Toward optimal feature selection using ranking methods and classification algorithms

Authors

  • Jasmina Novaković Megatrend University, Faculty of Computer Science, Belgrade
  • Perica Strbac Megatrend University, Faculty of Computer Science, Belgrade
  • Dusan Bulatović Megatrend University, Faculty of Computer Science, Belgrade

DOI:

https://doi.org/10.2298/YJOR1101119N

Keywords:

Feature selection, feature ranking methods, classification algorithms, classification

Abstract

We presented a comparison between several feature ranking methods used on two real datasets. We considered six ranking methods that can be divided into two broad categories: statistical and entropy-based. Four supervised learning algorithms are adopted to build models, namely, IB1, Naive Bayes, C4.5 decision tree and the RBF network. We showed that the selection of ranking methods could be important for classification accuracy. In our experiments, ranking methods with different supervised learning algorithms give quite different results for balanced accuracy. Our cases confirm that, in order to be sure that a subset of features giving the highest accuracy has been selected, the use of many different indices is recommended.

References

Abe, N., Kudo, M. (2005) Entropy criterion for classifier-independent feature selection. Lecture Notes in Computer Science, 3684, str. 689-695

Almuallim, H., Dietterich, T.G. (1991) Learning with many irrelevant features. u: AAAI-91, Anaheim, proc., str. 547-552

Ben-Bassat, M. (1982) Pattern recognition and reduction of dimensionality. u: Krishnaiah P.R., Kanal L.N. [ur.] Handbook of statistics-II, North-Holland, str. 773-791

Blum, A., Rivest, R. (1992) Training a 3-node neural network is NP-complete. Neural Networks, 5(1): 117-127

Blum, A.L., Langley, P. (1997) Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1-2): 245-271

Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J. (1984) Classification and regression trees. Belmont, CA: Wadsworth International Group

Caruana, R., Freitag, D. (1994) Greedy attribute selection. u: Proceedings of international conference on machine learning (ICML-94), Menlo Park, California, AAAI Press, str. 28-36

Das, S. (2001) Filters, wrappers and a boosting-based hybrid for feature selection. u: International conference on machine learning (XVIII), proceedings

Dash, M., Liu, H. (1997) Feature selection methods for classifications. Intelligent Data Analysis: An International Journal, 1 (3) www-east.elsevier.com/ida/free.htm

Dash, M., Liu, H., Yao, J. (1997) Dimensionality reduction of unsupervised data. u: IEEE international conference on tools with AI (ICTAI97) (IX), Nov., Newport Beach, 1997, proceedings, California: IEEE Computer Society, str. 532-539

Dash, M., Liu, H. (1999) Handling large unsupervised data via dimensionality reduction. u: SIGMOD research issues in data mining and knowledge discovery (DMKD-99) workshop, proceedings

Doak, J. (1992) An evaluation of feature selection methods and their application to computer security. Davis: University of California, Department of Computer Science, Technical report

Domingos, P., Pazzani, M. (1997) Feature selection and transduction for prediction of molecular bioactivity for drug design. Machine Learning, 29(2/3): 103-130

Duch, W., Adamczak, R., Grabczewski, K. (2001) A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council, 12(2): 277-306

Dy, J.G., Brodley, C.E. (2000) Feature subset selection and order identification for unsupervised learning. u: International conference on machine learning (XVII), proceedings, str. 247-254

Fayyad, U.M., Irani, K.B. (1992) The attribute selection problem in decision tree generation. u: National conference on artificial intelligence (IX), AAAI-92, proceedings, AAAI Press, str. 104-110

Frank, A., Asuncion, A. (2010) UCI machine learning repository. Irvine: University of California, School of Information and Computer Science, http://archive.ics.uci.edu/ml

Hall, M.A., Smith, L.A. (1998) Practical feature subset selection for machine learning. u: Australian computer science conference (XXI), proceedings, 181-191

Holte, R.C. (1993) Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11(1): 63-90

John, G.H., Kohavi, R., Pfleger, K. (1994) Irrelevant feature and the subset selection problem. u: Hirsh H., Cohen W.W. [ur.] Machine learning international conference (XI), proceedings, New Brunswick: Rutgers University, str. 121-129

Kim, Y., Street, W., Menczer, F. (2000) Feature selection for unsupervised learning via evolutionary search. u: ACM SIGKDD international conference on knowledge discovery and data mining (VI), proceedings, str. 365-369

Kira, K., Rendell, L.A. (1992) The feature selection problem: traditional methods and a new algorithm. u: AAAI-92, San Jose, proc, str. 122-126

Kohavi, R., John, G.H. (1997) Wrappers for feature subset selection. Artificial Intelligence, 97(1-2): 273-324

Koller, D., Sahami, M. (1996) Toward optimal feature selection. u: International conference on machine learning, str. 284-292

Kuramochi, M., Karypis, G. (2005) Gene classification using expression profiles: A feasibility study. International Journal on Artificial Intelligence Tools, 14(4): 641

Liu, H., Setiono, R. (1995) Chi 2: Feature selection and discretization of numeric attributes. u: IEEE International conference on tools with artificial intelligence (VII), proc, 338-391

Liu, H., Motoda, H. (1998) Feature selection for knowledge discovery and data mining. Kluwer Academic Publishers

Liu, H., Setiono, R. (1996) A probabilistic approach to feature selection: A filter solution. u: Saitta L. [ur.] Proceedings of international conference on machine learning, ICML-96, Jul. 3-6. 1996., Bari, Italy, San Francisco: Morgan Kaufmann Publishers, str. 319-327

Mitra, P., Murthy, C.A., Pal, S.K. (2002) Unsupervised feature selection using feature similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3): 301-312

Quinlan, R. (1993) C4. 5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann Publishers

Robnik-Šikonja, M., Kononenko, I. (2003) Machine Learning, 53(1/2): 23-69

Siedlecki, W., Sklansky, J. (1988) On automatic feature selection. International Journal of Pattern Recognition and Artificial Intelligence, 2(2): 197

Talavera, L. (1999) Feature selection as a preprocessing step for hierarchical clustering. u: Machine Learning (ICML99), International Conference, Proceedings

Weiss, S., Kulikowski, S. (1991) Computer systems that learn. San Francisco, CA, itd: Morgan Kaufmann

Wyse, N., Dubes, R., Jain, A.K. (1980) A critical evaluation of intrinsic dimensionality algorithms. u: Gelsema E.S., Kanal L.N. [ur.] Pattern Recognition in Practice, Morgan Kaufman Publishers, Inc, str. 415-425

Xing, E., Jordan, M., Karp, R. (2001) Feature selection for high-dimensional genomic micro array data. u: Machine Learning, 18th International Conference, Proceedings

Xing, E.P., Jordan, M.L., Karp, R.M. (2001) Feature selection for high-dimensional genomic micro array data. u: Machine Learning, 18th International Conference, Proceedings, 601-608

Yang, J., Honavar, V. (1998) Feature subset selection using a genetic algorithm. IEEE Intelligent Systems, 13(2): 44-49

Downloads

Published

2011-03-01

Issue

Section

Research Articles