Toward optimal feature selection using ranking methods and classification algorithms
DOI:
https://doi.org/10.2298/YJOR1101119NKeywords:
Feature selection, feature ranking methods, classification algorithms, classificationAbstract
We presented a comparison between several feature ranking methods used on two real datasets. We considered six ranking methods that can be divided into two broad categories: statistical and entropy-based. Four supervised learning algorithms are adopted to build models, namely, IB1, Naive Bayes, C4.5 decision tree and the RBF network. We showed that the selection of ranking methods could be important for classification accuracy. In our experiments, ranking methods with different supervised learning algorithms give quite different results for balanced accuracy. Our cases confirm that, in order to be sure that a subset of features giving the highest accuracy has been selected, the use of many different indices is recommended.References
Abe, N., Kudo, M. (2005) Entropy criterion for classifier-independent feature selection. Lecture Notes in Computer Science, 3684, str. 689-695
Almuallim, H., Dietterich, T.G. (1991) Learning with many irrelevant features. u: AAAI-91, Anaheim, proc., str. 547-552
Ben-Bassat, M. (1982) Pattern recognition and reduction of dimensionality. u: Krishnaiah P.R., Kanal L.N. [ur.] Handbook of statistics-II, North-Holland, str. 773-791
Blum, A., Rivest, R. (1992) Training a 3-node neural network is NP-complete. Neural Networks, 5(1): 117-127
Blum, A.L., Langley, P. (1997) Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1-2): 245-271
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J. (1984) Classification and regression trees. Belmont, CA: Wadsworth International Group
Caruana, R., Freitag, D. (1994) Greedy attribute selection. u: Proceedings of international conference on machine learning (ICML-94), Menlo Park, California, AAAI Press, str. 28-36
Das, S. (2001) Filters, wrappers and a boosting-based hybrid for feature selection. u: International conference on machine learning (XVIII), proceedings
Dash, M., Liu, H. (1997) Feature selection methods for classifications. Intelligent Data Analysis: An International Journal, 1 (3) www-east.elsevier.com/ida/free.htm
Dash, M., Liu, H., Yao, J. (1997) Dimensionality reduction of unsupervised data. u: IEEE international conference on tools with AI (ICTAI97) (IX), Nov., Newport Beach, 1997, proceedings, California: IEEE Computer Society, str. 532-539
Dash, M., Liu, H. (1999) Handling large unsupervised data via dimensionality reduction. u: SIGMOD research issues in data mining and knowledge discovery (DMKD-99) workshop, proceedings
Doak, J. (1992) An evaluation of feature selection methods and their application to computer security. Davis: University of California, Department of Computer Science, Technical report
Domingos, P., Pazzani, M. (1997) Feature selection and transduction for prediction of molecular bioactivity for drug design. Machine Learning, 29(2/3): 103-130
Duch, W., Adamczak, R., Grabczewski, K. (2001) A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council, 12(2): 277-306
Dy, J.G., Brodley, C.E. (2000) Feature subset selection and order identification for unsupervised learning. u: International conference on machine learning (XVII), proceedings, str. 247-254
Fayyad, U.M., Irani, K.B. (1992) The attribute selection problem in decision tree generation. u: National conference on artificial intelligence (IX), AAAI-92, proceedings, AAAI Press, str. 104-110
Frank, A., Asuncion, A. (2010) UCI machine learning repository. Irvine: University of California, School of Information and Computer Science, http://archive.ics.uci.edu/ml
Hall, M.A., Smith, L.A. (1998) Practical feature subset selection for machine learning. u: Australian computer science conference (XXI), proceedings, 181-191
Holte, R.C. (1993) Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11(1): 63-90
John, G.H., Kohavi, R., Pfleger, K. (1994) Irrelevant feature and the subset selection problem. u: Hirsh H., Cohen W.W. [ur.] Machine learning international conference (XI), proceedings, New Brunswick: Rutgers University, str. 121-129
Kim, Y., Street, W., Menczer, F. (2000) Feature selection for unsupervised learning via evolutionary search. u: ACM SIGKDD international conference on knowledge discovery and data mining (VI), proceedings, str. 365-369
Kira, K., Rendell, L.A. (1992) The feature selection problem: traditional methods and a new algorithm. u: AAAI-92, San Jose, proc, str. 122-126
Kohavi, R., John, G.H. (1997) Wrappers for feature subset selection. Artificial Intelligence, 97(1-2): 273-324
Koller, D., Sahami, M. (1996) Toward optimal feature selection. u: International conference on machine learning, str. 284-292
Kuramochi, M., Karypis, G. (2005) Gene classification using expression profiles: A feasibility study. International Journal on Artificial Intelligence Tools, 14(4): 641
Liu, H., Setiono, R. (1995) Chi 2: Feature selection and discretization of numeric attributes. u: IEEE International conference on tools with artificial intelligence (VII), proc, 338-391
Liu, H., Motoda, H. (1998) Feature selection for knowledge discovery and data mining. Kluwer Academic Publishers
Liu, H., Setiono, R. (1996) A probabilistic approach to feature selection: A filter solution. u: Saitta L. [ur.] Proceedings of international conference on machine learning, ICML-96, Jul. 3-6. 1996., Bari, Italy, San Francisco: Morgan Kaufmann Publishers, str. 319-327
Mitra, P., Murthy, C.A., Pal, S.K. (2002) Unsupervised feature selection using feature similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3): 301-312
Quinlan, R. (1993) C4. 5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann Publishers
Robnik-Šikonja, M., Kononenko, I. (2003) Machine Learning, 53(1/2): 23-69
Siedlecki, W., Sklansky, J. (1988) On automatic feature selection. International Journal of Pattern Recognition and Artificial Intelligence, 2(2): 197
Talavera, L. (1999) Feature selection as a preprocessing step for hierarchical clustering. u: Machine Learning (ICML99), International Conference, Proceedings
Weiss, S., Kulikowski, S. (1991) Computer systems that learn. San Francisco, CA, itd: Morgan Kaufmann
Wyse, N., Dubes, R., Jain, A.K. (1980) A critical evaluation of intrinsic dimensionality algorithms. u: Gelsema E.S., Kanal L.N. [ur.] Pattern Recognition in Practice, Morgan Kaufman Publishers, Inc, str. 415-425
Xing, E., Jordan, M., Karp, R. (2001) Feature selection for high-dimensional genomic micro array data. u: Machine Learning, 18th International Conference, Proceedings
Xing, E.P., Jordan, M.L., Karp, R.M. (2001) Feature selection for high-dimensional genomic micro array data. u: Machine Learning, 18th International Conference, Proceedings, 601-608
Yang, J., Honavar, V. (1998) Feature subset selection using a genetic algorithm. IEEE Intelligent Systems, 13(2): 44-49
Downloads
Published
Issue
Section
License
Copyright (c) YUJOR
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.