Feature subset selection plays an essential role in all data mining applications. It speeds up a data mining algorithm and improves mining performance. This paper investigates the performance of several feature selection methods for classification: filter, wrapper and hybrid. The two filter methods used are based on information gain and correlation measures. Correlation Feature Selection (CFS) filters and wrappers were implemented using three different search mechanisms: Best First, Greedy Stepwise and Genetic. The effectiveness of the selected features was investigated by comparing accuracy and runtime of five traditional classification algorithms applied to only these selected features versus all features.
[...] In this study we compared the feature selection performance of the filter and wrapper models and combine the two models to create new hybrid models for feature selection. The remainder of this paper is organized as follows: Section 2 describes various feature selection methods and classification algorithms used. Section 3 presents the proposed approach. Section 4 presents dataset description, experimental results and discussions. The conclusions and future research are presented in section CLASSIFICATION TECHNIQUES We consider 5 popular classification techniques such as RBF, Naïve Bayes, J48, decision tree, PART and JRIP rule learning algorithms. [...]
[...] Performance of the methods have been evaluated by calculating accuracy, execution time and the reduction in number of selected features RESULTS AND DISCUSSIONS For applying classification and feature selection methods we have used WEKA software The results reported in this section were obtained with 10-fold cross validation over the dataset. Combinations of feature selection and classification methods were examined for the dataset. The accuracy and runtime performance of the classifiers obtained using filter methods are shown in Table 1. Table 2 provides the performance of the classifiers with wrapper feature selection methods ACKNOWLEDGEMENT The authors would like to thank Dr. [...]
[...] The wrappers and CFS filters have been implemented The experimental observations of hybrid feature selection models for classification are given in Table 3. It is observed that the accuracy and runtime performance of almost all classifiers were improved after applying feature selection methods to the dataset. The performance fluctuates in some combinations in terms of accuracy and number of selected features. However there is a good improvement in processing time in all combinations. Highest accuracy is observed from applying hybrid information gain filter with best first and genetic search feature selection methods but the number of selected features is more. [...]
[...] Department of Computer Science, Waikato University, Hamilton, NZ John, G.H., Kohavi,R, Pfleger,K, “Irrelevant Features and the Subset Selection Problem” In Proceedings of the 11th International Conference on Mahcine Learning (ICML94), pp.121- Nigel Williams, Sebestian Zander, Grenville Armitage, Preliminary Performance Comparison of Five Machine Learning Algorithms for Practical IP Traffic Flow Classfication”, ACM SIGCOMM Computer Communication Review, Vol.36, No.5, pp.7- Haleh Vafaie and Kenneth De Jong, “Genetic Algorithms as a tool for Feature Selection in Machine Learning”, In Proceeding of 4th International Conference on Tools with Artificial Intelligence, pp.200- Y. [...]
[...] GSW searches greedily through the space of attribute subsets and progress forward from the empty set or backward from the full set. GS uses a simple genetic algorithm. Genetic algorithms are best known for their ability to efficiently search large spaces about which little is known a priority. Since genetic algorithm are relatively insensitive to noise, they seem to be an excellent choice for the basis of a more robust feature selection strategy for improving the performance of classification system [14]. [...]
APA Style reference
For your bibliographyOnline reading
with our online readerContent validated
by our reading committee