A filter-based feature construction and feature selection approach for classification using Genetic Programming
Introduction
Classification is an important supervised machine learning method that aims to classify unknown instances into corresponding categories based on the information contained in predefined features [1], [2]. So, the quality of features is a main factor that affects the classification performance [3]. Without prior knowledge, it is difficult to know which features are effective. Therefore the sufficient number of features is usually predefined, which results in many irrelevant and redundant features. Irrelevant and redundant features are not useful for classification and even reduce the classification performance [4], [5]. In some real-world classification applications, available features sometimes do not have adequate discrimination ability [3], so the trained classification model cannot achieve adequate classification performance.
Feature selection method is used to select effective features and remove irrelevant or redundant features [1], and feature construction method is employed to create new higher-level features from original ones to reduce the dimensionality of features and increase the classification performance [6]. Wrapper and filter are two typical feature construction and selection approaches based on different evaluation criteria. Wrapper-based approaches use the classification performance of a learning algorithm as the evaluation criterion, while information measures such as Information Gain(IG) [7], Information Gain Ratio(IGR) [3] and Correlation [8], [9] are used as evaluation criteria for filter-based approaches. Wrapper-based approaches are learning algorithms dependent, and in general these methods can achieve better classification performance than filter-based approaches. Since no learning algorithm is involved in the evaluation measures, filter-based approaches are faster and the classification models are more general than wrapper-based approaches [10], [11]. Moreover, experiments show that our proposed filter-based feature construction and feature selection method(FCWFS) can also achieve better classification performance than a wrapper-based feature construction method.
To address feature construction and feature selection problems, efficient global search algorithms are needed. Genetic Programming (GP) [12], [13] offers flexible pattern representations such as trees, and uses any kind of logical and mathematical expressions inside the representations [14]. These expressions can transform original features into new higher-level constructed features, and can also be used to select effective features. Therefore, GP can be used to solve feature selection [15], [16], [17] and feature construction [3], [10] tasks.
Feature construction approaches are used to transform the original feature space to another higher-level feature space. In general, GP can be used to construct single features. Otero et al. (2002) [3] used IGR as the fitness function and constructed single feature. Muharram et al. (2005) [18] proposed a single feature construction method that employed information gain, gini index, chisquare, and a combination of information gain and gini index as fitness functions. Guo et al. [19], [20] proposed similar single feature construction methods, and the difference is that these methods use Fisher criterion as fitness functions. Since single constructed features do not have adequate discrimination ability for classification, multiple feature construction approaches are investigated to improve the classification performance. Neshatian et al. (2012) [10] used a fitness function that maximized the purity of class intervals and constructed the same number of features as the number of classes. Krawiec (2002) [21] proposed an archive-based multiple feature construction method that stored useful individuals during evolutionary run. Ahmed et al. (2014) [22] divided the best individual to all possible sub-trees that were transformed into multiple features. Moreover, cooperative coevolution strategy [23], [24], [25], [26] that created multiple cocurrent populations was used to construct multiple features.
With the evolution of GP, many excellent individuals are often lost. In order to preserve effective constructed features during a GP run, a multiple feature construction approach (FCM) that stores top individuals is proposed in our previous work [27]. However, how to set the parameter is a problem. In this work, we investigate the impact of parameter on the experimental results and set a value that is as large as possible to maintain the classification performance. Thus, redundant features may be produced in the constructed features. Therefore, we employ a GP-based feature selection approach (FS) that uses correlation-based method to reduce feature redundancy and increase feature relevancy. This approach is named as FCMFS that first uses FCM to perform feature construction and then uses FS to perform feature selection.
To facilitate comparison, both feature selection and feature construction approaches use standard GP. The overall goal of this paper is to propose a feature construction and feature selection approach (FCMFS), which first constructs multiple features using FCM then selects effective constructed features using FS, and to investigate the effectiveness of our proposed FCMFS by comparing it with other feature processing methods. In order to achieve the overall goal, the following four objectives will be investigated.
Objective 1. Propose a filter-based multiple feature construction approach using GP (FCM) and a filter-based feature selection approach using GP (FS), and investigate whether features selected by FS or constructed by FCM can obtain better classification performance than original features.
Objective 2. Develop a feature construction and feature selection approach named FCMFS, and investigate whether FCMFS can obtain equivalent classification performance with a smaller number of features comparing with FCM, and whether FCMFS can achieve better classification performance and fewer features than FS on the nine datasets.
Objective 3. Investigate another feature construction and feature selection method that first selects features using FS then constructs features using FCM named FSFCM, and verify whether FCMFS can achieve better classification performance than FSFCM.
Objective 4. Investigate whether our proposed FCMFS can achieve better performance than three state-of-art methods including one wrapper-based feature construction method [28], one filter-based multiple feature construction method [10] and one single-stage feature construction and feature selection method [29].
The rest of the paper is arranged as follows. The next section describes background information involved in this paper. Section 3 presents the GP based feature construction and feature selection approaches. Section 4 describes the experimental design. Section 5 presents the experimental results with discussions. Section 6 provides conclusions and future directions.
Section snippets
Genetic programming
The evolutionary computation (EC) techniques are inspired by Darwin’s theory of evolution [30]. Genetic programming(GP) [12], [31], Genetic algorithm(GA), Particle swarm optimization(PSO), Ant colony optimization(ACO) are effective EC algorithms due to their global search ability. In addition, some new EC algorithms, such as extremal optimization (EO) algorithms, are used to solve optimization problems [32], [33]. Addressing multi-objective optimization problems using EC algorithms are getting
Methodology
We use standard GP representation methods to solve feature construction and feature selection problems. The individuals are represented as a tree-like structure. The leaf nodes of an individual are derived from original features, constructed features or selected features randomly according to different feature processing methods which are described as follows. The internal nodes are functions that come from a function set. Genetic operators, including reproduction, crossover and mutation, are
Benchmark techniques
To verify the effectiveness of our proposed FCMFS, three state-of-art techniques including one wrapper-based feature construction method [28], one filter-based multiple feature construction method [10] and one single-stage feature construction and feature selection method [29] are chosen for comparison.
The first is a conventional wrapper-based feature construction method using GP which constructs single feature, i.e, one GP run only outputs the best individual and the fitness function uses
Experimental results and discussions
We arranged the following experiments. Firstly FCMFS is compared with three benchmark techniques (FCW, FCMMR and SFCFS) to verify the effectiveness of the proposed FCMFS. Secondly, the effectiveness of six feature processing methods in Section 3 is compared. (1) FCMFS is compared with two baselines(ALL, FCS) to further verify the effectiveness of the proposed FCMFS, (2) FCM and FS are compared with ALL to verify the effectiveness of FCM and FS methods, and (3) FCMFS is compared with FCM and FS
Conclusions and future work
This paper proposes a filter-based feature construction and feature selection approach using GP(FCMFS) that is divided into two stages, i.e., first using multiple feature construction approach to store top individuals(FCM) and then using feature selection approach to select effective feature subset(FS). The experiments on nine datasets show that FCM and FS can obtain better performance than original features. FCMFS can maintain the classification performance with a smaller number of features
CRediT authorship contribution statement
Jianbin Ma: Conceptualization, Methodology, Software, Validation, Supervision, Investigation, Writing - original draft. Xiaoying Gao: Writing - review & editing, Formal analysis, Visualization, Validation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported by Hebei Agricultural University (No. ZD201702, No. LG201707) and Hebei Provincial Department of Human Resources and Social Security, China (No. CN201709).
References (64)
- et al.
Feature subset selection in large dimensionality domains
Pattern Recognit.
(2010) - et al.
Weighted nearest neighbors feature selection
Knowl.-Based Syst.
(2019) - et al.
Novel feature selection method for genetic programming using metabolomic 1H NMR data
Chemometr. Intell. Lab. Syst.
(2006) - et al.
Breast cancer diagnosis using genetic programming generated feature
Pattern Recognit.
(2006) Generative learning of visual concepts using multiobjective genetic programming
Pattern Recognit. Lett.
(2007)- et al.
A hybrid multiple feature construction approach using genetic programming
Appl. Soft Comput.
(2019) - et al.
Constrained population extremal optimization-based robust load frequency control of multi-area interconnected power system
Int. J. Electr. Power Energy Syst.
(2019) - et al.
Design of PID controller based on a self-adaptive state-space predictive functional control using extremal optimization method
J. Franklin Inst. B
(2018) - et al.
Design of fractional order PID controller for automatic regulator voltage system based on multi-objective extremal optimization
Neurocomputing
(2015) - et al.
Enhanced multi-objective particle swarm optimisation for estimating hand postures
Knowl.-Based Syst.
(2018)