Two-stage learning for multi-class classification using genetic programming

doi:10.1016/j.neucom.2012.01.048

Neurocomputing

Volume 116, 20 September 2013, Pages 311-316

https://doi.org/10.1016/j.neucom.2012.01.048 Get rights and content

Abstract

This paper introduces a two-stage strategy for multi-class classification problems. The proposed technique is an advancement of tradition binary decomposition method. In the first stage, the classifiers are trained for each class versus the remaining classes. A modified fitness value is used to select good discriminators for the imbalanced data. In the second stage, the classifiers are integrated and treated as a single chromosome that can classify any of the classes from the dataset. A population of such classifier-chromosomes is created from good classifiers (for individual classes) of the first phase. This population is evolved further, with a fitness that combines accuracy and conflicts. The proposed method encourages the classifier combination with good discrimination among all classes and less conflicts. The two-stage learning has been tested on several benchmark datasets and results are found encouraging.

Introduction

Data classification finds its application in many real world problems, like fraud detection, face recognition, speech recognition and knowledge extraction from databases. The field of data classification is receiving increased importance due to unpredictability and complexity of real-world data. Evolutionary algorithms have shown evident performance for classification tasks. Genetic Programming (GP) is one of the evolutionary algorithms introduced by Koza [1] for automatic evolution of computer programs (including classifiers). GP has been successfully used for evolution of classifier-programs like decision trees [2]. Other GP based classification approaches include evolution of neural networks [3], [4], [5], autonomous classification systems [6], rule induction algorithms [7], fuzzy rule based systems and fuzzy petri nets [5], [8]. Most of these methods involve defining a grammar that is used to create and evolve classification algorithms using GP.

Various researchers [9], [10], [11], [12], [13] have used GP for evolution of classification rules. The rule based systems include, atomic representations proposed by Eggermont [14], [15] and SQL based representations proposed by Freitas et al. [12]. Tunsel and Jamshidi [16], Berlanga et al. [17] and Mendes et al. [18] introduced evolution of fuzzy rules using GP. Chien et al. [19] used fuzzy discrimination function for classification. Falco et al. [20] discovered comprehensive classification rules that use continuous value attributes. Bozarczuk et al. [21], [22] used different set of functions applicable to different type of attributes that represent rules as disjunctive normal form. This type of GP is also referred as constrained syntax GP. Tsakonas et al. [23] introduced two GP based systems for medical domain and achieved noticeable performance. Lin et al. [24] proposed a layered GP, where different layers correspond to different populations that perform feature extraction and classification. Another method is evolution of arithmetic expressions for classification. The arithmetic expressions can be used for numerical data and they output a real value. This real value is translated into the classification decision using different thresholds. This includes static thresholds [25], [26], dynamic thresholds [26], [27] and slotted thresholds [28].

Multi-class classification problems are common in the real world applications for the tasks like object recognition, character recognition, person recognition, disease diagnosis and several others. Many classification algorithms are binary in nature and must be extended for multi-class classification. These include neural networks, decision trees, k-nearest neighbor, naive Baye's classifiers, and support vector machines [29]. GP also needs to be extended for multiclass classification problems. Several methods have been presented to use GP for multi-class classification problems. Most noticeable among them is the one-versus-all method also known as binary decomposition method. This method has been used widely in GP based multi-class classification. In this method, one classifier is evolved for each class, discriminating a particular class from other classes present in the data. The final decision is made by presenting the input vector to classifiers of all classes. The classifier with positive or highest output is declared the winner. This method has been explored by many researchers [30], [31], [32], [33], [34]. Another relatively different method proposed by Muni et al. [35], uses a multi-tree representation, where a single classifier is an integrated version of individual classifiers for all classes. This amalgamated classifier is evolved in search of best classifier that has the ability to classify any of the class in one evolution.

Several other methods like ‘all versus all’ [36], error correcting output codes [37], and generalized error correcting output codes [38] have also been used to tackle multi-class classification problems by binary classification algorithms. However, none of them has been used in GP due to the large number of computations.

The drawback of binary decomposition method is the conflicting situations, where more than one classifier outputs a positive signal or none of the classifier outputs a belong-to signal. This situation degrades the classification accuracy. Several conflict resolution methods have been devised for this problem but they require extra processing during training and classification step. Another problem is the presence of skewed data. The data appears unbalanced for classification of a single class versus remaining classes. This problem is solved by increasing the number of training instances to make them appear balanced for each class [30], [36]. This is named ‘interleaved data format’ where the samples belonging to class under consideration are repeated and alternately placed between samples belonging to other classes. This strategy increases the training data as well as the training time.

The proposed staged approach overcomes these two problems. It evolves the classifiers in two different stages that perform discrimination and integration, and incorporates a discriminative fitness function which takes care of skewed data without increasing the computation. The integrated evolution eliminates the conflicting situations decreasing the evaluation time required for conflict resolution. The proposed algorithm is detailed in the next section.

Section snippets

Proposed methodology

Many attempts have been made to develop general approaches to multi-class classification. One of the well known methods, in machine learning community, is one vs. all method. It involves learning a discriminator for each pair of class labels. The proposed classification mechanism uses the same principle but divides the training process into two phases. The first stage resembles the traditional binary decomposition method. The output, given by this phase, is a set of classifier populations for

Results

Five benchmark multi-class classification problems have been selected from UCI ML repository [41], for performance evaluation of this work. We have selected the datasets based on following properties:

(1)
Dataset should be real or numerical valued.
(2)
Problem should be multi-class classification.
(3)
There should be no missing values.

The datasets have been chosen from various dimensions of life to show the applicability of GP classification as well as generalization of our proposed optimization technique.

Conclusions

The proposed two stage learning mechanism for multi-class classification using Genetic Programming has yielded better results when compared to one-versus-all or binary decomposition method. This is due to the fact that binary decomposition method suffers from conflicting situations. On the other hand, we have used a fitness measure that favors accurate classifiers and less conflicting outputs. The proposed method reduces the computation required to perform the conflict resolution during the

Hajira Jabeen is working as an assistant professor at Iqra University, Islamabad, Pakistan since 2009. Her field of expertise include evolutionary Computation, swarm intelligence and data classification.

References (48)

M.D. Ritchie
Genetic programming neural networks: a powerful bioinformatics tool for human genetics
Appl. Soft Comput.
(2007)
A. Tsakonas
A comparison of classification accuracy of four genetic programming-evolved intelligent structures
Inf. Sci.
(2006)
M. Oltean et al.
An autonomous GP-based system for regression and classification problems
Appl. Soft Comput.
(2009)
B.C. Chien et al.
Learning discriminant functions with fuzzy attributes for classification using genetic programming
Expert Syst. Appl.
(2002)
C.C. Bojarczuk
A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets
Artif. Intelligence Med.
(2004)
K.H. Liu et al.
Cancer classification using Rotation Forest
Comput. Biol. Med.
(2008)
M. Galar et al.
An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes
Pattern Recognition
(2011)
J.R. Koza
Genetic Programming: On the Programming of Computers by Means of Natural Selection
(1992)
J.R. Koza
Concept formation and decision tree induction using the genetic programming paradigm
Lect. Notes Comput. Sci.
(1991)
D. Rivero et al.
Modifying genetic programming for artificial neural network development for data mining
Soft Comput.
(2008)

G.A. Pappa et al.

Evolving rule induction algorithms with multiobjective grammer based genetic programming

Knowl. Inf. Syst.

(2008)

J., Eggermont, Evolving Fuzzy Decision Trees for Data Classification, Proceedings of the 14th Belgium Netherlands...

R. Konig et al.

Genetic programming—a tool for flexible rule extraction

IEEE Cong. Evol. Comput.

(2007)

A.P. Engelbrecht et al.

A building block approach to genetic programming for rule discovery, in data mining: a heuristic approach

E. Carreno et al.

Evolution of classification rules for comprehensible knowledge discovery

IEEE Cong. Evol. Comput.

(2007)

A.A. Freitas

A Genetic Programming Framework for Two Data Mining Tasks : Classification and Generalized Rule Induction

(1997)

C.S. Kuo et al.

Applying genetic programming technique in classification trees

Soft Computing

(2007)

J., Eggermont, A.E., Eiben, J.I., Hemert, A comparison of genetic programming variants for data classification....

J. Eggermont, J.N. Kok, W.A. Kosters, GP For Data Classification, Partitioning The Search Space, Proceedings of the...

E. Tunstel et al.

On genetic programming of fuzzy rule-based systems for intelligent control

Int. J. Intelligent Autom. Soft Comput.

(1996)

F.J. Berlanga, et al., A genetic-programming-based approach for the learning of compact fuzzy rule-based classification...

R.R.F. Mendes, et al., Discovering fuzzy classification rules with genetic programming and co-evolution, Principles of...

I.D. Falco, A.D. Cioppa, E. Tarantino, Discovering interesting classificationrules with GP, Appl. Soft Comput. 4 (2002)...

C.C. Bojarczuk et al.

An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients

(2003)

Cited by (12)

A genetically optimized neural network model for multi-class classification
2016, Expert Systems with Applications
Citation Excerpt :
So, there is no point in adding BFS in crossover operation, because it increases the time required to reach the solution drastically. Jabeen and Baig (2013) proposed two stage learning for multi-class classification problems. In the first stage, the classifiers are trained for each class versus the remaining classes.
Multi-class classification is one of the major challenges in real world application. Classification algorithms are generally binary in nature and must be extended for multi-class problems. Therefore, in this paper, we proposed an enhanced Genetically Optimized Neural Network (GONN) algorithm, for solving multi-class classification problems. We used a multi-tree GONN representation which integrates multiple GONN trees; each individual is a single GONN classifier. Thus enhanced classifier is an integrated version of individual GONN classifiers for all classes. The integrated version of classifiers is evolved genetically to optimize its architecture for multi-class classification. To demonstrate our results, we had taken seven datasets from UCI Machine Learning repository and compared the classification accuracy and training time of enhanced GONN with classical Koza’s model and classical Back propagation model. Our algorithm gives better classification accuracy of almost 5% and 8% than Koza’s model and Back propagation model respectively even for complex and real multi-class data in lesser amount of time. This enhanced GONN algorithm produces better results than popular classification algorithms like Genetic Algorithm, Support Vector Machine and Neural Network which makes it a good alternative to the well-known machine learning methods for solving multi-class classification problems. Even for datasets containing noise and complex features, the results produced by enhanced GONN is much better than other machine learning algorithms. The proposed enhanced GONN can be applied to expert and intelligent systems for effectively classifying large, complex and noisy real time multi-class data.
Designing efficient discriminant functions for multi-category classification using evolutionary methods
2016, Neurocomputing
Citation Excerpt :
In addition, the ability of coping with skewed data is another advantage of this fitness function, which makes it the best choice for our purpose. We also take advantage of layered fitness [11], where the individual with the highest fitness value has always priority to be chosen. In the case of equality of two individuals׳ fitness values, we choose the individual with less number of nodes.
In this paper, we propose two approaches to obtain accurate classifiers for dealing with multi-category classification problem. Our work is based on one-vs-all strategy where we try to decrease conflicting situations. In the first phase of both approaches we employ Genetic Programming to find populations of the best discriminant functions (one population for each class). In addition to traditional function set, like ${+, -, ⁎,/}$ , we utilize other special functions in our binary trees. We also use both negative and positive constants in the terminal nodes of the trees. In the second phase, we employ Ant Colony in our first approach, called GP-Ant, and Genetic Algorithm in the second one, called GP–GA, to find the best combination of discriminant functions found in the previous phase. We also provide a special modification box to modify the decision of our final integrated classifiers, when conflicting situations happen. To cope with conflicting situations, we also utilize an appropriate fitness function in the second phase. We compare our works with both state of the art and basic multi-category classification methods on eight well-known publicly available data sets. Our experimental results show that our methods are statistically significantly better than all the other classification methods used.
On the usefulness of one-class classifier ensembles for decomposition of multi-class problems
2015, Pattern Recognition
Multi-class classification can be addressed in a plethora of ways. One of the most promising research directions is applying the divide and conquer rule, by decomposing the given problem into a set of simpler sub-problems and then reconstructing the original decision space from local responses.
In this paper, we propose to investigate the usefulness of applying one-class classifiers to this task, by assigning a dedicated one-class descriptor to each class, with three main approaches: one-versus-one, one-versus-all and trained fusers. Despite not using all the knowledge available, one-class classifiers display several desirable properties that may be of benefit to the decomposition task. They can adapt to the unique properties of the target class, trying to fit a best concept description. Thus they are robust to many difficulties embedded in the nature of data, such as noise, imbalanced or complex distribution. We analyze the possibilities of applying an ensemble of one-class methods to tackle multi-class problems, with a special attention paid to the final stage – reconstruction of the original multi-class problem. Although binary decomposition is more suitable for most standard datasets, we identify the specific areas of applicability for one-class classifier decomposition.
To do so, we develop a double study: first, for a given fusion method, we compare one-class and binary classifiers to find the correlations between classifier models and fusion algorithms. Then, we compare the best methods from each group (one-versus-one, one-versus-all and trained fusers) to draw conclusions about the overall performance of one-class solutions. We show, backed-up by thorough statistical analysis, that one-class decomposition is a worthwhile approach, especially in case of problems with complex distribution and a large number of classes.
A Novel Quadtree-Based Genetic Programming Search for Searchable Encryption Optimization
2023, GECCO 2023 Companion - Proceedings of the 2023 Genetic and Evolutionary Computation Conference Companion
Genetic Programming with Random Binary Decomposition for Multi-Class Classification Problems
2021, 2021 IEEE Congress on Evolutionary Computation, CEC 2021 - Proceedings
A semi-boosted nested model with sensitivity-based weighted binarization for multi-domain network intrusion detection
2019, ACM Transactions on Intelligent Systems and Technology

View all citing articles on Scopus

Abdul Rauf Baig has been assosiated with National University of Computing and Emerging Technologies, NU-FAST, Islamabad, Pakistan, since 2004. His field of expertise include Aritifical Intelligence, Data Mining and swarm intelligence.

View full text

Two-stage learning for multi-class classification using genetic programming

Abstract

Introduction

Section snippets

Proposed methodology

Results

Conclusions

Appl. Soft Comput.

Inf. Sci.

Appl. Soft Comput.

Expert Syst. Appl.

Artif. Intelligence Med.

Comput. Biol. Med.

Pattern Recognition

Genetic Programming: On the Programming of Computers by Means of Natural Selection

Concept formation and decision tree induction using the genetic programming paradigm

Lect. Notes Comput. Sci.

Modifying genetic programming for artificial neural network development for data mining

Soft Comput.

Evolving rule induction algorithms with multiobjective grammer based genetic programming

Knowl. Inf. Syst.

Genetic programming—a tool for flexible rule extraction

IEEE Cong. Evol. Comput.

A building block approach to genetic programming for rule discovery, in data mining: a heuristic approach

Evolution of classification rules for comprehensible knowledge discovery

IEEE Cong. Evol. Comput.

A Genetic Programming Framework for Two Data Mining Tasks : Classification and Generalized Rule Induction

Applying genetic programming technique in classification trees

Soft Computing

On genetic programming of fuzzy rule-based systems for intelligent control

Int. J. Intelligent Autom. Soft Comput.

An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients