Elsevier

Applied Soft Computing

Volume 80, July 2019, Pages 761-775
Applied Soft Computing

Feature selection based on brain storm optimization for data classification

https://doi.org/10.1016/j.asoc.2019.04.037Get rights and content

Highlights

  • A hybrid model (FAM-BSO) for feature selection and data classification is proposed.

  • Fuzzy ARTMAP (FAM) is first used for incremental learning of data.

  • A Brain Storm Optimization (BSO) is then used for feature selection.

  • The outcome indicates that FAM-BSO is able to produce promising results.

Abstract

Brain storm optimization (BSO) is a new and effective swarm intelligence method inspired by the human brainstorming process. This paper presents a novel BSO-based feature selection technique for data classification. Specifically, the Fuzzy ARTMAP (FAM) model, which is employed as an incremental learning neural network, is combined with BSO, which acts as a feature selection method, to produce the hybrid FAM-BSO model for feature selection and optimization. Firstly, FAM is used to create a number of prototype nodes incrementally. Then, BSO is used to search and select an optimal sub-set of features that is able to produce high accuracy with the minimum number of features. Ten benchmark problems and a real-world case study are employed to evaluate the performance of FAM-BSO. The results are quantified statistically using the bootstrap method with the 95% confidence intervals. The outcome indicates that FAM-BSO is able to produce promising results as compared with those from original FAM and other feature selection methods including particle swarm optimization, genetic algorithm, genetic programming, and ant colony optimization.

Introduction

Feature selection identifies an optimal subset of features from data samples for classification problems. Feature selection methods aim to improve the predictive accuracy and/or reduce the computational time of classification algorithms. They are important in data mining and pattern recognition problems. However, selecting an optimal subset of features is a challenging optimization task, and is time consuming due to the large search space and complex relationships among the features [1].

Feature selection techniques consist of two parts: (i) a search technique to find the optimal subset of features, and (ii) a classifier or learning algorithm to evaluate the effectiveness of the selected subset of features. In general, feature selection techniques can be divided into three categories: filter-based, wrapper-based and embedded-based methods [2]. Filter-based methods mainly focus on the properties of data samples without considering the underlying learning scheme [3]. In contrast, wrapper-based methods employ a classifier or a learning algorithm to evaluate the effectiveness of various feature subsets, and adopt a search technique to find the optimal one. Unlike wrapper methods, embedded methods considers feature selection in the training process, in order to reduce the computational time for re-classifying different feature subsets. Regularization methods, e.g., the least absolute shrinkage and selection operator (LASSO) [4] and Elastic Net [5], are popular embedded methods. Since wrapper-based methods consider both feature subset and classifier, they are generally more effective than filter based methods. Nevertheless, wrapper methods are more complex, because a classifier is required to re-learn each selected feature subset [6].

Many wrapper based feature selection methods have been proposed to perform the search process and find the optimal subset of features, such as sequential forward selection (SFS) and sequential backward selection (SBS) [7]. While both methods can produce promising results, they suffer from several problems such as nesting effects [3] and computational complexity [1]. To overcome these problems, population-based evolutionary computation (EC) algorithms such as the genetic algorithm (GA) [8], [9], ant colony optimization (ACO) [10], genetic programming (GP) [11] and particle swarm optimization (PSO) [12], have been widely used. These algorithms have shown promising results in solving single-, multi- and many-objective optimization problems due to their capability of conducting global search.

Many EC methods have been adopted for feature selection. Each method has its own advantages and disadvantages. As an example, the learning process of PSO, which is based on the Euclidean distance [13], makes it difficult in terms of adaptability. Nevertheless, PSO is less complex, both in run-time and memory requirements, and has the capability of fast convergence as compared with the GA and GP models [14], [15]. PSO also has a simple structure, which requires few number of parameters to adjust. These promising properties of PSO make it a useful approach, which has been applied to many fields including feature selection. Comparatively, brain storm optimization (BSO) [16] is a new swarm intelligence algorithm inspired by the human brainstorming process. To the best of our knowledge, the effectiveness of BSO in feature selection problems is yet to be investigated, which is the key focus of this paper.

On the other hand, fuzzy ARTMAP (FAM) [17] is a supervised neural network that merges the capability of Adaptive Resonance Theory (ART) [18] in solving the stability–plasticity dilemma with the capability of fuzzy set theory in handling vague and imprecise human linguistic information. FAM is an incremental learning model that operates by determining the similarity degree among its existing prototype nodes and the input samples against a threshold. If the similarity measure is not satisfied, FAM is able to incrementally add a new prototype node in its structure to encode the current learning sample, without forgetting or corrupting previously learned samples. It has been used with the GA to solve data classification problems [19], [20], [21].

In this study, we propose a hybrid FAM-BSO model for feature selection. Firstly, FAM is used to learn the input samples incrementally. Then, BSO is adopted to search and select an optimal feature subset. The proposed FAM-BSO model exploits the concept of “open prototype” to keep the most important input features while maximizing classification accuracy. Ten benchmark problems are employed to evaluate the effectiveness of FAM-BSO. The results are quantified statistically using the bootstrap method [22] with 95% confidence intervals. In addition, a real-world human motion recognition problem is used to demonstrate the applicability of FAM-BSO.

Building upon our previously proposed fuzzy based feature selection and rule extraction model reported in [21], the main contribution of this study is to hybridize a nature-inspired optimization algorithm, BSO, with a neural-fuzzy algorithm (FAM) to undertake feature selection and classification problems. Unlike the GA and PSO models which use the best solution and global best solution, respectively, to update the individual solutions, the brainstorming process of BSO use all possible solutions to generate new ones. As compared with the GA and PSO models, this updating mechanism helps BSO to scape from local optima. While, PSO has a fast convergence rate, it is not suitable for multi-model optimization problems [23]. Nevertheless, FAM-BSO requires longer execution durations owing to the procedure of clustering its solutions (as shown in Section 5.2). In summary, the main contributions of this paper are two-fold:

  • a hybrid FAM-BSO model that is able to maximize classification accuracy and minimize the number of features;

  • a comprehensive evaluation of FAM-BSO in feature selection and data classification using benchmark and real-world problems, and performance comparison pertaining to the feature selection capability of BSO (which constitutes a new application of BSO) with other EC-based algorithms.

The rest of this paper is organized as follows. Section 2 presents a review on both traditional and EC-based feature selection methods. Section 3 explains the structures of BSO and Adaptive Resonance Theory (ART) models. The details of proposed FAM-BSO model is described in Section 4. Section 5 provides the experimental results and discussion. Finally, conclusions and suggestions for further research are given in Section 6.

Section snippets

Related work

This section reviews both traditional and EC-based feature selection methods. Two well-known traditional feature selection methods are SBS [24] and SFS [25]. These methods sequentially add or remove features until no improvement in classification accuracy is observed. Once certain features are removed or selected, they cannot be updated in future steps, which is the main drawback of the SBS and SFS methods. To alleviate this problem, sequential forward floating selection (SFFS) and sequential

The structure and dynamics of BSO and ART models

In this section, we firstly explain the BSO model. Then, the structure of ART, specifically, Fuzzy ART and FAM models, are described in detail.

The FAM-BSO model

The proposed FAM-BSO model contains two main stages, which are the learning stage and feature selection stage. FAM is used in the first stage to learn the training samples, and BSO is employed in the second stage to extract the best feature subset with the aim to increase classification accuracy and reduce model complexity. The details are explained as follows.

Experimental studies

To evaluate the effectiveness of FAM-BSO, ten benchmark data sets from the UCI machine learning repository [50] and a real-world case study, i.e., human motion detection [51], are used. The details of the UCI data sets are listed in Table 2. The selected data sets contain different feature characteristics, which include the numbers of features and samples that are used to assess the feature selection and classification capabilities of FAM-BSO. The PID samples overlapped each other, which is

Summary

In this paper, a new FAM-BSO model, which is an evolutionary-based feature selection method, for solving feature selection problems for data classification has been proposed. Firstly, FAM was used as the underlying model to learn the training samples. Then, the BSO model adopted to search and select the best feature subset that maximizes the classification accuracy while minimizing the numbers of features. The effectiveness of FAM-BSO has been evaluated using ten benchmark data sets and a

Acknowledgments

This work is partially supported by the National Natural Science Foundation of China (Grant Nos. 61773197, 61772344, 61811530324, 61732011 and 61761136008), the Nanshan District Science and Technology Innovation Bureau (grant No. LHTD20170007), the Science and Technology Innovation Committee of Shenzhen City (Grant Nos. CKFW2016041415372174, GJHZ20170314114424152), and the Natural Science Foundation of Shenzhen University (Grant Nos. 827-000140, 827-000230, and 2017060).

Declaration of competing interest

No author associated

References (61)

  • PourpanahF. et al.

    A hybrid model of fuzzy ARTMAP and genetic algorithm for data classification and rule extraction

    Expert Syst. Appl.

    (2016)
  • WangK.J. et al.

    An improved artificial immune recognition system with the opposite sign test for feature selection

    Knowl.-Based Syst.

    (2014)
  • PudilP. et al.

    Floating search methods in feature selection

    Pattern Recognit. Lett.

    (1994)
  • DasA.K. et al.

    A new hybrid feature selection approach using feature association map for supervised and unsupervised classification

    Expert Syst. Appl.

    (2017)
  • MoradiP. et al.

    Integration of graph clustering with ant colony optimization for feature selection

    Knowl.-Based Syst.

    (2015)
  • SivagaminathanR.K. et al.

    A hybrid approach for feature subset selection using neural networks and ant colony optimization

    Expert Syst. Appl.

    (2007)
  • AghdamM.H. et al.

    Text feature selection using ant colony optimization

    Expert Syst. Appl.

    (2009)
  • DavisR.A. et al.

    Novel feature selection method for genetic programming using metabolomic 1 H NMR data

    Chemometr. Intell. Lab. Syst.

    (2006)
  • WangX. et al.

    Feature selection based on rough sets and particle swarm optimization

    Pattern Recognit. Lett.

    (2007)
  • HuangC.L. et al.

    A distributed PSO–SVM hybrid system with feature selection and parameter optimization

    Appl. Soft Comput.

    (2008)
  • PrecupR.E. et al.

    Nature-inspired optimal tuning of input membership functions of Takagi-Sugeno-Kang fuzzy models for anti-lock braking systems

    Appl. Soft Comput.

    (2015)
  • UnlerA. et al.

    A discrete particle swarm optimization method for feature selection in binary classification problems

    European J. Oper. Res.

    (2010)
  • XueB. et al.

    Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms

    Appl. Soft Comput.

    (2014)
  • CarpenterG.A. et al.

    A massively parallel architecture for a self-organizing neural pattern recognition machine

    Comput. Vis. Graph. Image Process.

    (1987)
  • CarpenterG.A. et al.

    Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system

    Neural Netw.

    (1991)
  • CarpenterG.A. et al.

    ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network

    Neural Netw.

    (1991)
  • TanC.J. et al.

    A multi-objective evolutionary algorithm-based ensemble optimizer for feature selection and classification with neural network models

    Neurocomputing

    (2014)
  • WangC.H. et al.

    Improved particle swarm optimization to minimize periodic preventive maintenance cost for series-parallel systems

    Expert Syst. Appl.

    (2011)
  • PourpanahF. et al.

    A Q-learning-based multi-agent system for data classification

    Appl. Soft Comput.

    (2017)
  • TibshiraniR.

    Regression shrinkage and selection via the lasso

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (1996)
  • Cited by (71)

    • Complex ontology alignment for autonomous systems via the Compact Co-Evolutionary Brain Storm Optimization algorithm

      2023, ISA Transactions
      Citation Excerpt :

      The Brain Storm Optimization algorithm (BSO) is inspired by the scenario where different people sit together to discuss how to address an issue [9], which is a powerful global optimization algorithm. Owing to its simple concept and effectiveness for complex optimization problems, BSO has attracted much attention in the nature-inspired optimization community [10,11]. To overcome BSO’s two main drawbacks (i.e., the premature convergence and the large computational cost), this work proposes a Compact Co-Evolutionary BSO (CCBSO) to face the challenge of aligning AS ontologies.

    View all citing articles on Scopus
    View full text