Genetic programming for feature extraction and construction in image classification

https://doi.org/10.1016/j.asoc.2022.108509Get rights and content

Highlights

  • A new GP representation to get an effective combination of features and classifiers.

  • A new feature extraction layer to effectively extracting various image features.

  • A new feature construction layer to producing high-level features.

  • A new mutation operator for dynamically adjusting GP program sizes.

Abstract

Genetic Programming (GP) has been successfully applied to image classification and achieved promising results. However, most existing methods either address binary image classification tasks only or need a predefined classifier to perform multi-class image classification while using GP for feature extraction. This limits their flexibility since it is unknown which combinations of classifiers and features are the most effective for an image classification task. Furthermore, high image variations increase the difficulty of feature extraction and image classification. This paper proposes a GP approach with a new program representation, new functions, and new terminals. The new approach can conduct feature extraction, feature construction, and classification, automatically and simultaneously. It can extract and construct informative image features, select a suitable classification algorithm instead of relying on a predefined classifier, and perform classification for binary and multi-class image classification tasks. In addition, this paper develops a new mutation operator based on fitness of population for dynamically adjusting the size of the evolved GP programs. The experimental results on eight datasets with different variations and difficulties show that the proposed approach achieves higher classification accuracy than most of the benchmark methods. Further analysis shows that the GP evolved programs have appropriate tree sizes and potentially high interpretability.

Introduction

Computer vision aims to train computers to interpret the visual world in the same way that humans do [1]. Image classification is a promising area within computer vision, with many real-world applications such as self-driving, face recognition, medical diagnosis, and biological identification [2], [3]. The task of image classification is to classify images into different groups according to the content in images. Due to the high dimensionality and variations of the image data, it is challenging to develop an approach that is able to capture useful information from images and then conduct classification effectively.

An image is often composed of many thousands of pixels, which often contains a large amount of redundant or irrelevant information that will lead to the difficulty of achieving high classification accuracy. The aim of feature extraction is to transform raw image pixels into informative features, which can effectively reduce image dimensionality. Effective feature extraction methods can improve the performance of an image classification system. Currently, a variety of feature extraction methods have been developed to capture useful features from images, such as local binary pattern (LBP) [4], scale-invariant feature transform (SIFT) [5], and histogram of oriented gradients (HOG) [6]. Such features are capable of reflecting the interesting part of the image by capturing the salient information within it, e.g., edges, textures, and orientations [7]. However, it is difficult to achieve promising results on complex image classification tasks by using a single feature extraction method since single type of features might not effectively represent the whole image. Furthermore, high image variations in scale, illumination, rotation, and background increase the difficulty in extracting useful features for classification. The current popular image classification methods based on deep convolutional neural networks (CNNs) can learn informative image features for classification, but these methods might require rich expertise to design the architecture for a particular task, and need a large number of training instances and computational resources [8].

Feature construction is to produce new high-level features by utilizing features that are potentially useful already. For instance, new features can be created by adding LBP features and SIFT features together. The aim is to build more effective features for better describing images. Features captured by employing one type of feature extraction method, such as texture feature extraction, might not be effective for representing the overall variations of the complex image data. Feature construction is capable of building multiple high-level image features for capturing different patterns of an image, which can potentially improve the classification accuracy. However, domain knowledge is often required to determine which features should be selected and how to construct new features using selected features when dealing with different classification tasks such as facial expression classification and texture classification.

One promising technique for image classification is genetic programming (GP) [9], which uses evolutionary principles to automatically evolve/learn computer programs (solutions) to conduct image classification. GP is well-known for its good global search ability, flexible variable-length representation, potentially high interpretability of the evolved solutions, and ability to learn from a relatively small number of training instances [8], [10], [11]. According to the evolutionary process of GP, through its operators, i.e., crossover and mutation, GP is likely to find solutions that domain experts may not compose. Thus, GP has been successfully applied to image classification and achieved promising results [12].

The representation of GP programs is typically a tree-based structure, where leaf nodes called terminals can be raw pixels, image features, or constants, and internal and root nodes contain the potential operations/functions, such as feature extraction. With a flexible tree-based representation, multiple existing feature extraction methods can be employed as GP functions to allow it to automatically learn informative features for classification. For example, LBP, HOG, and SIFT can be the functions of the evolved GP programs to capture meaningful image features [8], which are invariant to certain variations such as scale and rotation. However, some existing GP-based methods only deal with binary image classification tasks [10], [13], [14]. In these methods, the output of GP trees is a floating-point number, which can be easily used for binary classification. However, using a single feature is often not effective for dealing with complex image classification tasks. Furthermore, multi-class image classification tasks are more common than binary classification in the real world. To address these issues, some methods employ GP for extracting informative features, and these extracted features are then used as the inputs of a classification algorithm such as 1-nearest neighbor (1NN) and support vector machine (SVM) for classification [8], [11], [15]. However, the combination of the features and classifiers produced by these methods might not be the most effective when tackling different classification tasks such as scene classification and texture image classification. Moreover, integrating the existing feature extraction methods into a GP program to automatically learn useful features might not be sufficient. Designing new functions of GP that allow it to construct informative high-level features using extracted features for classification is desirable but this is challenging. Therefore, this paper aims to propose a GP-based approach with a new representation to both binary and multi-class classification, which can perform feature extraction, feature construction, and classification, automatically and simultaneously.

One of the problems for GP is that the size of an evolved program can become very large without significantly improving his fitness. This phenomenon is called bloat [16]. Such GP programs might lead to overfitting problems and high computational costs. Many methods tackle this problem by limiting GP tree sizes within a predefined range [8], [10]. However, controlling GP programs with a limited size is not effective since some difficult image classification tasks may need complex solutions with a large size to achieve better performance [17]. Therefore, this paper develops a new mutation operator based on fitness of population for dynamically adjusting the growth or shrink of GP programs during the evolutionary process.

The overall goal of this paper is to address the above-discussed limitations by proposing a new GP approach with a new representation and a new mutation operator for (binary and multi-class) image classification tasks. The new approach is called GPRM in short. The evolved GP program can automatically perform feature extraction, feature construction, and classification, in which the effective combination of image features and classifiers will be generated. Furthermore, a new mutation operator will be developed to evolve GP programs with a suitable size. Specifically, this paper will answer the following questions.

(1) How to develop a GP program structure to automatically and simultaneously perform feature extraction, feature construction, and classification in the GPRM approach?

(2) What functions and terminals of GPRM are good for extracting useful and informative image features?

(3) How to develop a new mutation operator to dynamically adjust the size of the evolved GP programs during the evolutionary process?

(4) Can GPRM achieve better performance than other competitive image classification methods?

(5) Can the evolved GP programs be interpretable?

The remainder of this paper is organized as follows. In Section 2, we present the background of this study and review typical related work. Section 3 proposes a new image classification approach using GP with a new representation and a new mutation operator. Experiment design is described in Section 4. Section 5 reports and discusses experimental results. Section 6 presents the conclusion and future work.

Section snippets

Background and related work

This section introduces the essential background and basic concepts of image features and some commonly used feature extraction methods. Then, it reviews related work on image classification using GP. The limitations of these methods are discussed, showing the motivations of the proposed approach.

The proposed approach

In this section, the proposed GPRM approach is discussed in detail. The overall process of the proposed image classification system is first introduced. Then, this section analyzes the new GP representation including the new GP program structure, the new function set, and the new terminal set. Finally, a new mutation operator based on fitness of population is developed for dynamically adjusting the size of evolved GP programs.

Experiment design

This section presents the design of experiments containing benchmark datasets, benchmark methods, and parameter settings.

Results and discussions

This section analyzes the experimental results of GPRM on the eight benchmark datasets by comparing it with the benchmark methods, including classification performance and computational cost.

Further analysis

In this section, we analyze why the GPRM approach achieves better performance through some evolved example programs and demonstrate the effectiveness of the proposed mutation operator.

Conclusions and future work

The goal of this paper was to develop a new GP based approach for feature extraction and feature construction in binary and multi-class image classification, which has been achieved by developing the GPRM approach with a new program structure, a new function set, and a new terminal set. The GPRM approach can automatically learn useful features and select the most effective classifiers for different image classification tasks. GPRM achieved better performance on eight image classification

CRediT authorship contribution statement

Qinglan Fan: Idea, Algorithm design, Software, Writing – original draft, Writing – review & editing. Ying Bi: Discussions of algorithm design, Results, Writing – review & editing, Comments, Supervision. Bing Xue: Discussions of results, Writing – review & editing, Supervision, Project administration, Funding acquisition. Mengjie Zhang: Discussions of results, Writing – review & editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the Marsden Fund of New Zealand Government, New Zealand under Contracts VUW1913 and VUW1914, the Science for Technological Innovation Challenge (SfTI) fund under contract 2019-S7-CRS, the University Research Fund at Victoria University of Wellington, New Zealand grant number 223805/3986, MBIE Data Science SSIF Fund, New Zealand under the contract RTVU1914, and National Natural Science Foundation of China (NSFC), China under Grant 61876169.

References (45)

  • KozaJ.R.

    Genetic Programming: On the Programming of Computers by Means of Natural Selection

    (1992)
  • ShaoL. et al.

    Feature learning for image classification via multiobjective genetic programming

    IEEE Trans. Neural Netw. Learn. Syst.

    (2014)
  • GandomiA. et al.

    Handbook of Genetic Programming Applications

    (2015)
  • AtkinsD. et al.

    A domain independent Genetic Programming approach to automatic feature extraction for image classification

  • BiY. et al.

    An automatic feature extraction approach to image classification using genetic programming

  • SuganumaM. et al.

    Hierarchical feature construction for image classification using Genetic Programming

  • PoliR. et al.

    A Field Guide to Genetic Programming

    (2008)
  • PriceS.R. et al.

    GOOFeD: Extracting advanced features for image classification via improved genetic programming

  • CuiF. et al.

    Edge feature extraction based on digital image processing techniques

  • VadakkenveettilB.

    Grey level co-occurrence matrices: Generalisation and some new features

    Int. J. Comput. Sci. Eng. Inf. Technol.

    (2012)
  • LiuC. et al.

    Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition

    IEEE Trans. Image Process.

    (2002)
  • YangM. et al.

    A survey of shape feature extraction techniques

    Pattern Recognit.

    (2008)
  • Cited by (21)

    • A novel flexible feature extraction algorithm for Spanish tweet sentiment analysis based on the context of words

      2023, Expert Systems with Applications
      Citation Excerpt :

      Dimensionality reduction has been addressed for years through feature selection techniques that exclude the least relevant characteristics in the classification operation (Onan and Korukoğlu, 2017; Rui et al., 2016; Agarwal and Mittal, 2013). Feature extraction is a widely applied technique in many areas of science with very high performance (Fan et al., 2022; Sachadev and Bhatnagar, 2022). Feature selection has been successfully applied in sentiment analysis of documents as well (Jain and Jain, 2022; Osmani et al., 2022; Wang and Hong, 2019; Madasu and Elango, 2020; Wang and Lin, 2020). (

    • Evolutionary feature selection on high dimensional data using a search space reduction approach

      2023, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Genetic Programming (Koza, 1994) (GP) is a versatile EA paradigm that, as filter, it is used as a search algorithm while in the wrapper approach as a classifier see, for instance, (Harvey and Todd, 2015). GP has also been used in Fan et al. (2022) for extracting features from images and classification simultaneously. With respect to SS, García et al. propose a first adaptation in García-López et al. (2004).

    View all citing articles on Scopus
    View full text