Multi-objective genetic programming for feature learning in face recognition

https://doi.org/10.1016/j.asoc.2021.107152Get rights and content

Highlights

  • A new GP representation to automatically learn effective features for classification

  • A single-objective GP algorithm for face feature learning

  • A single-objective GP algorithm with weighted objectives for face feature learning

  • Two multi-objective GP algorithms for multi-objective image feature learning

Abstract

Face recognition is a challenging task due to high variations of pose, expression, ageing, and illumination. As an effective approach to face recognition, feature learning can be formulated as a multi-objective optimisation task of maximising classification accuracy and minimising the number of learned features. However, most of the existing algorithms focus on improving classification accuracy without considering the number of learned features. In this paper, we propose new multi-objective genetic programming (GP) algorithms for feature learning in face recognition. To achieve effective face feature learning, a new individual representation is developed to allow GP to select informative regions from the input image, extract features using various descriptors, and combine the extracted features for classification. Then two new multi-objective genetic programming (GP) algorithms, one with the idea of non-dominated sorting (NSGPFL) and the other with the idea of Strength Pareto (SPGPFL), are proposed to simultaneously optimise these two objectives. NSGPFL and SPGPFL are compared with a single-objective GP for feature learning (GPFL), a single-objective GP for weighting two objectives (GPFLW), and a large number of baseline methods. The experimental results show the effectiveness of the NSGPFL and SPGPFL algorithms by achieving better or comparable classification performance and learning a small number of features.

Introduction

Face recognition is an active research area in computer vision [1]. The task is to identify the face of a person from a number of face images. Face recognition has a wide range of applications in entertainment, smart cards, information security, law enforcement and surveillance [2]. However, due to the wide variations of pose, illumination, ageing, expression, resolution, and occlusion, face recognition remains a challenging task.

A face recognition system typically has two main stages: a face representation stage and a face matching stage [3]. Face representation aims to extract a set of discriminative features so that the face images can be easily distinguished. Face matching aims to develop effective classifiers to classify the face images into different groups using the extracted features. In general, compared with face matching, face representation has more significant effects on the recognition/classification performance and is more challenging [4]. Many methods have been developed to obtain an effective face representation, such as scale-invariant feature transform (SIFT) [5], local binary patterns (LBP) [6] and Gabor wavelets [7]. The process that uses one of these methods to extract features often needs human intervention and domain knowledge. Different from feature extraction, feature learning aims to automatically learn/extract features from images without human intervention and domain knowledge. Many feature learning methods have been proposed in recent years to automatically learn representations of faces and have achieved better performance than the methods using manually extracted features [4]. Most feature learning methods are based on neural networks (NNs), which learn representations using many non-linear layers from raw data. However, NN-based methods have their limitations, such as require a large number of training instances, have poor interpretability of the learned representation, have a fixed model complexity, and require rich expertise to design an effective architecture. Except for NNs, genetic programming (GP) has also been applied to automatically learn features for image classification [8], [9]. GP is an evolutionary algorithm, aiming to automatically evolve computer programs to solve a task without any predefined solution structure [10]. Compared with NNs, GP has a flexible representation to evolve variable-length solutions for solving a task. In addition, GP can evolve tree-based solutions with high interpretability from a small number of training instances [11]. Many face recognition tasks often have a small number of instances in each class, which may be difficult to train an effective NN without any data argumentation. In contrast, the solutions of GP often have fewer parameters and it is effective to use GP to learn features from the such a small number of training instances [11]. Therefore, this study develops a new GP-based approach to learning features for face recognition.

The dimension of features in face representation is important and a small number of image features is typically preferred for fast applications. In many traditional face recognition systems, dimensionality reduction methods, such as principal component analysis (PCA) and linear discriminant analysis (LDA), are employed to reduce the dimension of the features [12], [13]. A small number of features can not only shorten the training time of a classification algorithm but also have potentially higher interpretability. However, the majority of the existing feature learning methods, such as convolutional neural networks (CNNs) and auto-encoders (AEs), focus on improving classification accuracy and ignore the number of learned/extracted features [2], [14]. To address this, it is possible to simultaneously maximise the classification performance and minimise the number of learned/extracted features. Typically, these two objectives are potentially conflicting because a small number of features represent limited information of the data and the between-class similarity may be reduced. A simple and straightforward way to deal with these two objectives is to combine them as a single-objective using a weighted sum approach [15]. However, it is difficult to set the weights of these two objectives because the optimal number of features is unknown for solving a task. Instead, this problem can be formulated as a multi-objective optimisation problem and directly solved using an existing multi-objective algorithm. Although many algorithms have been developed for feature learning [8], [16], very few works focused on multi-objective feature learning with simultaneous maximising the classification performance on the training set and minimising the number of learned/extracted features. To this end, this study aims to fill this gap by developing multi-objective GP-based feature learning algorithms for face recognition.

In recent years, evolutionary multi-objective optimisation algorithms have been widely applied to solve many real-world tasks, such as network planning [17], bound-constrained real-world problems [18], and spread spectrum radar polyphase code design problem [19]. As evolutionary algorithms, multi-objective GP algorithms have been proposed for symbolic regression and modelling [20], and morphological filters optimisation [21]. It can also be found that multi-objective GP has been applied for feature extraction and construction [22], [23], [24]. However, no GP-based algorithms have been developed for multi-objective feature learning. Existing multi-objective GP methods focus on maximising the performance and minimising the tree size rather than the number of features, such as in [22], [23]. In many GP-based feature learning algorithms, the number of features is dynamically and automatically changed during the evolutionary process [16], [25], [26]. These methods may learn a large number of features if the objective has no constraint on the feature number. The multi-objective GP methods have seldom been used to simultaneously maximise the classification performance and minimise the number of learned/extracted features. It is noted that feature selection also has these two objectives and a number of evolutionary multi-objective feature selection algorithms have been proposed, such as in [27], [28]. However, the maximum number of features in feature selection is known but it is unknown in feature learning. This difference makes these two tasks and the behaviours/landscapes of these two objectives very different. This also makes the multi-objective feature learning task is more difficult. Therefore, it is necessary to investigate multi-objective feature learning and develop a new multi-objective feature learning algorithm to solve it.

The overall goal of this study is to develop new multi-objective feature learning algorithms for face recognition using GP with the objectives of maximising the classification performance and minimising the number of learned features. To effectively learn features from face images, a new representation, a new function set and a new terminal set will be developed to allow GP to automatically detect regions from the input images, use descriptors to extract features and combine the extracted features for classification. Then we develop two single-objective feature learning algorithms and two multi-objective feature learning algorithms based on GP with the new representation:

  • single-objective GP for feature learning (GPFL)

  • single-objective GP for weighting two objectives (GPFLW)

  • multi-objective GP for feature learning using the idea of non-dominated sorting (NSGPFL)

  • multi-objective GP for feature learning using the idea of strength Pareto (SPGPFL)

These four algorithms will be examined on four face recognition datasets of different image sizes, numbers of instances and difficulty. Specifically, we will investigate

  • 1.

    whether NSGPFL and SPGPFL can achieve better classification performance and learn a smaller number of features than GPFL;

  • 2.

    whether NSGPFL and SPGPFL can achieve better classification performance and learn a smaller number of features than GPFLW with different weighting factors;

  • 3.

    Which method of NSGPFL and SPGPFL is better than the other in improving the classification performance and the number of learned features;

  • 4.

    Whether NSGPFL and SPGPFL with the new individual representation can achieve better classification performance than 34 non-GP-based baseline methods, including CNN-based methods and the methods using well-known face features;

The rest of the paper is organised as follows. Section 2 provides background of this study and reviews typical related work. Section 3 proposes the new representation of GP, new single-objective GP-based feature learning algorithms and new multi-objective GP-based feature learning algorithms. The experimental design is presented in Section 4. Section 5 discusses and analyses the experimental results. The final section presents conclusions and future work.

Section snippets

Background and related work

This section provides background about multi-objective optimisation and GP. It also reviews typical work on face recognition and GP for image feature learning.

The proposed approaches

This section describes the proposed single-objective and multi-objective GP-based feature learning approaches. First, it introduces the new representation, the function set and the terminal set. Second, it presents the four GP-based feature learning algorithms, i.e., single-objective GP for feature learning (GPFL), single-objective GP for feature learning with a weighted objective (GPFLW), multi-objective GP for feature learning using the idea of non-dominated sorting (NSGPFL), and

Datasets

To examine the performance of the proposed approaches, four well-known face recognition datasets of varying difficulty are used for conducting the experiments. The four datasets are ORL [46], Extended Yale B [47], Aberdeen [48], Faces95 [49]. The details of the four datasets are listed in Table 1. It is found that these datasets have different image sizes, numbers of classes, numbers of training and test instances. It is noted that these four datasets are not extremely large datasets but they

Results and discussions

This section discusses and analyses the performance of GPFL, GPFLWs, NSGPFL, SPGPFL, and the baseline methods on the four different datasets. The classification results and the number of learned features of the GPFL, GPFLWs, NSGPFL, and SPGPFL on the four datasets are shown in Table 2, and Fig. 4, Fig. 5. In Fig. 4, Fig. 5, “NSGPFL-Best” and “SPGPFL-Best” indicate the approximated Pareto fronts of the 30 runs. The comparisons with NSGPFL, SPGPFL and a large number of non-GP-based baseline

Conclusions

The goal of this paper was to develop multi-objective GP-based feature learning algorithms for face recognition by simultaneously optimising the objectives of the classification performance and the number of learned features. The goal has been successfully achieved by developing the NSGPFL and SPGPFL algorithms with new representation, a new function set and a new terminal set. With the new representation, the new GP algorithms can automatically select small regions from the input image, select

CRediT authorship contribution statement

Ying Bi: Idea, Algorithm design, Software, Draft writing and revision. Bing Xue: Discussions, Feedback, Comments, Supervision. Mengjie Zhang: Discussions, Feedback, Comments, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported in part by the Marsden Fund of New Zealand Government, New Zealand under Contracts VUW1509, VUW1615, VUW1913 and VUW1914, the Science for Technological Innovation Challenge (SfTI) fund under grant E3603/2903, the University Research Fund at Victoria University of Wellington, New Zealand grant number 223805/3986, MBIE Data Science SSIF Fund, New Zealand under the contract RTVU1914, and National Natural Science Foundation of China (NSFC), China under Grant 61876169.

References (60)

  • JainA.K. et al.

    Handbook of Face Recognition

    (2011)
  • ZhaoW. et al.

    Face recognition: A literature survey

    ACM Comput. Surv.

    (2003)
  • DingC. et al.

    Multi-directional multi-level dual-cross patterns for robust face recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2015)
  • LuJ. et al.

    Learning compact binary face descriptor for face recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2015)
  • LoweD.G.

    Distinctive image features from scale-invariant keypoints

    Proc. Int. J. Comput. Vis.

    (2004)
  • AhonenT. et al.

    Face description with local binary patterns: Application to face recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • ShenL. et al.

    A review on gabor wavelets for face recognition

    Pattern Anal. Appl.

    (2006)
  • Al-SahafH. et al.

    A survey on evolutionary machine learning

    J. R. Soc. N.Z.

    (2019)
  • BiY. et al.

    A survey on genetic programming to image analysis

    J. Zhengzhou Univ. (Eng. Sci.)

    (2018)
  • KozaJ.R.

    Genetic Programming: On the Programming of Computers by Means of Natural Selection

    (1992)
  • Al-SahafH. et al.

    Keypoints detection and feature extraction: A dynamic genetic programming approach for evolving rotation-invariant texture image descriptors

    IEEE Trans. Evol. Comput.

    (2017)
  • TurkM. et al.

    Eigenfaces for recognition

    J. Cogn. Neurosci.

    (1991)
  • BelhumeurP.N. et al.

    Eigenfaces vs. fisherfaces: Recognition using class specific linear projection

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • YiD. et al.

    Learning face representation from scratch

    (2014)
  • AntonioL.M. et al.

    Coevolutionary multiobjective evolutionary algorithms: Survey of the state-of-the-art

    IEEE Trans. Evol. Comput.

    (2017)
  • BiY. et al.

    Genetic programming with image-related operators and a flexible program structure for feature learning to image classification

    IEEE Trans. Evol. Comput.

    (2021)
  • ShaoL. et al.

    Feature learning for image classification via multiobjective genetic programming

    IEEE Trans. Neural Netw. Learn. Syst.

    (2014)
  • LiangY. et al.

    Figure-ground image segmentation using feature-based multi-objective genetic programming techniques

    Neural Comput. Appl.

    (2019)
  • Y. Bi, B. Xue, M. Zhang, Automatically extracting features for face classification using multi-objective genetic...
  • BiY. et al.

    An evolutionary deep learning approach using genetic programming with convolution operators for image classification

  • Cited by (22)

    • Smart contract vulnerability detection combined with multi-objective detection

      2022, Computer Networks
      Citation Excerpt :

      Multi-Objective-based NN optimization is a classical application [26,27] that can effectively reduce the errors resulting from the dataset. Multi-Objectives are also used in face recognition scenarios due to their good sensitivity and specificity [28,29]. Multi-Objective also plays an important role in software security [30–33].

    • Multi-objective optimization-based adaptive class-specific cost extreme learning machine for imbalanced classification

      2022, Neurocomputing
      Citation Excerpt :

      As the randomly assigned parameters degrade the performance, many researchers tried to optimize the network parameters to achieve more compact network structure by using several optimization algorithms [46,47]. Meanwhile, heuristic feature extraction methods were developed [48–50]. However, such kind of works only consider RMSE as the performance criterion, which only can evaluate the error between the actual value and the corresponding predicted value of the created model.

    • Semantic schema based genetic programming for symbolic regression

      2022, Applied Soft Computing
      Citation Excerpt :

      As a kind of evolutionary algorithm, genetic programming is inspired by natural evolution. Although the motivation behind its innovation was to achieve the success of natural evolution in problem-solving and has been applied to solve numerous real-world problems [1–4], some features of this algorithm contradict natural evolution and keep genetic programming from having an effective search within search space. Non-locality and non-gradual optimization are important samples of such features.

    View all citing articles on Scopus
    View full text