Elsevier

Information Sciences

Volume 369, 10 November 2016, Pages 347-367
Information Sciences

Evolving genetic programming classifiers with novelty search

https://doi.org/10.1016/j.ins.2016.06.044Get rights and content

Highlights

  • Novelty Search is applied for the first time to supervised classification with GP.

  • Two new variants of NS are proposed, overcoming some of its main shortcomings.

  • NS achieves competitive results compared to objective-based search.

  • Results show bloat control properties in binary tasks for NS.

Abstract

Novelty Search (NS) is a unique approach towards search and optimization, where an explicit objective function is replaced by a measure of solution novelty. However, NS has been mostly used in evolutionary robotics while its usefulness in classic machine learning problems has not been explored. This work presents a NS-based genetic programming (GP) algorithm for supervised classification. Results show that NS can solve real-world classification tasks, the algorithm is validated on real-world benchmarks for binary and multiclass problems. These results are made possible by using a domain-specific behavior descriptor. Moreover, two new versions of the NS algorithm are proposed, Probabilistic NS (PNS) and a variant of Minimal Criteria NS (MCNS). The former models the behavior of each solution as a random vector and eliminates all of the original NS parameters while reducing the computational overhead of the NS algorithm. The latter uses a standard objective function to constrain and bias the search towards high performance solutions. The paper also discusses the effects of NS on GP search dynamics and code growth. Results show that NS can be used as a realistic alternative for supervised classification, and specifically for binary problems the NS algorithm exhibits an implicit bloat control ability.

Introduction

Evolutionary algorithms (EAs) are a broad family of search and optimization algorithms that are based on a simplified model of Neo-Darwinian evolution [9], achieving impressive results in many domains [15]. The bio-inspired origins of EAs suggest a substantial difference with respect to traditional optimization approaches. However, EAs are guided by an objective function and specially designed search operators just like most optimization algorithms [25]. The use of an objective function in standard EAs is a key difference with respect to natural evolution, which is an open-ended process that lacks a predefined purpose.

There is, however, another approach to build EAs, what is normally referred to as open-ended artificial evolution. Open-ended algorithms do not use an objective function to drive the search, at least not an explicit one. An important feature of open-ended systems is the continuous emergence of novelty [1]. In fact, some of the earliest EAs were open-ended [6], but they have mostly been used in specialized domains such as artificial life [35] and interactive search [14]. Only recently has open-ended search been proposed to solve mainstream problems, one promising algorithm is Novelty Search (NS) proposed by Lehman and Stanley [19]. NS was conceived to overcome deception in evolutionary robotics (ER) [19], [20], [22], a common issue in most challenging problems [47].

Lehman and Stanley relate deception with problem hardness, stating that “[a] deceptive problem is one in which a reasonable EA will not reach the desired objective in a reasonable amount of time” [22] (p.193). The core idea behind NS is that using an objective function to determine fitness in challenging problems may mislead the search and prevent it from reaching a global optimum. Therefore, the proposal of NS is to abandon the objective function as the source of selective pressure, and instead determine selective pressure based on the novelty or “uniqueness” of each individual by considering a description of the behavior each individual exhibits. From the NS perspective, a behavior refers to a description of the interaction between a candidate solution and its domain-specific context.

NS has achieved promising results in different areas of ER [48], such as navigation [11], [19], [20], [21], [22], [44], morphology design [23] and gait control [22]. Despite the growing evidence that NS can be used as an alternative to traditional objective-based search (OS), we conjecture that it is not yet widely used for the following reasons. First, most work on NS has been limited to ER, providing little insight regarding the competence of NS in other areas, particularly in common machine learning problems. Second, NS introduces several additional algorithm parameters that must be heuristically tuned. Third, NS relies on a kernel method to estimate the uniqueness of each new solution based on its dissimilarity with previously generated solutions. Such an approach leads to a high computational overhead, which is normally solved with additional heuristics. Finally, NS has been shown to struggle when behavior space is large [13], the search for specific behaviors in these cases can become very slow while the algorithm explores many uninteresting solutions. To address this problem, Lehman and Stanley proposed an extension to NS called Minimal Criteria NS (MCNS) [21], where a solution is considered to be novel only if it is unique and satisfies some domain-specific minimal criteria, thus reducing the portion of behavior space that is explored.

The present work builds on previous contributions to extend the NS paradigm. Firstly, we apply NS on supervised classification with genetic programming (GP) and propose a behavior descriptor for evolved GP classifiers, whereas previous works on NS have focused mainly on ER. The NS approach is tested on twelve real-world datasets, considering binary and multiclass problems and using two different GP-based classifiers. Secondly, an extension to the basic NS algorithm is proposed, where the novelty of a solution is estimated probabilistically by modelling each behavior as a random vector. The proposed strategy is called probabilistic novelty search (PNS), which reduces the computational cost of the original NS algorithm, and all the parameters introduced by NS are eliminated. Thirdly, several NS variants are extensively tested and compared, including NS, MCNS and PNS. Results show that NS-based GP can perform competitively relative to a standard OS, while endowing the search with implicit bloat control in some cases. Preliminary results of this research were presented in [26], [31], [32], [41], [43]; however, those works only studied the general applicability of the original NS algorithm on synthetic pattern recognition problems without considering any algorithmic improvements or real-world scenarios. Nonetheless, those works served as a proof-of-concept for the proposed approach, which is fully explored and evaluated in the current paper. In summary, the work presented here will help establish NS as a viable alternative for GP-based machine learning.

The remainder of this paper is organized as follows. Section 2 provides the required background for this work, an overview of GP is given and the concept of behaviors in GP is introduced, discussing how it relates to objective-based fitness and semantics as understood within GP literature. Section 3 describes the NS algorithm and the proposed MCNS variant. Section 4 presents our basic approach towards applying NS with a GP-based classifier. Afterwards, the proposed PNS is described in Section 5. The experimental setup and results are presented in Section 6. Finally, Section 7 contains a summary, conclusions and future work.

Section snippets

Background

This section introduces GP, analyzes the search spaces used by GP, introduces the concept of behavior in GP, and discusses how it can be related with an open-ended search algorithm such as NS.

Novelty search

NS introduces a new perspective to guide an evolutionary search, inspired by the open-ended nature of biological evolution [39]. Lehman and Stanley conjectured that the objective function does not necessarily reward stepping stones in the search space that will ultimately lead to the desired goal, particularly in challenging problems [19]. NS measures progress by focusing on the uniqueness or novelty of each new individual, which is a dynamic measure that depends on the search progress at any

NS for supervised classification

To apply NS successfully, a behavior descriptor must first be proposed [13]. For instance, in a maze navigation problem Lehman and Stanley used the final robot position as the behavior descriptor [19], [20], [21], [22]. In this work, our goal is to apply NS in GP-based classification. In particular, we use two GP classifiers: a simple binary classifier based on a static threshold [40], [50] and a recently proposed multiclass approach [12], [30]. Both algorithms are wrapper methods that evolve

Probabilistic NS

As stated in Section 3, NS suffers from some shortcomings which are addressed by the proposal developed in this section. In particular, computing novelty using Eq. (1) can lead to several problems. First, it is not evident what is the optimal number of neighbors k to compute sparseness. Second, the computation of novelty based on sparseness, has a time complexity of O((m+q)2) where m is the size of population and q is the archive size, which will grow unbounded if it is not implemented as a

Experimental evaluation

In this section we present an experimental comparison of all the novelty-based variants discussed thus far, compared against a standard OS. The algorithms are compared on real-world classification datasets taken from several public repositories, summarized in Table 1. The first eight datasets in Table 1 are used to pose binary classification problems, datasets with 3 or more classes are divided into several binary classification problems using different combinations of classes (e.g., C1C2,

Summary, conclusions and future work

This work presents the first application of the NS approach to supervised classification with GP, presenting several contributions. First, the concept of behavior space is framed as a conceptual middle-ground between the well-known concept of objective space and the recently popular semantic space in GP, and can be extended to include both of them. Second, a domain-specific descriptor has been proposed and tested on supervised classification tasks considering real-world data for binary and

Acknowledgements

This research was partially supported by CONACYT Basic Science Research Project No. 178323, DGEST (México) Research Projects No. 5149.13-P and TIJ-ING-2012-110, TecNM (México) Research Projects 5414.14-P and 5621.15-P, as well as by FP7-Marie Curie-IRSES 2013 European Commission program with project ACoBSEC with contract No. 612689. The fourth author acknowledges centre grant (to BioISI, Centre Reference: UID/MULTI/04046/2013) from FCT/MCTES/PIDDAC, Portugal. Finally, first and last authors are

References (50)

  • J. Doucette et al.

    Novelty-based fitness: An evaluation under the santa fe trail

    EuroGP

    (2010)
  • A. Eiben et al.

    Introduction to Evolutionary Computing, Natural Computing

    (2007)
  • J. Gomes et al.

    Progressive minimal criteria novelty search

  • J.C. Gomes, P. Urbano, A.L. Christensen, Evolution of swarm robotics systems with novelty search, CoRR abs/1304.3362....
  • V. Ingalalli et al.

    A multi-dimensional genetic programming approach for multi-class classification problems

  • S. Kistemaker et al.

    Critical factors in the performance of novelty search

    Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation. GECCO ’11

    (2011)
  • T. Kowaliw et al.

    Promoting creative design in interactive evolutionary computation

    Evol. Comput. IEEE Trans.

    (2012)
  • J. Koza

    Human-competitive results produced by genetic programming

    Genetic Prog. Evolvable Mach.

    (2010)
  • J.R. Koza

    Genetic Programming: On the Programming of Computers by Means of Natural Selection

    (1992)
  • W.B. Langdon et al.

    Fitness causes bloat

    Proceedings of the Second On-line World Conference on Soft Computing in Engineering Design and Manufacturing

    (1997)
  • P. Larrañaga et al.

    Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation

    (2001)
  • J. Lehman et al.

    Exploiting open-endedness to solve problems through the search for novelty

  • J. Lehman et al.

    Efficiently evolving programs through the search for novelty

    Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation. GECCO ’10

    (2010)
  • J. Lehman et al.

    Revising the evolutionary computation abstraction: Minimal criteria novelty search

    Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation

    (2010)
  • J. Lehman et al.

    Abandoning objectives: Evolution through the search for novelty alone

    Evol. Comput.

    (2011)
  • Cited by (12)

    • An evolutionary framework for machine learning applied to medical data

      2019, Knowledge-Based Systems
      Citation Excerpt :

      Additionally, this paper provides a supplemental material in the Supplemental Material Document (SMD), which supports and complements the contributions reached in this research. Genetic programming (GP) represents a flexible and powerful evolutionary technique that uses a set of functions and terminals to produce computable expressions [30–32]. This allows us to find general solutions, in form of IF-THEN rules, able to classify patterns from a determined problem [28,29,33].

    • A novel Error-Correcting Output Codes algorithm based on genetic programming

      2019, Swarm and Evolutionary Computation
      Citation Excerpt :

      Compared with the linear structure of GA, the tree structure offers GP higher flexibility, especially in the pattern recognition field [9,10]. Nowadays, GP had been applied to solve diverse problems successfully, including the design of classifier for both semi-supervised learning [11] and supervised learning [12,13], the fusion of base learners [14], feature selection [15,16] and feature extraction [12,17]. Inspired by its significant power, this paper proposes a novel GP based ECOC algorithm (GP-ECOC for short).

    • Semantic tournament selection for genetic programming based on statistical analysis of error vectors

      2018, Information Sciences
      Citation Excerpt :

      Consequently, some information that is potentially useful for GP search may be lost. Recent research has shown that significant benefit could be gained by using semantic information of GP individuals (e.g., [21,22,28,31,35]). The genetic search operators of crossover and mutation can be modified to improve the semantic locality of search [9,30,34].

    • A Hierarchical Probabilistic Divergent Search Applied to a Binary Classification

      2022, International Conference on Agents and Artificial Intelligence
    View all citing articles on Scopus
    View full text