A survey of techniques for characterising fitness landscapes and some possible ways forward
Introduction
Metaheuristics have become popular for solving complex optimisation problems where classical optimisation methods are either infeasible or perform poorly. Despite many success stories, it is well known that on some problems these techniques fail and that there is in fact very little understanding of which algorithms, or algorithm variants, are in general more suitable for solving which kinds of problems. It is also true that no one optimisation algorithm is at all times superior to the other. This was shown theoretically by Wolpert and Macready with their famous ‘No-Free-Lunch’ theorems for search/optimisation [87], [88]. In the case of simple hill-climbing algorithms it is relatively straightforward to estimate which problems will be easy and which will be harder to solve. However, in the case of more complex metaheuristics, it is not as easy to predict the degree to which problems will present difficulties for algorithms. As expressed by Culberson [12]: “The researcher trying to solve a problem is then placed in the unfortunate position of having to find a representation, operators and parameter settings to make a poorly understood system solve a poorly understood problem. In many cases he might be better served concentrating on the problem itself”. This article focuses on ways of better understanding problems in the hope that practitioners and researchers will have better guidance in the use of appropriate algorithms.
Many attempts at characterising optimisation problems have focused on finding a measure that could divide problems into those that are easy and those that are hard to solve [30], [48], [23]. These attempts have not been very successful. In the literature, whenever a publication appears proposing some measure of problem hardness, a number of subsequent publications can usually be found with counter-examples for which the proposed hardness measure does not hold. Some authors even provide counter-examples to their own techniques, pre-empting the inevitable ‘counter-paper’.
Much of the earlier work done on predicting problem hardness assumed genetic algorithms with the resulting notions of GA-hard and GA-easy problems [14], [28], [33], [47]. As pointed out by Guo and Hsu [23] “Any efforts like this are doomed to fail”, because the class of GA algorithms is too broad. The same problem can change from a hard problem to an easy problem by changing the GA settings. If finding a GA hardness measure is infeasible, then surely finding a general problem hardness measure is infeasible? Even assuming such a general difficulty measure could be found, He et al. [24] have proved that a predictive version of such a measure, i.e. that runs in polynomial-time, cannot exist (unless P = NP or BPP1 = NP). The general agreement in literature seems to be that no satisfactory problem difficulty measure for search heuristics has been found [30], [23], [24].
A possible reason for this is that although there are many factors (such as deception, ruggedness, and non-linear separability) that clearly affect problem difficulty, no one factor appears to be necessary or sufficient for characterising problem hardness. For example, modality, although an important consideration, cannot be used as the only estimate of complexity for search algorithms. Horn and Goldberg [28] show that there are problems with minimal modality (such as long path problems) that are hard for a GA to optimise and that there are problems with maximum modality (such as their one-max function with “bumps”) that are easy for a GA to optimise. Kallel et al. [34] confirm that multimodality is neither necessary nor sufficient as a predictor of difficulty for both hill-climbers and genetic algorithms. Forrest and Mitchell [18] studied GA failure showing that some GA-deceptive problems are easy for a GA and that there are non-deceptive problems that are difficult for a GA. They conclude that GA-deception is only one factor that contributes to the difficulty of search for a GA. It has therefore become widely accepted, as expressed by Smith et al. [64] that “No single measure or description can possibly characterise any high-dimensional heterogeneous search space”.
Instead of trying to find one measure of hardness, a more realistic approach could be to determine the characteristics of a problem and then use these characteristics to determine which algorithm would be best suited to solving that problem. What is hard for a Particle Swarm Optimisation (PSO) algorithm to solve might not necessarily be hard for a Genetic Algorithm (GA) to solve, or even a PSO with different parameter settings. It is hoped that in analysing problems in more depth, it will become possible to distinguish problems based on their characteristics.
This paper addresses the topic of characterising optimisation problems. The aims are to, firstly, discuss characteristics of problems that could potentially make them hard to solve and, secondly, to provide an overview of existing techniques for analysing these problem characteristics. Section 2 starts with an overview of different views of fitness landscapes. Although the term ‘fitness landscape’ is used frequently in literature, it can have different meanings in different contexts and these are elaborated on in Section 2. Section 3 provides a summary of different features of optimisation problems that could potentially affect the difficulty in solving the problem. The most important contribution of this work is in Section 4 where a survey is provided of existing techniques to characterise optimisation problems from the 1980s to the present. Important features are highlighted such as the focus, the level of search independence, assumptions on which the technique is based, and the result produced by the technique. The paper concludes with suggestions on how research in this area can move forward.
Section snippets
Fitness landscapes
For most optimisation problems there is a fitness function2 that reflects the objectives of the problem to be solved. (Problems that do not have a readily available fitness function are excluded from this study.) Potential solutions to a problem are compared
Features of fitness functions and landscapes
This section summarises a number of features of optimisation problems that could influence the ability of algorithms to solve the problems. The features listed are not in any way exhaustive. There may be features not known or not mentioned here, which could influence the behaviour of optimisation algorithms. The purpose is to summarise those features that are commonly discussed in literature. Measuring or quantifying these features is not always straight-forward and this is discussed further in
Measures and techniques for analysing fitness landscapes
For low dimensional problems, the associated fitness landscape could be visualised. A graphical representation could then give some indication of the features of the problem to be solved. Two problem landscapes could be compared in terms of ruggedness, deception, neutrality, etc. simply through visual inspection. In reality, however, problems are too complex to be visualised, so some other way of analysing problem characteristics is needed. The ideal would be to have a single numerical measure
Discussion
The aim of this study was to make sense of the body of work outlined in Table 1 in order to better utilise these techniques in practical ways. This section highlights what the survey reveals: where the focus has been, where the gaps are and possible ways in which techniques can be adapted to be more usable or relevant. The main points of the discussion in this section are summarised as possible ways forward in Table 2.
Conclusion
This paper provides a survey of existing techniques for characterising problems. Each technique is described in terms of the focus (what is measured), the level of search independence, assumptions on which the technique is based, and the result produced. The survey reveals how the focus has changed over the last two decades. Some characteristics, such as ruggedness, are the focus of many different techniques, but others, such as symmetry, are not well represented. Suggestions are made for ways
References (91)
Evolutionary algorithms in noisy environments: theoretical issues and guidelines for practice
Comput. Methods Appl. Mech. Eng.
(2000)Epistasis variance: a viewpoint on GA-hardness
Found. Genetic Algorithms
(1991)- et al.
Algorithmic information theory
- et al.
Genetic algorithm difficulty and the modality of fitness landscapes
- et al.
Towards a general theory of adaptive walks on rugged landscapes
J. Theor. Biol.
(1987) - et al.
Neutrality in fitness landscapes
Appl. Math. Comput.
(2001) Re-evaluating genetic algorithm performance under coordinate rotation of benchmark functions. A survey of some theoretical and practical aspects of genetic algorithms
BioSystems
(1996)- et al.
Evaluating evolutionary algorithms
Artif. Intell.
(1996) The evolution of evolvability in genetic programming
Fitness distance correlation analysis: an instructive counterexample