Towards fast approximations for the hypervolume indicator for multi-objective optimization problems by Genetic Programming

https://doi.org/10.1016/j.asoc.2022.109103Get rights and content

Highlights

  • A model that approximates the Hypervolume indicator is obtained using GP.

  • Models are highly efficient, orders of magnitude faster in total computation time.

  • Models are highly accurate, and can be used to guide an indicator-based MOEA.

Abstract

Hypervolume (HV) has become one of the most popular indicators to assess the quality of Pareto front approximations. However, the best algorithm for computing these values has a computational complexity of O(Nk/3polylog(N)) for N candidate solutions and k objectives. In this study, we propose a regression-based approach to learn new mathematical expressions to approximate the HV value and improve at the same time their computational efficiency. In particular, Genetic Programming is used as the modeling technique, because it can produce compact and efficient symbolic models. To evaluate this approach, we exhaustively measure the deviation of the new models against the real HV values using the DTLZ and WFG benchmark suites. We also test the new models using them as a guiding mechanism within the indicator-based algorithm SMS-EMOA. The results are very consistent and promising since the new models report very low errors and a high correlation for problems with 3, 4, and 5 objectives. What is more striking is the execution time achieved by these models, which in a direct comparison against standard HV calculation achieved extremely high speedups of close to 100X for a single front and over 1000X for all the HV contributions in a population, speedups reach over 10X in full runs of SMS-EMOA compared with the standard Monte Carlo approximations of the HV, particularly for large population sizes. Finally, the evolved models generalize across multiple complex problems, using only two problems to train the problems from the DTLZ benchmark and performing efficiently and effectively on all remaining DTLZ and WFG benchmark problems.

Introduction

Real-world problems often require the simultaneous optimization of several competing objectives, leading to multi-objective optimization problems (MOPs). One important characteristic of MOPs is that their solution sets, the so-called Pareto sets, as well as their images, the Pareto fronts, typically form objects of dimension (k1), where k is the number of objectives considered in the given problem. For the numerical treatment of such problems, multi-objective evolutionary algorithms (MOEAs) have caught many researchers’ and practitioners’ interest during the last two decades. Reasons for this include that MOEAs are of global nature, very robust, require minimal assumptions on the model, and are capable of computing finite-size approximations of the entire Pareto set/front of the given MOP in a single run of the algorithm. Since the outcome of every MOEA is an entire set of candidate solutions (population) that ideally resembles the solution set (mainly the Pareto front), one question that naturally arises is how to measure the obtained approximation quality. This is needed to compare different solution sets and guide the MOEA towards the “best” Pareto front approximation.

One performance indicator that is widely used is the Hypervolume indicator (HV, [1], [2]). Although this indicator has several valuable properties [2], [3], [4], it has one critical weakness: the cost for evaluating the HV value of given candidate sets grows exponentially with the number of objectives. That is, while this cost is relatively low for bi-objective problems (compared to the overall cost of a MOEA), the computation of the HV values become the bottleneck for MOPs with more objectives, which represents a severe drawback for the applicability of the HV in modern applications. Since decision-making processes are getting more sophisticated, it is a natural consequence that also the related MOPs increase their number of optimization objectives. And this is not only valid for the quality assessment of a given solution set, but even for the correct functioning of those MOEAs that are based on computing thousands of HV values and HV contributions (i.e., the contribution of an individual of a given population to the HV value) within one run of the algorithm. A straightforward implementation of the HV value of a given set S with a magnitude N leads to a complexity of O(Nk+1), while the best algorithm has a complexity of O(Nk/3polylog(N)). Literature reports several methods that aim for a reduction of the computational cost for the HV. For instance, some methods proposed algorithms that reduce the complexity of the computation for specific cases [3], [5]; others employ techniques to approximate the values of the Hypervolume [6]; and finally, algorithms specialized on the Hypervolume contributions have also been proposed [7], [8], [9].

In this work, we propose using a machine learning regression technique that can produce relatively simple and efficient models that approximate the Hypervolume indicator’s behavior. The goal is to approximate the real indicator value, with minimal deviation, for any given problem. The modeling strategy considered is Genetic Programming (GP), which can produce models expressed as symbolic mathematical expressions. The GP system is set up to obtain efficient and straightforward models, avoiding unnecessarily large or complex structures. Thus, the resulting expressions’ main advantage is their computational complexity, which significantly speeds up the computational times (mostly runs in linear time) while keeping the quality in the obtained approximation.

Accordingly, we can summarize the main contributions of this study as follows.

  • We pose the problem of deriving approximate models of the HV indicator as a supervised learning problem that we approach through GP regression.

  • We show that the learned models are highly efficient, particularly when combined with an adequate updating process, achieving large speedups relative to the state-of-the-art.

  • We present results that show that the evolved models effectively approximate the HV indicator, allowing them to be used in two common scenarios: (1) quality indicators and (2) guiding the search of an indicator-based MOEA, both tasks tested for 3-objectives, 4-objective, and 5-objective MOPs.

  • The evolved models are quite general, since models trained on two benchmarks can be used to guide an indicator based MOEA on a variety of different MOPs.

The remainder of this paper is organized as follows. In Section 2, we briefly present some definitions and related work on multi-objective optimization and GP, respectively. In Section 3, we present the problem formulation, posing it as a supervised learning problem on which to apply GP. Afterwards, we outline our proposed approach in Section 4 and provide details of the main algorithms used. In Section 5 we present the main results of our study. Finally, Section 6 contains the conclusions and future work.

Section snippets

Background on multi-objective optimization and performance indicators

A continuous MOP can be mathematically defined as follows: minxDF(x),s.t.G(x)0H(x)=0.Hereby, F:Dnk, F(x)=(f1(x),,fk(x)) is the objective function that is defined by the individual objectives fi:Dn. The domain D of F is defined by the subset of the n that satisfies all inequality and equality constraints, D{xn:G(x)0andH(x)=0}.

The optimality of a MOP is defined by the concept of dominance. Let v,wk, then we say that the vector v is less than w (v<pw), if vi<wi for all i{1,,k}

Computational problem statement

In this work, the goal of deriving a MOEA performance indicator is posed as a synthesis or learning problem instead of a traditional analytical or formal derivation. While this could be modeled in different ways, we propose defining a supervised machine learning problem to build a new model to compute a performance indicator. To do so, it is necessary to define a target functionality that a learning algorithm will attempt to match, contained in a set of training instances. From this, it is then

Methodology

This section outlines our proposed approach to derive models that can efficiently approximate the HV indicator. The proposed methodology includes the following main stages.

  • 1.

    Generate the learning dataset. From a set of widely used MO benchmarks, we obtain a sample of approximations to the Pareto front. We made this by running a MOEA on these problems, and used a heuristic sampling policy. To compute the HV for each data set (ground truth), we apply the WFG implementation.1

  • 2.

Experiments and results

The experiments were carried out on a Dell R730 Power Edge Server with 2X Intel Xeon E5-2650 processors and 512 GB RAM running KVM virtual machines over Ubuntu Linux. The GPTIPS 2.0 software was downloaded from https://sites.google.com/site/gptips4matlab, running on MATLAB Version: 9.3.0.713579 (R2017b). To compute the HV indicator,we used the WFG implementation,3 the SMS-EMOA code was obtained from PlatEMO,4 with the

Conclusions and future work

In this work, we have presented a new methodology that allows approximating the Hypervolume (HV) values for MOPs. In particular, we have, for the first time in related literature, a supervised learning problem and used GP to evolve solutions for it. We have confirmed the reliability of our models through a comprehensive set of experimental evaluations. Numerical results show that our models approximate the real HV value of the selected benchmark problems, DTLZ and WFG, with great accuracy but

CRediT authorship contribution statement

Cristian Sandoval: Methodology, Software, Validation, Investigation, Data curation, Visualization. Oliver Cuate: Formal analysis, Investigation, Writing – review & editing, Methodology. Luis C. González: Conceptualization, Writing – original draft, Writing – review & editing, Supervision, Project administration, Funding acquisition. Leonardo Trujillo: Conceptualization, Methodology, Resources, Writing – original draft, Writing – review & editing, Supervision, Project administration, Funding

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The first author was supported by CONACYT (Mexico) doctoral scholarship with CVU number 789493. Oliver Cuate acknowledges Instituto Politécnico Nacional and funding from project SIP 20221947.

References (60)

  • SoteloA. et al.

    Identification of epilepsy stages from ECoG using genetic programming classifiers

    Comput. Biol. Med.

    (2013)
  • ZitzlerE. et al.

    Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach

    IEEE Trans. Evol. Comput.

    (1999)
  • ZitzlerE. et al.

    The hypervolume indicator revisited: On the design of Pareto-compliant indicators via weighted integration

  • BeumeN. et al.

    On the complexity of computing the hypervolume indicator

    IEEE Trans. Evol. Comput.

    (2009)
  • IshibuchiH. et al.

    Comparison of hypervolume, IGD and IGD+ from the viewpoint of optimal distributions of solutions

  • BaderJ. et al.

    HypE: An algorithm for fast hypervolume-based many-objective optimization

    Evol. Comput.

    (2011)
  • BradstreetL. et al.

    Updating exclusive hypervolume contributions cheaply

  • BringmannK. et al.

    An efficient algorithm for computing hypervolume contributions

    Evol. Comput.

    (2010)
  • EmmerichM.T.M. et al.

    Computing hypervolume contributions in low dimensions: Asymptotically optimal algorithm and complexity results

  • HillermeierC.

    Nonlinear Multiobjective Optimization: A Generalized Homotopy Approach, Vol. 135

    (2001)
  • Van VeldhuizenD.A.

    Multiobjective Evolutionary Algorithms: Classifications, Analyses, and New InnovationsTech. Rep.

    (1999)
  • CoelloC.A.C. et al.

    Solving multiobjective optimization problems using an artificial immune system

    Genet. Program. Evol. Mach.

    (2005)
  • SchützeO. et al.

    Using the averaged Hausdorff distance as a performance measure in evolutionary multiobjective optimization

    IEEE Trans. Evol. Comput.

    (2012)
  • BogoyaJ.M. et al.

    A (p,q)-averaged Hausdorff distance for arbitrary measurable sets

    Math. Comput. Appl.

    (2018)
  • BogoyaJ.M. et al.

    The averaged Hausdorff distances in multi-objective optimization: A review

    Mathematics

    (2019)
  • BrockhoffD. et al.

    On the properties of the R2 indicator

  • HansenM.P. et al.

    Evaluating the Quality of Approximations to the Non-Dominated Set

    (1994)
  • ZitzlerE. et al.

    Quality assessment of Pareto set approximations

  • DilettosoE. et al.

    A weakly Pareto compliant quality indicator

    Math. Comput. Appl.

    (2017)
  • IshibuchiH. et al.

    Modified distance calculation in generational distance and inverted generational distance

  • Cited by (11)

    View all citing articles on Scopus
    View full text