Automatic design of specialized algorithms for the binary knapsack problem

https://doi.org/10.1016/j.eswa.2019.112908Get rights and content

Highlights

  • Complex optimization problems arise in many artificial intelligence fields.

  • Algorithms are automatically designed for a complex optimization problem.

  • The automatic design produces several algorithms for the same problem.

  • The algorithms are specialized for set of instances.

  • The novel algorithms that emerge are computationally effective.

Abstract

Not all problem instances of a difficult combinatorial optimization problem have the same degree of difficulty for a given algorithm. Surprisingly, apparently similar problem instances may require notably different computational efforts to be solved. Few studies have explored the case that the algorithm that solves a combinatorial optimization problem is automatically designed. In consequence, the generation of the best algorithms may produce specialized algorithms according to the problem instances used during the constructive step. Following a constructive process based on genetic programming that combines heuristic components with an exact method, new algorithms for the binary knapsack problem are produced. We found that most of the automatically designed algorithms have better performance when solving instances of the same type used during construction, although the algorithms also perform well with other types of similar instances. The rest of the algorithms are partially specialized. We also found that the exact method that only solves a small knapsack problem has a key role in such results. When the algorithms are produced without considering such a method, the errors are higher. We observed this fact when the algorithms were constructed with a combination of instances from different types. These results suggest that the better the pre-classification of the instances of an optimization problem, the more specific and more efficient are the algorithms produced by the automatic generation of algorithms. Consequently, the method described in this article accelerates the search for efficient methods for NP-hard optimization problems.

Introduction

Not all problem instances of a difficult optimization problem present the same degree of difficulty for a given algorithm. Surprisingly, apparently similar problem instances require very different computational efforts to be solved. In fact, the space that contains all problem instances is divided by frontiers known as phase transitions, which have been studied via some specific problems (Achlioptas, Naor & Peres, 2005). For instance, for the symmetrical travelling salesman problem (a typical case of the difficult combinatorial optimization problems), random problem instances with different sizes and degrees of regularity have been generated to study the difficulty in solving each problem instance with mathematical programming methods (Schawe & Hartmann, 2016). It has been found that such methods have different levels of difficulty against the problem instances, and consequently, the phase transitions for this problem are evident. Similar studies analyze this performance for different optimization problems, such as asymmetrical TSP (Zhang, 2004), three-coloring problem (Boettcher & Percus, 2004) and maximum satisfiability (Pruegel-Bennett & Tayarani-Najaran, 2012). Moreover, it is possible to identify the appropriate features to differentiate problem instances of a set of combinatorial optimization problems to predict the performance of a given algorithm (Smith-Miles and Lopes, 2012, Smith-Miles et al., 2014). This fact suggests that, instead of designing a common algorithm for all problem instances of a problem, it seems convenient to design several algorithms with specific functionalities for the type of problem instance they are facing. This reasoning is in accordance with the No Free Lunch theorem that establishes that a heuristic algorithm for a problem does not perform better than another for all problem instances (Wolpert & Macready, 1997).

A different way to address difficult optimization problems is to automatically construct the best algorithm to solve the problem. The automatic generation of algorithms is a technique that automatically assembles the components that potentially make up an algorithm for a given problem (Loyola et al., 2016, Ryser-Welch et al., 2015). In fact, the determination of the best algorithm for an optimization problem can also be formulated as a meta optimization problem through mathematical programming (Mitsos, Najman & Kevrekidis, 2018). In fact, if aπ is an algorithm that solves an optimization problem π and Ωπ is the space containing all possible algorithms that solve π, then a performance measure P is used to guide the search process as established in Eq. (1). Then, the search for the best algorithm for the specific problem can be conducted through some of the existing heuristics and, in particular, with genetic programming. Genetic programming is particularly suited for this task because it involves populations of syntax trees that are suitable to represent combinations of instructions, such as those occurring in an algorithm (Koza, 2003, Poli et al., 2008). Artificial evolution occurs due to the application of operators of selection, crossover, mutation and reproduction. The components of the algorithms may be specific heuristics already existing for the problem or their atomic parts. Thus, diverse algorithmic combinations are explored generation after generation, evaluating each new algorithm produced according to their capacity to solve evaluation problem instances. The approaches used thus far are focused on the search for a new algorithm for all problem instances instead of the search for specialized algorithms according to the type of problem instances.MaxP(aπ)subjecttoaπΩπ

The Binary Knapsack Problem (KP) is one of the difficult optimization problems that seem to offer varied resistance when addressed differently according to the type of problem instance to be solved. KP consists of the determination of a set of elements that can be contained in a container with a limited capacity and with maximum profit. If xj is a binary variable that indicates the presence of item j in the container and pj indicates its profit, then the objective function to be maximized indicates the profit added to the container considering the items selected from the n available items (Eqs. (2) and (3)). In turn, if wj indicates the weight of item j, then the capacity constraint establishes that the capacity W is not exceeded. Thus, a solution to the problem consists of an allocation of binary values to the vector x. The simplicity of the mathematical formulation of this problem hides the difficulty of solving any problem instance with a low computational time. In fact, this NP-hardness has caused an enormous diversity of studies with exact and heuristic algorithms and the use of a variety of different problem instances (Bienstock, 2008, Darehmiraki and Mishmast Nehi, 2007, Martello and Toth, 1990). Furthermore, Pisinger (2005) performs a meticulous characterization of KP instances. Based on the relation between pj and wj values, several groups of KP instances are generated so that the classification of problem instances combines elements that come from experimentation and from the structure of the problem. Their results reveal that some problem instances are easier to solve for some of the exact algorithms that are tested. Specifically, although the classification identifies no phase transitions between these types of problem instances, it is clear that some groups are easier to solve than others by a branch and bound algorithm.MaxZ(x)=j=1npjxjs.t.j=1nwjxjW,xj{0,1},j=1,,n

The algorithms automatically designed for KP are capable of determining the optimal solution for some types of problem instances (Parada, Herrera, Sepúlveda & Parada, 2015). The authors show that the algorithms constructed from elementary heuristic components determine the optimal solution of several KP instances. Although it is natural to think that a single algorithm is able to cope with all problem instances of the same problem with the same efficiency, the results present evidence of the existence of algorithms with different computational performances according to the problem instances that have been used in its design. In addition, the constructed algorithms are combinations of heuristics that do not consider hybridizations with the exact methods for the KP. Although such methods are time-consuming, their use to solve a part of the problem may contribute to an improvement in the efficiency of the built methods. Both issues have been less explored in the literature in terms of the computational effort required and the performance of the resulting algorithms.

Several decision-making problems have been studied directly through formulations of KP. Its formulation directly represents the minimization of raw materials (Martin, Hokama, Morabito & Munari, 2019), the selection of a portfolio (Vaezi, Sadjadi & Makui, 2019) or, the clustering of data (Wedashwara, Mabu, Obayashi & Kuremoto, 2016), between many other problems (Kellerer, Pferschy & Pisinger, 2004). Besides, KP is one of the fundamental NP-hard problems in the field of combinatorial optimization then, indirectly, through a sequence of polynomial transformations KP can be connected as a subproblem of many other problems in the NP-hard class (Martello & Toth, 1990). Consequently, the search for an efficient algorithm for KP is also the search for an algorithm for all the other problems. Thus, the automatic generation of algorithms constitutes an algorithm search tool for KP that also impacts the search for algorithms for the whole family of difficult problems.

This work describes the automatic construction of specialized algorithms for different types of KP instances, and to construct the algorithms, an exact method is included as one of the elementary components. The algorithms are constructed through genetic programming, evolving syntax trees that are decoded as algorithms for the problem. The components of syntax trees are heuristics for filling the knapsack and refining a feasible solution. In addition, an exact optimization terminal based on dynamic programming is included that, when invoked, operates by correcting a part of the current solution. The set of instances of each type is divided into two groups: the first group is used for the construction, whereas the second group is used to evaluate the already constructed algorithms. The specialization of algorithms is verified by performing cross-evaluations of the different types considered, and the relevance of the exact method is numerically evaluated.

Definitions of the components used for the automatic generation of algorithms are presented in Section 2, whereas the classification of the used problem instances is presented in Section 3. The results are presented in Section 4, and the primary conclusions of the study are presented in Section 5.

Section snippets

Automatic generation of algorithms for KP

To identify a feasible solution for the meta-optimization problem, a genetic programming algorithm is used. The evolutionary process is performed in two stages: the construction of algorithms and their evaluation. In the first stage, the algorithms are constructed by evolving syntactic trees, making use of operations of selection, reproduction, crossover and mutation. The syntactic trees are decoded as algorithms to solve KP such that a population of syntactic trees corresponds to a population

Selection of problem instances for KP

The problem instances of the problem were constructed by using a random generator of problems (Pisinger, 2005). The generator allows for the production of problem instances with different distribution ranges of wj and pj, from which six types of KP instances previously identified in Pisinger (2005) are generated: Almost strong correlated (ASC), CEIL instances, weakly correlated (WC), non-correlated (NC), strongly correlated (SC) and items with similar weight (SW). The distribution of space wj -

Results

Most of the constructed algorithms are specialized in the determination of near-optimal solutions for the problem instances of the same type as those that were used during their construction. The rest of the algorithms are partially specialized. In fact, it is detected that the algorithms have better performance in evaluating problem instances of the same type compared with the evaluation of problem instances of other types. The evaluations performed with all types of problem instances are

Conclusions

New algorithms for KP specialized for types of problem instances were obtained through a genetic programming modeling to provide near-optimal solutions for the KP. The elementary components of the algorithms were defined from a decomposition of classical heuristics for the KP already existing in the literature. Additionally, an exact optimization component that allows for improvement of an existing solution was constructed using an existing dynamic programming algorithm. The evolutionary

CRediT authorship contribution statement

Nicolás Acevedo: Conceptualization, Methodology, Software, Investigation, Formal analysis, Writing - original draft. Carlos Rey: Conceptualization, Methodology, Software, Investigation, Formal analysis, Writing - original draft. Carlos Contreras-Bolton: Conceptualization, Methodology, Software, Investigation, Formal analysis, Writing - original draft, Writing - review & editing. Victor Parada: Conceptualization, Methodology, Software, Investigation, Formal analysis, Writing - original draft,

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This study was partially supported by the Institute of Complex Engineering Systems: BASAL CONICYT- FB0816. We also acknowledge projects USA1899-Vridei 061919VP-PAP Universidad de Santiago de Chile and DICYT-USACH 061919PD.

References (26)

  • H. Kellerer et al.

    Knapsack problems

    (2004)
  • J. Koza

    Genetic programming IV: Routine human-competitive machine intelligence

    (2003)
  • C. Loyola et al.

    Automatic design of algorithms for the traveling salesman problem

    Cogent Engineering

    (2016)
  • Cited by (0)

    View full text