2.1 Introduction

Within the field of Genetic Algorithms (GA) and Artificial Intelligence (AI) a variety computational substrates with the power to find solutions to a large variety of problems have been described. Research has specialized on different computational substrates that each excel in different problem domains. For example, Artificial Neural Networks (ANN) [28] have proven effective at classification, Genetic Programs (by which we mean mathematical tree-based genetic programming and will abbreviate with GP) [18] are often used to find complex equations to fit data, Neuro Evolution of Augmenting Topologies (NEAT) [35] is good at robotics control problems [7], and Markov Brains (MB) [8, 12, 21] are used to test hypotheses about evolutionary behavior [25] (among many other examples). Given the wide range of problems and vast number of computational substrates practitioners of GA and AI face the difficulty that every new problem requires an assessment to find an appropriate computational substrates and specific parameter tuning to achieve optimal results.

Methods have been proposed that combine different computational substrates. “AutoML” [27, 36] is a method designed to select a computational substrate most appropriate for a given problem, and then generate a solution in that substrate. Another compound method is the “mixture of experts” concept, where an artificial neural network is allowed to be constructed from a heterogeneous set of sub-networks, originally pioneered by Jacobs et al. [14] and similar to more recent work [31]. These methods choose from existing substrates or create a network of existing substrates.

In this manuscript we propose a compound method that borrows elements from various known computational substrates and uses them to create a heterogeneous genetic algorithm that allows for direct low-level integration between components from the different substrates. We call this approach the Buffet Method, in deference to the No Free Lunch theorem [29, 39, 40, 42], which loosely “state[s] that any two optimization algorithms are equivalent when their performance is averaged across all possible problems” [41]. In this paper we choose components from four computational substrates, and combined these components with a MB framework to construct one possible implementation of the Buffet Method. Each substrate was selected because it has been observed to be successful in some domain. We will show that a MB that incorporates widely heterogeneous computational elements from other systems (in this case GP, NEAT, ANNs, and MB) can combine the advantages of these different systems, resulting in the ability to automatically discover high quality, often hybrid, solutions while avoiding pitfalls suffered by the individual systems.

2.2 Methods

We will be using terms that require definition, and/or specification. When we talk about “computational substrate” we refer to the definition of a particular method of computation. In this light, ANN, MB, etc., all define computational substrates, but so does a biological brain (though we are not proposing the integration of biological neurons into a buffet method…yet). It is also important here that when we talk about ANN, GP and NEAT, we are talking about particular implementations (usually the initial or canonical implementation). In particular, when we use GP we specifically mean mathematical tree-based Genetic Programming and by no means dismiss the large body of work that exists investigating other forms of Genetic Programming. We also wish to note that we are using Markov Brains in two ways in this paper. Firstly, MB are being used as the underlying method that allows for the interconnection between elements from different computational substrates. Secondly, we will be using two types of MB gates that have been in use since MB were first proposed. When talking about Markov Brains in the first sense, we will use MB, and in the second sense we will use CMB (for canonical Markov Brain).

Every computational substrate specifies unique sub-component behavior, the ways sub-components connect with each other, the actions available to the substrate, how inputs are received, how outputs are delivered, and how the substrate stores internal states to allow for memory and recurrence. For example, GP is constructed from nodes arranged as a tree whereas ANNs have a fixed layered topology. CMB logic gates work on digital inputs and not on the continuous values such as those used by NEAT. It is therefore impossible to create one system that integrates unmodified elements from all systems. Instead, we identified essential qualities of each system and devised a way to incorporate these characteristics in a new system.

2.2.1 Markov Brains: An Introduction

Markov Brains describe a computational substrate comprised of three primary elements: Nodes, Gates and Wires. Nodes are simply values (either input, output, or hidden/recurrent). Gates are logical units which execute computations between nodes. Wires connect input nodes to gates and gates to output nodes. When a MB is executed, all of its gates are processed in parallel.

If more then one gate output is connected to the same output node then the gate output values are summed (other methods such as overwrite and average have been tested, but their discussion is outside the scope of this article). In many cases, MB are used with binary inputs and produce binary outputs, in these cases the output values are discretized as 1 if value >0, else 0.

Usually, MB are used to find solutions to problems which require multiple updates (i.e. navigation of a robot). On each update the inputs are set (conversion of sensor state to input nodes), the MB is executed, and then the “world” is updated based on the state of the output nodes (i.e. output nodes are used to control motors). Memory between updates is achieved with hidden nodes. These are extra nodes are added by reserving extra space in the input and output buffers. The values written to the output hidden nodes are copied to the input hidden nodes after each MB execution. In some configurations of MB additional input nodes are reserved so that output node values may be copied in the same manner as hidden nodes, providing direct access to the last outputs.

Since MB architecture is only one layer deep, output values are limited to operations that only use single gate execution or a summation of such executions. This limitation can be overcome by allowing hidden nodes and executing the MB multiple times with a single set of inputs. More elaborate deeper topologies of nodes and gates are possible. For example, if output nodes are available as inputs to gates, then the output of one gate can be accessed by another gate in a single MB execution. Of course, in this serial configuration, the gates cannot be run in parallel and a gate execution order must be established (usually as additional information extracted from the genome when the gates are being constructed).

The number of input and output nodes is determined by the task and the number of hidden nodes is set by the user. The gates and wires are initially randomly generated, but then are subject to selection, reproduction, and mutation (see encoding methods below).

Gates may have any number of inputs and outputs. The basic gate type is a deterministic logic gate using between 1 and 4 inputs that converts the inputs to bits (with the discretization function mentioned above) and then delivers between 1 and 4 outputs derived from a genetically determined look-up table. Neuron gates are a more complex example gate type that take one or more inputs, sum these inputs, then deliver an evolvable output value if the input total exceeds an evolvable threshold and 0 if it does not. The summed value in a neuron gate can persist over multiple updates providing the gate with its own local memory (apart from the hidden nodes). Neuron gates have additional configuration options that, for example, allow them to interactively alter their threshold value. Other gate types that have been explored include counters, timers [13], mathematical operations, and feedback [32]. It is not our intention here to describe the full range of possible (or even existing) gate types, but rather to convey that the range of possible behaviors is not limited and could even include nested MB networks. Adding new gate types only requires the implementation of the gate’s internal process (gate update) and construction and/or mutation behavior (depending on the encoding type). For a more detailed description of Markov Brains, see [12].

In actuality, the logic contained within gates can define any computational operation. Because of I/O standardization of MB gates any collection of gate types will be compatible, regardless of internal computational processes. The modular inter-operable structure of MB [12] lays the foundation for creating a heterogeneous computational substrate that adopts elements from multiple sources and this is what allows for the Buffet Method.

There are many processes that can be used to construct a MB. While the construct method will likely effect evolvability, it can be considered separately from the MB substrate. In order to generate more robust results (i.e. to insure that the encoding method is not critical to the results) we replicated all experiments using a Genetic Encoding method and a Direct Encoding method.

2.2.2 Genetic Encoding

Genetic Encoding uses a genome (a string of numbers) and a translation method. The occurrence of predefined start codons (sub-strings of numbers, e.g. ‘43, 212’) identify regions (genes) that define gates. Each gate is associated with a particular start codon. The sequence following the start codon provides information needed to define the function of that gate and how it is wired. Thus, every sequence of ‘43, 212’ in the genome will initiate the subsequent translation of a gate (of the type associated with ‘43, 212’) when encountered during a linear read. Note that this allows for overlapping gate genes. Since each gate type requires different data for its construction, the information that must be extracted from the genome after the start codon will be different and must be defined by the gate type. Gate types can be allowed or disallowed by adding or removing their start codons from the translation method.

For example, consider the genome sub-string in Fig. 2.1. If this were part of a genome being translated into a MB the first thing that would happen is that the ‘43, 212’ sub-string would be located. As it happens, ‘43, 212’ is the start codon for a deterministic gate. The next two values 31 and 89 would be used to determine the number of inputs and number of outputs. Since deterministic gates have between 1 and 4 inputs and 1 and 4 outputs each of these values would be processed with ((value mod 4) + 1); resulting in 4 inputs and 2 outputs. The next 8 values determine the input and output addresses. This gate will use all 4 input address values but only the first 2 output address values. Since a mutation could alter the number of inputs or outputs, the additional genome locations representing the 3rd and 4th inputs go unread so that such mutations will not result in a frame shift. In order to process the input and output address values the input genome values are modded by the number of input nodes (including hidden) and the output values are modded by the number of output nodes (including hidden). The following 64 values (bold text in figure) are modded by 2 to generate 64 binary values for the look-up table. Since this gate only has 4 inputs and 2 outputs a significant number of these look-up table values will not be used, but they are unread (like the input and output address values) to avoid frame shifts in the case of mutations, which alter the number of inputs or outputs.

Fig. 2.1
figure 1

Illustration of Indirect encoding for Markov Brains. A sub-string of values found in a genome encodes each gate of a Markov Brain, each site on this string specifies different aspects of the gate, such as the number of inputs and outputs, or how the inputs and outputs connect to nodes

Reproduction and mutation are simple when using this form of genetic encoding—the genome is copied, random mutations are applied, and then the resulting genome is translated into a new MB. We allow for point mutations (the random alteration of one random site in the genome), copy mutations (where a section of the genome is selected and copied to another location) and deletion mutations (where a section of the genome is deleted). The mutation rates are established by the user and are defined as a per site percent chance. Sexual reproduction is achieved by crossover between parent genomes (although all experiments in this paper were asexual).

2.2.3 Direct Encoding

The Direct encoding method we used in this paper generates the initial MB populations using randomly generated genomes and translates the genomes using the method described above, ensuring that experiments using Direct or Indirect encoding have the same starting condition. But thereafter, organisms perform reproduction and mutation by copying the MB and applying mutations to each component of the MB directly. This method adds the requirement of specifying mutational operators for each gate type based on the unique structure of that gate type. In addition, the mutation rates of every possible mutation must be determined explicitly. In this paper the direct encoding method allows for mutations that can add a new randomly generated gate, copy an existing randomly selected gate, remove an existing randomly selected gate, alter input or output wires of a gate and alter gate internal parameters.

2.2.4 Multi-Step Functions

MBs execute all gates in parallel within a single update. If a computation requires the participation of multiple gates in sequence, then there must be multiple updates. If a task requires multiple updates and the MB performs multiple updates (i.e. a navigation task where the input/update/output loop is repeated) then multi-step processes can occur over time. On the other hand, some tasks pose a problem and expect an answer or are time-critical. In these cases we allow the MB to update multiple times between setting inputs and reading outputs. This is similar in concept to allowing evolution to use portions of Jordan recurrent architecture, Elman recurrent architecture, and evolvable connections as in NEAT [9, 16, 35].

2.2.5 Gate Types

For this paper we used four architectures: Canonical Markov Brain (CMB), GP, ANN, and NEAT. We chose these because each has been shown to excel in different problem domains. Elements from other systems, including neural Turing machines [22], HyperNEAT [34], and POMDP [17] were considered and could be incorporated later.

Our intent here is not to compare these different computational substrates, which would not be possible or meaningful given that we are re-implementing them in the MB framework, but rather to use these architectures for investigation.

To represent MB we selected two gate types: deterministic and probabilistic, which were the first and most commonly used gates. When we are referring to the use of these gate types we will use the abbreviation CMB (i.e. Canonical Markov Brain) rather then MB to avoid confusion. These gates take 1–4 inputs and generate 1–4 outputs. The update function is a look-up table that maps inputs to outputs. In the probabilistic gate, every input pattern can result in any output pattern. For each possible input pattern, every possible output pattern is assigned a probability. Determining which output is generated by a given input requires the generation of a random number during gate update.

From GP, we co-opted the idea of unary and binary math operations. A GP gate takes 1 or 2 inputs and then performs an operation generating a single output (see Table 2.1).

Table 2.1 Computational elements from genetic programming

From ANN, we adopted the transfer function (summed weighted inputs) and the hyperbolic tangent as the threshold function. These ANN gates have 1–4 inputs serving as the input layer, and 1–4 outputs serving as the output layer, identical to a small ANN without a hidden layer. The specific function of such an ANN gate is controlled by a weight matrix.

From NEAT we borrowed more complex weighting and integration methods [33]. These gates are a hybrid between ANN and GP in that they take multiple inputs, apply weights, aggregate them (product, sum, max, min, or mean), and then pass them though a mathematical operation. These gates have 1–2 inputs, and a single output (for the specific operators used, see Table 2.2).

Table 2.2 Computational elements from NEAT, the inputs become aggregated into a

2.2.6 Tasks

We chose a range of tasks that includes examples for which each of the gate types described above has demonstrated competence. The purpose here is not to benchmark these gate types or the Buffet Method as a whole, but to demonstrate that the Buffet Method allows evolution to leverage the computational abilities of the different gate types, which on average, yields better results than the use of a single gate type. Moreover, our results show that evolution often finds solutions comprised of combinations of gate types.

2.2.6.1 Xor

For this task, two binary inputs are given, and fitness is awarded if the logical XOR operation is correctly computed [15]. This is an extremely simple task for a MB, since the initial population is made from randomly generated logic gates that likely implement this function. However, evolving an ANN to solve this task is not trivial, and this task has been used before as a benchmark example for NEAT [15, 37]. The fitness (performance) is evaluated by presenting 100 pairs of binary inputs one pair at a time, and comparing agent output to the expected output. Correct answers are tallied resulting in a fitness between 0 and 100. Each agent is given 10 updates to allow multi-step computation to be performed before evaluating the output.

2.2.6.2 Symbolic Regression

We organized a small set of functions for symbolic regression which all seem to be equally complicated for the GA to find regardless of method (data not shown). Here, we show the result for one function with two inputs x 1 and x 2: f(x 1, x 2) = (x 1 ∗ x 2) ∗ (x 2 − x 1). The fitness of the agent is determined by the difference between this function and the agent’s response (sampled 100 times with random input values between − 2.0 and 2.0), which is summed and squared. Each agent is given 10 updates to allow multi-step computation.

2.2.6.3 Inverted Pendulum

This task [4] involves balancing a vertical beam on a cart that can move only left or right as if on rails. Each agent was evaluated 10 times. The beam (pendulum) is mounted on top of the cart such that it can freely rotate around its mounting point in 1 perpendicular axis (for an overview see Fig. 2.2. The agent can accelerate the cart left or right, and the time the pendulum is above the cart is recorded during 100 simulation updates. The inputs are the current angle of the pendulum Θ, its first derivative Θ̇, the location of the cart x, and the current acceleration of the cart ẋ. For each simulation update the agent experiences 8 multi-step computations, and the output is the cart acceleration \(\overrightarrow {F}\) (limited between − 1.0 and 1.0). Both inputs and outputs in this task are continuous (floating point) values. The code for this was ported from openAI [26].

Fig. 2.2
figure 2

Schematic overview of the inverted pendulum [38]. A weight with the mass m = 0.1, kg is mounted on top of a beam (pendulum) of length l = 0.5 m. The cart must move forward and backward to balance the pendulum upright. The simulation is stopped if the angle Θ increases above 12. The cart has a mass M of 1.0 kg, but can be accelerated with a force \( \overrightarrow {F}\). We model gravity to be earth-like: 9.8 m/s2

2.2.6.4 Value Judgment

This task originates from decision-making in psychology. An agent is confronted with two noisy signals and must identify the stronger signal [19]. Imagine two flickering lights of which you have to identify the one which is lit more often. In this task each agent has two binary inputs and two binary outputs. Each evaluation consists of 100 updates. At the beginning of each evaluation, a random value of 0 or 1 is generated that determines whether the first or second input will be more likely. During each update there is a 55% probability that the more likely input will be 1 and the other input will be 0, and a 45% probability of the opposite. For the first 80 updates, the outputs are ignored. the agent then has 20 updates to provide an answer. Outputs of 0, 0 or 1, 1, are ignored, but 0, 1 and 1, 0 trigger an end of evaluation. If an output of 0, 1 is provided and the second input was more likely or 1, 0 and the first input was more likely, then the agent receives one point. This test is repeated 100 times and the results are summed, resulting in a fitness value between 0 and 100. Note: a) guessing will result in an average score of 50 and b) a perfect score is unlikely because of the small variation in blink probabilities 0.55 vs. 0.45.

2.2.6.5 Block-Catching Task

This task involves an embodied agent that must either catch small blocks or avoid large blocks that are moving obliquely towards the agent [5]. This is a task we thoroughly investigated before [3, 21, 30] and it can be solved by MBs and ANNs (for an illustration of the task see Fig. 2.3). The task is not trivial because information must be integrated over time. Agents in this task are only allowed one brain update per world update so to be successful they should integrate and remember information while new information is presented.

Fig. 2.3
figure 3

(a) In the simulation, large or small blocks fall toward the bottom row of a 20 × 20 discrete world, only one at a time. As a block falls, it moves diagonally in a fixed direction either to the right or left (e.g. on each time step a given block will move down one row and the left one column). The agent is situated at the bottom row and can move left or right. For the purpose of illustrating the task, a large brick may fall to the left, then the next block might be a small block that will fall to the right. In this example the agent is rewarded for catching small blocks and punished for catching large blocks, but the task may be reversed, or made more complicated with more rules (catch left-falling blocks). (b) A depiction of the agent’s states (bottom left: triangles depict sensors, circles illustrate brain (internal) states, trapezoids denote actuators) and the sequence of activity patterns on the agent’s 4-bit retina (right), as a large brick falls to the right. Reproduced from [21], with permission

2.2.6.6 Associative Memory

In the associative learning task the agent is rewarded for each unique location visited along a predefined path. The agent is given one input representing whether the agent is on a straight part of the path or not. Two other inputs are also provided, one if the path progresses to the Left and another if the path progresses to the Right. Fitness is increased for each unique location along the path the agent visits, and decreased for every time step that the agent is not on the path. Between each fitness evaluation the set of inputs for “turn left” and “turn right” are randomized. No information is provided indicating the end of the path. At the beginning of a fitness evaluation the agent first should discover the mapping between signs and turn directions, the behavior of which, if performed, looks like exploration. The agent then may utilize that information to follow the path properly, which appears like an exploitation phase [11] (Fig. 2.4). The original work was performed in AVIDA [1] and this particular extension of the task to associative learning was proposed by Anselmo Pontes in yet unpublished dissertation work. From prior experiments (data not shown) we know that MBs as well as ANNs are well-suited to solve this task.

Fig. 2.4
figure 4

Overview of the associative learning task. The agent (orange triangle) navigates a path (white) on which there are four signals that the agent can only see when standing on top of them. One signal indicates that the path continues in a straight line (black circle). Two other signals (green diamond and blue star) indicate that the path will turn left or right, with the meaning of the signals randomized at the start of every experiment. Thus, the agent must first explore what the signals mean and then exploit that information. The final signal is given when navigating off the path into poison (purple) and has negative consequences, the severity of which is set by the experimenter

2.2.6.7 Noisy Foraging

This task uses an embodied agent that must search for randomly placed food on a discrete grid; once the food is collected the agent must return to a previously defined home location to increase fitness. This foraging trip can be repeated many times over the lifetime of the agent, on each repetition the randomly placed food is moved farther away from home. The home and food locations are marked by a beacon that can be seen from great distance by the agent. The agent has eight detectors, each covering a 45 arc that can detect the food and home beacons. Agents can turn 45 left and right or move forward. Additional sensors inform the agent about having reached the home location or a location where food can be found. Each agent has between 9900 and 10,155 time steps to collect food. The variation in lifetime prevents agents from deploying a repetitive search pattern (data not shown). Fitness is defined according to the following function:

$$\displaystyle \begin{aligned} w= \sum_{j=0}^{j<s_{i}}1.05^{\frac{1}{t_{i,j}}d_{i,j}^{2}} {} \end{aligned} $$
(2.1)

where fitness is w, the distance of food to home is d, the time each trip takes is t (only considering the last time an organism leaves either food or home until it reaches the other), and the number of successful trips per trial is s. This equation rewards efficient resource collection by penalizing time-consuming search (\(\frac {1}{t}\)). Additionally, collecting additional resources is rewarded exponentially (d 2). We then summed over all successful foraging trips completed in one evaluation of the agent.

2.2.6.8 Maze Solving Task

In this navigation task [8], the agent is rewarded for navigating a binary maze consisting of a sequence of long parallel walls, each containing a single door somewhere along the wall (Fig. 2.5). When passing through a door the agent receives an input (1) if the next door is to the right of the current door or an input (0) if the next door is not to the right of the current door. The agent can only perform three actions: step forward, sidestep left, or sidestep right. This task requires the agent to identify sporadic signals and remember them (at least until the next door) in order to navigate efficiently.

Fig. 2.5
figure 5

Overview of the navigation task. A typical maze (panel A), with gray boxes indicating walls and white boxes indicating empty space to navigate. The agent (red triangle) finds the shortest path through the maze by attending to the signals received when passing through the doors. A signal (purple speech bubble) indicates that the next door is to the right (from the perspective of the agent), and the absence of a signal indicates that the next door is either straight ahead or to the left. When the agent navigates successfully through the maze, ideally following the optimal path (dashed red line), it is moved back to the start. A schematic view of inputs feeding into the MB and its connection to outputs is shown in panel B. If both actuators receive a positive signal then the agent attempts to step forward. If actuator A receives a positive signal and B does not then the agent attempts to step to the left, and vice versa for a step to the right. An attempt to step sideways into a wall is conveyed by left/right wall inputs. The agent experiences single updates, disallowing multi-step computation

2.2.6.9 Behavioral Optimization in “Berry-World”

This task was designed to test speciation and specialization of behavior [6], however, it also allows us to test how agents integrate past information to optimize future decisions. The environment is a small grid (6 × 6) surrounded by a wall and filled with two types of collectible resources (commonly referred to as red and blue berries, hence ‘berry world’). The agent is evaluated for 200 time steps. On each time step the agent can turn 45 left or right, move forward, or consume the berry at its current location. If a berry is consumed and the agent moves, then the empty spot is replenished with a new random berry (red or blue). The agent can sense resources at its current location, in front of it, to the front left, and to the front right; empty, red, blue, or wall. Consumption of each berry is rewarded with one point, however if the type of berry consumed differs from the previous one, then a task-switching cost (1.4 points) is subtracted. The task-switching penalty discourages random consumption and encourages foraging strategies that minimize switching. On each world update (time step), agents receive five binary inputs. The first two describe the state of the location occupied by the agent (either 1, 0 red food, 0, 1 blue food or, 0, 0 no food here). The other three inputs depend on the state of the location in front of the agent (red food, blue food or wall). the agent provides 2 binary output (0, 0 = no action, 0, 1 = turn right, 1, 0 = turn left, 1, 1 = move forward).

2.2.7 Experimental Parameters

For each possible combination of brains (CMB, GP, NEAT, ANN, and buffet), environments (XOR, symbolic regression, inverted pendulum, value judgment, berry world, spatial temporal integration, maze, associative learning, and noisy foraging) and encoding (direct and indirect) we investigated 100 replicates of each condition. Each condition was seeded with 10, 000 randomly generated agents in the first generation (to increase the likelihood to start with an initial viable agent). After that the population size was reduced to 100 agents and evolution progressed for 5000 generations. In each generation roulette wheel selection was used to choose parents and via asexual reproduction, mutated offspring were generated. We expect, our results should also apply to other search methods such as map-elites [24], novelty search [20], and sophisticated hill climbers [10].

All experiments were implemented and performed using the MABE framework [6]. MABE is a general purpose digital evolution framework designed to allow for the arbitrary combination of modules in order to construct agent-based evolution experiments. Here we leveraged MABE’s ability to “swap” world modules (fitness functions) between experiments while leaving computational substrates, genomes, mutational operations, and the selection method fixed.

2.3 Results

While most combinations of brains and environments performed well, some combinations failed to find a solution or preformed sub-optimally (see Fig. 2.6). For example, NEAT, GP, and ANNs struggled with the Berry World and Foraging environments, and CMB was unable to solve the pendulum task in the time allotted. The Buffet Method was able to solve all tasks near perfectly.

Fig. 2.6
figure 6

Comparisons of different experimental conditions of the buffet method only run allowing CMB, NEAT, GP, or ANN gates, respectively, in comparison to allowing them ALL types of gates at the same time (columns). Each condition was tested on nine different tasks (XOR, symbolic regression, inverted pendulum, value judgment, berry world, block catching, maze navigation, associative memory, and noisy foraging) represented by the rows. Average performance for the indirect encoding shown as a solid black line, direct encoding represented by a dashed line. The red dashed line indicates the maximum performance on each task where applicable, except for the symbolic regression, which tries to minimize the error

When comparing CMB, GP, ANN, and NEAT to the Buffet Method we find that the Buffet Method generally evolves populations to higher fitness regardless of direct or indirect encoding (see Fig. 2.7), but in some cases is not the best. In particular, ANNs performed better on the Inverted Pendulum task and CMBs perform better on the associative memory task.

Fig. 2.7
figure 7

Final normalized scores after 5000 generations for encoding methods. See the legend for the gate conditions. The x-axis is the eight different environments without the symbolic regression task because that task is a score-minimizing task. Y-axis \(\bar {W}\) is average fitness. Standard error is shown. (a) Results after genetic encoding was used. (b) Results after direct encoding was used

To investigate which components evolution selects, we reconstructed gate usage over evolutionary time for CMB, NEAT, GP, and ANN gates. See Fig. 2.8. In some environments (symbolic regression and inverted pendulum) ANN gates are predominantly used, while in the XOR and Berry World CMB gates were dominant. In all other environments we find more than one gate type with slight bias toward one type or another (depending on task). Inspecting individual brains (data not shown) confirms that evolved brains are composed of different gate types. This indicates that the Buffet Method is not just allowing evolution to select a single optimal gate type, but to also generate heterogeneous brains composed of different gate types.

Fig. 2.8
figure 8

Gate usage for the different experimental conditions using genetic encoding (direct encoding results are similar, data not shown), in orange deterministic logic gates, in red probabilistic logic gates, in shades of green GP gates, in shades of blue NEAT gates, and dark gray ANN gates

2.4 Discussion and Conclusion

We showed that the Buffet Method performs generally well across all tasks, while each of the subsets (CMB, NEAT, GP, ANN) failed at least one task and under-performed on others.

While we cannot say for certain why different computational substrates struggle with some tasks and not others, it is worth noting that the two tasks that were problematic for NEAT, GP, and ANN involve directional navigation. We do know that CMBs’ difficulty with the pendulum task is a result of the fact that MB gates are not capable of producing negative outputs (because we chose to represent CMB with deterministic and probabilistic gates which are binary and non-negative), and thus can only accelerate the cart in one direction. We could have resolved the problem of negative numbers for CMBs in the Inverted Pendulum task by changing the meaning of the outputs for that task. We could have discretized the input to a string of bits and instead of a force, we could have used two binary outputs, where both 00 and 11 mean nothing, and 01 or 10 indicate an acceleration to the left or right. The reason that we did not re-implement the input and output to the Inverted Pendulum task such that non-continuous value substrates could solve it was to highlight that not all computational substrates will always be able to solve all problems. Our approach allowed us to test whether the Buffet Method was capable of discovering solutions using the provided elements given that the problem could be solved by at least some part of the provided elements. One of the advantages of the buffet Method is that it allows for the combination of significantly different computational substrates. In this case, the substrates had different limitations on their input and output specifics. Previous work has shown that the representation of a task can affect evolutionary outcomes [2]. With the Buffet Method we were able to include tasks without needing to consider how, or even if, the task will interface with the substrates.

Considering that the Buffet Method has access to all of the elements of the individual computational substrates it is noteworthy that it could not always find the optimal solution. For example, a MB using only ANN gates is superior to the Buffet Method on the pendulum task. Why did the Buffet Method not simply discard all other gates while retaining the ANN gates? The same logic applies to the association task where using MB gates alone produced better results than the Buffet Method. We expect that this effect may be related to historical contingency. That is, if a strategy that provided some fitness using a sub-optimal gate type was discovered, this gate and strategy may have been “locked in” prohibiting the discovery of a more optimal solution. But this is simply conjecture and requires more investigation.

2.5 Future Work

Here we incorporated ANN, NEAT, and GP gates into a MB substrate, but integration of probabilistic and deterministic MB logic gates and ANN weighted threshold elements into GP or NEAT should be investigated. This idea is not entirely novel. For instance, some implementations of Cartesian GP include binary logic elements [23]. Future exploration into integrated computational substrates, the methods that allow for arbitrary integration, and methods for testing these emerging systems provide ample opportunity for research and development.

The work presented here suggests that the typical approach of finding the correct computational substrate for a given problem is sub-optimal to the Buffet Method in which evolution can be used to discover not only which computational substrate is optimal for a given problem, but can generate new hybrid systems in an automated manner.

Far from insisting that everyone should abandon what they are doing to work on the Buffet Method, we hope that work continues exploring separate domains, so that whenever a new idea is shown to perform well (even if the domain is narrow) this idea can be integrated into the buffet.

One area we did not explore is task classification depending on gate usage. The Buffet Method could be applied to a greater number of tasks and a comparison made between the times to achieve optimal solutions given access to a subset of gates. Or, the distribution of gates used to make up solutions when all gates are provided could be compared. Such profiles could provide an objective task classification method.

Lastly, we found that the Buffet Method creates new heterogeneous solutions made from different components. One cannot help but wonder how components never intended to work together might suddenly form functional computational machines. We will explore these hybrid substrates in the future.