On sampling error in genetic programming
Created by W.Langdon from
gp-bibliography.bib Revision:1.8010
- @Article{Schweim:NatComput,
-
author = "Dirk Schweim and David Wittenberg and Franz Rothlauf",
-
title = "On sampling error in genetic programming",
-
journal = "Natural Computing",
-
year = "2022",
-
volume = "21",
-
pages = "173--186",
-
keywords = "genetic algorithms, genetic programming, Sampling
error, Initial supply, Building blocks, Initial
population, Ramped half-and-half, Full, Grow, n-Grams",
-
ISSN = "1567-7818",
-
URL = "https://rdcu.be/cmvW6",
-
DOI = "doi:10.1007/s11047-020-09828-w",
-
size = "14 pages",
-
abstract = "The initial population in genetic programming (GP)
should form a representative sample of all possible
solutions (the search space). While large populations
accurately approximate the distribution of possible
solutions, small populations tend to incorporate a
sampling error. This paper analyzes how the size of a
GP population affects the sampling error and
contributes to answering the question of how to size
initial GP populations. First, we present a
probabilistic model of the expected number of subtrees
for GP populations initialized with full, grow, or
ramped half-and-half. Second, based on our frequency
model, we present a model that estimates the sampling
error for a given GP population size. We validate our
models empirically and show that, compared to smaller
population sizes, our recommended population sizes
largely reduce the sampling error of measured fitness
values. Increasing the population sizes even more,
however, does not considerably reduce the sampling
error of fitness values. Last, we recommend population
sizes for some widely used benchmark problem instances
that result in a low sampling error. A low sampling
error at initialization is necessary (but not
sufficient) for a reliable search since lowering the
sampling error means that the overall random variations
in a random sample are reduced. Our results indicate
that sampling error is a severe problem for GP, making
large initial population sizes necessary to obtain a
low sampling error. Our model allows practitioners of
GP to determine a minimum initial population size so
that the sampling error is lower than a threshold,
given a confidence level.",
-
notes = "Johannes Gutenberg University, Jakob-Welder-Weg
9,55128 Mainz, Germany",
- }
Genetic Programming entries for
Dirk Schweim
David Wittenberg
Franz Rothlauf
Citations