Scaling of program functionality

Langdon, W. B.

doi:10.1007/s10710-008-9065-y

Scaling of program functionality

Original Paper
Published: 29 July 2008

Volume 10, pages 5–36, (2009)
Cite this article

Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

W. B. Langdon¹

125 Accesses
4 Citations
Explore all metrics

Abstract

The distribution of fitness values (landscapes) of programs tends to a limit as the programs get bigger. We use Markov chain convergence theorems to give general upper bounds on the length of programs needed for convergence. How big programs need to be to approach the limit depends on the type of the computer they run on. We give bounds (exponential in N, N log N and smaller) for five computer models: any, average or amorphous or random, cyclic, bit flip and four functions (AND, NAND, OR and NOR). Programs can be treated as lookup tables which map between their inputs and their outputs. Using this we prove similar convergence results for the distribution of functions implemented by linear computer programs. We show most functions are constants and the remainder are mostly parsimonious. The effect of ad-hoc rules on genetic programming (GP) are described and new heuristics are proposed. We give bounds on how long programs need to be before the distribution of their functionality is close to its limiting distribution, both in general and for average computers. The computational importance of destroying information is discussed with respect to reversible and quantum computers. Mutation randomizes a genetic algorithm population in \(\frac{1}{4}(l+1)(\hbox{log}\,(l)+4)\) generations. Results for average computers and a model like genetic programming are confirmed experimentally.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incremental Evaluation in Genetic Programming

Uniform Linear Transformation with Repair and Alternation in Genetic Programming

Large Population Sizes and Crossover Help in Dynamic Environments

References

D.H. Wolpert, W.G. Macready, No free lunch theorems for optimization. IEEE Trans. Evolut. Comput. 1(1), 67–82 (1997)
Google Scholar
J.A. Foster, Review: discipulus: a commercial genetic programming system. Genet. Prog. Evol. Mach. 2(2), 201–203 (2001). doi:10.1023/A:1011516717456
Article MathSciNet Google Scholar
C.M. Reidys, P.F. Stadler, Combinatorial landscapes. SIAM Rev. 44(1), 3–54. doi:10.1137/S0036144501395952
W.B. Langdon, R. Poli, Foundations of Genetic Programming (Springer, 2002)
W.B. Langdon, Convergence rates for the distribution of program outputs, in GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, New York, 9–13 July 2002, ed. by W.B. Langdon, E. Cantú-Paz, K. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M.A. Potter, A.C. Schultz, J.F. Miller, E. Burke, N. Jonoska, pp. 812–819
W.B. Langdon, How many good programs are there? How long are they? in Foundations of Genetic Algorithms VII, Torremolinos, Spain, 4–6 September 2002, ed. by K.A. De Jong, R. Poli, J.E. Rowe (Morgan Kaufmann, 2002), pp. 183–202
W.B. Langdon, Mapping non-conventional extensions of genetic programming, in Unconventional Computing 2006, vol. 4135. LNCS, ed. by C.S. Calude, M.J. Dinneen, G. Paun, G. Rozenberg, S. Stepney (Springer, York, 4–8 September 2006), pp. 166–180, doi:10.1007/11839132_14
W. Banzhaf, P. Nordin, R.E. Keller, F.D. Francone, Genetic Programming—An Introduction; On the Automatic Evolution of Computer Programs and its Applications (Morgan Kaufmann, San Francisco, 1998)
W. Banzhaf, Challenging the program counter, in The Grand Challenge in Non-Classical Computation: International Workshop, York, UK, 18–19 April 2005, ed. by S. Stepney, S. Emmott
W. Feller, An Introduction to Probability Theory and Its Applications, 2nd ed., vol. 1 (John Wiley and Sons, New York, 1957)
W. Feller, An Introduction to Probability Theory and Its Applications, vol. 2. (John Wiley and Sons, New York, 1966)
MATH Google Scholar
O. Haggstrom, Finite Markov Chains and Algorithmic Applications, vol. 52. London Mathematical Society Student Texts. (Cambridge University Press, Cambridge, 2002)
P. Diaconis, Group Representations in Probability and Statistics, vol. 11. Lecture notes-Monograph Series. (Institute of Mathematical Sciences, Hayward, California, 1988)
J.S. Rosenthal, Convergence rates for Markov chains. SIAM Rev. 37(3), 387–405 (1995)
Article MATH MathSciNet Google Scholar
D. Stirzaker, Probability and Random Variables A Beginner’s Guide (Cambridge University Press, Cambridge, 1999)
P. Nordin, Evolutionary Program Induction of Binary Machine Code and its Applications. PhD thesis, der Universitat Dortmund am Fachereich Informatik (1997)
J.G. Propp, D.B. Wilson, Exact sampling with coupled markov chains and applications to statistical mechanics. Random Struct Algorithms 9(1 & 2), 223–252 (1996)
Google Scholar
W.B. Langdon, The distribution of reversible functions is Normal, in Genetic Programming Theory and Practise, Chap. 11, ed. by R.L. Riolo, B. Worzel, (Kluwer, 2003), pp. 173–188
P.G. Bishop, Using reversible computing to achieve fail-safety, in Proceedings of the Eighth International Symposium On Software Reliability Engineering, Albuquerque, NM, USA, 2–5 Nov 1997 (IEEE Press, 1997), pp. 182–191
T. Bäck, Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms (Oxford University Press, New York, 1996)
J. Garnier, L. Kallel, M. Schoenauer, Rigorous hitting times for binary mutations. Evol. Comput. 7(2), 173–203 (1999)
Article Google Scholar
Y. Gao, An upper bound on the convergence rates of canonical genetic algorithms. Complex. Int. 5 (1998)
T. Blickle, Theory of Evolutionary Algorithms and Application to System Synthesis. PhD thesis, Swiss Federal Institute of Technology, Zurich (November 1996)
W.B. Langdon, T. Soule, R. Poli, J.A. Foster, The evolution of size and shape, in Advances in Genetic Programming 3, Chap. 13, ed. by L. Spector, W.B. Langdon, U.-M. O’Reilly, P.J. Angeline (MIT Press, 1999), pp. 163–190
W.B. Langdon, Genetic Programming and Data Structures. (Kluwer, Boston, 1998)
A. Teller, Genetic programming, indexed memory, the halting problem, and other curiosities, in Proceedings of the 7th annual Florida Artificial Intelligence Research Symposium, Pensacola, Florida, USA, May 1994 (IEEE Press, 1994), pp. 270–274
F.D. Francone, Discipulus Owner’s Manual. 11757 W. Ken Caryl Avenue F, PBM 512, Littleton, Colorado, 80127-3719, USA, version 3.0 draft edition, 2001%
W.B. Langdon, P. Nordin, Evolving hand-eye coordination for a humanoid robot with machine code genetic programming, in Genetic Programming, Proceedings of EuroGP’2001, vol. 2038. LNCS, Lake Como, Italy, 18−20 April 2001, ed. by J.F. Miller, M. Tomassini, P.L. Lanzi, C. Ryan, A.G.B. Tettamanzi, W.B. Langdon (Springer, 2001), pp. 313–324
R. Poli, W.B. Langdon, Sub-machine-code genetic programming, in Advances in Genetic Programming 3, Chap. 13, ed. by L. Spector, W.B. Langdon, U.-M. O’Reilly, and P.J. Angeline (MIT Press, 1999), pp. 301–323

Download references

Acknowledgements

I would like to thank Jeffrey Rosenthal, Tom Westerdale, James A. Foster, Riccardo Poli, Ingo Wegener, Nic McPhee, Michael Vose, Jon Rowe, Wolfgang Banzhaf, Tina Yu and the anonymous referees.

Author information

Authors and Affiliations

Mathematical and Biological Sciences and Computing and Electronic Systems, University of Essex, Essex, CO4 3SQ, UK
W. B. Langdon

Authors

W. B. Langdon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to W. B. Langdon.

Appendix A: Summary

The distribution of outputs produced by all computers converges to a limiting distribution as their (linear) programs get longer. We provide a general quantitative upper bound (2.30a I ^a, where I is the number of instructions and a is the length of programs needed to store every possible value in the computer’s memory, Sect. 5). Tighter bounds are given for four types of computer. There are radical differences in their convergence rates. The length of programs needed for convergence depends heavily on the type of computer, the size of its (data) memory N and its instruction set.

The cyclic computer (Sect. 6) converges most slowly, ≤0.35 × 2^2N, for large N. In contrast the bit flip computer (Sect. 7) takes only \(\frac{1}{4}(N+1)(\hbox{log}\,(m)+4)\) random instructions (m bits in output register). In both computers, the distributions of outputs and of functions converge at the same rate to a uniform limiting distribution.

In Sect. 8 we introduced a random or amorphous, model of computers. This represents the average behaviour over all computers (cf. NFL [1]). It takes less than (15.3 + 2.30m)/log I random instructions to get close to the uniform output limit. The limiting distribution contains only functions that are constants. Again convergence is exponential with 90% of programs of length 1.6 × n2^N yielding constants (n is the size of the input register in bits).

Section 9 shows the output of programs comprised of four common Boolean operators converges to a uniform distribution within \(\frac{1}{2}N (\hbox{log}\,(m) + 4)\) random instructions. The importance of the pragmatic heuristic of write protecting the input register, is highlighted, since without it there are no “interesting” functions in the limit of large programs.

In Sect. 10 we showed quantum and reversible computers do not have synchronising sequences and consequently the behaviour of their long programs is radically different from that of conventional computers.

Section 11 shows the number of generations (\(\frac{1}{4}(l+1)(\hbox{log}(l)+4)\)) needed for mutation alone to randomise a bit string genetic algorithm (chromosome of l bits).

Practical GP fitness functions will converge faster than the distribution of all functions, since they typically test only a small part of the whole function. Real GP systems allow rapid movement about the computer’s state space and so appear to be close to the bit flipping (Sect. 7) and four Boolean instruction (Sect. 9) models. We speculate rapid O(|test set|N log m) convergence in fitness distributions may be observed.

The number of minimal solutions to XOR (even-2 parity) grows quadratically in memory size (cf. Sect. 5.3) but this corresponds to a rapid fall in proportion as memory is increased.

In the Boolean linear systems considered, complex functions are very rare even in short programs and appear to reach a peak in their frequency near l = N/m. This suggests a size limit of O(N/m) might be beneficial to linear GP. The peak is followed by exponential decline, with the same exponent (\(\approx \sqrt{2/N^{3}}\)) as the other non-trivial functions. Since \(\sqrt{2/N^{3}} < I\), the number of solutions grows exponentially with program length l. It is also appears that the frequency of complex functions decreases dramatically as the number of operations needed to created them from the program’s inputs increases. I.e. most functions are parsimonious.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Langdon, W.B. Scaling of program functionality. Genet Program Evolvable Mach 10, 5–36 (2009). https://doi.org/10.1007/s10710-008-9065-y

Download citation

Received: 07 March 2007
Revised: 17 June 2008
Accepted: 01 July 2008
Published: 29 July 2008
Issue Date: March 2009
DOI: https://doi.org/10.1007/s10710-008-9065-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scaling of program functionality

Abstract

Access this article

Similar content being viewed by others

Incremental Evaluation in Genetic Programming

Uniform Linear Transformation with Repair and Alternation in Genetic Programming

Large Population Sizes and Crossover Help in Dynamic Environments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix A: Summary

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Incremental Evaluation in Genetic Programming

Uniform Linear Transformation with Repair and Alternation in Genetic Programming

Large Population Sizes and Crossover Help in Dynamic Environments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix A: Summary

Appendix A: Summary

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation