Skip to main content

Advertisement

Log in

Scaling of program functionality

  • Original Paper
  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

The distribution of fitness values (landscapes) of programs tends to a limit as the programs get bigger. We use Markov chain convergence theorems to give general upper bounds on the length of programs needed for convergence. How big programs need to be to approach the limit depends on the type of the computer they run on. We give bounds (exponential in N, N log N and smaller) for five computer models: any, average or amorphous or random, cyclic, bit flip and four functions (AND, NAND, OR and NOR). Programs can be treated as lookup tables which map between their inputs and their outputs. Using this we prove similar convergence results for the distribution of functions implemented by linear computer programs. We show most functions are constants and the remainder are mostly parsimonious. The effect of ad-hoc rules on genetic programming (GP) are described and new heuristics are proposed. We give bounds on how long programs need to be before the distribution of their functionality is close to its limiting distribution, both in general and for average computers. The computational importance of destroying information is discussed with respect to reversible and quantum computers. Mutation randomizes a genetic algorithm population in \(\frac{1}{4}(l+1)(\hbox{log}\,(l)+4)\) generations. Results for average computers and a model like genetic programming are confirmed experimentally.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. D.H. Wolpert, W.G. Macready, No free lunch theorems for optimization. IEEE Trans. Evolut. Comput. 1(1), 67–82 (1997)

    Google Scholar 

  2. J.A. Foster, Review: discipulus: a commercial genetic programming system. Genet. Prog. Evol. Mach. 2(2), 201–203 (2001). doi:10.1023/A:1011516717456

    Article  MathSciNet  Google Scholar 

  3. C.M. Reidys, P.F. Stadler, Combinatorial landscapes. SIAM Rev. 44(1), 3–54. doi:10.1137/S0036144501395952

  4. W.B. Langdon, R. Poli, Foundations of Genetic Programming (Springer, 2002)

  5. W.B. Langdon, Convergence rates for the distribution of program outputs, in GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, New York, 9–13 July 2002, ed. by W.B. Langdon, E. Cantú-Paz, K. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M.A. Potter, A.C. Schultz, J.F. Miller, E. Burke, N. Jonoska, pp. 812–819

  6. W.B. Langdon, How many good programs are there? How long are they? in Foundations of Genetic Algorithms VII, Torremolinos, Spain, 4–6 September 2002, ed. by K.A. De Jong, R. Poli, J.E. Rowe (Morgan Kaufmann, 2002), pp. 183–202

  7. W.B. Langdon, Mapping non-conventional extensions of genetic programming, in Unconventional Computing 2006, vol. 4135. LNCS, ed. by C.S. Calude, M.J. Dinneen, G. Paun, G. Rozenberg, S. Stepney (Springer, York, 4–8 September 2006), pp. 166–180, doi:10.1007/11839132_14

  8. W. Banzhaf, P. Nordin, R.E. Keller, F.D. Francone, Genetic Programming—An Introduction; On the Automatic Evolution of Computer Programs and its Applications (Morgan Kaufmann, San Francisco, 1998)

  9. W. Banzhaf, Challenging the program counter, in The Grand Challenge in Non-Classical Computation: International Workshop, York, UK, 18–19 April 2005, ed. by S. Stepney, S. Emmott

  10. W. Feller, An Introduction to Probability Theory and Its Applications, 2nd ed., vol. 1 (John Wiley and Sons, New York, 1957)

  11. W. Feller, An Introduction to Probability Theory and Its Applications, vol. 2. (John Wiley and Sons, New York, 1966)

    MATH  Google Scholar 

  12. O. Haggstrom, Finite Markov Chains and Algorithmic Applications, vol. 52. London Mathematical Society Student Texts. (Cambridge University Press, Cambridge, 2002)

  13. P. Diaconis, Group Representations in Probability and Statistics, vol. 11. Lecture notes-Monograph Series. (Institute of Mathematical Sciences, Hayward, California, 1988)

  14. J.S. Rosenthal, Convergence rates for Markov chains. SIAM Rev. 37(3), 387–405 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  15. D. Stirzaker, Probability and Random Variables A Beginner’s Guide (Cambridge University Press, Cambridge, 1999)

  16. P. Nordin, Evolutionary Program Induction of Binary Machine Code and its Applications. PhD thesis, der Universitat Dortmund am Fachereich Informatik (1997)

  17. J.G. Propp, D.B. Wilson, Exact sampling with coupled markov chains and applications to statistical mechanics. Random Struct Algorithms 9(1 & 2), 223–252 (1996)

    Google Scholar 

  18. W.B. Langdon, The distribution of reversible functions is Normal, in Genetic Programming Theory and Practise, Chap. 11, ed. by R.L. Riolo, B. Worzel, (Kluwer, 2003), pp. 173–188

  19. P.G. Bishop, Using reversible computing to achieve fail-safety, in Proceedings of the Eighth International Symposium On Software Reliability Engineering, Albuquerque, NM, USA, 2–5 Nov 1997 (IEEE Press, 1997), pp. 182–191

  20. T. Bäck, Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms (Oxford University Press, New York, 1996)

  21. J. Garnier, L. Kallel, M. Schoenauer, Rigorous hitting times for binary mutations. Evol. Comput. 7(2), 173–203 (1999)

    Article  Google Scholar 

  22. Y. Gao, An upper bound on the convergence rates of canonical genetic algorithms. Complex. Int. 5 (1998)

  23. T. Blickle, Theory of Evolutionary Algorithms and Application to System Synthesis. PhD thesis, Swiss Federal Institute of Technology, Zurich (November 1996)

  24. W.B. Langdon, T. Soule, R. Poli, J.A. Foster, The evolution of size and shape, in Advances in Genetic Programming 3, Chap. 13, ed. by L. Spector, W.B. Langdon, U.-M. O’Reilly, P.J. Angeline (MIT Press, 1999), pp. 163–190

  25. W.B. Langdon, Genetic Programming and Data Structures. (Kluwer, Boston, 1998)

  26. A. Teller, Genetic programming, indexed memory, the halting problem, and other curiosities, in Proceedings of the 7th annual Florida Artificial Intelligence Research Symposium, Pensacola, Florida, USA, May 1994 (IEEE Press, 1994), pp. 270–274

  27. F.D. Francone, Discipulus Owner’s Manual. 11757 W. Ken Caryl Avenue F, PBM 512, Littleton, Colorado, 80127-3719, USA, version 3.0 draft edition, 2001%

  28. W.B. Langdon, P. Nordin, Evolving hand-eye coordination for a humanoid robot with machine code genetic programming, in Genetic Programming, Proceedings of EuroGP’2001, vol. 2038. LNCS, Lake Como, Italy, 18−20 April 2001, ed. by J.F. Miller, M. Tomassini, P.L. Lanzi, C. Ryan, A.G.B. Tettamanzi, W.B. Langdon (Springer, 2001), pp. 313–324

  29. R. Poli, W.B. Langdon, Sub-machine-code genetic programming, in Advances in Genetic Programming 3, Chap. 13, ed. by L. Spector, W.B. Langdon, U.-M. O’Reilly, and P.J. Angeline (MIT Press, 1999), pp. 301–323

Download references

Acknowledgements

I would like to thank Jeffrey Rosenthal, Tom Westerdale, James A. Foster, Riccardo Poli, Ingo Wegener, Nic McPhee, Michael Vose, Jon Rowe, Wolfgang Banzhaf, Tina Yu and the anonymous referees.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to W. B. Langdon.

Appendix A: Summary

Appendix A: Summary

The distribution of outputs produced by all computers converges to a limiting distribution as their (linear) programs get longer. We provide a general quantitative upper bound (2.30a I a, where I is the number of instructions and a is the length of programs needed to store every possible value in the computer’s memory, Sect. 5). Tighter bounds are given for four types of computer. There are radical differences in their convergence rates. The length of programs needed for convergence depends heavily on the type of computer, the size of its (data) memory N and its instruction set.

The cyclic computer (Sect. 6) converges most slowly, ≤0.35 × 22N, for large N. In contrast the bit flip computer (Sect. 7) takes only \(\frac{1}{4}(N+1)(\hbox{log}\,(m)+4)\) random instructions (m bits in output register). In both computers, the distributions of outputs and of functions converge at the same rate to a uniform limiting distribution.

In Sect. 8 we introduced a random or amorphous, model of computers. This represents the average behaviour over all computers (cf. NFL [1]). It takes less than (15.3 + 2.30m)/log I random instructions to get close to the uniform output limit. The limiting distribution contains only functions that are constants. Again convergence is exponential with 90% of programs of length 1.6 × n2N yielding constants (n is the size of the input register in bits).

Section 9 shows the output of programs comprised of four common Boolean operators converges to a uniform distribution within \(\frac{1}{2}N (\hbox{log}\,(m) + 4)\) random instructions. The importance of the pragmatic heuristic of write protecting the input register, is highlighted, since without it there are no “interesting” functions in the limit of large programs.

In Sect. 10 we showed quantum and reversible computers do not have synchronising sequences and consequently the behaviour of their long programs is radically different from that of conventional computers.

Section 11 shows the number of generations (\(\frac{1}{4}(l+1)(\hbox{log}(l)+4)\)) needed for mutation alone to randomise a bit string genetic algorithm (chromosome of l bits).

Practical GP fitness functions will converge faster than the distribution of all functions, since they typically test only a small part of the whole function. Real GP systems allow rapid movement about the computer’s state space and so appear to be close to the bit flipping (Sect. 7) and four Boolean instruction (Sect. 9) models. We speculate rapid O(|test set|N log m) convergence in fitness distributions may be observed.

The number of minimal solutions to XOR (even-2 parity) grows quadratically in memory size (cf. Sect. 5.3) but this corresponds to a rapid fall in proportion as memory is increased.

In the Boolean linear systems considered, complex functions are very rare even in short programs and appear to reach a peak in their frequency near l = N/m. This suggests a size limit of O(N/m) might be beneficial to linear GP. The peak is followed by exponential decline, with the same exponent (\(\approx \sqrt{2/N^{3}}\)) as the other non-trivial functions. Since \(\sqrt{2/N^{3}} < I\), the number of solutions grows exponentially with program length l. It is also appears that the frequency of complex functions decreases dramatically as the number of operations needed to created them from the program’s inputs increases. I.e. most functions are parsimonious.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Langdon, W.B. Scaling of program functionality. Genet Program Evolvable Mach 10, 5–36 (2009). https://doi.org/10.1007/s10710-008-9065-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10710-008-9065-y

Keywords

Navigation