Skip to main content
Log in

Semantically-based crossover in genetic programming: application to real-valued symbolic regression

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

We investigate the effects of semantically-based crossover operators in genetic programming, applied to real-valued symbolic regression problems. We propose two new relations derived from the semantic distance between subtrees, known as semantic equivalence and semantic similarity. These relations are used to guide variants of the crossover operator, resulting in two new crossover operators—semantics aware crossover (SAC) and semantic similarity-based crossover (SSC). SAC, was introduced and previously studied, is added here for the purpose of comparison and analysis. SSC extends SAC by more closely controlling the semantic distance between subtrees to which crossover may be applied. The new operators were tested on some real-valued symbolic regression problems and compared with standard crossover (SC), context aware crossover (CAC), Soft Brood Selection (SBS), and No Same Mate (NSM) selection. The experimental results show on the problems examined that, with computational effort measured by the number of function node evaluations, only SSC and SBS were significantly better than SC, and SSC was often better than SBS. Further experiments were also conducted to analyse the perfomance sensitivity to the parameter settings for SSC. This analysis leads to a conclusion that SSC is more constructive and has higher locality than SAC, NSM and SC; we believe these are the main reasons for the improved performance of SSC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Since Sampling Semantics is defined for any subtree, it can be used in particular to estimate the semantics of the whole tree. We will use it in this way in the examples in later sections.

  2. We are using similarity here in its ordinary English meaning, where A is similar to B implies that A is not the same as B, as opposed to a common mathematical convention in which similarity includes equivalence.

  3. We assume that the computation costs of all primitive functions are the same, or at least negligibly different when compared to the cost of individual fitness evaluation.

  4. SACs with different X are denoted as SACX (with X = 2, 3, 4, and 5).

  5. Denoted as SSCX, where X is 8, 12, 16, and 20.

  6. In the original version, CAC is applied after 80% of individual evaluations in a run. Here we use node evaluation for the purpose of uniform comparisons. We did, however, apply CAC earlier in the 40, 60%, etc of the last node evaluations to compensate for this difference. The results were worse than those reported in the paper. We omit these results and only report the best results for CAC to compact the result tables.

  7. Denoted as CACX with X = 1, 2, and 4.

  8. Denoted as SBSXY, with X = 3, 4 and Y = 1, 2, 4.

  9. Denoted as SSCUX where X is 0.1, 0.2, 0.4, 0.6, 0.8, or 1.

  10. Denoted as SSCLX with X = 1, 2, 3, 4, and 5.

  11. Denoted as SSCMTX, with X = 4, 8, 12, 16, and 20.

  12. Denoted as SSCNPX with X = 0.5, 1 or 2.

  13. The values for SSCLX and SSCPX are not shown in this table as they have little effect.

  14. We have tried increasing the Max_Trial to compensate for decreasing the upper bound. This was unsuccessful, as if UBSS is too small, the exchange of semantics between the two parents is also too small, so that SSC is more readily trapped in local optima.

  15. Denoted as SACX, for X = 1, 2, 3, 4, or 5.

  16. In fact one simple way to use our method is to transform boolean function learning problems to real-valued ones as in [54].

References

  1. C. Alan, Meaning and language: an introduction to semantics and pragmatics. (Oxford Textbooks in Linguistics, Cambridge, 2004)

    Google Scholar 

  2. L. Altenberg, in Advances in Genetic Programming, ed. by K. E. Kinnear, Jr., The evolution of evolvability in genetic programming, chap. 3 (MIT Press, Cambridge, 1994), pp. 47–74

    Google Scholar 

  3. C. Baier, J.P. Katoen, Principles of Model Checking (MIT Press, Cambridge, 2008).

    MATH  Google Scholar 

  4. L. Beadle, C. Johnson, Semantically driven crossover in genetic programming, in Proceedings of the IEEE World Congress on Computational Intelligence (IEEE Press, New York, 2008), pp. 111–116

  5. L. Beadle, C.G. Johnson, Semantic analysis of program initialisation in genetic programming. Genet. Program. Evol. Mach. 10(3), 307–337 (2009)

    Article  Google Scholar 

  6. L. Beadle, C. G. Johnson, Semantically driven mutation in genetic programming. in 2009 IEEE Congress on Evolutionary Computation, ed. by A. Tyrrell (IEEE Computational Intelligence Society, IEEE Press, Trondheim, Norway, 18–21 May 2009), pp. 1336–1342

  7. R.E. Bryant, Graph-based algorithms for Boolean function manipulation. IEEE Trans. Comp. C-35, 677–691 (1986)

    Article  Google Scholar 

  8. E.K. Burke, S. Gustafson, G. Kendall, Diversity in genetic programming: an analysis of measures and correlation with fitness. IEEE Trans. Evol. Comput. 8(1), 47–62 (2004)

    Article  Google Scholar 

  9. R. Cleary, M. O’Neill, Solving knapsack problems with attribute grammars, in Proceedings of the Grammatical Evolution Workshop, 2004

  10. R. Cleary, M. O’Neill, in Proceedings of the Evolutionary Computation in Combinatorial Optimization. An attribute grammar decoder for the 01 multi-constrained knapsack problem (Springer, Berlin, 2005), pp. 34–45

  11. A.M. Collins, M.R. Quillian, Retrieval time from semantic memory. J. Verbal Learn.Verbal Behav. 8, 240–247 (1969)

    Article  Google Scholar 

  12. J.M. Daida, D.S. Ampy, M. Ratanasavetavadhana, H. Li, O. Chaudhri, in Proceedings of the Genetic and Evolutionary Computation Conference, (GECCO’1999). Challenges with verification, repeatability, and meaningful comparison in genetic programming: Gibson’s magic (Morgan Kaufmann, 1999), pp. 1851–1858

  13. M. de la Cruz Echeand’a, A.O. de la Puente, M. Alfonseca, in Proceedings of the IWINAC 2005. Attribute grammar evolution (Springer, Berlin, 2005), pp. 182–191

  14. K. Deb, H.G. Beyer, in Proceedings of the Genetic and Evolutionary Computation Conference. Self-adaptation in real-parameter genetic algorithms with simulated binary crossover (Morgan Kaufmann, July 1999), pp. 172–179

  15. E. Galvan-Lopez, M. O’Neill, in CIG. On the effects of locality in a permutation problem: the sudoku problem (IEEE, 2009)

  16. E. Galvan-Lopez, M. O’Neill, in MICAI, Lecture Notes in Computer Science. Towards understanding the effects of locality in genetic programming (Springer, Berlin, 2009)

  17. J. Gottlieb, G. Raidl, in Proceedings of the Genetic and Evolutionary Computation Conference. The effects of locality on the dynamics of decoder-based evolutionary search (ACM, 2000), p. 283–290

  18. S. Gustafson, E.K. Burke, N. Krasnogor, in Proceedings of the 2005 IEEE Congress on Evolutionary Computation. On improving genetic programming for symbolic regression, vol. 1. (IEEE Press, Edinburgh, 2005), pp. 912–919

  19. S. Hengpraprohm, P. Chongstitvatana, in Proceedings of ISCIT International Symposium on Communications and Information Technologies. Selective crossover in genetic programming, Nov 2001, pp. 14–16

  20. N.X. Hoai, R. McKay, D. Essam, in Proceedings of the 2002 Congress on Evolutionary Computation (CEC2002). Solving the symbolic regression problem with tree-adjunct grammar guided genetic programming: the comparative results (IEEE Press, 2002), pp. 1326–1331

  21. N.X. Hoai, R.I. McKay, D. Essam, Representation and structural difficulty in genetic programming. IEEE Trans. Evol. Comput. 10(2), 157–166 (2006)

    Article  Google Scholar 

  22. N.X. Hoai, R.I.B. McKay, D. Essam, H. Abbass, in Genetic Programming 7th European Conference, EuroGP 2004, Proceedings, vol. 3003 of LNCS, ed. by M. Keijzer, U.-M. O’Reilly, S.M. Lucas, E. Costa, T. Soule, Toward an alternative comparison between different genetic programming systems (Springer, Berlin, 2004), pp. 67–77

  23. T.-H. Hoang, D. Essam, R.I.B. McKay, X.H. Nguyen, in Proceedings of the 2007 International Symposium on Intelligent Computation and Applications (ISICA). Building on success in genetic programming:adaptive variation & developmental evaluation (China University of Geosciences Press, Wuhan, China, Sep 2007)

  24. T. Ito, H. Iba, S. Sato, in Proceedings of the 1998 IEEE World Congress on Computational Intelligence. Depth-dependent crossover for genetic programming (IEEE Press, May 1998), pp. 775–780

  25. T. Ito, H. Iba, S. Sato, in Advances in Genetic Programming. A self-tuning mechanism for depth-dependent crossover (IEEE Press, June 1999), pp. 377–399

  26. C. Johnson, in Proceedings of the 4th European Conference on Genetic Programming (EuroGP2002). Deriving genetic programming fitness properties by static analysis (Springer, Berlin, 2002), pp. 299–308

  27. C. Johnson, in Recent Advances in Soft Computing. Genetic programming with guaranteed constraints (The Nottingham Trent University, UK, 2002), pp. 134–140

  28. C. Johnson, in Proceedings of the UK Workshop on Computational Intelligence. What can automatic programming learn from theoretical computer science (University of Birmingham, Birmingham, 2002)

  29. C. Johnson, in Proceedings of the 10th European Conference on Genetic Programming (EuroGP2002). Genetic programming with fitness based on model checking (Springer, Berlin, 2007), pp. 114–124

  30. C. Johnson, in Proceedings of the 12th European Conference on Genetic Programming (EuroGP2009). Genetic programming crossover: Does it cross over? (Springer, Berlin, 2009), pp. 97–108

  31. G. Katz, D. Peled, Genetic programming and model checking: Synthesizing new mutual exclusion algorithms. Automated technology for verification and analysis. Lect. Notes Comput. Sci. 5311, 33–47 (2008)

    Article  Google Scholar 

  32. G. Katz, D. Peled, Model checking-based genetic programming with an application to mutual exclusion. Tools Algorithm. Constr. Anal. Syst. 4963, 141–156 (2008)

    Article  Google Scholar 

  33. M. Keijzer, in Proceedings of EuroGP’2003. Improving symbolic regression with interval arithmetic and linear scaling, Springer, Berlin, April 2003), pp. 70–82

  34. D. Knuth, Semantics of context-free languages. Math. Syst. Theory. 2 95 (1968)

    Article  MathSciNet  Google Scholar 

  35. J. Koza, Genetic Programming: On the Programming of Computers by Natural Selection (MIT Press, Cambridge, 1992)

    MATH  Google Scholar 

  36. J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection (The MIT Press, Cambridge, 1992)

    MATH  Google Scholar 

  37. K. Krawiec, P. Lichocki, in Genetic and Evolutionary Computation Conference, GECCO 2009, Proceedings, Montreal, Québec, Canada, July 8–12, 2009, ed. by F. Rothlauf. Approximating geometric crossover in semantic space (ACM, New York, 2009), pp. 987–994

  38. K. Krawiec, B. Wieloch, in GECCO ’09: Proceedings of the 11th Annual conference on Genetic and evolutionary computation. Functional modularity for genetic programming (ACM, Montreal, July 2009), pp. 995–1002

  39. W.B. Langdon, in Proceedings of the Genetic and Evolutionary Computation Conference. Size fair and homologous tree genetic programming crossovers (Morgan Kaufmann, July 1999), pp. 1092–1097

  40. W. B. Langdon, R. Poli, Foundations of Genetic Programming (Springer, Berlin, 2002)

    MATH  Google Scholar 

  41. H. Majeed, C. Ryan, in Proceedings of the 9th European Conference on Genetic Programming. A less destructive, context-aware crossover operator for gp, Lecture Notes in Computer Science (Springer, Berlin, April 2006), pp. 36–48

  42. H. Majeed, C. Ryan, in Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (GECCO). On the constructiveness of context-aware crossover (ACM Press, New York, July 2007), pp. 1659–1666

  43. N. McPhee, B. Ohs, T. Hutchison, in Proceedings of 11th European Conference on Genetic Programming. Semantic building blocks in genetic programming (Springer, Berlin, 2008) , pp. 134–145

  44. N. Mori, B. McKay, N.X. Hoai, D. Essam, S. Takeuchi, A new method for simplifying algebraic expressions in genetic programming called equivalent decision simplification. J. Adv. Comput. Intell. Intell. Inform. 13(3), 237–244 (2009)

    Google Scholar 

  45. F. Nielson, H.R. Nielson, C. Hankin, Principles of Program Analysis. (Springer, Berlin, 2005)

    MATH  Google Scholar 

  46. H.R. Nielson, F. Nielson, Semantics with Applications: An Appetizer (Springer, London, 2007)

    Book  MATH  Google Scholar 

  47. U.M. O’Reilly, F. Oppacher, Program search with a hierarchical variable length representation: genetic programming, simulated annealing and hill climbing. Lect. Notes Comput. Sci. 866(1), 397–406 (1994)

    Google Scholar 

  48. R. Poli, W.B. Langdon, in Proceedings of Soft Computing in Engineering Design and Manufacturing Conference. Genetic programming with one-point crossover (Springer, Berlin, June 1997), pp. 180–189

  49. R. Poli, W.B. Langdon, N.F. McPhee, A Field Guide to Genetic Programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza).

  50. B.J. Ross, Logic-based genetic programming with definite clause translation grammars. New Gen. Comput. 19(4), 313–337 (2001)

    Article  MATH  Google Scholar 

  51. F. Rothlauf, Representations for Genetic and Evolutionary Algorithms, 2nd edn. (Springer, Berlin, 2006)

    Google Scholar 

  52. F. Rothlauf, D. Goldberg, Redundant representations in evolutionary algorithms. Evol. Comput. 11(4), 381–415 (2003)

    Article  Google Scholar 

  53. F. Rothlauf, M. Oetzel, in Proceedings of the 9th European Conference on Genetic Programming. On the locality of grammatical evolution, lecture notes in computer science (Springer, Berlin, April 2006), pp. 320–330

  54. R.P. Salustowicz, J. Schmidhuber, Probabilistic incremental program evolution. Evol. Comput. 5(2), 123–141 (1997)

    Article  Google Scholar 

  55. W.A. Tackett, Selection, and the Genetic Construction of Computer Programs. PhD thesis, University of Southern California, USA, 1994

  56. W.A. Tackett, A. Carmi, in Proceedings of the 1994 IEEE World Congress on Computational Intelligence. The unique implications of brood selection for genetic programming (IEEE Press, Orlando, Florida, USA, 27–29 June 1994)

  57. N.Q. Uy, N.X. Hoai, M. O’Neill, in Proceedings of EuroGP09. Semantic aware crossover for genetic programming: the case for real-valued function regression (Springer, Berlin, April 2009), pp. 292–302.

  58. M.L. Wong, K.S. Leung, in Proceedings of the 7th IEEE International Conference on Tools with Artificial Intelligence. An induction system that learns programs in different programming languages using genetic programming and logic grammars (1995)

  59. M.L. Wong, K.S. Leung, in Proceedings of the Fourth Congress of the Italian Association for Artificial Intelligence. Learning programs in different paradigms using genetic programming (Springer, Berlin, 1995)

  60. P. Wong, M. Zhang, 2008 IEEE World Congress on Computational Intelligence, ed. by J. Wang. SCHEME: caching subtrees in genetic programming (IEEE Computational Intelligence Society, IEEE Press, Hong Kong, 1–6 June 2008)

Download references

Acknowledgements

This paper was funded under a Postgraduate Scholarship from the Irish Research Council for Science Engineering and Technology (IRCSET). The authors would like to thank the members of NCRA (Natural Computing Research & Applications Group) at University College Dublin. The second author was partly funded by The Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01.14.09 for doing this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nguyen Xuan Hoai.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Uy, N.Q., Hoai, N.X., O’Neill, M. et al. Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genet Program Evolvable Mach 12, 91–119 (2011). https://doi.org/10.1007/s10710-010-9121-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10710-010-9121-2

Keywords

Navigation