Abstract
The recent Crossover Bias theory has shown that bloat in Genetic Programming can be caused by the proliferation of small unfit individuals in the population. Inspired by this theory, Operator Equalisation is the most recent and successful bloat control method available. In a recent work there has been an attempt to replicate the evolutionary dynamics of Operator Equalisation by joining two key ingredients found in older and newer bloat control methods. However, the obtained dynamics was very different from expected, which prompted a further investigation into the reasons that make Operator Equalisation so successful. It was revealed that, at least for complex symbolic regression problems, the distribution of program lengths enforced by Operator Equalisation is nearly flat, contrasting with the peaky and well delimited distributions of the other approaches. In this work we study the importance of having flat program length distributions for bloat control. We measure the flatness of the distributions found in previous and new Operator Equalisation variants and we correlate it with the amount of search performed by each approach. We also analyze where this search occurs and how bloat correlates to these properties. We conclude presenting a possible explanation for the unique behavior of Operator Equalisation.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Archetti, Francesco,Lanzeni, Stefano,Messina,Enza, andVanneschi,Leonardo (2006). Genetic programming for human oral bioavailability of drugs. In Keijzer,Maarten et al., editors, GECCO2006: Proceedings of the 8th annual conference on Genetic and evolutionary computation, volume 1, pages 255– 262, Seattle, Washington, USA. ACM Press.
Archetti, Francesco,Lanzeni, Stefano,Messina,Enza, andVanneschi,Leonardo (2007). Genetic programming for computational pharmacokinetics in drug discovery and development. Genetic Programming and Evolvable Machines, 8(4):413–432. special issue on medical applications of Genetic and Evolutionary Computation.
Dignum, Stephen and Poli, Riccardo (2007). Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In Thierens, Dirk et al., editors, GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, volume 2, pages 1588–1595, London. ACM Press.
Dignum, Stephen and Poli, Riccardo (2008a). Crossover, sampling, bloat and the harmful effects of size limits. In O’Neill, Michael et al., editors, Proceedings of the 11th European Conference on Genetic Programming, EuroGP 2008, volume 4971 of Lecture Notes in Computer Science, pages 158–169, Naples. Springer.
Dignum, Stephen and Poli, Riccardo (2008b). Operator equalisation and bloat free GP. In O’Neill, Michael et al., editors, Proceedings of the 11th European Conference onGeneticProgramming,EuroGP2008, volume 4971 of Lecture Notes in Computer Science, pages 110–121, Naples. Springer.
Igel, Christian and Chellapilla, Kumar (1999). Investigating the influence of depth and degree of genotypic change on fitness in genetic programming. In Banzhaf,Wolfgang et al., editors, Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, pages 1061–1068, Orlando, Florida, USA. Morgan Kaufmann.
Koza, John R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.
Luke, Sean (2003). Modification point depth and genome growth in genetic programming. Evolutionary Computation, 11(1):67–106.
Poli, Riccardo, Langdon, William B., and Dignum, Stephen (2007). On the limiting distribution of program sizes in tree-based genetic programming. In Ebner, Marc et al., editors, Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science, pages 193–204, Valencia, Spain. Springer.
Poli, Riccardo, McPhee, Nicholas F., and Vanneschi, Leonardo (2008a). Analysis of the effects of elitism on bloat in linear and tree-based genetic program232 ming. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice VI, Genetic and Evolutionary Computation, chapter 7, pages 91–111. Springer, Ann Arbor.
Poli, Riccardo, McPhee, Nicholas F., and Vanneschi, Leonardo (2008b). The impact of population size on code growth in GP: analysis and empirical validation. In Keijzer, Maarten et al., editors, GECCO ’08: Proceedings of the 10th annual conference on Genetic and evolutionary computation, pages 1275–1282, Atlanta, GA, USA. ACM.
Poli, Riccardo, McPhee, Nicholas Freitag, and Vanneschi, Leonardo (2008c). Elitism reduces bloat in genetic programming. In Keijzer, Maarten et al., editors, GECCO ’08: Proceedings of the 10th annual conference on Genetic and evolutionary computation, pages 1343–1344, Atlanta, GA, USA. ACM. Silva, Sara (2011). Reassemblingoperator equalisation - a secret revealed. In Genetic and Evolutionary Computation Conference (GECCO-2011). ACM Press.
Silva, Sara andAlmeida, Jonas (2003).Dynamicmaximum tree depth. In Cant´u- Paz, E. et al., editors, Genetic and Evolutionary Computation – GECCO- 2003, volume 2724 of LNCS, pages 1776–1787, Chicago. Springer-Verlag.
Silva, Sara and Costa, Ernesto (2004). Dynamic limits for bloat control: Variations on size and depth. In Deb, Kalyanmoy et al., editors, Genetic and Evolutionary Computation – GECCO-2004, Part II, volume 3103 of Lecture Notes in Computer Science, pages 666–677, Seattle, WA, USA. Springer- Verlag.
Silva, Sara and Costa, Ernesto (2009). Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines, 10(2):141–179.
Silva, Sara and Dignum, Stephen (2009). Extending operator equalisation: Fitness based self adaptive length distribution for bloat free GP. In Vanneschi, Leonardo et al., editors, Proceedings of the 12th EuropeanConference onGenetic Programming, EuroGP 2009, volume 5481 of LNCS, pages 159–170, Tuebingen. Springer.
Silva, Sara and Vanneschi, Leonardo (2009). Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction. In Raidl, Guenther et al., editors, GECCO ’09: Proceedings of the 11th Annual conference on Genetic and evolutionary computation, pages 1115–1122, Montreal. ACM.
Silva, Sara and Vanneschi, Leonardo (2012). Bloat free genetic programming: Application to human oral bioavailability prediction. International Journal of Data Mining and Bioinformatics. to appear.
Silva, Sara, Vasconcelos, Maria, and Melo, Joana (2010). Bloat free genetic programming versus classification trees for identification of burned areas in The Importance of Being Flat 233 satellite imagery. In Di Chio, Cecilia et al., editors, EvoIASP, volume 6024 of LNCS, Istanbul. Springer.
Tackett, Walter Alden (1994). Recombination, Selection, and the Genetic Construction of Computer Programs. PhD thesis, University of Southern California, Department of Electrical Engineering Systems, USA.
Vanneschi, Leonardo (2004). Theory and Practice for Efficient Genetic Programming.
PhD thesis, Faculty of Sciences, University of Lausanne, Switzerland. Vanneschi, Leonardo, Castelli,Mauro, and Silva, Sara (2010).Measuring bloat, overfitting and functional complexity in genetic programming. In Branke, Juergen et al., editors, GECCO ’10: Proceedings of the 12th annual conference on Genetic and evolutionary computation, pages 877–884, Portland, Oregon, USA. ACM.
Vanneschi, Leonardo and Silva, Sara (2009). Using operator equalisation for prediction of drug toxicity with genetic programming. In Lopes, Luis Seabra, Lau, Nuno, Mariano, Pedro, and Rocha, Luis Mateus, editors, Progress in Artificial Intelligence, 14th Portuguese Conference on Artificial Intelligence, EPIA 2009, volume 5816 of LNAI, pages 65–76, Aveiro, Portugal. Springer.
Vanneschi, Leonardo, Tomassini, Marco, Collard, Philippe, and Clergue, Manuel (2003). Fitness distance correlation in structural mutation genetic programming. In Ryan, Conor et al., editors, Genetic Programming, Proceedings of EuroGP’2003, volume 2610 of LNCS, pages 455–464, Essex. Springer-Verlag.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Silva, S., Vanneschi, L. (2011). The Importance of Being Flat–Studying the Program Length Distributions of Operator Equalisation. In: Riolo, R., Vladislavleva, E., Moore, J. (eds) Genetic Programming Theory and Practice IX. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1770-5_12
Download citation
DOI: https://doi.org/10.1007/978-1-4614-1770-5_12
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1769-9
Online ISBN: 978-1-4614-1770-5
eBook Packages: Computer ScienceComputer Science (R0)