Skip to main content
Log in

Behavioral Diversity and a Probabilistically Optimal GP Ensemble

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

We propose N-version Genetic Programming (NVGP) as an ensemble method to enhance accuracy and reduce performance fluctuation of programs produced by genetic programming. Diversity is essential for forming successful ensembles. NVGP quantifies behavioral diversity of ensemble members and defines NVGP optimal as an ensemble that has independent fault occurrences among its members. We observed significant accuracy improvement by NVGP optimal ensembles when applied to a DNA segment classification problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Avizienis and J. P. J. Kelly, “Fault tolerance by design diversity: Concepts and experiments, ” IEEE Computer, vol. 17,no. 8, pp. 67-80, 1984.

    Google Scholar 

  2. W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone, Genetic Programming: An Introduction: On the Automatic Evolution of Computer Programs and its Applications, Academic Press/Morgan Kaufmann: San Francisco, 1998.

    Google Scholar 

  3. S. C. Basak, B. D. Gute, G. D. Grunwald, D. W. Opitz, and K. Balasubramanian, “Use of statistical and neural net methods in predicting toxicity of chemicals: A hierarchical qsar approach, ” in Predictive Toxicology of Chemicals: Experiences and Impact of AI Tools—Papers from the 1999 AAAI Symposium, G. C. Gini (ed.), AAAI Press: Palo Alto, Menlo, CA, USA, 1999, pp. 108-111.

    Google Scholar 

  4. E. Bauer and R. Kohavi, “An empirical comparison of voting classification algorithms: Bagging, boosting, and variants, ” Machine Learning, vol. 36,no. 1/2, pp. 105-139, 1999.

    Google Scholar 

  5. M. Brameier and W. Banzhaf, “Evolving teams of predictors with linear genetic programming, ” Genetic Programming and Evolvable Machines, vol. 2,no. 4, pp. 381-407, 2001.

    Google Scholar 

  6. M. Brameier and W. Banzhaf, “Explicit control of diversity and effective variation distance in linear genetic programming, ” in Proceedings of 5th European Conference EuroGP2002, volume 2278 of Lecture Notes in Computer Science, Kinsale, Ireland, E. Lutton, J. Miller, C. Ryan, Andrea G. B. Tettamanzi, James A. Foster, and J. C. M. Baeten (eds.), Springer-Verlag: Heidelberg, Germany, 2002, pp. 38-50.

    Google Scholar 

  7. L. Breiman, Bagging predictor, Technical Report 421, Department of Statistics, University of California: Berkley, 1994.

    Google Scholar 

  8. E. Burke, S. Gustafson, and G. Kendall, “A survey and analysis of diversity measures in genetic programming, ” in GECCO 2002: Proceedings of Genetic and Evolutionary Computation Conference, New York, NY, USA, E. Cantu-Paz, K. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M. A. Potter, A. C. Schultz, J. F. Miller, E. Burke, N. Jonoska, and W. B. Langdon (eds.), Morgan Kaufmann: San Francisco, CA, USA, 2002, pp. 716-723.

    Google Scholar 

  9. A. Ekárt and S. Z. Németh, “Maintaining the diversity of genetic programs, ” in Proceedings of 5th European Conference EuroGP2002, volume 2278 of Lecture Notes in Computer Science, Kinsale, Ireland, E. Lutton, J. Miller, C. Ryan, Andrea G. B. Tettamanzi, James A. Foster, and J. C. M. Baeten (eds.), Springer-Verlag: Heidelberg, Germany, 2002.

    Google Scholar 

  10. R. Feldt, “Generating diverse software versions with genetic programming: an experimental study, ” IEE Proceedings—Software, Special issue on Dependable Computing Systems, vol. 145,no. 6, pp. 228-236, 1998.

    Google Scholar 

  11. R. Feldt, “Generating multiple diverse software versions with genetic programming, ” in Proceedings of the 24th EUROMICRO Conference, Workshop on Dependable Computing Systems, Västerås, Sweden, IEEE Computer Society Press: Los Alamitos, CA, USA, 1998, pp. 387-396.

    Google Scholar 

  12. Y. Freung, Y. Mansour, and R. E. Schapire, “Why averaging classifiers can protect against overfitting, ” in Proceedings of 8th International Workshop on Artificial Intelligence and Statistics 2001, Key West, FL, USA, T. Jaakkola (ed.), Morgan Kaufmann: San Francisco, CA, USA, 2001.

    Google Scholar 

  13. S. Handley, “Predicting whether or not a nucleic acid sequence is an E. coli promoter region using genetic programming, ” in Proceedings of 1st International Symposium on Intelligence in Neural and Biological Systems, Herndon, VA, USA, N. G. Bourbakis, (ed.), IEEE Computer Society Press: Los Alamitos, CA, USA, 1995, pp. 122-127.

    Google Scholar 

  14. S. Hashem, “Improving model accuracy using optimal linear combinations of trained neural networks, ” IEEE Transactions on Neural Networks, vol. 6,no. 3, pp. 792-794, 1995.

    Google Scholar 

  15. S. Hashem, “Optimal linear combinations of neural networks, ” Neural Networks, vol. 10,no. 4, pp. 599-614, 1997.

    Google Scholar 

  16. L. Hatton, “N-version vs. one good program, ” IEEE Software, vol. 14,no. 6, pp. 71-76, 1997.

    Google Scholar 

  17. V. Hilford, M. R. Lyu, B. Cukic, A. Jamoussi, and F. B. Bastani, “Diversity in the software development process, ” in Proceedings of 3rd International Workshop on Object-Oriented Real-Time Dependable Systems, Newport Beach, CA, USA, IEEE Computer Society Press: Los Alamitos, CA, USA, 1997, pp. 129-136.

    Google Scholar 

  18. H. Iba, “Bagging, boosting, and bloating in genetic programming, ” in GECCO'99: Proceedings of the Genetic and Evolutionary Computation Conference, Orlando, FL, USA, W. Banzhaf, J. Daida, M. H. Eiben, A. E. Garzon, V. Honavar, M. Jakiela, and R. E. Smith (eds.), Morgan Kaufmann: San Francisco, CA, USA, 1999, pp. 1053-1060.

    Google Scholar 

  19. K. Imamura and J. A. Foster, “Fault tolerant computing with N-version genetic programming, ” in GECCO 2001: Proceedings of Genetic and Evolutionary Computation Conference, San Francisco, CA, USA, L. Spector, E. D. Goodman (eds.), Morgan Kaufmann: San Francisco, CA, USA, 2001, p. 178.

    Google Scholar 

  20. K. Imamura, R. B. Heckendorn, T. Soule, and J. A. Foster, “Abstention reduces errors—decision abstaining N-version genetic programming, ” in GECCO 2002: Proceedings of Genetic and Evolutionary Computation Conference, New York, NY, USA, E. Cantu-Paz, K. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M. A. Potter, A. C. Schultz, J. F. Miller, E. Burke, N. Jonoska, and W. B. Langdon (eds.), Morgan Kaufmann: San Francisco, CA, USA, 2002, p. 796.

    Google Scholar 

  21. K. Imamura, R. B. Heckendorn, T. Soule, and J. A. Foster, “N-version genetic programming via fault masking, ” in Proceedings of 5th European Conference EuroGP2002, volume 2278 of Lecture Notes in Computer Science, Kinsale, Ireland, E. Lutton, J. Miller, C. Ryan, Andrea G. B. Tettamanzi, James A. Foster, and J. C. M. Baeten (eds.), Springer-Verlag: Heidelberg, Germany, 2002, pp. 172-181.

    Google Scholar 

  22. D. Jimenez and N. Walsh, “Dynamically weighted ensemble neural networks for classification, ” in Proceedings of the 1998 IEEE International Joint Conference on Neural Networks, Anchorage, AL, USAIEEE: Piscataway, NJ, USA, 1998, pp. 753-756.

    Google Scholar 

  23. J. C. Knight and N. G. Leveson, “An experimental evaluation of the assumption of independence in multiversion programming, ” IEEE Transactions on Software Engineering, vol. 12,no. 1, pp. 96-109, 1986.

    Google Scholar 

  24. R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection, ” in Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), Montreal, Canada, C. S. Mellish (ed.), Morgan Kaufmann: San Francisco, CA, USA, 1995, pp. 1137-1145.

    Google Scholar 

  25. A. Krogh and J. Vedelsby, “Neural network ensembles, cross validation, and active learning, ” in Advances in Neural Information Processing Systems, G. Tesauro, D. Touretzky, and T. Leen (eds.), The MIT Press, vol. 7, 1995, pp. 231-238.

  26. W. H. Land, Jr., T. Masters, J. Y. Lo, D. W. McKee, and F. R. Anderson, “New results in breast cancer classification obtained from an evolutionary computation/adaptive boosting hybrid using mammogram and history data, ” in Proceedings of the 2001 IEEE Mountain Workshop on Soft Computing in Industrial Applications, Blackburg, VA, USA, M. J. Embrechts (ed.), IEEE: Piscataway, NJ, USA, 2001, pp. 47-52.

    Google Scholar 

  27. W. B. Langdon and B. F. Buxton, “Genetic programming for combining classifiers, ” in GECCO 2001: Proceedings of Genetic and Evolutionary Computation Conference, San Francisco, CA, USA, L. Spector and E. D. Goodman (eds.), Morgan Kaufmann: San Francisco, CA, USA, 2001, pp. 66-73.

    Google Scholar 

  28. Q. Ma and J. T. L. Wang, “Recognizing promoters in DNA using Bayesian neural networks, ” in Proceedings of the IASTED International Conference—Artificial Intelligence and Soft Computing, Honolulu, HI, USA, M. H. Hamza (ed.), ACTA Press: Calgary, Canada, 1999, pp. 301-305.

    Google Scholar 

  29. R. Maclin and D. Opitz, “Popular ensemble methods: An empirical study, ” Journal of Artificial Intelligence Research, vol. 11, pp. 169-198, 1999.

    Google Scholar 

  30. B. W. Matthwes, “Comparison of the predicted and observed secondary structure of t4 phage lysozyme, ” Biochimica et Biophysica Acta, vol. 405, pp. 443-451, 1975.

    Google Scholar 

  31. D. W. Opitz, S. C. Basak, and B. D. Gute, “Hazard assessment modeling: An evolutionary ensemble approach, ” in GECCO'99: Proceedings of the Genetic and Evolutionary Computation Conference, Orlando, FL, USA, W. Banzhaf, J. Daida, M. H. Eiben, A. E. Garzon, V. Honavar, M. Jakiela, and R. E. Smith (eds.), Morgan Kaufmann: San Francisco, CA, USA, 1999, pp. 1643-1650.

    Google Scholar 

  32. A. G. Pedersen and J. Engelbrecht, “Investigations of Escherichia coli promoter sequences with artificial neural networks: New signals discovered upstream of the transcriptional startpoint, ” in Proceedings of the 3rd International Conference on Intelligent Systems for Molecular Biology, Cambridge, UK, C. Rawlings (ed.), AAAI Press: Menlo Park, CA, USA, 1995, pp. 292-299.

    Google Scholar 

  33. D. K. Pradhan and P. Banerjee, “Fault-tolerance multiprocessor and distributed systems: Principles, ” in Fault-Tolerant Computer System Design, D. K. Pradhan (ed.), Prentice Hall PTR, 1996, ch. 3, p. 142.

  34. G. Rätsch, T. Onoda, and K. R. Müller, “An improvement of adaboost to avoid overfitting, ” in Proceedings of the 5th International Conference on Neural Information Processing (ICONIP98), Kitakyushu, Japan, S. Usui and T. Omori (eds.), Ohmsha-IOS Press: Tokyo, Japan, 1998, pp. 506-509.

    Google Scholar 

  35. B. Rosen, “Ensemble learning using decorrelated neural networks, ” Connection Science, vol. 8, pp. 373-384, 1996.

    Google Scholar 

  36. R. E. Schapire and Y. Freund, “A short introduction to boosting, ” Journal of Japanese Society for Artificial Intelligence, vol. 14,no. 5, pp. 771-780, 1999.

    Google Scholar 

  37. T. Soule, “Voting teams: A cooperative approach to non-typical problems, ” in GECCO'99: Proceedings of the Genetic and Evolutionary Computation Conference, Orlando, FL, USA, W. Banzhaf, J. Daida, M. H. Eiben, A. E. Garzon, V. Honavar, M. Jakiela, and R. E. Smith (eds.), Morgan Kaufmann: San Francisco, CA, USA, vol. 1, 1999, pp. 916-922.

    Google Scholar 

  38. T. Soule, “Heterogeneity and specialization in evolving teams, ” in GECCO 2000: Proceedings of the Genetic and Evolutionary Computation Conference, Las Vegas, NV, USA, L. D. Whitley, D. Goldberg, E. Cantu-Paz, L. Spector, I. Parmee, and H. G. Beyer (eds.), Morgan Kaufmann: San Francisco, CA, USA, 2000, pp. 778-785.

    Google Scholar 

  39. G. G. Towell, J. W. Shavlik, and M. O. Noordewier, “Refinement of approximate domain theories by knowledge-based neural networks, ” in Proceedings of the 8th National Conference on Artificial Intelligence (AAAI-90), Boston, MA, USA, T. Dietterich (ed.), AAAI Press/MIT Press: Menlo Park, CA, USA, 1990, pp. 861-866.

    Google Scholar 

  40. UCI Machine Learning Repository—Molecular Biology Databases. http://wwwl.ics.uci.edu/~mlearn/MLSummary.html

  41. B.T. Zang and J. G. Joung, “Enhancing robustness of genetic programming at the species level, ” in Proceedings of the 2nd Annual Conference Genetic Programming (GP 97), Palo Alto, CA, USA, J. R. Koza (ed.), Morgan Kaufmann: San Francisco, Palo Alto, CA, USA, 1997, pp. 336-342.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Imamura, K., Soule, T., Heckendorn, R.B. et al. Behavioral Diversity and a Probabilistically Optimal GP Ensemble. Genetic Programming and Evolvable Machines 4, 235–253 (2003). https://doi.org/10.1023/A:1025124423708

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1025124423708

Navigation