Abstract
Given infinite time, humans would progress through modeling complex data in a manner that is dependent on prior expert knowledge. The goal of the present study is make extensions and enhancements to a computational evolution system (CES) that has the ultimate objective of tinkering with data as a human would. This is accomplished by providing flexibility in the model-building process and a meta-layer that learns how to generate better models. The key to the CES system is the ability to identify and exploit expert knowledge from biological databases or prior analytical results. Our prior results have demonstrated that CES is capable of efficiently navigating these large and rugged fitness landscapes toward the discovery of biologically meaningful genetic models of disease. Further, we have shown that the efficacy of CES is improved dramatically when the system is provided with statistical or biological expert knowledge. The goal of the present study was to apply CES to the genetic analysis of prostate cancer aggressiveness in a large sample of European Americans. We introduce here the use of Pareto-optimization to help address overfitting in the learning system. We further introduce a post-processing step that uses hierarchical cluster analysis to generate expert knowledge from the landscape of best models and their predictions across patients. We find that the combination of Pareto-optimization and post-processing of results greatly improves the genetic analysis of prostate cancer.
Key words
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Banzhaf W, Francone FD, Keller RE, Nordin P (1998) Genetic programming: an introduction: on the automatic evolution of computer programs and its applications. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA
Banzhaf W, Beslon G, Christensen S, Foster J, Képès F, Lefort V, Miller J, Radman M, Ramsden J (2006) From artificial evolution to computational evolution: a research agenda. Nature Reviews Genetics 7:729–735
Cordell HJ (2009) Detecting gene-gene interactions that underlie human diseases. Nature Reviews Genetics 10:392–404
Fogel GB, Corne DW (eds) (2003) Evolutionary Computation in Bioinformatics. Morgan Kaufmann Publishers Inc.
Greene C, Hill D, Moore J (2009a) Environmental noise improves epistasis models of genetic data discovered using a computational evolution system. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp 1785–1786
Greene CS, Hill DP, Moore JH (2009b) Environmental sensing of expert knowledge in a computational evolution system for complex problem solving in human genetics. In: Riolo RL, O’Reilly UM, McConaghy T (eds) Genetic Programming Theory and Practice VII, Genetic and Evolutionary Computation, Springer, Ann Arbor, chap 2, pp 19–36
Hastie T, Tibshirani R, Friedman J (2003) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, corrected edn. Springer, URL http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20\&path=ASIN/0387952845
Horn J, Nafpliotis N, Goldberg DE (1994) A niched pareto genetic algorithm for multiobjective optimization. In: Evolutionary Computation, 1994. IEEE World Congress on Computational Intelligence., Proceedings of the First IEEE Conference on, pp 82–87 vol.1, DOI 10.1109/ICEC.1994.350037, URL http://dx.doi.org/10.1109/ICEC.1994.350037
Koza JR (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems), 1st edn. A Bradford Book, URL http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20\&path=ASIN/0262111705
Lamont GB, VanVeldhuizen DA (2002) Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic Publishers, Norwell, MA, USA
Lee PH, Jung JY, Shatkay H (2009) Functionally informative tag snp selection using a pareto-optimal approach: playing the game of life. BMC Bioinformatics 10(S-13):0
McKinney BA, Reif DM, Ritchie MD, Moore JH (2006) Machine Learning for Detecting Gene-Gene Interactions: A Review. Applied Bioinformatics 5(2):77–88, URL http://www.ingentaconnect.com/content/adis/abi/2006/00000005/00000002/-art00002
Mitchell TM (1997) Machine Learning, 1st edn. McGraw-Hill, Inc., New York, NY, USA
Moore J, White B (2007) Tuning relieff for genome-wide genetic analysis. In: Lecture Notes in Computer Science, Springer, vol 4447, pp 166–175
Moore J, Williams S (2009) Epistasis and its implications for personal genetics. American Journal of Human Genetics 85:309–320
Moore J, Parker J, Olsen N, Aune T (2002) Symbolic discriminant analysis of microarray data in autoimmune disease. Genetic Epidemiology 23:57–69
Moore J, Andrews P, Barney N, White B (2008) Development and evaluation of an open-ended computational evolution system for the genetic analysis of susceptibility to common human diseases. In: Lecture Notes in Computer Science, vol 4973, pp 129–140
Moore J, Greene C, Andrews P, White B (2009) Genetic Programming Theory and Practice VI, Springer, chap 9: Does complexity matter? Artificial evolution, computational evolution, and the genetic analysis of epistasis in common human diseases
Moore J, Asselbergs F, Williams S (2010) Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4):445–455
Moore JH, Hill DP, Fisher JM, Lavender N, Kidd LC (2011) Human-computer interaction in a computational evolution system for the genetic analysis of cancer. In: Riolo R, Vladislavleva E, Moore JH (eds) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, Springer, Ann Arbor, USA, chap 9, pp 153–171, DOI doi: 10.1007/978-1-4614-1770-5-9
Motsinger AA, Ritchie MD, Reif DM (2007) Novel methods for detecting epistasis in pharmacogenomics studies. Pharmacogenomics 8(9):1229–1241, DOI 10.2217/14622416.8.9.1229, URL http://dx.doi.org/10.2217/14622416.8.9.1229
Pattin KA, Payne JL, Hill DP, Caldwell T, Fisher JM, Moore JH (2010) Exploiting expert knowledge of protein-protein interactions in a computational evolution system for detecting epistasis. In: Riolo R, McConaghy T, Vladislavleva E (eds) Genetic Programming Theory and Practice VIII, Genetic and Evolutionary Computation, vol 8, Springer, Ann Arbor, USA, chap 12, pp 195–210, URL http://www.springer.com/computer/ai/book/978-1-4419-7746-5
Payne J, Greene C, Hill D, Moore J (2010) Exploitation of Linkage Learning in Evolutionary Algorithms, Springer, chap 10: Sensible initialization of a computational evolution system using expert knowledge for epistasis analysis in human genetics, pp 215–226
Smits G, Kotanchek M (2004) Pareto-front exploitation in symbolic regression. In: O’Reilly UM, Yu T, Riolo RL, Worzel B (eds) Genetic Programming Theory and Practice II, Springer, Ann Arbor, chap 17, pp 283–299, DOI doi:10.1007/0-387-23254-0-17
Thornton-Wells T, Moore J, Haines J (2004) Genetics, statistics, and human disease: Analytic retooling for complexity. Trends in Genetics 20:640–647
Acknowledgements
This work was supported by NIH grants LM009012, LM010098 and AI596-94. We would like to thank the participants of present and past Genetic Programming Theory and Practice Workshops (GPTP) for their stimulating feedback and discussion that helped formulate some of the ideas in this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Moore, J.H., Hill, D.P., Sulovari, A., Kidd, L.C. (2013). Genetic Analysis of Prostate Cancer Using Computational Evolution, Pareto-Optimization and Post-processing. In: Riolo, R., Vladislavleva, E., Ritchie, M., Moore, J. (eds) Genetic Programming Theory and Practice X. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6846-2_7
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6846-2_7
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6845-5
Online ISBN: 978-1-4614-6846-2
eBook Packages: Computer ScienceComputer Science (R0)