Skip to main content

Evolving Better RNAfold Structure Prediction

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10781))

Abstract

Grow and graft genetic programming (GGGP) evolves more than 50000 parameters in a state-of-the-art C program to make functional source code changes which give more accurate predictions of how RNA molecules fold up. Genetic improvement updates 29% of the dynamic programming free energy model parameters. In most cases (50.3%) GI gives better results on 4655 known secondary structures from RNA_STRAND (29.0% are worse and 20.7% are unchanged). Indeed it also does better than parameters recommended by Andronescu, M., et al.: Bioinformatics 23(13) (2007) i19–i28.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The ViennaRNA package must first be configured with ./configure –enable-sse. https://www.tbi.univie.ac.at/RNA/documentation.html.

References

  1. Tsunoda, M., et al.: Structural basis for recognition of cognate tRNA by tyrosyl-tRNA synthetase from 3 kingdoms. Nucleic Acids Res. 35(13), 4289–4300 (2007). https://doi.org/10.1093/nar/gkm417

    Article  Google Scholar 

  2. Crick, F.: Central dogma of molecular biology. Nature 227, 561–563 (1970). https://doi.org/10.1038/227561a0

    Article  Google Scholar 

  3. Andronescu, M., et al.: RNA STRAND: The RNA secondary structure and statistical analysis database. BMC Bioinformatics 9(1), 340 (2008). https://doi.org/10.1186/1471-2105-9-340

    Article  MathSciNet  Google Scholar 

  4. Reeder, J., et al.: pknotsRG: RNA pseudoknot folding including near-optimal structures and sliding windows. Nucleic Acids Res. 35(Suppl 2), W320–W324 (2007). https://doi.org/10.1093/nar/gkm258

    Article  Google Scholar 

  5. Langdon, W.B., Harman, M.: Grow and graft a better CUDA pknotsRG for RNA Pseudoknot free energy calculation. In: GI 2015 Workshop, pp. 805–810 (2015). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/langdon_2015_gi_pknots.html

  6. Lorenz, R., et al.: ViennaRNA package 2.0. Algorithms Mol. Biol. 6(1), 26 (2011). https://doi.org/10.1186/1748-7188-6-26

    Article  Google Scholar 

  7. Lee, J., et al.: RNA design rules from a massive open laboratory. PNAS 111(6), 2122–2127 (2013). https://doi.org/10.1073/pnas.1313039111

    Article  Google Scholar 

  8. Harman, M., Jia, Y., Langdon, W.B.: Babel Pidgin: SBSE can grow and graft entirely new functionality into a real world system. In: Le Goues, C., Yoo, S. (eds.) SSBSE 2014. LNCS, vol. 8636, pp. 247–252. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09940-8_20

    Google Scholar 

  9. Jia, Y., Harman, M., Langdon, W.B., Marginean, A.: Grow and serve: growing Django citation services using SBSE. In: Barros, M., Labiche, Y. (eds.) SSBSE 2015. LNCS, vol. 9275, pp. 269–275. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22183-0_22

    Chapter  Google Scholar 

  10. Kocsis, Z.A., Swan, J.: Genetic programming + proof search = automatic improvement. J. Autom. Reasoning 60(2), 157–176 (2018). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/PolyfinicJAR.html

    Article  MathSciNet  MATH  Google Scholar 

  11. Langdon, W.B., Lam, B.Y.H., Petke, J., Harman, M.: Improving CUDA DNA analysis software with genetic programming. In: GECCO, pp. 1063–1070 (2015). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Langdon_2015_GECCO.html

  12. Langdon, W.B.: Genetic improvement of software for multiple objectives. In: Barros, M., Labiche, Y. (eds.) SSBSE 2015. LNCS, vol. 9275, pp. 12–28. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22183-0_2

    Chapter  Google Scholar 

  13. Langdon, W.B., Lam, B.Y.H., Modat, M., Petke, J., Harman, M.: Genetic improvement of GPU software. GP & EM 18(1), 5–44 (2017). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Langdon_2016_GPEM.html

    Google Scholar 

  14. Langdon, W.B: Genetically improved software. In: Gandomi, A.H., et al. (Eds.): Handbook of Genetic Programming Applications, pp. 181–220. Springer, Cham (2015). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/langdon_2015_hbgpa.html

  15. Petke, J., Haraldsson, S.O., Harman, M., Langdon, W.B., White, D.R., Woodward, J.R.: Genetic improvement of software: a comprehensive survey. IEEE Transactions on Evolutionary Computation (In press). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Petke_gisurvey.html

  16. Langdon, W.B., Harman, M.: Optimising existing software with genetic programming. IEEE Trans. Evol. Comput. 19(1), 118–135 (2015). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Langdon_2013_ieeeTEC.html

    Article  Google Scholar 

  17. Bruce, B.R., Petke, J., Harman, M.: Reducing energy consumption using genetic improvement. In: GECCO, pp. 1327–1334. ACM (2015)

    Google Scholar 

  18. Wu, F., Weimer, W., Harman, M., Jia, Y., Krinke, J.: Deep parameter optimisation. In: Silva, S., et al., (Eds.) GECCO, pp. 1375–1382 (2015). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Wu_2015_GECCO.html

  19. Marginean, A., Barr, E.T., Harman, M., Jia, Y.: Automated transplantation of call graph and layout features into kate. In: Barros, M., Labiche, Y. (eds.) SSBSE 2015. LNCS, vol. 9275, pp. 262–268. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22183-0_21

    Chapter  Google Scholar 

  20. Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: GenProg: a generic method for automatic software repair. IEEE Trans. Softw. Eng. 38(1), 54–72 (2012). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/DBLP_journals_tse_GouesNFW12.html

    Article  Google Scholar 

  21. Langdon, W.B., Lorenz, R.: Improving SSE parallel code with grow and graft genetic programming. In: Petke, J., et al. (Eds.) GI-2017, pp. 1537–1538. ACM (2017). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Langdon_2017_GI.html

  22. Koza, J.R.: Genetic Programming. MIT press, Cambridge, MA (1992). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/koza_book.html

    MATH  Google Scholar 

  23. Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D.: Genetic Programming - An Introduction. Morgan Kaufmann, San Francisco (1998). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/banzhaf_1997_book.html

    Book  MATH  Google Scholar 

  24. Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming. Lulu Enterprises, UK (2008). http://www.gp-field-guide.org.uk

    Google Scholar 

  25. Das, R.: Personal Communication (2017)

    Google Scholar 

  26. Langdon, W.B.: Evolving better RNAfold C source code. Technical Report RN/17/08, University College, London, (2017). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/langdon_RN1708.html

  27. MacKerell Jr., A.D., Banavali, N., Foloppe, N.: Development and current status of the CHARMM force field for nucleic acids. Biopolymers 56(4), 257–265 (2000). https://doi.org/10.1002/1097-0282(2000)56:4%3c257::AID-BIP10029%3e3.0.CO;2-W

    Article  Google Scholar 

  28. Zuber, J., et al.: A sensitivity analysis of RNA folding nearest neighbor parameters identifies a subset of free energy parameters with the greatest impact on RNA secondary structure prediction. Nucleic Acids Res. 45(10), 6168–6176 (2017). https://doi.org/10.1093/nar/gkx170

    Article  Google Scholar 

  29. Angeline, P.J.: Multiple interacting programs: a representation for evolving complex behaviors. Cybern. Syst. 29(8), 779–803 (1998). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/angeline_1998_mips3.html

    Article  MATH  Google Scholar 

  30. Langdon, W.B.: Genetic Programming and Data Structures. Kluwer, Norwell (1998). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/langdon_book.html

    Book  MATH  Google Scholar 

  31. Langdon, W.B., Lam, B.Y.H.: Genetically improved BarraCUDA. BioData Min., 20(28) (2017). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Langdon_2017_BDM.html

  32. Weimer, W., Nguyen, T., Le Goues, C., Forrest, S.: Automatically finding patches using genetic programming. In: ICSE, pp. 364–374 (2009). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Weimer_2009_ICES.html

  33. Andronescu, M., et al.: Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics 23(13), i19–i28 (2007). https://doi.org/10.1093/bioinformatics/btm223

    Article  Google Scholar 

  34. Schmidt, M., Lipson, H.: Distilling free-form natural laws from experimental data. Science 324(5923), 81–85 (2009). http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/Science09_Schmidt.html

    Article  Google Scholar 

Download references

Acknowledgements

I am grateful for the assistance of Rhiju Das and Fernando Portela, and our anonymous reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William B. Langdon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Langdon, W.B., Petke, J., Lorenz, R. (2018). Evolving Better RNAfold Structure Prediction. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds) Genetic Programming. EuroGP 2018. Lecture Notes in Computer Science(), vol 10781. Springer, Cham. https://doi.org/10.1007/978-3-319-77553-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77553-1_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77552-4

  • Online ISBN: 978-3-319-77553-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics