Skip to main content

Evolving SQL Queries from Examples with Developmental Genetic Programming

  • Chapter
  • First Online:

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

Abstract

Large databases are becoming ever more ubiquitous, as are the opportunities for discovering useful knowledge within them. Evolutionary computation methods such as genetic programming have previously been applied to several aspects of the problem of discovering knowledge in databases. The more specific task of producing human-comprehensible SQL queries has several potential applications but has thus far been explored only to a limited extent. In this chapter we show howdevelopmental genetic programming can automatically generate SQL queries from sets of positive and negative examples. We show that a developmental genetic programming system can produce queries that are reasonably accurate while excelling in human comprehensibility relative to the well-known C5.0 decision tree generation system.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://hampshire.edu/lspector/push.html.

  2. 2.

    C5.0 is available at http://rulequest.com/see5-info.html.

References

  • Acar AC, Motro A (2005) Intensional encapsulations of database subsets by genetic programming. Tech. Rep. ISE-TR-05-01, Information and Software Engineering Department, The Volgenau School of Information Technology and Engineering, George Mason University, URL http://ise.gmu.edu/techrep/2005/05-01.pdf

  • Doucette JA, McIntyre AR, Lichodzijewski P, Heywood MI (2012) Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces. Genetic Programming and Evolvable Machines 13(1):71–101, DOI doi:10.1007/s10710-011-9151-4, special Section on Evolutionary Algorithms for Data Mining

    Google Scholar 

  • Frank A, Asuncion A (2010) UCI machine learning repository. URL http://archive.ics.uci.edu/ml

  • Freitas A (2002) A survey of evolutionary algorithms for data mining and knowledge discovery. In: Ghosh A, Tsutsui S (eds) Advances in Evolutionary Computation, Springer-Verlag, chap 33, pp 819–845, URL http://www.macs.hw.ac.uk/~dwcorne/Teaching/freitas01survey.pdf

  • Freitas AA (1997) A genetic programming framework for two data mining tasks: Classification and generalized rule induction. In: Koza JR, Deb K, Dorigo M, Fogel DB, Garzon M, Iba H, Riolo RL (eds) Genetic Programming 1997: Proceedings of the Second Annual Conference, Morgan Kaufmann, Stanford University, CA, USA, pp 96–101, URL http://citeseer.nj.nec.com/43454.html

  • Gruau F (1994) Neural network synthesis using cellular encoding and the genetic algorithm. PhD thesis, Laboratoire de l’Informatique du Parallilisme, Ecole Normale Supirieure de Lyon, France, URL ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/PhD/PhD1994/PhD1994-01-E.ps.Z

  • Ishida CY, Pozo ATR (2002) GP SQL miner: SQL-grammar genetic programming in data mining. In: Fogel DB, El-Sharkawi MA, Yao X, Greenwood G, Iba H, Marrow P, Shackleton M (eds) Proceedings of the 2002 Congress on Evolutionary Computation CEC2002, IEEE Press, pp 1226–1231

    Google Scholar 

  • Klein J, Spector L (2007) Unwitting distributed genetic programming via asynchronous JavaScript and XML. In: Thierens D, Beyer HG, Bongard J, Branke J, Clark JA, Cliff D, Congdon CB, Deb K, Doerr B, Kovacs T, Kumar S, Miller JF, Moore J, Neumann F, Pelikan M, Poli R, Sastry K, Stanley KO, Stutzle T, Watson RA, Wegener I (eds) GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, ACM Press, London, vol 2, pp 1628–1635, DOI doi:10.1145/1276958.1277282, URL http://www.cs.bham.ac.uk/~wbl/biblio/gecco2007/docs/p1628.pdf

  • Koza JR, Andre D, Bennett III FH, Keane M (1999) Genetic Programming 3: Darwinian Invention and Problem Solving. Morgan Kaufman, URL http://www.genetic-programming.org/gpbook3toc.html

  • Montana DJ (1995) Strongly typed genetic programming. Evolutionary Computation 3(2):199–230, DOI doi:10.1162/evco.1995.3.2.199, URL http://vishnu.bbn.com/papers/stgp.pdf

    Google Scholar 

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA

    Google Scholar 

  • da Silva BC, Thomas P (2010) Automatic query generation, “Unpublished manuscript”

    Google Scholar 

  • Spector L (2001) Autoconstructive evolution: Push, pushGP, and pushpop. In: Spector L, Goodman ED, Wu A, Langdon WB, Voigt HM, Gen M, Sen S, Dorigo M, Pezeshk S, Garzon MH, Burke E (eds) Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), Morgan Kaufmann, San Francisco, California, USA, pp 137–146, URL http://hampshire.edu/lspector/pubs/ace.pdf

  • Spector L, Klein J (2005) Trivial geography in genetic programming. In: Yu T, Riolo RL, Worzel B (eds) Genetic Programming Theory and Practice III, Genetic Programming, vol 9, Springer, Ann Arbor, chap 8, pp 109–123, URL http://hampshire.edu/lspector/pubs/trivial-geography-toappear.pdf

  • Spector L, Klein J, Keijzer M (2005) The push3 execution stack and the evolution of control. In: Beyer HG, O’Reilly UM, Arnold DV, Banzhaf W, Blum C, Bonabeau EW, Cantu-Paz E, Dasgupta D, Deb K, Foster JA, de Jong ED, Lipson H, Llora X, Mancoridis S, Pelikan M, Raidl GR, Soule T, Tyrrell AM, Watson JP, Zitzler E (eds) GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary computation, ACM Press, Washington DC, USA, vol 2, pp 1689–1696, DOI doi:10.1145/1068009.1068292, URL http://www.cs.bham.ac.uk/~wbl/biblio/gecco2005/docs/p1689.pdf

  • Van Rijsbergen C (1979) Information retrieval. Butterworths, London

    Google Scholar 

  • Veeramachaneni K, Vladislavleva E, O’Reilly UM (2012) Knowledge mining sensory evaluation data: genetic programming, statistical techniques, and swarm optimization. Genetic Programming and Evolvable Machines 13(1):103–133, DOI doi:10.1007/s10710-011-9153-2, special Section on Evolutionary Algorithms for Data Mining

    Google Scholar 

Download references

Acknowledgements

We thank Gerome Miklau for advice regarding databases and the UCI Machine Learning Repository for use of the adult dataset; see http://archive.ics.uci.-edu/ml/index.html. This material is based upon work supported by the National Science Foundation under Grant No. 1017817. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Helmuth .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Helmuth, T., Spector, L. (2013). Evolving SQL Queries from Examples with Developmental Genetic Programming. In: Riolo, R., Vladislavleva, E., Ritchie, M., Moore, J. (eds) Genetic Programming Theory and Practice X. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6846-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-6846-2_1

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-6845-5

  • Online ISBN: 978-1-4614-6846-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics