Skip to main content

Genetic Programming for Natural Language Parsing

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3003))

Abstract

The aim of this paper is to prove the effectiveness of the genetic programming approach in automatic parsing of sentences of real texts. Classical parsing methods are based on complete search techniques to find the different interpretations of a sentence. However, the size of the search space increases exponentially with the length of the sentence or text to be parsed and the size of the grammar, so that exhaustive search methods can fail to reach a solution in a reasonable time. This paper presents the implementation of a probabilistic bottom-up parser based on genetic programming which works with a population of partial parses, i.e. parses of sentence segments. The quality of the individuals is computed as a measure of its probability, which is obtained from the probability of the grammar rules and lexical tags involved in the parse. In the approach adopted herein, the size of the trees generated is limited by the length of the sentence. In this way, the size of the search space, determined by the size of the sentence to parse, the number of valid lexical tags for each words and specially by the size of the grammar, is also limited.

Supported by projects TIC2003-09481-C04 and 07T/0030/2003.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pinker, S.: The Language Instinct. Harper Collins (1994)

    Google Scholar 

  2. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  3. Charniak, E.: Statistical Language Learning. MIT Press, Cambridge (1993)

    Google Scholar 

  4. Brew, C.: Stochastic hpsg. In: Proc. of the 7th Conf. of the European Chapter of the Association for Computational Linguistics, Dublin, Ireland, University College, pp. 83–89 (1995)

    Google Scholar 

  5. Abney, S.: Statistical methods and linguistics. In: Klavans, J., Resnik, P. (eds.) The Balancing Act. MIT Press, Cambridge (1996)

    Google Scholar 

  6. Charniak, E.: Statistical techniques for natural language parsing. AI Magazine 18, 33–44 (1997)

    Google Scholar 

  7. Charniak, E.: Tree-bank grammars. In: Proc. of the Thirteenth National Conference on Artificial Intelligence, vol. 2, pp. 1031–1036. AAAI Press / MIT Press (1996)

    Google Scholar 

  8. Ratle, A., Sebag, M.: Avoiding the bloat with probabilistic grammar-guided genetic programming. In: Collet, P., Fonlupt, C., Hao, J.-K., Lutton, E., Schoenauer, M. (eds.) EA 2001. LNCS, vol. 2310, pp. 255–266. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Kool, A.: Literature survey (2000)

    Google Scholar 

  10. Araujo, L.: A parallel evolutionary algorithm for stochastic natural language parsing. In: Proc. of the Int. Conf. Parallel Problem Solving from Nature, PPSN VII (2002)

    Google Scholar 

  11. Sampson, G.: English for the Computer. Clarendon Press, Oxford (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Araujo, L. (2004). Genetic Programming for Natural Language Parsing. In: Keijzer, M., O’Reilly, UM., Lucas, S., Costa, E., Soule, T. (eds) Genetic Programming. EuroGP 2004. Lecture Notes in Computer Science, vol 3003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24650-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24650-3_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21346-8

  • Online ISBN: 978-3-540-24650-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics