Skip to main content

Genetic Programming with Embedded Feature Construction for High-Dimensional Symbolic Regression

  • Conference paper
  • First Online:
Book cover Intelligent and Evolutionary Systems

Part of the book series: Proceedings in Adaptation, Learning and Optimization ((PALO,volume 8))

Abstract

Feature construction is an effective way to eliminate the limitation of poor data representation in many tasks such as high-dimensional symbolic regression. Genetic Programming (GP) is a good choice for feature construction for its natural ability to explore the feature space to detect and combine important features. However, there is very little contribution devoted to enhance the generalisation performance of GP for high-dimensional symbolic regression by feature construction. This work aims to develop a new feature construction method namely genetic programming with embedded feature construction (GPEFC) for high-dimensional symbolic regression. GPEFC keeps track of new small informative building blocks on best fitness gain individuals and constructs new features using these building blocks. The new constructed features augment the Terminal Set of GP dynamically. A series of experiments were conducted to investigate the learning ability and generalisation performance of GPEFC. The results show that GPEFC can evolve more compact models in an efficient way, has better learning ability and better generalisation performance than standard GP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmed, S., Zhang, M., Peng, L., Xue, B.: Multiple feature construction for effective biomarker identification and classification using genetic programming. In: Proceedings of the 2014 conference on Genetic and evolutionary computation. pp. 249–256. ACM (2014)

    Google Scholar 

  2. Neshatian, K., Zhang, M., Andreae, P.: A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming. Evolutionary Computation, IEEE Transactions on 16(5), 645–661 (2012)

    Article  Google Scholar 

  3. Koza, J.R.: Genetic programming: on the programming of computers by means of natural selection, vol. 1. MIT press (1992)

    Google Scholar 

  4. Krawiec, K.: Genetic programming-based construction of features for machine learning and knowledge discovery tasks. Genetic Programming and Evolvable Machines 3(4), 329–343 (2002)

    Article  MATH  Google Scholar 

  5. Neshatian, K., Zhang, M., Johnston, M.: Feature construction and dimension reduction using genetic programming. In: AI 2007: Advances in Artificial Intelligence, pp. 160–170. Springer (2007)

    Google Scholar 

  6. Amari, S.i., Wu, S.: Improving support vector machine classifiers by modifying kernel functions. Neural Networks 12(6), 783–789 (1999)

    Google Scholar 

  7. Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Statistics and computing 14(3), 199–222 (2004)

    Article  MathSciNet  Google Scholar 

  8. Kushchu, I.: Genetic programming and evolutionary generalization. Evolutionary Computation, IEEE Transactions on 6(5), 431–442 (2002)

    Article  MATH  Google Scholar 

  9. Castelli, M., Manzoni, L., Silva, S., Vanneschi, L.: A quantitative study of learning and generalization in genetic programming. In: Genetic Programming, pp. 25–36. Springer (2011)

    Google Scholar 

  10. Chen, Q., Xue, B., Shang, L., Zhang, M.: Improving generalisation of genetic programming for symbolic regression with structural risk minimisation. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference. pp. 709–716. ACM (2016)

    Google Scholar 

  11. Gonçalves, I., Silva, S., Fonseca, C.M.: On the generalization ability of geometric semantic genetic programming. In: Genetic Programming, pp. 41–52. Springer (2015)

    Google Scholar 

  12. Uy, N.Q., Hien, N.T., Hoai, N.X., ONeill, M.: Improving the generalisation ability of genetic programming with semantic similarity based crossover. In: Genetic Programming, pp. 184–195. Springer (2010)

    Google Scholar 

  13. Glavan, M., Gradišar, D., Atanasijević-Kunc, M., Strmčnik, S., Mušič, G.: Input variable selection for model-based production control and optimisation. The international journal of advanced manufacturing technology 68(9-12), 2743–2759 (2013)

    Article  Google Scholar 

  14. Smits, G., Kordon, A., Vladislavleva, K., Jordaan, E., Kotanchek, M.: Variable selection in industrial datasets using pareto genetic programming. GENETIC PROGRAMMING SERIES 9,  79 (2006)

    Google Scholar 

  15. Arnaldo, I., Krawiec, K., O’Reilly, U.M.: Multiple regression genetic programming. In: Proceedings of the 2014 conference on Genetic and evolutionary computation. pp. 879–886. ACM (2014)

    Google Scholar 

  16. Azad, R.M.A., Ryan, C.: A simple approach to lifetime learning in genetic programming-based symbolic regression. Evolutionary computation 22(2), 287–317 (2014)

    Article  Google Scholar 

  17. Kommenda, M., Affenzeller, M., Kronberger, G., Burlacu, B., Winkler, S.: Multi-population genetic programming with data migration for symbolic regression. In: Computational Intelligence and Efficiency in Engineering Systems, pp. 75–87. Springer (2015)

    Google Scholar 

  18. Mousavi Astarabadi, S.S., Ebadzadeh, M.M.: Avoiding overfitting in symbolic regression using the first order derivative of gp trees. In: Proceedings of the Companion Publication of the 2015 on Genetic and Evolutionary Computation Conference. pp. 1441–1442. ACM (2015)

    Google Scholar 

  19. Vafaie, H., De Jong, K.: Genetic algorithms as a tool for restructuring feature space representations. In: Tools with Artificial Intelligence, 1995. Proceedings., Seventh International Conference on. pp. 8–11. IEEE (1995)

    Google Scholar 

  20. Otero, F.E., Silva, M.M., Freitas, A.A., Nievola, J.C.: Genetic programming for attribute construction in data mining. In: EuroGP. vol. 3, pp. 384–393. Springer (2003)

    Google Scholar 

  21. Li, D.C., Liu, C.W.: Extending attribute information for small data set classification. Knowledge and Data Engineering, IEEE Transactions on 24(3), 452–464 (2012)

    Article  Google Scholar 

  22. Tran, B., Xue, B., Zhang, M.: Genetic programming for feature construction and selection in classification on high-dimensional data. Memetic Computing 8(1), 3–15 (2016)

    Article  Google Scholar 

  23. Koza, J.R.: Genetic programming II: automatic discovery of reusable programs. MIT press (1994)

    Google Scholar 

  24. Ballard, D., Rosca, J.: Genetic programming with adaptive representations (1994)

    Google Scholar 

  25. Oppacher, U.M.O.F.: The troubling aspects of a building block hypothesis for genetic programming. Foundations of Genetic Algorithms 1995 (FOGA 3) 3,  73 (2014)

    Google Scholar 

  26. Kinzett, D., Johnston, M., Zhang, M.: Numerical simplification for bloat control and analysis of building blocks in genetic programming. Evolutionary Intelligence 2(4), 151–168 (2009)

    Article  Google Scholar 

  27. Kinzett, D., Zhang, M., Johnston, M.: Analysis of building blocks with numerical simplification in genetic programming. In: Genetic Programming, pp. 289–300. Springer (2010)

    Google Scholar 

  28. Lichman, M.: UCI machine learning repository (2013), http://archive.ics.uci.edu/ml

  29. Archetti, F., Lanzeni, S., Messina, E., Vanneschi, L.: Genetic programming for computational pharmacokinetics in drug discovery and development. Genetic Programming and Evolvable Machines 8(4), 413–432 (2007)

    Article  Google Scholar 

  30. Rosenwald, A., Wright, G., Chan, W.C., Connors, J.M., Campo, E., Fisher, R.I., Gascoyne, R.D., Muller-Hermelink, H.K., Smeland, E.B., Giltnane, J.M., et al.: The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma. New England Journal of Medicine 346(25), 1937–1947 (2002)

    Article  Google Scholar 

  31. Vanneschi, L., Silva, S., Castelli, M., Manzoni, L.: Geometric semantic genetic programming for real life applications. In: Genetic Programming Theory and Practice XI, pp. 191–209. Springer (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Chen, Q., Zhang, M., Xue, B. (2017). Genetic Programming with Embedded Feature Construction for High-Dimensional Symbolic Regression. In: Leu, G., Singh, H., Elsayed, S. (eds) Intelligent and Evolutionary Systems. Proceedings in Adaptation, Learning and Optimization, vol 8. Springer, Cham. https://doi.org/10.1007/978-3-319-49049-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49049-6_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49048-9

  • Online ISBN: 978-3-319-49049-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics