Skip to main content

RECIPE: A Grammar-Based Framework for Automatically Evolving Classification Pipelines

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10196))

Abstract

Automatic Machine Learning is a growing area of machine learning that has a similar objective to the area of hyper-heuristics: to automatically recommend optimized pipelines, algorithms or appropriate parameters to specific tasks without much dependency on user knowledge. The background knowledge required to solve the task at hand is actually embedded into a search mechanism that builds personalized solutions to the task. Following this idea, this paper proposes RECIPE (REsilient ClassifIcation Pipeline Evolution), a framework based on grammar-based genetic programming that builds customized classification pipelines. The framework is flexible enough to receive different grammars and can be easily extended to other machine learning tasks. RECIPE overcomes the drawbacks of previous evolutionary-based frameworks, such as generating invalid individuals, and organizes a high number of possible suitable data pre-processing and classification methods into a grammar. Results of f-measure obtained by RECIPE are compared to those two state-of-the-art methods, and shown to be as good as or better than those previously reported in the literature. RECIPE represents a first step towards a complete framework for dealing with different machine learning tasks with the minimum required human intervention.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/RecipeML/Recipe.

  2. 2.

    https://github.com/RecipeML/Recipe/tree/master/grammars.

  3. 3.

    http://scikit-learn.org/stable/modules/classes.html.

References

  1. Banzhaf, W., Francone, F.D., Keller, R.E., Nordin, P.: Genetic Programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications. Morgan Kaufmann Publishers Inc., Burlington (1998)

    Book  MATH  Google Scholar 

  2. Pappa, G.L., Ochoa, G., Hyde, M.R., Freitas, A.A., Woodward, J., Swan, J.: Contrasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms. Genet. Program. Evolvable Mach. 15(1), 3–35 (2014)

    Article  Google Scholar 

  3. Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 485–492 (2016)

    Google Scholar 

  4. Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 847–855 (2013)

    Google Scholar 

  5. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 2755–2763 (2015)

    Google Scholar 

  6. Olson, R.S., Urbanowicz, R.J., Andrews, P.C., Lavender, N.A., Kidd, L.C., Moore, J.H.: Automating biomedical data science through tree-based pipeline optimization. In: Squillero, G., Burelli, P. (eds.) EvoApplications 2016. LNCS, vol. 9597, pp. 123–137. Springer, Heidelberg (2016). doi:10.1007/978-3-319-31204-0_9

    Chapter  Google Scholar 

  7. McKay, R., Hoai, N., Whigham, P., Shan, Y., O’Neill, M.: Grammar-based genetic programming: a survey. Genet. Program. Evolvable Mach. 11(3), 365–396 (2010)

    Article  Google Scholar 

  8. Mendoza, H., Klein, A., Feurer, M., Springenberg, J., Hutter, F.: Towards automatically-tuned neural networks. In: Proceedings of the ICML AutoML Workshop (2016)

    Google Scholar 

  9. Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)

    Article  Google Scholar 

  10. Yao, X.: Evolving artificial neural networks. Proc. IEEE 87(9), 1423–1447 (1999)

    Article  Google Scholar 

  11. Pappa, G.L., Freitas, A.A.: Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach. Springer, Heidelberg (2009)

    MATH  Google Scholar 

  12. Dioşan, L., Rogozan, A., Pecuchet, J.P.: Improving classification performance of support vector machine by genetically optimising kernel shape and hyper-parameters. Appl. Intell. 36(2), 280–294 (2012)

    Article  Google Scholar 

  13. Mantovani, R.G., Rossi, A.L.D., Vanschoren, J., Bischl, B., de Carvalho, A.: Effectiveness of random search in SVM hyper-parameter tuning. In: Proceedings of the International Joint Conference on Neural Networks, pp. 1–8 (2015)

    Google Scholar 

  14. Barros, R.C., Basgalupp, M.P., de Carvalho, A.C.P.L.F., Freitas, A.A.: Automatic design of decision-tree algorithms with evolutionary algorithms. Evol. Comput. 21(4), 659–684 (2013)

    Article  Google Scholar 

  15. Sá, A.G.C., Pappa, G.L.: Towards a method for automatically evolving bayesian network classifiers. In: Proceedings of the Conference Companion on Genetic and Evolutionary Computation, pp. 1505–1512 (2013)

    Google Scholar 

  16. Sá, A.G.C., Pappa, G.L.: A hyper-heuristic evolutionary algorithm for learning bayesian network classifiers. In: Bazzan, A.L.C., Pichara, K. (eds.) IBERAMIA 2014. LNCS (LNAI), vol. 8864, pp. 430–442. Springer, Heidelberg (2014). doi:10.1007/978-3-319-12027-0_35

    Google Scholar 

  17. Springenberg, J.T., Klein, A., Falkner, S., Hutter, F.: Bayesian optimization with robust bayesian neural networks. In: Proceedings of the Conference on Neural Information Processing Systems (2016)

    Google Scholar 

  18. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11, 10–18 (2009)

    Article  Google Scholar 

  19. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: SciKit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  20. Feurer, M., Springenberg, J.T., Hutter, F.: Initializing bayesian hyperparameter optimization via meta-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1128–1135 (2015)

    Google Scholar 

  21. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  22. Whigham, P.A., Dick, G., Maclaurin, J., Owen, C.A.: Examining the “best of both worlds” of grammatical evolution. In: Proceedings of the Conference on Genetic and Evolutionary Computation, pp. 1111–1118 (2015)

    Google Scholar 

  23. Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., Burlington (2011)

    Google Scholar 

  24. Asuncion, A., Newman, D.: UCI machine learning repository (2007)

    Google Scholar 

  25. Freitas, A.A., Vasieva, O., de Magalhães, J.P.: A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related. BMC Genomics 12(1) (2011)

    Google Scholar 

  26. Souto, M., Costa, I., Araujo, D., Ludermir, T., Schliep, A.: Clustering cancer gene expression data: a comparative study. BMC Bioinf. 9(1), 497 (2008)

    Article  Google Scholar 

  27. Wan, C., Freitas, A.A., De Magalhães, J.P.: Predicting the pro-longevity or anti-longevity effect of model organism genes with new hierarchical feature selection methods. IEEE/ACM Trans. Comput. Biol. Bioinform. 12(2), 262–275 (2015)

    Article  Google Scholar 

  28. Wilcoxon, F., Katti, S., Wilcox, R.A.: Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Sel. Tables Math. Stat. 1, 171–259 (1970)

    MATH  Google Scholar 

  29. Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 2546–2554 (2011)

    Google Scholar 

  30. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(1), 281–305 (2012)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was partially supported by the following Brazilian Research Support Agencies: CNPq, CAPES and FAPEMIG.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alex G. C. de Sá .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

de Sá, A.G.C., Pinto, W.J.G.S., Oliveira, L.O.V.B., Pappa, G.L. (2017). RECIPE: A Grammar-Based Framework for Automatically Evolving Classification Pipelines. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds) Genetic Programming. EuroGP 2017. Lecture Notes in Computer Science(), vol 10196. Springer, Cham. https://doi.org/10.1007/978-3-319-55696-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55696-3_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55695-6

  • Online ISBN: 978-3-319-55696-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics