Skip to main content

Advertisement

Log in

Evolving rule induction algorithms with multi-objective grammar-based genetic programming

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Multi-objective optimization has played a major role in solving problems where two or more conflicting objectives need to be simultaneously optimized. This paper presents a Multi-Objective grammar-based genetic programming (MOGGP) system that automatically evolves complete rule induction algorithms, which in turn produce both accurate and compact rule models. The system was compared with a single objective GGP and three other rule induction algorithms. In total, 20 UCI data sets were used to generate and test generic rule induction algorithms, which can be now applied to any classification data set. Experiments showed that, in general, the proposed MOGGP finds rule induction algorithms with competitive predictive accuracies and more compact models than the algorithms it was compared with.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aho AV, Sethi R, Ullman JD (1986) Compilers: principles, techniques and tools, 1st edn. Addison-Wesley, Reading

    Google Scholar 

  2. Banzhaf W, Nordin P, Keller R, Francone F (1998) GP—an introduction. On the automatic evolution of computer programs and its applications. Morgan Kaufmann, San Francisco

    Google Scholar 

  3. Bleuler S, Brack M, Thiele L, Zitzler E (2001) Multiobjective genetic programming: Reducing bloat using SPEA2. In: Proceedings of the 2001 congress on evolutionary computation—CEC2001. IEEE, Korea, pp 536–543

  4. Brunk CA, Pazzani MJ (1991) An investigation of noise-tolerant relational concept learning algorithms. In: Birnbaum L, Collins G (eds) Proceedings of the 8th international workshop on machine learning. Morgan Kaufmann, San Francisco, pp 389–393

    Google Scholar 

  5. Clare A, King RD (2002) Machine learning of functional class from phenotype data. Bioinformatics 18(1): 160–166

    Article  Google Scholar 

  6. Clark P, Boswell R (1991) Rule induction with cn2: some recent improvements. In: Kodratoff Y (eds) EWSL-91: Proceedings of the working session on learning. Springer, New York, pp 151–163

    Google Scholar 

  7. Cleary R (2005) Extending grammar evolution with attribute grammars: an application to knapsack problems. Master’s Thesis, University of Limerick

  8. Coello CAC (1999) A comprehensive survey of evolutionary-based multiobjective optimization techniques. Knowl Inf Syst 1(3): 129–156

    Google Scholar 

  9. Coello CAC, Veldhuizen DV, Lamont G (2002) Algorithms for solving multi-objective problems. Kluwer, New York

    MATH  Google Scholar 

  10. Cohen WW (1993) Efficient pruning methods for separate-and-conquer rule learning systems. In: Proceedings of the 13th international joint conference on artificial intelligence (IJCAI-93), France, pp 988–994

  11. Cohen WW (1995) Fast effective rule induction. In: Prieditis A, Russell S (eds) Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 115–123

    Google Scholar 

  12. De Jong ED, Watson RA, Pollack JB (2001) Reducing bloat and promoting diversity using multi-objective methods. In: Spector L, Goodman E, Wu A, Langdon W, Voigt H-M, Gen M, Sen S, Dorigo M, Pezeshk S, Garzon M, Burke E (eds) Proceedings of the genetic and evolutionary computation conference, GECCO-2001. Morgan Kaufmann, San Francisco, pp 11–18

    Google Scholar 

  13. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley Interscience series in Systems and Optimization, Berlin

    MATH  Google Scholar 

  14. Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. In: Schoenauer M, Deb KGR, Yao X, Lutton E, Merelo JJ, Schwefel H (eds) Parallel problem solving from nature—PPSN VI. Springer, Berlin, pp 849–858

    Chapter  Google Scholar 

  15. Falco ID, Cioppa AD, Iazzetta A, Tarantino E (2005) An evolutionary approach for automatically extracting intelligible classification rules. Knowl Inf Syst 7(2): 179–201

    Article  Google Scholar 

  16. Falco ID, Tarantino E, Cioppa AD, Fontanella F (2005) A novel grammar-based genetic programming approach to clustering. In: Proceedings of the 2005 ACM symposium on applied computing (SAC-05). ACM Press, New York, pp 928–932

  17. Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery: an overview. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining, AAAI/MIT Press

  18. Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Heidelberg

    MATH  Google Scholar 

  19. Freitas AA (2004) A critical review of multi-objective optimization in data mining: a position paper. ACM SIGKDD Explor. Newsl. 6(2): 77–86

    Article  MathSciNet  Google Scholar 

  20. Furnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1): 3–54

    Article  Google Scholar 

  21. Furnkranz J, Widmer G (1994) Incremental reduced error pruning. In: Proceedings of the 11th international conference on machine learning, New Brunswick, NJ, pp 70–77

  22. Gruau F (1996) On using syntactic constraints with genetic programming. In: Angeline PJ, Kinnear KE Jr (eds) Advances in genetic programming 2, chap 19. MIT Press, Cambridge, pp 377–394

    Google Scholar 

  23. Handl J, Kell DB, Knowles J (2007) Multiobjective optimization in bioinformatics and computational biology. IEEE/ACM Trans Comput Biol Bioinf 4(2): 279–292

    Article  Google Scholar 

  24. Handl J, Knowles J (2004) Evolutionary multiobjective clustering, PPSN VIII: proceedings of the 8th international conference on parallel problem solving from nature. Springer, London, pp 1081–1091

    Google Scholar 

  25. Hetland ML, Saetrom P (2005) Evolutionary rule mining in time series databases. Mach Learn 58(2): 107–125

    Article  MATH  Google Scholar 

  26. Hoai NX, McKay RI, Abbass HA (2003) Tree adjoining grammars, language bias, and genetic programming. In: Ryan C, Soule T, Keijzer M, Tsang E, Poli R, Costa E (eds) Proceedings of the 6th European conference on genetic programming (EuroGP-03), vol 2610 of Lecure Notes in Computer Science. Springer, Essex, pp 335–344

  27. Hussain T, Browse R (1998) Network generating attribute grammar encoding. In: Proceedings of IEEE international joint conference on neural networks, pp 431–436

  28. Jacobson H (2005) Rule extraction from recurrent neural networks: a taxonomy and review. Neural Comput. 17: 1223–1263

    Article  MathSciNet  Google Scholar 

  29. Karwath A, King R (2002) Homology induction: the use of machine learning to improve sequence similarity searches. BMC Bioinf 3 (online publication)

  30. Keller RE, Banzhaf W (1996) Genetic programming using genotype-phenotype mapping from linear genomes into linear phenotypes. In: Koza JR, Goldberg DE, Fogel DB, Riolo RL (eds) Proceedings of the 1st annual conference on genetic programming (GP-96). MIT Press, Stanford University, pp 116–122

    Google Scholar 

  31. Law MH, Topchy A, Jain A (2004) Multiobjective data clustering. Proc. IEEE Comput Soc Conf Comput Vis Pattern Recogn 2: 424–430

    Google Scholar 

  32. Lim T, Loh W, Shih Y (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40(3): 203–228

    Article  MATH  Google Scholar 

  33. McConaghy T, Gielen G (2006) Canonical form functions as a simple means for genetic programming to evolve human-interpretable functions. In: Proceedings of the 8th annual conference on genetic and evolutionary computation (GECCO-06). ACM Press, New York, pp 855–862

  34. Michie, D, Spiegelhalter, DJ, Taylor, CC, Campbell, J (eds) (1994) Machine learning, neural and statistical classification. Ellis Horwood, Upper Saddle River

    MATH  Google Scholar 

  35. Mirkin B, Ritter O (2000) A feature-based approach to discrimination and prediction of protein folding groups. Genomics and Proteomics. Springer, Heidelberg, pp 155–177

    Google Scholar 

  36. Newman DJ, Hettich S, Blake C, Merz C (1998) UCI repository of machine learning databases

  37. O’Neill M, BrabazonA, Ryan C, Collins JJ (2001) Evolving market index trading rules using grammatical evolution. In: Boers EJW, Cagnoni S, Gottlieb J, Hart E, Lanzi PL, Raidl GR, Smith RE, Tijink H (eds) Applications of evolutionary computing, vol 2037 of LNCS. Springer, Heidelberg, pp 343–352

    Chapter  Google Scholar 

  38. O’Neill M, Ryan C (2003) Grammatical evolution evolutionary automatic programming in an arbitrary language. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  39. Ortega A, de la Cruz M, Alfonseca M (2007) Christiansen grammar evolution: grammatical evolution with semantics. Evol Comput IEEE Trans 11(1): 77–90

    Article  Google Scholar 

  40. Pagallo G, Haussler D (1990) Boolean feature discovery in empirical learning. Mach Learn 5(1): 71–99

    Article  Google Scholar 

  41. Pappa GL (2007) Automatically evolving rule induction algorithms with grammar-based genetic programming. PhD Thesis, Computing Laboratory, University of Kent

  42. Pappa GL, Freitas AA (2006) Automatically evolving rule induction algorithms. In: Fuernkranz J, Scheffer T, Spiliopoulou M (eds) Proceedings of the 17th European conference on machine learning, vol 4212 of Lecture Notes in Computer Science. Springer, Berlin, pp 341–352

    Google Scholar 

  43. Pappa GL, Freitas AA (2007) Discovering new rule induction algorithms with grammar-based genetic programming. In: Maimon O, Rokach L (eds) Soft computing for knowledge discovery and data mining. Springer, Heidelberg, pp 177–196

    Google Scholar 

  44. Pappa GL, Freitas AA, Kaestner CAA (2004) Multi-objective algorithms for attribute selection in data mining. In: Coello CAC, Lamont G (eds) Applications of multi-objective evolutionary algorithms. World Scientific, Singapore, pp 603–626

    Google Scholar 

  45. Pazzani MJ (2000) Knowledge discovery from data?. IEEE Intell Syst 15(2): 10–13

    Article  Google Scholar 

  46. Quinlan JR (1990) Induction of decision trees. In: Shavlik JW, Dietterich TG (eds) Readings in machine learning. Morgan Kaufmann (originally published in Machine Learning 1:81–106, 1986)

  47. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco

    Google Scholar 

  48. Ratle A, Sebag M (2000) Genetic programming and domain knowledge: beyond the limitations of grammar-guided machine discovery. In: Schoenauer M, Deb K, Rudolph G, Yao X, Lutton E, Merelo JJ, Schwefel H (eds) Proceedings of the 6th international conference on parallel problem solving from nature (PPSN). Springer, Heidelberg, pp 211–220

    Chapter  Google Scholar 

  49. Ratle A, Sebag M (2001) Avoiding the bloat with probabilistic grammar-guided genetic programming. In: Collet P, Fonlupt C, Hao J-K, Lutton E, Schoenauer M (eds) 5th international conference on evolution artificielle, EA, vol 2310. Springer, Creusot, pp 255–266

    Google Scholar 

  50. Rodrfguez-Vzquez K, Fleming PJ (2005) Evolution of mathematical models of chaotic systems based on multiobjective genetic programming. Knowl Inf Syst 8(2): 235–256

    Article  Google Scholar 

  51. Romero C, Ventura S, De-Bra P (2005) Knowledge discovery with genetic programming for providing feedback to courseware authors. User Model User Adapt Interact 14(5): 425–464

    Article  Google Scholar 

  52. Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. The MIT Press, Cambridge

    Google Scholar 

  53. Soule T, Foster JA (1998) Effects of code growth and parsimony pressure on populations in genetic programming. Evol Comput 6(4): 293–309

    Article  Google Scholar 

  54. Szafron D, Lu P, Greiner R, Wishart DS, Poulin B, Eisner R, Lu Z, Anvik J, Macdonell C, Fyshe A, Meeuwis D (2004) Proteome analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations. Nucleic Acids Res 32(suppl-2): W365–371

    Article  Google Scholar 

  55. Tsakonas A, Dounias G, Jantzen J, Axer H, Bjerregaard B, von Keyserlingk DG (2004) Evolving rule-based systems in two medical domains using genetic programming. Artif Intell Med 32(3): 195–216

    Article  Google Scholar 

  56. Whigham PA (1995) Grammatically-based genetic programming. In: Rosca JP (ed) Proceedings of the workshop on GP: from theory to real-world applications, Tahoe City, pp 33–41

  57. Whigham PA (1996) Grammatical bias for evolutionary learning. PhD Thesis, School of Computer Science, University College, University of New South Wales, Canberra, Australia

  58. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco

    Google Scholar 

  59. Wong ML (1998) An adaptive knowledge-acquisition system using generic genetic programming. Exp Syst Appl 15(1): 47–58

    Article  Google Scholar 

  60. Wong ML, Leung KS (2000) Data mining using grammar-based genetic programming and applications. Kluwer, Dordrecht

    MATH  Google Scholar 

  61. Zafra A, Ventura S (2007) Multi-objective genetic programming for multiple instance learning. In: Proceedings of European conference on machine learning—ECML 2007, pp 790–797

  62. Zhang J (1992) Selecting typical instances in instance-based learning. In: Proceedings of the 9th international workshop on machine learning. Morgan Kaufmann, San Francisco, pp 470–479

  63. Zhao H (2007) A multi-objective genetic programming approach to developing pareto optimal decision trees. Decis Support Syst 43(3): 809–826

    Article  Google Scholar 

  64. Zitzler E, Laumanns M, Thiele L (2002) SPEA2: improving the strength pareto evolutionary algorithm for multiobjective optimization. In: Giannakoglou K, Tsahalis D, Periaux J, Papaliliou K, Fogarty T (eds) Evolutionary methods for design, optimisation and control with application to industrial problems. Proceedings of the EUROGEN2001 conference on international center for numerical methos in engineering (CIMNE), pp 95–100

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gisele L. Pappa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pappa, G.L., Freitas, A.A. Evolving rule induction algorithms with multi-objective grammar-based genetic programming. Knowl Inf Syst 19, 283–309 (2009). https://doi.org/10.1007/s10115-008-0171-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-008-0171-1

Keywords

Navigation