Abstract
Inducing decision trees is a popular method in machine learning. The information gain computed for each attribute and its threshold helps finding a small number of rules for data classification. However, there has been little research on how many rules are appropriate for a given set of data. In this paper, an evolutionary multi-objective optimization approach with genetic programming will be applied to the data classification problem in order to find the minimum error rate for each size of decision trees. Following structural risk minimization suggested by Vapnik, we can determine a desirable number of rules with the best generalization performance. A hierarchy of decision trees for classification performance can be provided and it is compared with C4.5 application.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases. In: Proceedings of the Fifth International Conference on Machine Learning (1998)
Bot, M.C.J.: Improving induction of linear classification trees with genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2000), pp. 403–410. Morgan Kaufmann, San Francisco (2000)
Bot, M.C.J., Langdon, W.B.: Application of genetic programming to induction of linear classification trees. In: Proceedings of the 3rd European Conference on Genetic Programming (2000)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group (1984)
Fayyad, U.M.: On the induction of decision trees for multiple concept learning. Ph. D. dissertation, EECS department, University of Michigan (1991)
Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization: Formulation, discussion and generalization. In: Proceedings of the Fifth Int. Conf. on Genetic Algorithms, pp. 416–423. Morgan Kaufmann, San Francisco (1993)
Freitas, A.A., Pappa, G.L., Kaestner, C.A.A.: Attribute selection with a multiobjective genetic algorithm. In: Proceedings of the 16th Brazilian Symposium on Artificial Intelligence, pp. 280–290. Springer, Heidelberg (2002)
Irani, K.B., Khaminsani, V.A.: Knowledge based automation of semiconductor manufacturing. In SRC Project Annual Review Report, The University of Michigan, Ann Arbor (1991)
De Jong, E.D., Pollack, J.B.: Multi-objective methods for tree size control. Genetic Programming and Evolvable Machines 4(3), 211–233 (2003)
Kim, D., Hallam, J.: An evolutionary approach to quantify internal states needed for the woods problem. In: From Animals to Animats 7, pp. 312–322. MIT Press, Cambridge (2002)
Mingers, J.: An empirical comparison of selection measures for decision-tree induction. Machine Learning 4(2), 227–243 (1989)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Quinlan, J.R.: Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Approach 4, 77–90 (1996)
Quinlan, J.R., Rivest, R.: Inferring decision trees using the minimum description length principle. Information and Computation 80(3), 227–248 (1996)
Vapnik, V.N.: The nature of statistical learning theory. Springer, Heidelberg (1995)
Zitzler, E.: Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications. Ph. D. dissertation, Swiss Federal Institute of Technology (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, D. (2004). Structural Risk Minimization on Decision Trees Using an Evolutionary Multiobjective Optimization. In: Keijzer, M., O’Reilly, UM., Lucas, S., Costa, E., Soule, T. (eds) Genetic Programming. EuroGP 2004. Lecture Notes in Computer Science, vol 3003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24650-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-24650-3_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21346-8
Online ISBN: 978-3-540-24650-3
eBook Packages: Springer Book Archive