Skip to main content

A Study of Decision Tree Induction for Data Stream Mining Using Boosting Genetic Programming Classifier

  • Conference paper
Book cover Swarm, Evolutionary, and Memetic Computing (SEMCCO 2011)

Abstract

Genetic Programming is an evolutionary soft computing approach. Data streams are the order of the day input mechanisms. Here is a study of GP Classifier on Data Streams. GP classification performance is compared to that of other state-of-the-art data mining and stream classification approaches. Boosting is a machine learning meta-algorithm for performing supervised learning. A weak learner is defined to be a classifier which is only slightly correlated with the true classification (it can label examples better than random guessing). In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification. Boosting combines a set of weak learners to create a strong learner. It is observed that the Boosting GP approach is beating Boosting Naïve Bayes classification. Hence it is found that GP is a competent algorithm for Data Stream classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Loveard, T., Ciesielski, V.: Representing classification problems in genetic programming. In: Proc. Congr. Evolutionary Computation, May 27-30, pp. 1070–1077 (2001)

    Google Scholar 

  2. Kishore, J.K., Patnaik, L.M., Mani, V., Agrawal, V.K.: Application of genetic programming for multicategory pattern classification. IEEE Transaction on Evolutionary Computation 4, 242–258 (2000)

    Article  Google Scholar 

  3. Muni, D.P., Pal, N.R., Das, J.: A novel approach for designing classifiers using genetic programming. IEEE Trans. Evolut. Comput. 8(2), 183–196 (2004)

    Article  Google Scholar 

  4. Muni, D.P., Pal, N.R., Das, J.: Genetic programming for simultaneous feature selection and classifier design. IEEE Transactions on Systems, Man, and Cybernetics, Part B 36(1), 106–117 (2006)

    Article  Google Scholar 

  5. Paul, T.K., Iba, H.: Prediction of Cancer class with Majority Voting Genetic Programming Classifier Using Gene Expression Data. IEEE/ACM Trans. on Computational Biology and Bioinformatics 6(2), 363–367 (2009)

    Article  Google Scholar 

  6. Han, J., Kamber, M.: Data Mining Concepts and Techniques, 2nd edn. Elsevier (2006)

    Google Scholar 

  7. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading (1989)

    MATH  Google Scholar 

  8. Koza, J.R.: Genetic Programming: On the programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  9. Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to Genetic Programming (March 2008), http://www.gp-field-guide.org.uk

  10. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley and Sons (2001)

    Google Scholar 

  11. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representation by error propagation. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing. MIT Press (1986)

    Google Scholar 

  12. Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)

    MATH  Google Scholar 

  13. Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Person Education (2006)

    Google Scholar 

  14. Nagendra Kumar, D.J., Satapathy, S.C., Murthy, J.V.R.: A scalable genetic programming multi-class ensemble classifier. In: World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, pp. 1201–1206 (2009), doi:10.1109/NABIC.2009.5393788

    Google Scholar 

  15. Masud, M.M., Gao, J., Khan, L., Han, J., Thuraisingham, B.: Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS, vol. 5782, pp. 79–94. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  16. Folino, G., Pizzuti, C., Spezzano, G.: An Adaptive Distributed Ensemble Approach to Mine Concept-Drifting Data Streams. In: ICTAI 2007, Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, vol. 02 (2007)

    Google Scholar 

  17. Abbass, H.A., Bacardit, J., Butz, M.V., Llorà, X.: Online Adaptation in Learning Classifier Systems: Stream Data Mining (2004)

    Google Scholar 

  18. Zhang, Y., Jin, X.: An automatic construction and organization strategy for ensemble learning on data streams. ACM SIGMOD Record Homepage Archive 35(3) (September 2006)

    Google Scholar 

  19. Wu, W., Gruenwald, L.: Research issues in mining multiple data streams. In: StreamKDD 2010, Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques (2010)

    Google Scholar 

  20. Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: 15th ACM SIGKDD Intl. Conference on Knowledge Discovery and Data Mining (KDD 2009), Paris, France (June 2009)

    Google Scholar 

  21. Folino, G., Pizzuti, C., Spezzano, G.: Boosting Technique for Combining Cellular GP Classifiers. In: Keijzer, M., O’Reilly, U.-M., Lucas, S., Costa, E., Soule, T. (eds.) EuroGP 2004. LNCS, vol. 3003, pp. 47–56. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  22. Paris, G., Robilliard, D., Fonlupt, C.: Genetic Programming with Boosting for Ambiguities in Regression Problems. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 183–193. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  23. Teredesai, A., Govindaraju, V.: Issues in Evolving GP based Classifiers for a Pattern Recognition Task. In: Proceedings of the 2004 IEEE Congress on Evolutionary Computation, pp. 509–515. IEEE Press (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kumar, D.J.N., Murthy, J.V.R., Satapathy, S.C., Pullela, S.V.V.S.R.K. (2011). A Study of Decision Tree Induction for Data Stream Mining Using Boosting Genetic Programming Classifier. In: Panigrahi, B.K., Suganthan, P.N., Das, S., Satapathy, S.C. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2011. Lecture Notes in Computer Science, vol 7076. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27172-4_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27172-4_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27171-7

  • Online ISBN: 978-3-642-27172-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics