Abstract
Missing values are an unavoidable issue in many real-world datasets. Classification with incomplete data has to be addressed carefully because inadequate treatment often leads to a big classification error. Interval genetic programming (IGP) is an approach to directly use genetic programming to evolve an effective and efficient classifier for incomplete data. This paper proposes a method to improve IGP for classification with incomplete data by integrating IGP with ensemble learning to build a set of classifiers. Experimental results show that the integration of IGP and ensemble learning to evolve a set of classifiers for incomplete data can achieve better accuracy than IGP alone. The proposed method is also more accurate than other common methods for classification with incomplete data.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Acuna, E., Rodriguez, C.: The treatment of missing values and its effect on classifier accuracy. In: Banks, D., McMorris, F.R., Arabie, P., Gaul, W. (eds.) Classification, Clustering, and Data Mining Applications. Studies in Classification, Data Analysis, and Knowledge Organisation, pp. 639–647. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-642-17103-1_60
Asuncion, A., Newman, D.: UCI Machine Learning Repository (2013)
Buuren, S., Groothuis-Oudshoorn, K.: MICE: multivariate imputation by chained equations in R. J. Stat. Softw. 45, 1–67 (2011)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Espejo, P.G., Ventura, S., Herrera, F.: A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40, 121–144 (2010)
García-Laencina, P.J., Sancho-Gómez, J.-L., Figueiras-Vidal, A.R.: Pattern classification with missing data: a review. Neural Comput. Appl. 19, 263–282 (2010)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11, 10–18 (2009)
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366 (2000)
Koza, J.R.: Genetic Programming III: Darwinian Invention and Problem Solving, vol. 3 (1999)
Liu, Y., Brown, S.D.: Comparison of five iterative imputation methods for multivariate classification. Chemom. Intell. Lab. Syst. 120, 106–115 (2013)
Luke, S., et al.: A Java-based evolutionary computation research system, March 2004. http://cs.gmu.edu/~eclab/projects/ecj
Neshatian, K., Zhang, M., Andreae, P.: A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming. IEEE Trans. Evol. Comput. 16, 645–661 (2012)
Opitz, D.W., Maclin, R.: Popular ensemble methods: an empirical study. J. Artif. Intell. Res. (JAIR) 11, 169–198 (1999)
Tran, C.T., Zhang, M., Andreae, P.: Directly evolving classifiers for missing data using genetic programming. In: 2016 IEEE Congress on Evolutionary Computation (CEC), pp. 5278–5285 (2016)
Tran, C.T., Zhang, M., Andreae, P., Xue, B., Bui, L.T.: An effective and efficient approach to classification with incomplete data. Knowl.-Based Syst. 154, 1–16 (2018)
White, I.R., Royston, P., Wood, A.M.: Multiple imputation using chained equations: issues and guidance for practice. Stat. Med. 30, 377–399 (2011)
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Philip, S.Y., et al.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Tran, C.T., Zhang, M., Xue, B., Andreae, P. (2018). Genetic Programming with Interval Functions and Ensemble Learning for Classification with Incomplete Data. In: Mitrovic, T., Xue, B., Li, X. (eds) AI 2018: Advances in Artificial Intelligence. AI 2018. Lecture Notes in Computer Science(), vol 11320. Springer, Cham. https://doi.org/10.1007/978-3-030-03991-2_53
Download citation
DOI: https://doi.org/10.1007/978-3-030-03991-2_53
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03990-5
Online ISBN: 978-3-030-03991-2
eBook Packages: Computer ScienceComputer Science (R0)