Abstract
Lymphoma cancer classification with DNA microarray data is one of important problems in bioinformatics. Many machine learning techniques have been applied to the problem and produced valuable results. However the medical field requires not only a high-accuracy classifier, but also the in-depth analysis and understanding of classification rules obtained. Since gene expression data have thousands of features, it is nearly impossible to represent and understand their complex relationships directly. In this paper, we adopt the SNR (Signal-to-Noise Ratio) feature selection to reduce the dimensionality of the data, and then use genetic programming to generate cancer classification rules with the features. In the experimental results on Lymphoma cancer dataset, the proposed method yielded 96.6% test accuracy in average, and an excellent arithmetic classification rule set that classifies all the samples correctly is discovered by the proposed method.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ben-Dor, A., et al.: Tissue classification with gene expression profiles. J. of Computational Biology 7, 559–584 (2000)
Brazma, A., Vilo, J.: Gene expression data analysis. Federation of European Biochemical Societies Letters 480, 17–24 (2000)
Park, C., Cho, S.-B.: Genetic search for optimal ensemble of feature-classifier pairs in DNA gene expression profiles. In: Int. Joint Conf. on Neural Networks, pp. 1702–1707 (2003)
Tan, K., et al.: Evolutionary computing for knowledge discovery in medical diagnosis. Artificial Intelligence in Medicine 27(2), 129–154 (2003)
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Goldberg, D.: Genetic Algorithms in Search, Optimaization, and Machine Learning. Addison-Wesley, Reading (1989)
DeJong, K., et al.: Using genetic algorithms for concept learning. Machine Learning 13, 161–188 (1993)
Freitas, A.: A survey of evolutionary algorithms for data mining and knowledge discovery. Advances in Evolutionary Computation, pp. 819–845 (2002)
Hsu, C., Knoblock, C.: Discovering robust knowledge from databases that change. Data Mining and Knowledge Discovery 2(1), 69–95 (1998)
Zhou, C., et al.: Discovery of classification rules by using gene expression programming. In: Proc. of the 2002 Int. Conf. on Artificial Intelligence, pp. 1355–1361 (2002)
Bojarczuk, C., et al.: Discovering comprehensible classification rules using genetic programming: A case study in a medical domain. In: Proc. of the Genetic and Evolutionary Computation Conf., pp. 953–958 (1999)
Falco, I., et al.: Discovering interesting classification rules with genetic programming. Applied Soft Computing 1(4), 257–269 (2002)
Koza, J.: Genetic programming. Encyclopedia of Computer Science and Technology 39, 29–43 (1998)
Kishore, J., et al.: Application of genetic programming for multicategory pattern classification. IEEE Trans. on Evolutionary Computation 4(3), 242–258 (2000)
Won, H.-H., Cho, S.-B.: Neural network ensemble with negatively correlated features for cancer classification. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN 2003 and ICONIP 2003. LNCS, vol. 2714, pp. 1143–1150. Springer, Heidelberg (2003)
Bins, J., Draper, B.: Feature selection from huge feature sets. In: Proc. Int. Conf. Computer Vision 2, pp. 159–165 (2001)
Augier, S., et al.: Learning first order logic rules with a genetic algorithm. In: Proc. of the First Int. Conf. on Knowledge Discovery & Data Mining, pp. 21–26 (1995)
Alizadeh, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Monni, O., et al.: BCL2 overexpression in diffuse large B-cell lymphoma. Leuk Lymphoma 34(1-2), 45–52 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hong, JH., Cho, SB. (2004). Lymphoma Cancer Classification Using Genetic Programming with SNR Features. In: Keijzer, M., O’Reilly, UM., Lucas, S., Costa, E., Soule, T. (eds) Genetic Programming. EuroGP 2004. Lecture Notes in Computer Science, vol 3003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24650-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-24650-3_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21346-8
Online ISBN: 978-3-540-24650-3
eBook Packages: Springer Book Archive