Abstract
We developed genetic programming (GP)-based evolutionary ensemble system for the early diagnosis, prognosis and prediction of human breast cancer. This system has effectively exploited the diversity in feature and decision spaces. First, individual learners are trained in different feature spaces using physicochemical properties of protein amino acids. Their predictions are then stacked to develop the best solution during GP evolution process. Finally, results for HBC-Evo system are obtained with optimal threshold, which is computed using particle swarm optimization. Our novel approach has demonstrated promising results compared to state of the art approaches.
References
Aminzadeh F, Shadgar B, Osareh A (2011) A robust model for gene analysis and classification. Int J Multimed Appl 3:11–20
Good BM, Loguercio S, Griffith OL, Nanis M, Wu C, Su AI (2014) The cure: making a game of gene selection for breast cancer survival prediction. arXiv preprint arXiv 1402:3632
Jiang X, Wei R, Zhao Y, Zhang T (2008) Using Chou’s pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location. Amino Acids 34(4):669–675
Khan A, Majid A, Hayat M (2011) CE-PLoc: an ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition. Comput Biol Chem 35(4):218–229
Lavanya D, Rani KU (2012) Ensemble decision making system for breast cancer data. Int J Comput Appl 51(17):0975–8887
Li ZC, Zhou XB, Dai Z, Zou XY (2009) Prediction of protein structural classes by Chou’s pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis. Amino Acids 37(2):415–425
Li DC, Liu CW, Hu SC (2011) A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. Artif Intell Med 52:45–52. doi:10.1016/j.artmed.2011.02.001
Majid A, Safdar A, Mubashar I, Nabeela K (2014) Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines. Comput Methods Progr Biomed 113(3):792–808
Maqsood H, Khan A, Yeasin M (2012) Prediction of membrane proteins using split amino acid and ensemble classification. Amino Acids 42(6):2447–2460
Munteanu CR, Magalhães AL, Uriarte E, González-Díaz H (2009) Multi-target QPDR classification model for human breast and colon cancer-related proteins using star graph topological indices. J Theor Biol 257(2):303–311
Pena-Reyes CA, Sipper M (1999) A fuzzy-genetic approach to breast cancer diagnosis. Artif Intell Med 17:131–155
Ruxandra S, Stoean C (2013) Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection. Expert Syst Appl 40:2677–2686
Safdar A, Majid A, Khan A (2014) IDM-PhyChm-Ens: intelligent decision-making ensemble methodology for classification of human breast cancer using physicochemical properties of amino acids. Amino Acids 46(4):977–993
Xin M, Guo J, Liu H, Xie J, Sun X (2012) Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information. IEEE ACM Trans Comput Biol Bioinf 9(6):1766–1775
Conflict of interest
It is stated that we authors do not have any type of “conflict of interest” in the submission of this paper.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Majid, A., Ali, S. HBC-Evo: predicting human breast cancer by exploiting amino acid sequence-based feature spaces and evolutionary ensemble system. Amino Acids 47, 217–221 (2015). https://doi.org/10.1007/s00726-014-1871-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00726-014-1871-3