An evolutionary framework for machine learning applied to medical data

doi:10.1016/j.knosys.2019.104982

Knowledge-Based Systems

Volume 185, 1 December 2019, 104982

https://doi.org/10.1016/j.knosys.2019.104982 Get rights and content

Abstract

Supervised learning problems can be faced by using a wide variety of approaches supported in machine learning. In recent years there has been an increasing interest in using the evolutionary computation paradigm as a search method for classifiers, helping the applied machine learning technique. In this context, the knowledge representation in the form of logical rules has been one of the most accepted machine learning approaches, because of its level of expressiveness. This paper proposes an evolutionary framework for rule-based classifier induction. Our proposal introduces genetic programming to build a search method for classification-rules (IF/THEN). From this approach, we deal with problems such as, maximum rule length and rule intersection. The experiments have been carried out on our domain of interest, medical data. The achieved results define a methodology to follow in the learning method evaluation for knowledge discovery from medical data. Moreover, the results compared to other methods have shown that our proposal can be very useful in data analysis and classification coming from the medical domain.

Introduction

Machine learning can be approached as the systematic study of algorithms and systems, which improve their knowledge or performance based on experience [1], [2], [3], [4], [5], [6]. The building of machines able to learn from experience has been for a long time a matter of debate since such machines have proven to have a meaningful level of learning ability. Thus, the introduction of machine learning techniques in computer science troubleshooting is of vital importance since there exist problems that cannot be solved through common programming techniques [7], [8], [9], [10]. For such problems there is not an available consistent mathematical model able to guide the programmer. But solving those problems has the potential of reforming aspects of our life and the used machine learning methods may provide the key to their solutions [11].

According to [12], the tasks involving machine learning can be classified into three main categories as follows: supervised learning, which builds a model from a set of inputs and a corresponding set of outputs. The goal is to find a mapping relating inputs with outputs. In contrast, unsupervised learning is not based on experience as the case of supervised learning and there are no labels on the data. The goal of this approach is to capture the true structure of the data to disclose knowledge. In reinforcement learning, intelligent processes (which can be called agents) interact with each other in a dynamic environment to reach their targets. In this approach, agents learn from a series of reinforcements, rewards or punishments, which makes the difference with supervised learning. Since such agents are endowed of a reinforcement process, they can evolve by learning from their environment [13].

The approach using examples (also called instances or patterns) to create programs is known as learning methodology and the set of examples can be referred to as the training data (or training set). The estimate of the target function, which is learned from the learning algorithm and maps inputs with outputs is known as the solution of the learning problem (or decision function). Usually, a set of candidate functions known as hypotheses is selected before to start learning the correct function. Therefore, the set of hypotheses can be seen as the key of the learning strategy. On the other hand, the method taking the training data as input and choosing the hypothesis from the space is the second key of the learning strategy, which is known as the learning method. A learning problem with binary outputs is called binary classification, one with a finite number of outputs is known as multi-class classification and one with real value outputs (continuous values) is known as regression.

The exponential growth of the amount of available medical data raises the problems of efficient storage and management of information as well as disclosing useful information from the data. The problem above is a challenge in computational medicine, claiming the development of methods and tools able to transform data into medical knowledge on the underlying mechanism. Those tools (methods) allow us to go beyond a simple description of the data and provide knowledge in form of models. Through this data abstraction involving a model, we will be able to obtain predictions of systems [8], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25].

There are several medical domains where machine learning techniques have been applied to discover knowledge, such as: diagnosis and prognosis, medical imaging and signal processing, planning and scheduling. Diagnosis and prognosis are the most common within this domain. Diagnosis is the process of selectively collecting information concerning a patient for its subsequent interpretation according to previous knowledge, as evidence for or against the presence or absence of disorders [26]. In a prognostic process, information is also collected and interpreted through the patient. But in this case, the goal is to predict the future behavior of the patient’s condition. For its predictive nature, prognosis systems are often used as tools to state medical treatments [27]. The goal of machine learning in the context of diagnosis and prognosis is knowledge discovery needed to interpret the gathered information. In some cases, this knowledge has been expressed as probabilistic relationships between clinical features and the proposed diagnosis (or prognosis). In other cases, a rule-based representation has been selected, so as to provide the expert with an explanation of the decision. Moreover, there are other cases where the system is designed as a black box decision maker, which is totally indifferent to the interpretation of its decisions. In summary, machine learning techniques are well suited to solve these kinds of problems due to their ability to carry out searches in extremely complex spaces.

We are today racking up a huge amount of data involving medical domains and we are also interested in transforming it into knowledge. Hence, the latter can be used as an auxiliary assistant in the decision-making process performed in the diagnosis of diseases [17], [20], [25]. Disclosing knowledge from data is one of the tasks supported by learning algorithms. In the case of the medical data domain, we are interested about inducing classifiers from a set of labeled patterns as a kind of supervised learning. We aim at developing classifiers in the form of a set of rules. To achieve this, we will devise an evolutionary algorithm, i.e., an algorithm inspired by the principles of evolution by natural selection, and the principles of genetics, tuned for this kind of problem.

The task of particular interest that we are going to implement aims at the development of interpretable solutions, enabling us to transform clinical data into medical knowledge and to incorporate existing clinical evidence. The concrete goal of this task is threefold: (1) to develop a novel algorithm for the building of a classifier from labeled examples and (2–3) to test and validate the algorithm with medical data sets.

As a general approach and in relation to evolutionary computation, we have developed an evolutionary framework based on genetic programming to induce rules of form $I F 〈 c o n d i t i o n s 〉 THEN 〈 c l a s s 〉$ , which will be able to classify patterns coming from the search space. Therefore, the implementation of the framework has been intending to achieve rule-based classifiers by basing on the idea of sequential covering (or separate and conquer) to render each rule [28]. In such a case, we assume individuals evolving from our learning algorithm as single classification rules (Michigan-style) [29]. The biggest challenge faced by the proposed framework was to solve the intersection problem of rules, i.e., patterns classified by rules in different classes. To solve such a problem, we have proposed an ensemble method to classify patterns at the intersection.

To achieve the aims above, the remainder of this paper has been outlined into the following sections: Section 2 deals with the existing background about our proposal. Section 3 presents our proposal, an evolutionary framework to induce rule-based classifiers. In this context, details such as, encoding and search strategy, rule generalization, fitness functions, genetic operators, classification model, rule intersection, the evolutionary algorithm and classifier improvement have been included in this section. Section 4 outlines the main features of the four clinical data sets involved in the experiments as well as the results reached by this proposal compared with other machine learning methods. Section 5 explains the conclusions of this work. Appendix A Example of a rule-based classifier for dataset#1, Appendix B Proof of theorems present an example of a rule-based classifier and the theoretical results respectively. Additionally, this paper provides a supplemental material in the Supplemental Material Document (SMD), which supports and complements the contributions reached in this research.

Section snippets

Background

Genetic programming (GP) represents a flexible and powerful evolutionary technique that uses a set of functions and terminals to produce computable expressions [30], [31], [32]. This allows us to find general solutions, in form of IF-THEN rules, able to classify patterns from a determined problem [28], [29], [33].

When GP is used for rule induction by generating classifiers becomes one of the most important applications in this field, since it allows us to capture the main features of a given

An evolutionary framework for rule induction

This section describes our evolutionary proposal, which defines a Rule Induction Method based on Genetic Programming (called RIM-GP). Before starting with an explanation of the method, we are going to definite classification-rule and rule-based classifier according to the approach assumed in this work. This will be useful in the analysis and understanding of RIM-GP.

Definition 1 Classification Rule

Let $D^{d}$ (or $D$ for short) be a data set of dimension $d$ , which has been partitioned into a set of classes ${C_{0}, C_{1}, \dots, C_{m}}$ , whose $C_{i} ⋂ C_{j} = ϕ$

Results on medical data

This section describes the experiments carried out by the proposed approach on four clinical datasets from the public repository of the Center for Machine Learning and Intelligent Systems, http://archive.ics.uci.edu/ml/datasets/. For each case, we describe the essential of the data set and provide the results achieved by the proposal. In particular, a visual analysis of the used data sets in conjunction with the RIM-GP results has been presented in the Supplemental Material Document (SMD). At

Conclusions

Machine learning, as a practical matter, deals with the extraction of the right features from the data to build the right models achieving the right tasks [6], [60]. We have also seen that the most common approaches used in machine learning are $c l a s s i f i c a t i o n$ and $r e g r e s s i o n$ . In that sense, we can say that machine learning processes aim to render classification expressions as simple as possible for humans to understand [67]. The creation and assessment of intelligent machines whose learning is

CRediT authorship contribution statement

José A. Castellanos-Garzón: Conceptualization, Formal analysis, Investigation, Writing-original draft. Ernesto Costa: Supervision, Funding acquisition, Investigation, Validation, Methodology. José Luis Jaimes S.: Formal analysis, Methodology, Writing-review & editing. Juan M. Corchado: Supervision, Funding acquisition, Project administration, Validation.

Acknowledgments

This work has been carried out under the iCIS, Spain project (CENTRO-07-ST24-FEDER-002003), which has been co-financed by QREN, Spain, in the scope of the Mais Centro Program and European Union’s FEDER. This work has also been partially supported by the Interreg V-A Spain-Portugal Program (PocTep) and the European Regional Development Fund (ERDF) under the IOTEC project (grant 0123_IOTEC_3_E).

References (67)

HolmesJ.H. et al.
Learning classifier systems: New models, successful applications
Inform. Process. Lett.
(2002)
NolanJ.R.
Computer systems that learn: an empirical study of the effect of noise on the performance of three classification methods
Expert Syst. Appl.
(2002)
ZhangC. et al.
Multi-imbalance: An open-source software for multi-class imbalance learning
Knowl.-Based Syst.
(2019)
XiaoQ. et al.
Multi-view manifold regularized learning-based method for prioritizing candidate disease miRNAs
Knowl.-Based Syst.
(2019)
FujitaH. et al.
Computer aided detection for fibrillations and flutters using deep convolutional neural network
Inform. Sci.
(2019)
HongJ.-H. et al.
The classification of cancer based on dna microarray data that uses diverse ensemble genetic programming
Artif. Intell. Med.
(2006)
Peña ReyesC.A. et al.
Evolutionary computation in medicine: an overview
Artif. Intell. Med.
(2000)
TsakonasA. et al.
Evolving rule-based systems in two medical domains using genetic programming
Artif. Intell. Med.
(2004)
LucasP.
Analysis of notions of diagnosis
Artif. Intell.
(1998)
LucasP.
Prognostic methods in medicine
Artif. Intell.
(1999)

NaredoE. et al.

Evolving genetic programming classifiers with novelty search

Inform. Sci.

(2016)

De FalcoI. et al.

Discovering interesting classification rules with genetic programming

Appl. Soft Comput.

(2002)

Castellanos-GarzónJ.A. et al.

An evolutionary computational model applied to cluster analysis of DNA microarray data

Expert Syst. Appl.

(2013)

Castellanos-GarzónJ.A. et al.

A visual analytics framework for cluster analysis of DNA microarray data

Expert Syst. Appl.

(2013)

DudaR.O. et al.

Pattern Classification

(2001)

EmbreyD.

Human Error, Human Reliability Associates

(2005)

WuX. et al.

Top 10 algorithms in data mining

Knowl. Inf. Syst.

(2008)

FlachP.

Machine Learning: The Art and Science of Algorithms that Make Sense of Data

(2012)

FujitaH. et al.

Resilience analysis of critical infrastructures: A cognitive approach based on granular computing

IEEE Trans. Cybern.

(2019)

CristianiniN. et al.

An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods

(2000)

RussellS. et al.

Artificial Intelligence: A Modern Approach

(2003)

BishopC.M.

Pattern Recognition and Machine Learning

(2006)

BandyopadhyayS. et al.

KumarT.P. et al.

Prediction of cancer class with majority voting genetic programming classifier using gene expression data

IEEE/ACM Trans. Comput. Biol. Bioinform.

(2009)

KumarR. et al.

Classification rule discovery for diabetes patients by using genetic programming

Int. J. Soft Comput. Eng.

(2012)

LarrañagaP. et al.

Machine learning in bioinformatics

Brief. Bioinform.

(2006)

LiuK.-H. et al.

A genetic programming-based approach to the classification of multiclass microarray datasets

Bioinformatics

(2009)

MaulikU. et al.

Multiobjective Genetic Algorithms for Clustering: Applications in Data Mining and Bioinformatics

(2011)

PodgorelecV. et al.

Knowledge discovery with classification rules in a cardiovascular dataset

Comput. Methods Programs Biomed.

(2005)

SoniJ. et al.

Intelligent and effective heart disease prediction system using weighted associative classifiers

Int. J. Comput. Sci. Eng.

(2011)

VargasC.M.B. et al.

Computational Biology and Applied Bioinformatics

(2011)

FreitasA.A.

Soft Computing for Knowledge Discovery and Data Mining, Part II

(2008)

PappaG.L. et al.

Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach

(2010)

Cited by (19)

StratMed: Relevance stratification between biomedical entities for sparsity on medication recommendation
2024, Knowledge-Based Systems
With the growing imbalance between limited medical resources and escalating demands, AI-based clinical tasks have become paramount. As a sub-domain, medication recommendation aims to amalgamate longitudinal patient history with medical knowledge, assisting physicians in prescribing safer and more accurate medication combinations. Existing works ignore the inherent long-tailed distribution of medical data, have uneven learning strengths for hot and sparse data, and fail to balance safety and accuracy. To address the above limitations, we propose StratMed, which introduces a stratification strategy that overcomes the long-tailed problem and achieves fuller learning of sparse data. It also utilizes a dual-property network to address the issue of mutual constraints on the safety and accuracy of medication combinations, synergistically enhancing these two properties. Specifically, we construct a pre-training method using deep learning networks to obtain medication and disease representations. After that, we design a pyramid-like stratification method based on relevance to strengthen the expressiveness of sparse data. Based on this relevance, we design two graph structures to express medication safety and precision at the same level to obtain patient representations. Finally, the patient’s historical clinical information is fitted to generate medication combinations for the current health condition. We employed the MIMIC-III dataset to evaluate our model against state-of-the-art methods in three aspects comprehensively. Compared to the sub-optimal baseline model, our model reduces safety risk by 15.08%, improves accuracy by 0.36%, and reduces training time consumption by 81.66%. Our source code is publicly available at: https://github.com/lixiang-222/StratMed
Classification model of machine learning for medical data analysis
2022, Statistical Modeling in Machine Learning: Concepts and Applications
Machine learning algorithms are newly emerging, cost-effective, and accurate techniques that are used in image recognition, speech recognition, and automation systems. This chapter provides of a broad aspect of all type of classification models such as traditional classification models such as logistic regression, decision tree, random forest, ANN, SVM, Radial Basis Function NN, and deep neural net classification models for medical data analysis. This chapter also presents some of useful classification algorithms for medical image analysis. Traditional learning algorithms provides better results for lesser number of data however performance does not improve on larger data size (in terms of accuracy, robustness and overfitting). Deep neural net does improve the performance as data size increases.
An intelligent hybrid classification algorithm integrating fuzzy rule-based extraction and harmony search optimization: Medical diagnosis applications
2021, Knowledge-Based Systems
Citation Excerpt :
developed a new learning approach for the kernel extreme learning machine based on the chaotic moth-flame optimization strategy where the model was used for medical diagnosis problems of Parkinson’s disease and breast cancer. [18] utilized a machine learning approach, i.e. deep learning, to diagnose and classify diabetes using a specifically selected set of characteristics. [19] used an evolutionary algorithm integrated with a rule-based classification approach on four medical datasets in which genetic programming was applied to make a search method for if-then classification rules.
Uncertainty is a critical factor in medical datasets needed to be overcome for increasing diagnosis efficiency. This paper proposes an intelligent classification algorithm comprising a fuzzy rule-based approach, a harmony search (HS) algorithm, and a heuristic algorithm to classify medical datasets intelligently. Two fuzzy approaches, as well as orthogonal and triangular fuzzy sets, are first utilized to define the attributes of data. Then, an HS algorithm is integrated with a heuristic to generate fuzzy rules to select the best rules in the fuzzy rule-based systems. Moreover, to improve the performance of the proposed classification approach, a three-phase parameter tuning approach is applied. First, the Taguchi method (phase I) is employed to tune the parameters of the HS algorithm using a fixed number of training data and find the central points of the parameters’ values. Then, a nested cross-validation (CV) approach consisting of an outer CV (phase II) and an inner CV (phase III) is utilized. Using the Taguchi approach gives the advantage of not considering a wide range of parameters by the nested CV which produces better results on the medical dataset. Nine well-known medical datasets are used to evaluate the efficiency of the proposed hybrid algorithm. To this aim, the results obtained by the algorithm are compared with the ones of several related works in the literature where several statistical tests and graphical approaches are used for comparisons. The results show that the proposed methods are robust in data analysis and classification of clinical datasets.
Using machine learning techniques for rising star prediction in basketball
2021, Knowledge-Based Systems
Citation Excerpt :
classical machine learning technique are not much efficient in situations where the decisions are time-dependent, for such situations, [38] presented a machine learning model that have the ability to work in time varying systems. Machine learning techniques based on evolutionary framework has been used in medical domain on clinical data [39]. For prediction of breast cancer, Support Vector Machines and Artificial Neural Networks has been applied by [40] on Wisconsin Breast Cancer dataset.
Rising stars in any field are the persons that have the potential to become popular in near future. Exploring rising stars in any organization will help the organization in their decisions making. Concept of rising star has been applied for finding rising authors in research community, rising business managers in telecommunication industry and rising players in the game of cricket. In this paper we presented the rising star prediction in basketball as a machine learning problem. We presented three types of co-players: co-players of same team in same game, co-players of opponent team in same game and co-players of both same and opponent team in same game. Co-players statistics are used as features for machine learning models. The co-player features are classified by feature size and type, which are further divided into different categories. Derived features along with their mathematical formulation are presented, that are derived from players statistics. The impact of co-players on prediction of rising star is measured through various machine learning models. Experimental results shows that derived features are dominant on different datasets in terms of F-measure score. The highest F-measure score achieved by derived features is 96%. Comparison of different machine learning models shows that Maximum Entropy Markov Model is dominant on all datasets in terms of F-measure score. The highest F-measure score achieved by Maximum Entropy Markov Model is 96%. Ranking comparison shows that most of the labeled rising stars are ranked in the top 100 in the subsequent six seasons. Comparison of rising stars with NBA (National Basketball Association) most improved players shows that rising stars have better efficiency in those seasons for which NBA most improved players were selected.
Hyper-heuristic local search for combinatorial optimisation problems
2020, Knowledge-Based Systems
Citation Excerpt :
Many COPs are known as NP-hard problems, hence impractical for exact methods. Instead, meta-heuristic algorithms have been used to deal with the COPs because they are able to find reasonably good solutions within an acceptable timeframe [2–9]. Local search (LS) algorithms are a type of meta-heuristic which have been shown to be very effective on a range of COPs [10–14].
Local search algorithms have been successfully used for many combinatorial optimisation problems. The choice of the most suitable local search algorithm is, however, a challenging task as their performance is highly dependent on the problem characteristic. In addition, most of these algorithms require users to select appropriate internal neighbourhood structures to obtain desirable performance. No single local search algorithm can consistently perform well with a fixed setting, for different types of problems or even different instances of the same problem. To address this issue, we propose a hyper-heuristic framework which incorporates multiple local search algorithms and a pool of neighbourhood structures. This framework is novel in three respects. Firstly, a two-stage hyper-heuristic structure is designed to control the selection of a local search algorithm and its internal operators. Secondly, we propose an adaptive ranking mechanism to choose the most appropriate neighbourhood structures for the current local search algorithm. The proposed mechanism uses the entropy to evaluates the contribution of the local search in terms of quality and diversity. It adaptively adjusts the pool of candidate neighbourhood structures. Thirdly, we use a population of solutions within the proposed framework to effectively navigate different areas in the solutions search space and share solutions with local search algorithms. To ensure different solutions is allocated in different regions of the search space, we propose a distance-based strategy for population updating process that allowing solutions to share local search algorithms. We have evaluated the performance of the proposed framework using two challenging optimisation problems: Multi-Capacity Bin Packing benchmark instances and Google Machine Reassignment benchmark instances. The results show the effectiveness of the proposed framework, which outperformed state-of-the-art algorithms on several problem instances.
A bi-phased multi-objective genetic algorithm based classifier
2020, Expert Systems with Applications
This paper presents a novel Bi-Phased Multi-Objective Genetic Algorithm (BPMOGA) based classification method. It is a Learning Classifier System (LCS) designed for supervised learning tasks. Here we have used Genetic Algorithms (GAs) to discover optimal classifiers from data sets. The objective of the work is to find out a classifier or Complete Rule (CR) which comprises of several Class Specific Rules (CSRs). Phase-I of BPMOGA extracts optimized CSRs in $I F - T H E N$ form by following Michigan approach, without considering interaction among the rules. Phase-II of BPMOGA builds optimized CRs from CSRs by following Pittsburgh way. It combines the advantages of both approaches. Extracted CRs help to build CSRs for the next run of phase-I. Hence, phase-I and phase-II are cyclically related, which is one of the uniqueness of BPMOGA. With the help of twenty one benchmark data sets from the University of California at Irvine (UCI) machine learning repository we have compared performance of BPMOGA based classifier with fourteen GA and non-GA based classifiers. Statistical test shows that the performance of the proposed classifier is either superior or comparable to other classifiers.

View all citing articles on Scopus

^☆: No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.104982.

¹: http://bisite.usal.es.

View full text

An evolutionary framework for machine learning applied to medical data☆

Abstract

Introduction

Section snippets

Background

An evolutionary framework for rule induction

Results on medical data

Conclusions

CRediT authorship contribution statement

Acknowledgments

Inform. Process. Lett.

Expert Syst. Appl.

Knowl.-Based Syst.

Knowl.-Based Syst.

Inform. Sci.

Artif. Intell. Med.

Artif. Intell. Med.

Artif. Intell. Med.

Artif. Intell.

Artif. Intell.

Inform. Sci.

Appl. Soft Comput.

Expert Syst. Appl.

Expert Syst. Appl.

Pattern Classification

Human Error, Human Reliability Associates

Top 10 algorithms in data mining

Knowl. Inf. Syst.

Machine Learning: The Art and Science of Algorithms that Make Sense of Data

Resilience analysis of critical infrastructures: A cognitive approach based on granular computing

IEEE Trans. Cybern.

An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods

Artificial Intelligence: A Modern Approach

Pattern Recognition and Machine Learning

Prediction of cancer class with majority voting genetic programming classifier using gene expression data

IEEE/ACM Trans. Comput. Biol. Bioinform.

Classification rule discovery for diabetes patients by using genetic programming

Int. J. Soft Comput. Eng.

Machine learning in bioinformatics

Brief. Bioinform.

A genetic programming-based approach to the classification of multiclass microarray datasets

Bioinformatics

Multiobjective Genetic Algorithms for Clustering: Applications in Data Mining and Bioinformatics

Knowledge discovery with classification rules in a cardiovascular dataset

Comput. Methods Programs Biomed.

Intelligent and effective heart disease prediction system using weighted associative classifiers

Int. J. Comput. Sci. Eng.

Computational Biology and Applied Bioinformatics

Soft Computing for Knowledge Discovery and Data Mining, Part II

Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach