ReviewIntrusion detection by machine learning: A review
Introduction
The Internet has become a part of daily life and an essential tool today. It aids people in many areas, such as business, entertainment and education, etc. In particular, Internet has been used as an important component of business models (Shon & Moon, 2007). For the business operation, both business and customers apply the Internet application such as website and e-mail on business activities. Therefore, information security of using Internet as the media needs to be carefully concerned. Intrusion detection is one major research problem for business and personal networks.
As there are many risks of network attacks under the Internet environment, there are various systems designed to block the Internet-based attacks. Particularly, intrusion detection systems (IDSs) aid the network to resist external attacks. That is, the goal of IDSs is to provide a wall of defense to confront the attacks of computer systems on Internet. IDSs can be used on detect difference types of malicious network communications and computer systems usage, whereas the conventional firewall can not perform this task. Intrusion detection is based on the assumption that the behavior of intruders different from a legal user (Stallings, 2006).
In general, IDSs can be divided into two categories: anomaly and misuse (signature) detection based on their detection approaches (Anderson, 1995, Rhodes et al., 2000). Anomaly detection tries to determine whether deviation from the established normal usage patterns can be flagged as intrusions. On the other hand, misuse detection uses patterns of well-known attacks or weak spots of the system to identify intrusions.
In literature, numbers of anomaly detection systems are developed based on many different machine learning techniques (c.f. Section 3). For example, some studies apply single learning techniques, such as neural networks, genetic algorithms, support vector machines, etc. On the other hand, some systems are based on combining different learning techniques, such as hybrid or ensemble techniques. In particular, these techniques are developed as classifiers, which are used to classify or recognize whether the incoming Internet access is the normal access or an attack. However, there is no a review of these different machine learning techniques over the intrusion detection domain.
Therefore, the goal of this paper is to review 55 related studies/systems published from 2000 to 2007 by examining what techniques have been used, what experiments have been conducted, and what should be considered for future work based on the machine learning’s perspective.
This paper is organized as follows. Section 2 provides an overview of machine learning techniques and briefly describes a number of related techniques for intrusion detection. Section 3 compares related work based on the types of classifier design, the chosen baselines, datasets used for experiments, etc. Conclusion and discussion for future research are given in Section 4.
Section snippets
Pattern classification
Pattern recognition is the action to take raw data and activity on data category (Michalski, Bratko, & Kubat, 1998). The methods of supervised and unsupervised learning can be used to solve different pattern recognition problems (Theodoridis and Koutroumbas, 2006, Theodoridis and Koutroumbas, 2006). In supervised learning, it is based on using the training data to create a function, in which each of the training data contains a pair of the input vector and output (i.e. the class label).
The
Types of classifier design
The methods for intrusion detection can be generally divided into three categories, namely single, hybrid, and ensemble. To understand the types of classifier design, Table 1 shows the total numbers of the 55 articles using single, ensemble, and hybrid classifiers respectively. Fig. 1 presents yearwise distribution of these articles in terms of their classifier design.
Regarding Table 1, single classifiers have the largest number of literatures between 2000 and 2007. On the other hand, very few
Discussion and conclusion
We have reviewed current studies of intrusion detection by machine learning techniques. In particular, this paper reviews recent papers which are between 2000 and 2007. In addition, we consider a large number of machine learning techniques used in the intrusion detection domain for the review including single, hybrid, and ensemble classifiers.
Regarding the comparative results of related work, developing intrusion detection systems using machine learning techniques still needs to be researched.
References (77)
- et al.
Application of SVM and ANN for intrusion detection
Computer and Operations Research
(2005) - et al.
An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks
Expert Systems with Applications
(2005) - et al.
Intrusion detection in computer networks by a modular ensemble of one-class classifiers
Information Fusion
(2008) - et al.
Detecting intrusion with ruled-based integration of multiple models
Computers and Security
(2003) - et al.
A clustering-based method for unsupervised intrusion detections
Pattern Recognition Letters
(2006) - et al.
The neural network models for IDS based on the asymmetric costs of false negative errors and false positive errors
Expert System with Applications
(2003) - et al.
Use of K-nearest neighbor classifier for intrusion detection
Computer and Security
(2002) - et al.
An active learning based TCM-KNN algorithm for supervised network intrusion detection
Computer and Security
(2007) - et al.
A genetic clustering method for intrusion detection
Pattern Recognition
(2004) - et al.
A hierarchical intrusion detection model based on the PCA neural networks
Neurocomputing
(2007)
An empirical analysis of the probabilistic K-nearest neighbour classifier
Pattern Recognition Letters
Intrusion detection using an ensemble of intelligent paradigms
Network and Computer Applications
Intrusion detection by integrating boosting genetic fuzzy classifier and data mining criteria for rule pre-screening
Journal of Network and Computer Applications
Modeling intrusion detection system using hybrid intelligent systems
Journal of Network and Computer Applications
A Bayesian paradigm for designing intrusion detection systems
Computational Statistics and Data Analysis
Applying genetic algorithm for classifying anomalous TCP/IP packets
Neurocomputing
A hybrid machine learning approach to network anomaly detection
Information Sciences
A new approach to intrusion detection based on an evolutionary soft computing model using neuro-fuzzy classifiers
Computer Communication
Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection
Pattern Recognition
A latent class modeling approach to detect network intrusion
Computer Communications
Intrusion detection using hierarchical neural network
Pattern Recognition Letters
Application of online-training SVMs for real-time intrusion detection with different considerations
Computer Communications
A parallel genetic local search algorithm for intrusion detection in computer networks
Engineering Applications of Artificial Intelligence
A new framework for learning classifier models in data mining
An introduction to neural networks
Intrusion detection through behavior model
Computer Communication
Neural networks for pattern recognition
Classification and regressing trees
Hybrid flexible neural-tree-based intrusion detection systems
International Journal of Intelligent Systems
Detection and Summarization of Novel Network Attacks Using Data Mining
A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data
Using artificial anomalies to detect unknown and known network intrusions
Knowledge and Information Systems
Using artificial anomalies to detect unknown and known network intrusions
Knowledge and Information Systems
Cited by (757)
Archimedes Fire Hawk Optimization enabled feature selection with deep maxout for network intrusion detection
2024, Computers and SecurityAnomaly detection based on Artificial Intelligence of Things: A Systematic Literature Mapping
2024, Internet of Things (Netherlands)Meta-survey on outlier and anomaly detection
2023, NeurocomputingA gradient-based approach for adversarial attack on deep learning-based network intrusion detection systems
2023, Applied Soft ComputingKnowledge graph reasoning for cyber attack detection
2024, IET Communications