1 Introduction
Nowadays, the Internet has sparked a great technological revolution in terms of the exchange of information, knowledge and science between individuals and even institutions; at the same time, the use of the web has become one of the essential necessities of our daily life. Unfortunately, this dependence on the web has led some individuals to exploit it illegally through hacking, espionage, data theft, extortion and other malicious activities.
This reality poses a significant security threat to both individuals and companies. This issue is also becoming a real challenge for computer science researchers and developers.
Therefore, it is necessary to implement a security policy to protect company data and personal information from unexpected attacks. several tools are available to ensure data protection and personal information. The purpose of this protection is to reduce the risks associated with the confidentiality, integrity and availability of data.
An Intrusion Detection System (IDS) is considered to be the most important tool to ensure the functionality of computer security systems, because the IDS is the only tool that can guarantee the stability of the system, and then, because most attacks occur after an intrusion or by the injection of a malicious application. It is in charge of the response in the event of an attack as well as the stop or continuity strategies [8].
There are two main types of intrusion detection approaches in the literature: those based on scenarios (such as signature research, pattern matching, etc.) and those based on behavioral approaches (for example, Bayesian analysis, statistical analysis and neural networks).
This last category aims to recognize abnormal behavior, compared to a definition or a modeling of normal or abnormal behaviors learned from a prior observation of the system, and in this case, learning seems possible. In contrast, in a scenario-based approach, the IDS relies on a pre-existing knowledge base referencing the various known attacks likely to be implemented in a computer system.
This knowledge is used by the IDS for the recognition of events produced by intrusion actions in the computer system that it observes. Therefore, this method requires regular updating of the knowledge base and the IDS focuses directly on the identification of misuse.
It is also possible to compare intrusion detection systems based on the data sources they rely on. Some IDS, known as HIDS (Host IDS) are based on the execution histories of specific programs or instruction sequences, which are often provided by the operating system but sometimes also by applications. Other IDS, typically known as NIDS (Network IDS), analyzes the packets sent over the network.
In theory, two response modes can be distinguished for IDS. Usually, a passive response is adopted: the IDS broadcasts an alert and identifies the detected attack to an analysis or broadcast system by recording the detected intrusions in a log file.
However, more active responses should be considered, where the IDS aims to stop an attack at the moment of its detection by interrupting a connection or even counter-attacking [23].
In order to improve the efficiency of intrusion detection systems, several solutions have been proposed in this field. The authors remain focused on achieving this objective by conducting research on the use and integration of bio-inspiration techniques in general and particle swarm optimization (PSO) in particular.
PSO is a bioinspired optimization metaheuristic that was proposed by Eberhart and Kennedy in 1995 [12]. The technique of optimizing particle swarm was inspired by the collective behavior of birds or fish schools.
Each particle in the PSO is a fish or a bird in search space, with its own specific coordinates: position and velocity. Prior to searching for the optimum global position, particles try to maintain their local best positions [9]. In this paper, it is proposed to use the Correlation based Features Selection (CFS) feature evaluator, based on the bio-inspired technique of PSO, for selecting only the relevant features. Subsequently, the Random Forest (RF) classifier is chosen for attack classification in a network.
The RF algorithm is one of the most popular machine learning techniques. The sections of this article are arranged in the following order: Section 2 provides the related works in the field of intrusion detection systems, distinguishing, those that are based on machine learning methods and some others that focus on deep learning.
Section 3 presents the author’s proposal, followed by a brief analysis of the KDDCup’99 data set and its versions used in this article, such as statistics and data preprocessing. This section concludes with a description of the different evaluation metrics used. Section 4 explores the analysis and discussion of the experimental results. Finally, section 5 presents a conclusion and future research directions suggested.
2 Related Work
Information security is an interesting area of research for its importance in the daily lives of individuals and even for institutions.
An intrusion detection system (IDS) is considered an important policy to improve the quality of computer security. In recent years, a considerable number of literature searches on intrusion detection have been published. In this section, a selection of this works is presented.
During the preceding decade, several studies have been done in the intrusion detection area, some of them based on machine learning methods and others focusing on deep learning. First, a few studies based on machine learning techniques are presented, followed by a few others based on deep learning.
2.1 IDS based on Machine Learning Techniques
In [6], the authors propose an algorithm for feature selection. The authors used these selected features to build an intrusion detection system based on the least squares support vector machine LSSVM-IDS.
They tested their experiment on three data sets such as KDDCup’99, NSL-KDD and Kyoto 2006+, and they showed that their algorithm gives improved accuracy per attack class. The paper presented by Altwaijry and Algarny in 2012 [5] explains the use of a Naïve Bayesian classifier for intrusion detection.
The authors evaluated their proposal by category of attacks on the 10 percent of KDDCup’99 and the corrected-KDD data set. In their article referenced by [22], the authors focused their work on the cluster center and nearest neighbor (CANN) approach to feature representation with the aim of detecting intrusions.
They evaluated their experimentation on the KDDCup’99 data set. They used four types of attacks. In 2016, Han X. et al. [16] suggested principal component analysis for feature extraction and proposed an algorithm for intrusion detection based on the traditional Naïve Bayesian classification algorithm.
The authors used the 10 percent subset of KDDCup’99 (494,020 records, including 19.69 percent normal and 80.31 percent attack) to evaluate the performance of their solution.
2.2 IDS based on Deep Learning Techniques
Since 2006, several studies on deep learning methods for intrusion detection have been published. the paper presented by Tang et al. in 2016 [28] explains an approach based on a deep neural network composed of an input layer of 6 dimensions, three hidden layers of 12, 6 and 3 neurons respectively and a 2-dimensional output layer.
The authors tested their approach on the NSL-KDD data set and their model achieved an accuracy around 75.75 percent. The NDAE (Non-symmetric Deep Auto-Encoder) model, based on a Deep Auto-Encoder is proposed by Shone et al. in 2018 [27].
In this study, the number of attributes was reduced to 28 instead of a total of 41 attributes by this Auto-Encoder. The proposed model is composed of an input layer, six hidden layers and an output layer.
Their model is evaluated using the 10 percent subset of KDDCup’99 and NSL-KDD data set, and an accuracy of 97.85 percent and 85.42 percent for the two data sets respectively is obtained by the authors after using a random forest-based classifier for 5-class classification.
In 2016, Javaid et al. [19] developed a flexible and efficient NIDS (Network Intrusion Detection System) based on a proposed deep learning approach. The authors apply the technique of self-directed learning (STL).
They use the NSL-KDD data set to evaluate their system which achieved an accuracy rate of 88.39 percent and 79.10 percent for 2-class and 5-class respectively. In [4], the paper presented by authors explains the application of a Restricted Boltzmann Machine (RBM) and a Deep Belief Network (DBN) for a suggested deep learning approach to detect anomalies.
A feature reduction is performed by a first RBM. And the resulting weights are passed to a second RBM to create the DBN. the authors tested their approach on the KDDcup’99 data set, and their model showed improved accuracy (97.9 percent).
In 2017, Yin et al. [35] proposed an approach based on deep learning using a recurrent neural network (RNN-IDS). They chose the sigmoid function for activation and SoftMax as a classification function. The authors implemented their solution and tested it on the NSL-KDD data set.
The evaluation of their proposal shows an accuracy rate of 83.28 percent and 81.29 percent for a binary and multi-class (5-class) classification respectively on the KDDTest+ data set and an accuracy rate of 68.55 percent and 64.67 percent for a 2-class and 5-class respectively on the KDDTest-21 data set. A summary of some related works is shown in Table 1 below.
Used algo/model | Data set | Classification | Accuracy (%) | Ref. |
NDAE (DL - AE - RF) | NSL-KDD | 5-class | 85.42 | [27] |
10% KddCup’99 | 5-class | 97.85 | ||
DNN - AE – SM | NSL-KDD | 2-class, 5-class | - | [26] |
DL - AE – SM (STL, SMR) | NSL-KDD | 2-class, STL 2-class, SMR 5-class, STL 5-class, SMR | 88.39 78.06 79.10 75.23 | [19] |
AE – DBN | 10% KddCup’99 | 2-class | 92.10 | [3] |
DBN | 40% NSL-KDD | 5-class | 97.45 | [15] |
DBN | 10% KddCup’99 | 5-class | 93.49 | [4] |
DBN – LR | 10% KddCup’99 | 5-class | 97.90 | [14] |
DRBM | 10% KddCup’99 | 2-class | 94.00 | |
DNN | 10% KddCup’99 NSL-KDD | 2-class 5-class 2-class 5-class | 93.00 93.50 80.10 78.50 | [31] |
RNN | NSL-KDD Test+ Test-21 Test+ Test-21 | 2-class 5-class | 83.28 68.55 81.29 64.67 | [35] |
LSTM RNN | KddCup’99 | 5-class | 97.54 | [21] |
HC + SVM | KddCup’99 | 5-class | 95.72 | [17] |
CT + SVM | 1998 DARPA | 5-class | 69.80 | [20] |
NB + KNN | NSL-KDD | 5-class | 84.86 | [25] |
KNN + SVM + PSO | KddCup’99 | 5-class | 88.72 | [1] |
K-means + KNN | KddCup’99 | 5-class | 99.01 | [30] |
GMMs + PSO + SVM | KddCup’99 | 5-class | 99.99 | [18] |
FL + GA | 10% KddCup’99 | 5-class | 94.60 | [10] |
K-Means + NB + BNN | KddCup’99 | 5-class | 99.90 | [11] |
NDAE: Non-symmetric Deep Auto-Encoder; DL: Deep Learning; BNN: Back-propagation Neural Network; ANN: Artificial Neural Network; DNN: Deep Neural Network; RNN: Recurrent Neural Network; DBN: Deep Belief Network; DRBM: Discriminative Restricted Boltzmann Machine; AE: Auto-Encoder; SM: Soft-Max; SMR: Soft-Max Regression; STL: Self-Taught Learning; CT: Clustering Tree; LSTM: Long Short-Term Memory; GMMs: Gaussian Mixture Models; IDS: intrusion detection system; MDS: Malicious Detection System; NADS: Network Anomaly Detection System; LR: Logistic Regression; RF: Random Forest; HC: Hierarchical Clustering; NB: Naïve Bayes; K-Means; FL: Fuzzy Logic, GA: Genetic Algorithm; KNN: K-Nearest Neighbor; SVM: Support Vector Machine; PSO: Particle Swarm Optimization. SGD: Stochastic Gradient Descent.
3 Proposed Approach
As stated in some research, such as presented by Maniriho and Ahmad in 2018 [24], certain features have no influence in the attack detection process, or in other words, these unnecessary features may have a negative impact on attack determination performance.
The study described in this section aims to propose an IDS model based on the machine learning methods for the attack detection, based on the features selection that has an important influence in the attack determination process.
To achieve this objective and select only the relevant features for training of the proposed model, various feature evaluators were employed by conducting multiple tests.
Three evaluators, namely Correlation based Features Selection (CFS), Pearson’s Correlation (PC) and Gain Ratio (GR), were the focus of these test.
The ranking scores generated by feature class using these three evaluators were used to select twenty-one considered as relevant out of a total of forty-one. The proposed model is illustrated in the block diagram in Figure 1 below.
After is a brief description of the three evaluators used.
3.1 Correlation based Feature Selection
The principle of the Correlation based Features Selection (CFS) is to measure Pearson’s correlation between an attribute and the class, it determines the value of the attribute. By treating each value as an indicator, nominal properties are evaluated individually.
A weighted average is used to determine the overall correlation of a nominal attribute. The particle swarm optimization method is chosen as the search method for this feature evaluator. This approach was invented by Eberhart and Kennedy in 1995 [12]. The principle of PSO is population-based, which aims to find a sub-optimal solution in the search space.
At each iteration of the PSO algorithm, each individual (particle
3.2 Pearson’s Correlation
The Pearson coefficient indicator denoted r is a measure used to detect the presence or absence of a linear relationship between two variables.
The value of this measure of correlation varies from
While a negative measure indicates that one variable increases, the other decreases and when this value is close to
3.3 Gain Ratio
An extension of information gain, called gain ratio, was used to select the best feature feature for splitting the dataset. The gain ratio is calculated by normalizing the information gain with aid of division information.
A feature will be favored by the gain of information if it has a large number of values. The Gain Ratio (GR) is calculated as follows:
where:
IG: is the Information Gain.
SI: is the Split Information can be calculated as follows:
where:
3.4 Dataset Description
Research in the field of intrusion detection (ID) requires the use of data sets to evaluate the efficiency and effectiveness of the proposed solutions by researchers in order to achieve concrete objectives. In this context, there are a variety of freely accessible network-based data sets available for intrusion detection research.
Among these data sets, we focused on the KDDCup’99 data set in our work; this data set is primarily concerned with intrusion detection and was constructed and modified from original network traffic data collected by the DARPA 1998 evaluation program under the supervision of the Massachusetts Institute of Technology (MIT) Lincoln Laboratory.
The data set in question is often used in the literature and comported of around 4,900,000 connection records, each of which is composed of 41 values and is labeled as either normal or an attack, each value corresponding to a different feature [29]. The KDDCup’99 data set can be listed as a normal traffic class and four categories to group the different kinds of attacks as shown below:
– Normal: it indicates that the network traffic record is normal or benign.
– Denial of Service attack (DoS): an intrusion or a kind of attack that tries to make some computing resources (server, host, memory, ...) inaccessible for the client, such as memory that is too full, with the objective of using the victim’s resources.
– Probing attack (Probe): this category of attack includes all kinds of malicious activity, in which the perpetrator gathers detailed information about the system infrastructure and its security configurations, and for the goal by passing the firewall and conducting critical attacks.
– Remote to Local attack (R2L): the intruder does not belong to the computer network, but sends packets to the server or to another machine as a local user in order to gain access.
– User to Root attack (U2R): after several attempts to access network resources, the intruder has the character of a legitimate or normal user. Then, it attempts to access root or superuser privileges.
In 2009, Tavallaee et al. [29] provided and developed a new refined and improved version of the KDDCup’99 corpus under the appellation NSL-KDD.
For security researchers, the number of publicly available data sets for network IDS (NIDS) is limited. KDDCup’99 and NSL-KDD are the most widely utilized and publicly available data sets for testing the effectiveness of different existing and newly announced machine learning methods [32].
In this paper, the NSL-KDD data set is used to train and test the proposed solution for intrusion detection. This version of the data set is derived from the main KDDCup’99 data set.
It reduced and improved the data set version which contains 125,973 instances. A brief description of these data sets is reported in Table 2 and Table 3. The different connections types for KddCup’99 and NSL-KDD data set are:
– Probe (Probing): ipsweep, nmap, portsweep, satan.
– DoS (Denial of Service): back, land, neptune, pod, smurf, teardrop.
– U2R (User to Root): buffer_overflow, loadmodule, perl, rootkit.
– R2L (Remote to Local): ftp_write, guesspasswd, imap, multihop, phf, spy, warezclient, warezmaster.
Connection Type | Before preprocessing | After preprocessing | |
No. of instances | No. of unique instances | Reduction (%) | |
DoS | 3,883,370 | 247,267 | 93.63 |
Probe | 41,102 | 13,860 | 66.28 |
R2L | 1,126 | 999 | 11.28 |
U2R | 52 | 52 | 00.00 |
T. Attacks | 3,925,650 | 262,178 | 93.32 |
Normal | 972,781 | 812,814 | 16.44 |
Total | 4,898,431 | 1,074,992 | 78.05 |
3.5 Data Preprocessing
As mentioned above, the KDDCup’99 and NSL-KDD data sets gather 41 features of different types and are distributed as follows, three of a nominal type such as ’Protocol type’, ’Service’ and ’Flag’, four are binary and the thirty-four remaining features are of continuous type.
Knowing that most of the algorithms and methods only work with numbers and in order to obtain better results from experiments, a preprocessing must be performed on the data sets.
Firstly, using the One-hot-encoding [36] for transformed the nominal features to discrete features, for example, dummy variables are used to encode the textual values of the ’Protocol Type’ feature (i.e. [ 1,0,0], [0,1,0], [0,0,1] for tcp, udp, icmp), knowing that the nominal features ’Protocol type’, ’Service’ and ’Flag’ of the 10% KDDCup’99 training data set have 3, 66 and 11 categories respectively.
Secondly, another main step to complete is the standard normalization, also called standardization or z-score normalization. The purpose of this step is to scale all features in order to guarantee that all predictor values are on the same scale.
The principle of z-score normalization is to subtract from the data their empirical mean
Such that, for each feature
During the data set preprocessing phase, the training and testing databases in the KDDCup’99 collection have a multitude of duplicate instances.
This duplication represents one of the main disadvantages of this data set. These redundancies have a negative impact on the results of the experiments, and must therefore be removed. It is noted that, the training and test data sets, respectively, had about 78.05 percent and 80.68 percent of duplicated instances [29], (see Table 2).
Often, in the preprocessing procedure for data sets, it is also important to remove records that contain incorrect values in the fields, such as character strings arranged in numerical fields or vice versa, missing values, etc.
After preprocessing the KDDCup’99 databases. It was noticed that 4,898,431 records which constitute the initial training set was reduced to 1,074,992 unique data points due to redundancy, this significant reduction represents a rate of 78.05 percent as shown in Table 2.
Similarly, for the KDDCup99’s test set, it was noted that, a total number of 2,984,154 data points was reduced to 576,449 unique instances which represents a reduction rate of 80.68 percent. The results of this table (Table 3) are interpreted in Figures 2a and 2b below.
As previously stated, it is noted that all instances of the same KDDCup’99 data set or its derivatives are composed of 41 features. each feature has only one type of continuous, discrete or symbolic variable [33].
Generally, features are divided into four aspects or classes (see Table 5), the first nine features relate to basic intrinsic properties of the network connection, such as connection duration, protocol type, network service (http, telnet, etc.), etc.
No. f | Feature label | Type | No. f | Feature label | Type |
Basic features class (B) | Traffic ‘same-Service’ features class (TS) | ||||
1 | Duration | Continuous | 23 | Count | Continuous |
2 | protocol_type | Symbolic | 24 | srv_count | Continuous |
3 | service | Symbolic | 25 | serror_rate | Continuous |
4 | flag | Symbolic | 26 | srv_rerror_rate | Continuous |
5 | src_bytes | Continuous | 27 | serror_rate | Continuous |
6 | dst_bytes | Continuous | 28 | srv_rerror_rate | Continuous |
7 | land | Symbolic | 29 | same_rerror_rate | Continuous |
8 | wrong_fragment | Continuous | 30 | diff_srv_rate | Continuous |
9 | urgent | Continuous | 31 | srv_diff_host_rate | Continuous |
Content features class (C) | Traffic ‘same-Host’ features class (TH) | ||||
10 | Hot | Continuous | 32 | dst_host_count | Continuous |
11 | num_failed_logins | Continuous | 33 | dst_host_srv_count | Continuous |
12 | logged_in | Symbolic | 34 | dst_host_same_srv_rate | Continuous |
13 | num_compromised | Continuous | 35 | dst_host_diff_srv_rate | Continuous |
14 | root_shell | Continuous | 36 | dst_host_same_src_port_rate | Continuous |
15 | su_attempted | Continuous | 37 | dst_host_srv_diff_host_rate | Continuous |
16 | num_root | Continuous | 38 | dst_host_serror_rate | Continuous |
17 | num_file_creations | Continuous | 39 | dst_host_srv_serror_rate | Continuous |
18 | num_shells | Continuous | 40 | dst_host_rerror_rate | Continuous |
19 | num_access_files | Continuous | 41 | dst_host_srv_rerror_rate | Continuous |
20 | num_outbound_cmds | Continuous | |||
21 | is_host_login | Symbolic | |||
22 | is_guest_login | Symbolic |
Are grouped to form a first aspect or base class (B). The following thirteen features correspond to domain knowledge or the content of a network connection. The purpose of the content aspect features (C) is to assess the payload of the original TCP packets and to detect attacks that are hidden and not commonly present such as those of the U2R and R2L classes.
In this case, to identify such attacks, the researchers retrieved information on the amount of login failures, which suggest intrusive behavior [34]. The other two classes are encapsulated under the name Traffic; this large traffic aspect groups features which are called time-based and calculated with respect to a time interval.
The first of the traffic aspects includes the ”same service” (TS) features, consists to examine only connections established during the last two continuous seconds which have the same service as the present connection.
The second traffic aspect includes the last ten features that are from ”same host” (TH), consists of an analysis of the connections made in the last continuously two seconds which have the identical final host as the present connection in order to calculate the behavioral statistical properties of the network connection, relating to protocol, serving, etc. [2].
3.6 Evaluation Criteria
Generally, to evaluate the IDS detection precision, the following measures are often used:
– True Positive (TP): this metric represents the number of attacks detected and correctly classified by the model.
– True Negative (TN): a metric that indicates the number of normal instances predicted and correctly classified as normal traffic.
– False Positive (FP): this metric represents the number of normal instances recognized and incorrectly classified as attacks by the model.
– False Negative (FN): a metric that indicates the number of attacks predicted and incorrectly classified as normal traffic by the model.
These metrics often form the confusion matrix values shown in Table 4 below for a binary classification problem.
Other measures were used that can be calculated based on the values of this confusion matrix as presented in Table 4, as follows: Detection Rate (DR) or True Positive Rate (TPR):
False Alarm Rate (FAR) or False Positive Rate (FPR):
Precision:
Overall accuracy is defined as the proportion of instances in a set of occurrences that have been correctly classified. This metric is less useful in the case where there is a significant imbalance between the classes:
4 Experiment Results and Discussion
After applying the three attribute evaluation metrics (CFS-PSO, PC-Ranker, GR-Ranker), the results obtained for binary classification are shown in Table 6. Therefore, for each feature class, it is also important to choose the most relevant or influential features for the intrusion detection process.
Attribute Evaluator: Search Method: | CFS PSO | Pearson’s Correlation Ranker | Gain Ratio Ranker |
Features class | Position of the 21 Best selected features | Position of the 21 Best selected features | Position of the 21 Best selected features |
Basic (B) | 1, 3, 4, 5, 6, 7 | 3, 4, 8 | 3, 4, 5, 6, 8 |
Content (C) | 12, 14, 15, 16, 21, 22 | 12 | 12 |
Traffic ‘same-Service’ (TS) | 26, 27, 29, 30 | 23, 25, 26, 27, 28, 29, 30, 31 | 23, 25, 26, 28, 29, 30, 31 |
Traffic ‘same-Host’ (TH) | 34, 35, 37, 38, 39 | 32, 33, 34, 35, 36, 38, 39, 40, 41 | 32, 33, 34, 35, 37, 38, 39, 41 |
So, the most relevant features are chosen for each class (Basic: B, Continent: C, Traffic same Service: TS and Traffic same Host: TH) by following the order of features based on their order of merit in their respective classes. These features are presented in Table 6.
For example, if CFS-PSO technique used in binary classification case, the best features selected for Basic Class are (Duration, service, flag, src_bytes, dst_bytes, land).
4.1 Analysis of Experimental Results
After selecting the top twenty-one features for each attribute evaluator from the entire data set. In the binary classification experiments, the resulting data set can be trained and tested using a variety of machine learning techniques, such as Naïve Bayes, Random Forest, Stochastic Gradient Descent, Deep Learning, K-Nearest Neighbors and Support Vector Machine.
The obtained results are presented below. Based on the corresponding new NSL-KDD data set, which contains only the twenty-one best selected features for each attribute evaluator (presented in Table 6), various performance measures can be calculated such as DR, FAR, precision and system accuracy, based on the results of the confusion matrix.
Table 7 presents the obtained results of the Detection Rate and False Alarm Rate measurements of each of the machine learning techniques and for each used attribute evaluator. These results are interpreted in Figures 3a and 3b.
DR | FAR | |||||
ML | CFS-PSO | PC-Ranker | GR-Ranker | CFS-PSO | PC-Ranker | GR-Ranker |
NB | 0.6977 | 0.6892 | 0.6889 | 0.0485 | 0.0461 | 0.0483 |
RF | 0.9884 | 0.9851 | 0.9887 | 0.0130 | 0.0288 | 0.0158 |
SGD | 0.9056 | 0.9605 | 0.9495 | 0.0460 | 0.0745 | 0.0764 |
DL | 0.9107 | 0.9518 | 0.9522 | 0.0491 | 0.0716 | 0.0730 |
KNN | 0.9801 | 0.9813 | 0.9792 | 0.0268 | 0.0355 | 0.0278 |
SVM | 0.9108 | 0.9606 | 0.9476 | 0.0721 | 0.0738 | 0.0754 |
In the same way, precision and accuracy measurements can be calculated. Table 8 presents the obtained results of precision and accuracy measurements of each machine learning techniques and for each used attribute evaluator. The results of this table are interpreted in Figures 4a and 4b.
Precision | Accuracy | |||||
ML | CFS-PSO | PC-Ranker | GR-Ranker | CFS-PSO | PC-Ranker | GR-Ranker |
NB | 0.9500 | 0.9518 | 0.9496 | 0.8070 | 0.8032 | 0.8021 |
RF | 0.9902 | 0.9783 | 0.9881 | 0.9878 | 0.9791 | 0.9868 |
SGD | 0.9630 | 0.9446 | 0.9426 | 0.9264 | 0.9454 | 0.9383 |
DL | 0.9608 | 0.9462 | 0.9452 | 0.9280 | 0.9417 | 0.9413 |
KNN | 0.9797 | 0.9733 | 0.9790 | 0.9771 | 0.9741 | 0.9762 |
SVM | 0.9435 | 0.9450 | 0.9432 | 0.9182 | 0.9458 | 0.9377 |
Finally, Table 9 shows a performance comparison of the proposed method with some other recent methods using the same data set (NSL-KDDTest) in terms of accuracy. It can be seen from the table that the proposed method (CFS-PSO + RF) ranks first in terms of accuracy in binary classification case.
Therefore, the proposed CFS-PSO attribute evaluator-based RF classifier performs better than all other competitive techniques for binary classification case (see Figure 5).
4.2 Discussion of Experimental Results
In a data set, applying a method for eliminating unnecessary features is indispensable because these extra features decrease the precision and efficiency of the prediction algorithms. Additionally, as the number of features in a data set grows, so does the searchable space.
In this research, feature selection and reduction were performed by keeping only the most relevant features. To accomplish this, three attribute evaluation metrics were applied: CFS-PSO, PC-Ranker and GR-Ranker in binary classification. The results are shown in Table 6.
In order to improve the DR and optimize the performance of the IDS, the three attribute evaluation metrics can be applied to the data set, by selecting the same number of relevant features for each of these metrics.
After running several tests, twenty-one relevant features were selected. Various performance measures were calculated, including DR, FAR, precision and system accuracy, based on the results of the confusion matrix, the obtained results are discussed as follows: In the binary classification case, the performance comparison results are shown in Tables 7 and 8, which indicate that the proposed technique (CFS-PSO attribute evaluation metric combined with RF classifier) achieved a higher DR of 98.84%, while the False Alarm Rate (FAR is 1.3%) is also the lowest compared to other machine learning techniques. In terms of precision and accuracy, Figures 4a and 4b also show a comparison of performances and prove that the proposed method takes the first place with a precision rate of 99.02% and an accuracy rate of 98.78%.
5 Conclusion and Future Work
This paper discusses an effective intrusion detection technique that is divided into two phases. In the first phase, relevant features were selected by eliminating those that do not have a significant influence on the intrusion detection procedure.
This was achieved by using an attribute elevator technique called Correlation based Features Selection (CFS) technique based on the Particle Swarm Optimization (PSO) method, resulting in a feature space reduction of approximately 50%.
In the second phase, the proposed classification algorithm Random Forest (RF) and different machine learning algorithms were tested to evaluate the performance of the proposed method, experiments were conducted on the new NSL-KDD data set containing only twenty-one features.
The experiments carried in this study are divided into three classes, Firstly, a comparison is made between the chosen attribute evaluator (CFS-PSO) and two other evaluators, such as PC-Ranker and GR-Ranker, in the second set of experiment, a comparison is made between the proposed classifier (RF) and other machine learning classifiers, namely NB, SGD, DL, KNN and SVM.
The experimental results on the NSL-KDD data set show the promising performance of the proposed techniques in terms of accuracy and detection rate compared to competitive methods.
In the final class of experiments, the proposed technique is compared to different previously existing methods.
The obtained performance results indicate that the proposed technique outperforms other methods in the binary classification. Finally, it should be noted that the current study has two major limitations, namely real-time operation and the ability to detect zero-day attacks.
To address these limitations and further improve the proposed technique, future work could focus on finding more efficient solutions for detecting zero-day attacks and developing an IDS that works in real-time. It is recommended to test the technique on other data sets such as UNSW-NB15, CSE-CIC-IDS2018.