Distributed two phase intrusion detection system using machine learning techniques and underlying big data storage and processing architecture- HDFS
DOI:
https://doi.org/10.4114/intartif.vol28iss76pp124-148Keywords:
Accuracy, HDFS, Intrusion Detection System, Machine Learning, Two Phase Intrusion DetectionAbstract
It is crucial for organizations to secure their data in the internet era. The use of Intrusion Detection Systems (IDS) implies this security. Several researchers used various tools and methods to implement various IDS models. However, a few performance concerns that must be resolved are crucial from a security standpoint. The problems pertain to the IDS time efficiency referred as timeliness, accuracy as well as the fault tolerance. The proposed model of intrusion detection has two phases of detection. Every phase uses a different set of machine learning algorithms. Phase I employs Support Vector Machine (SVM) and k nearest neighbor (kNN), whereas Phase II uses Decision Tree and Naïve Bayes. This two phase detection takes care of reducing false positives and false negatives. To compensate the execution time of these four techniques, the big data environment—Hadoop Distributed File System (HDFS)—is utilized as the underlying storage and processing structure. With such arrangement of two phases, the model gives accuracy of 97.29% overall for known and unknown attacks. For known attacks it gives 99.49% and for unknown attacks it gives 96.28% accuracy in detecting intrusion. Also, the time efficiency is measured for training and testing of the model, for training with 10,000 records, it took 0.7 seconds which is very efficient as considered to existing systems. The detailed performance achievements are discussed in results section. Also, because of HDFS, it becomes distributed and fault tolerant intrusion detection system.
Downloads
Metrics
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Iberamia & The Authors

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Open Access publishing.
Lic. under Creative Commons CC-BY-NC
Inteligencia Artificial (Ed. IBERAMIA)
ISSN: 1988-3064 (on line).
(C) IBERAMIA & The Authors