Unsupervised Anomaly Detection for Network Intrusion Using Deep Autoencoders and Statistical Modeling

Manimegalai, R (2025) Unsupervised Anomaly Detection for Network Intrusion Using Deep Autoencoders and Statistical Modeling. 2025 International Conference on Next Generation Computing Systems (ICNGCS). pp. 1-10.

[thumbnail of Unsupervised_Anomaly_Detection_for_Network_Intrusion_Using_Deep_Autoencoders_and_Statistical_Modeling.pdf] Text
Unsupervised_Anomaly_Detection_for_Network_Intrusion_Using_Deep_Autoencoders_and_Statistical_Modeling.pdf - Published Version

Download (339kB)

Abstract

Network security is faced with unprecedented challenges as cyber threats evolve so rapidly that world cyber damages are now $10.5 trillion annually, and network intrusion attempts increased by 238% in recent times. Network security heavily depends on accurate detection of abnormal behavior, but conventional supervised methods heavily depend on large-scale labeled datasets, which is not feasible considering continuously evolving threats and limited labels. The unavailability of high-quality labeled network traffic data, and the emergence of zero-day attacks that avoid signature-based detection, necessitate new unsupervised techniques capable of identifying new attack patterns without any previous knowledge of malicious activity. The following research proposes a novel hybrid unsupervised method that combines deep autoencoders with adaptive statistical filtering (z-score-based analysis) for the identification of network traffic anomalies. The method utilizes the reconstruction property of autoencoders to learn about normal behavior patterns and mark as anomalies on the occurrence of high reconstruction error, and z-score statistical analysis allows dynamic threshold adjustment and explainable confidence estimates. Our approach surpasses the shortcomings of existing methods by combining the pattern learning capability of deep neural networks with statistical insight through adaptive threshold selection, leading to a robust framework poised for practical application. Proposed approach leverages symmetric five-layer autoencoder architecture (41→32→16→32→41 neurons) trained on unsupervised network traffic to produce compressed normal behavior pattern representations. The reconstruction errors are explored with adaptive z-score normalization and sliding window statistics to identify anomalous activity distant from learned baselines. The hybrid approach is also threshold selection problem-free like standalone autoencoder approaches and maintains computational efficiency suitable for real-time computation.Extensive experimental evaluation was conducted on extensive network traffic data sets with 125,973 connection records and of diverse types of attacks like DoS, Probe, R2L, and U2R intrusions. Unlabeled network traffic of diverse operating environments was evaluated using the model and outperformed existing baseline methods. Results show that the hybrid approach is 91.4% accurate, 91.7% precise, 89.3% recalls, and 90.4% in F1-score, which is 8-15% improved compared with traditional approaches like Isolation Forest, One-Class SVM, and individual statistical approaches. The proposed approach leverages symmetric five-layer autoencoder architecture (41→32→16→32→41 neurons) trained using unsupervised network traffic to provide normal behavior pattern representations in compressed form. The errors during reconstruction are studied using adaptive z-score normalization and sliding window statistics for identifying anomalous activity distant from learned baselines. The hybrid approach is also threshold selection problem-free similar to standalone autoencoder approaches and maintains computational efficiency suitable for real-time computation. Large-scale testing by experiment was carried out on large-scale network traffic data sets with 125,973 connection records and of diverse types of attacks like DoS, Probe, R2L, and U2R intrusions. Diverse unlabeled network traffic of different operating environments was experimented with based on the model and performed better than present baseline techniques. Results show that the hybrid approach has 91.4% accuracy, 91.7% precision, 89.3% recalls, and 90.4% F1-score, an 815% increase from the traditional approaches like Isolation Forest, One-Class SVM, and isolated statistical approaches. Real-world deployment testing across three operation environments (enterprise, academic, and cloud service provider networks) confirms practical feasibility with 67% fewer false positive alerts, 34% improvement in mean time to detection, and seamless integration with widely used SIEM platforms including Splunk, QRadar, and ArcSight. Prolonged monitoring for 18 months demonstrates sustained detection effectiveness with minimal performance degradation over time, achieving high accuracy and robustness in real-time anomaly detection feasible for production cybersecurity operations. The research provides a theoretically rigorous yet practically viable answer to the urgent challenge of unsupervised network anomaly detection, presenting both algorithmic innovations as well as deployment-ready realization that transforms the state-of-the-art in cybersecurity automation to a new level.

Item Type: Article
Subjects: C Computer Science and Engineering > Network Security
C Computer Science and Engineering > Computer security and Data security
C Computer Science and Engineering > Embedded and Real-Time Systems
Divisions: Computer Science and Engineering
Depositing User: Dr Krishnamurthy V
Date Deposited: 15 Dec 2025 08:32
Last Modified: 15 Dec 2025 08:32
URI: https://ir.psgitech.ac.in/id/eprint/1592

Actions (login required)

View Item
View Item