Abstract
Domain fluxing, a technique employed by attackers to evade conventional Command and Control detection, presents a significant challenge for cybersecurity. This technique leverages Domain Generation Algorithms (DGAs) to dynamically generate domain names, often producing non-sensical sequences. The proposed framework presents a real-time DGA detection framework that analyzes Non-existent (NX) domain responses and applies a statistical anomaly detection approach to identify malicious activity. The detected DGAs are further classified into 56 families using a HybridBERT framework, integrating Bidirectional Encoder Representations from Transformer (BERT) with an attention mechanism and statistical characteristics. The dataset, comprising approximately 0.3 million samples from various online sources, was pre-processed to remove redundant data, approximately 25% of the total, and then divided into training, validation, and testing sets in a 60:20:20 ratio. The BERT model was fine-tuned by freezing the first five layers and trained over 20 epochs with early stopping, achieving an overall precision of 96%. Despite significant class imbalance, the framework demonstrated robust performance in both word-based and pseudorandom DGAs, with detailed metrics such as precision, recall, and F1-score providing a comprehensive evaluation. The proposed framework improves the ability of cybersecurity systems to detect zero-day DGAs and offers a scalable solution for real-time DGA classification.
| Original language | English |
|---|---|
| Pages (from-to) | 160393-160410 |
| Number of pages | 18 |
| Journal | IEEE Access |
| Volume | 13 |
| DOIs | |
| Publication status | Published - 10 Sept 2025 |
Keywords
- malware
- feature extraction
- servers
- Hidden Markov models
- Classification algorithms
- Vectors
- Real-time systems
- Domain Name System
- Semantics
- convolutional neural networks