Computer Standards & Interfaces 97 (2026) 104119


                                                                     Contents lists available at ScienceDirect


                                                         Computer Standards & Interfaces
                                                              journal homepage: www.elsevier.com/locate/csi


SiamIDS: A novel cloud-centric Siamese Bi-LSTM framework for
interpretable intrusion detection in large-scale IoT networks
Prabu Kaliyaperumal a , Palani Latha b , Selvaraj Palanisamy a , Sridhar Pushpanathan c ,
Anand Nayyar d,* , Balamurugan Balusamy e, Ahmad Alkhayyat f
a
  School of Computer Science and Engineering, Galgotias University, Delhi NCR, India
b
  Department of Information Technology, Panimalar Engineering College, Chennai, India
c
  Department of Electrical and Electronics Engineering, Kongunadu College of Engineering and Technology, Trichy, India
d
  School of Computer Science, Duy Tan University, Da Nang 550000, Viet Nam
e
  School of Engineering and IT, Manipal Academy of Higher Education, Dubai Campus, Dubai, United Arab Emirates
f
  Department of Computer Techniques Engineering, College of Technical Engineering, The Islamic University, Najaf, Iraq


A R T I C L E I N F O                                       A B S T R A C T

Keywords:                                                   The rapid proliferation of Internet of Things (IoT) devices has heightened the need for scalable and interpretable
Siamese network                                             intrusion detection systems (IDS) capable of operating efficiently in cloud-centric environments. Existing IDS
IoT security                                                approaches often struggle with real-time processing, zero-day attack detection, and model transparency. To
Intrusion detection
                                                            address these challenges, this paper proposes SiamIDS, a novel cloud-native framework that integrates
SHAP
Clustering
                                                            contrastive Siamese Bi-directional LSTM (Bi-LSTM) modeling, autoencoder-based dimensionality reduction,
                                                            SHapley Additive exPlanations (SHAP) for interpretability, and Ordering Points To Identify the Clustering
                                                            Structure (OPTICS) clustering for unsupervised threat categorization. The framework aims to enhance the
                                                            detection of both known and previously unseen threats in large-scale IoT networks by learning behavioral
                                                            similarity across network flows. Trained on the CIC IoT-DIAD 2024 dataset, SiamIDS achieves superior detection
                                                            performance with an F1-score of 99.45%, recall of 98.96%, and precision of 99.94%. Post-detection OPTICS
                                                            clustering yields a Silhouette Score of 0.901, DBI of 0.092, and ARI of 0.889, supporting accurate threat
                                                            grouping. The system processes over 220,000 samples/sec with a RAM usage under 1.5 GB, demonstrating real-
                                                            time readiness. Compared to state-of-the-art methods, SiamIDS improves F1-score by 2.8% and reduces resource
                                                            overhead by up to 25%, establishing itself as an accurate, efficient, and explainable IDS for next-generation IoT
                                                            ecosystems.


1. Introduction                                                                                   operational efficiency and real-time analytics, has significantly broad­
                                                                                                  ened the attack surface, making cybersecurity a critical concern for both
    With the explosive growth of digital transformation across in­                                cloud and IoT ecosystems [4,5]. In such environments, cyber threats like
dustries, the convergence of the Internet of Things (IoT) and cloud                               ransomware, botnets, Distributed Denial-of-Service (DDoS) attacks, and
computing has revolutionized modern infrastructure. From smart homes                              zero-day vulnerabilities have become increasingly sophisticated and
and healthcare monitoring to industrial automation and intelligent                                frequent [6]. These threats not only exploit system vulnerabilities and
transportation systems, IoT devices now generate massive volumes of                               insecure communication channels but also leverage the lack of consis­
data that are often offloaded to cloud platforms for centralized pro­                             tent security policies across distributed endpoints. As organizations
cessing and storage [1,2]. According to a recent IDC report, over 41.6                            increasingly rely on cloud-centric infrastructures to host critical ser­
billion IoT devices are expected to be connected by 2025, producing                               vices, ensuring end-to-end security—especially across low-power, het­
79.4 zettabytes of data [3]. This hyperconnectivity, while enabling                               erogeneous IoT nodes—has become both a necessity and a challenge [7,


  * Corresponding author.
    E-mail addresses: k.prabu@galgotiasuniversity.edu.in (P. Kaliyaperumal), lathapalani@panimalar.ac.in (P. Latha), p.mselvaraj@galgotiasuniversity.edu.in
(S. Palanisamy), sridharp@kongunadu.ac.in (S. Pushpanathan), anandnayyar@duytan.edu.vn (A. Nayyar), kadavulai@gmail.com (B. Balusamy),
ahmedalkhayyat85@iunajaf.edu.iq (A. Alkhayyat).

https://doi.org/10.1016/j.csi.2025.104119
Received 1 August 2025; Received in revised form 16 October 2025; Accepted 15 December 2025
Available online 15 December 2025
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
P. Kaliyaperumal et al.                                                                                         Computer Standards & Interfaces 97 (2026) 104119


                                                                                IoT ecosystems interact with edge devices, fog layers, and cloud services,
                                                                                forming a multi-layered infrastructure with dynamic data flows. These
                                                                                interconnected systems introduce new vulnerabilities, particularly in
                                                                                resource coordination, data aggregation, and service orchestration. In
                                                                                cloud-centric environments, threats may propagate from the edge to the
                                                                                core or vice versa, requiring real-time threat detection and response
                                                                                mechanisms that are not only accurate but also interpretable and
                                                                                scalable.
                                                                                    Despite the growing need for intelligent IDS models in IoT-cloud
                                                                                environments, current techniques face several critical limitations.
                                                                                First, many machine learning-based IDS solutions are trained in a su­
                                                                                pervised fashion, heavily reliant on labeled datasets that do not reflect
                                                                                the diversity of real-world attacks. Second, most existing models lack
                                                                                interpretability, rendering them less useful for human operators in Se­
                                                                                curity Operations Centers (SOCs) who must understand and act upon
                                                                                alerts. Third, these models often fail to meet the constraints of cloud-
                                                                                edge deployments due to high computational or memory re­
                                                                                quirements. Lastly, many IDS do not provide mechanisms for grouping
                                                                                detected anomalies into meaningful patterns, limiting post-detection
Fig. 1. Workflow of an Intrusion Detection System in cloud-centric IoT          forensics and threat hunting capabilities.
environments.                                                                       The above limitations highlight the urgent need for a robust, cloud-
                                                                                ready, interpretable, and generalizable IDS framework that can adapt to
                                                                                the unique characteristics of large-scale IoT environments. The ability to
                                                                                not only detect zero-day attacks but also explain the detection rationale
                                                                                in human-understandable terms is becoming increasingly critical.
                                                                                Furthermore, supporting scalability and low-latency processing is
                                                                                essential for real-time operation across distributed edge-cloud networks.
                                                                                Recognizing these demands, this research proposes an advanced solu­
                                                                                tion that integrates deep metric learning, unsupervised clustering, and
                                                                                explainable AI (XAI) to create a holistic and effective intrusion detection
                                                                                pipeline.
                                                                                    This study focuses on designing an intelligent, scalable, and
                                                                                explainable intrusion detection system (IDS) optimized for cloud-centric
                                                                                IoT networks. The scope encompasses flow-based traffic monitoring,
                                                                                similarity-driven anomaly detection, post-detection behavior analysis,
                                                                                and explainable threat attribution. The key problem addressed is the
                                                                                lack of unified IDS frameworks that can simultaneously handle unseen
                                                                                threats, offer transparency, and operate efficiently in resource-
             Fig. 2. An overview of cloud-centric IoT infrastructure.
                                                                                constrained IoT-cloud environments.
                                                                                    To overcome this, we introduce SiamIDS—a Siamese Bi-LSTM-based
8].                                                                             intrusion detection system—that incorporates contrastive learning,
    To defend against such multifaceted threats, Intrusion Detection            autoencoder-based compression, SHAP-based interpretability, and OP­
Systems (IDS) have emerged as a cornerstone of modern cybersecurity             TICS clustering for semantic anomaly grouping. This approach enables
architectures [9]. As illustrated in Fig. 1, an IDS monitors system and         similarity-driven detection that is capable of generalizing to novel be­
network traffic for signs of unauthorized or anomalous activities. IDS          haviours while offering detailed reasoning through feature contribution
mechanisms are broadly classified into two categories [10]:                     analysis.
signature-based detection, which matches observed behaviors with a
predefined set of known attack patterns, and anomaly-based detection,
which identifies deviations from established normal behavior. While             1.1. Objectives of the paper
signature-based methods offer high precision for known threats, they are
ineffective against new or evolving attack types. Anomaly-based IDS, on            The objectives of the paper are:
the other hand, provide flexibility and the ability to detect zero-day
attacks but often suffer from high false alarm rates due to the diffi­          1. To conduct a comprehensive background study and literature review
culty of accurately modeling "normal" behavior [11,12].                            on the design of scalable and interpretable intrusion detection sys­
    Traditional IDS frameworks were initially designed for homoge­                 tems for IoT networks;
neous, resource-rich enterprise networks. These systems typically               2. To propose a novel methodology titled SiamIDS for detecting and
assumed structured traffic flows, consistent device capabilities, and ac­          explaining known and zero-day cyber threats in large-scale IoT
cess to reliable computational resources [13,14]. However, the IoT                 traffic. The novelty lies in combining contrastive similarity learning
paradigm introduces a set of conditions that challenge these assump­               with interpretable SHAP analysis and unsupervised clustering to
tions: highly heterogeneous devices, constrained memory and compute                enhance both accuracy and transparency;
power, varied communication protocols, and intermittent connectivity.           3. To test and validate the proposed SiamIDS framework using metrics
Furthermore, many IoT nodes are deployed with minimal configurations               such as F1-score, precision, recall, Silhouette Score, DBI, ARI,
and legacy firmware, making them attractive entry points for attackers             inference speed, and memory footprint;
[15]. Studies reveal that IoT-based attacks have surged by more than            4. And, to compare SiamIDS with existing techniques, including CNN,
300 % in the last five years, with incidents such as the Mirai botnet              Bi-LSTM, GRU, AE, and traditional statistical baselines, across mul­
compromising millions of devices globally [16]. As depicted in Fig. 2,             tiple attack categories in the CIC IoT-DIAD 2024 dataset.

                                                                            2
P. Kaliyaperumal et al.                                                                                          Computer Standards & Interfaces 97 (2026) 104119


1.2. Organization of paper                                                      It emphasizes device-specific modeling and evaluates traditional ML and
                                                                                DL approaches on real-world IoT traffic. Though effective, it lacks any
    The rest of the paper is organized as: Section 2 presents a detailed        temporal modeling, similarity learning, or explainability. Additionally,
literature review, highlighting recent advancements and challenges in           cloud deployment strategies were not explored. SiamIDS distinguishes
intrusion detection systems for IoT networks. Section 3 discusses the           itself by offering temporal contrastive learning, explainability through
Materials and Methods used in this study, covering the dataset, pre­            SHAP, and real-time cloud deployment features tailored for IoT
processing steps, and the foundational methods employed to build the            environments.
proposed SiamIDS framework. Section 4 presents the proposed meth­                   Hnamte & Hussain (2023) [22] proposed DCNNBILSTM, a hybrid
odology, explaining the architectural design and key components of              intrusion detection system combining CNN for feature extraction,
SiamIDS. Section 5 focuses on Experimentation, Results, and Analysis.           BiLSTM for sequence learning, and DNN layers for classification. The
And, Finally, Section 6 concludes the paper with key outcomes, limita­          methodology includes thorough data preprocessing and the use of ReLU,
tions, and directions for future research.                                      Softmax, and Adam optimizer. Trained on CICIDS2018 and Edge_IIoT
                                                                                datasets, it achieved 100 % and 99.64 % accuracy, respectively, with
2. Literature review                                                            F1-score up to 100 %, and minimal loss rate (0.0080). The novelty lies in
                                                                                integrating deep CNN with BiLSTM for robust detection. Limitations
    The rapid growth of Internet of Things (IoT) devices has brought            include longer training times due to model complexity, suggesting
forth new challenges in network security, especially in cloud-centric           future optimization for real-time deployment.
architectures where massive volumes of traffic are continuously gener­              Alzboon et al. (2023) [23] proposed a novel IDS combining
ated. As a result, Intrusion Detection Systems (IDS) have gained signif­        FLAME-based feature filtration and an enhanced extended classifier
icant attention in recent literature, with various machine learning (ML)        system (XCS) with genetic algorithm and cuckoo search optimization.
and deep learning (DL) approaches being explored to tackle the                  This hybrid methodology was tested on the KDD99 dataset after
complexity of modern threats. This section reviews existing IDS models          reducing feature dimensions from 41 to 20. The enhanced model ach­
with a focus on approaches leveraging Siamese networks, sequence                ieved 100 % detection rate, 99.99 % accuracy, 0.05 % FAR, and high
learning (e.g., LSTM, Bi-LSTM), contrastive learning, and interpret­            precision, recall, specificity, and F1-score. The novelty lies in integrating
ability frameworks such as SHAP. We also examine clustering tech­               CS for adaptive rule selection within GA to improve classifier breeding.
niques like OPTICS used for post-detection analysis. Each work is               Limitations include reliance on FLAME’s density-based clustering and a
evaluated based on its methodology, effectiveness, explainability, and          focus on a single dataset, which may affect generalizability to newer
suitability for real-time deployment in large-scale IoT or cloud                threats.
environments.                                                                       Ben Said et al. (2023) [24] proposed a CNN-BiLSTM hybrid deep
    Bedi et al. (2020) [17] addressed the class imbalance issue in IDS by       learning model for Network Intrusion Detection in Software-Defined
proposing a DNN-based Siamese architecture trained using contrastive            Networking (SDN). The methodology integrates spatial and temporal
loss. Their model effectively improved recall for rare attack types like        feature extraction with regularization and dropout optimization. Using
U2R and R2L in the NSL-KDD dataset. Although effective in                       InSDN, NSL-KDD, and UNSW-NB15 datasets, the model achieved up to
similarity-based detection, it lacked temporal modeling, interpret­             97.77 % accuracy, 99.85 % precision, 95.28 % recall, 100 % specificity,
ability, and cloud deployment support. SiamIDS adopts this contrastive          and F1-scores over 97 %. The novelty lies in combining BiLSTM’s
learning principle but enhances it with Bi-LSTM temporal encoding,              contextual memory with CNN’s hierarchical feature extraction for
SHAP-based explainability, and scalable cloud-oriented integration              SDN-specific threats. Limitations include longer training time and reli­
    Saurabh et al. (2022) [18] proposed LBDMIDS, a Bi-LSTM and                  ance on handcrafted feature selection.
Stacked LSTM-based model evaluated on UNSW-NB15 and Bot-IoT                         Zhang et al. (2023) [25] introduced a BiLSTM-based network
datasets. The model used Z-score normalization and sequence slicing             intrusion detection model enhanced by a multi-head attention mecha­
for temporal analysis, achieving over 99 % accuracy on Bot-IoT. While           nism to refine feature relationships. The methodology included
this supports temporal modeling, the approach lacks interpretability,           embedding, attention-driven weighting, and bidirectional temporal
similarity-based learning, and clustering capabilities. SiamIDS advances        analysis. Tested on KDDCUP99, NSLKDD, and CICIDS2017 datasets, the
this by combining Bi-LSTM with Siamese contrastive training, adding             model achieved accuracies of 98.29 %, 95.19 %, and 99.08 %, respec­
SHAP explanations, and applying OPTICS clustering to analyze novel              tively, with F1-scores up to 99 %. Precision and recall exceeded 97 % on
threats in cloud settings.                                                      most classes. The novelty lies in combining multi-head attention with
    Aldaej et al. (2023) [19] presents a Bi-LSTM-based IDS deployed in a        BiLSTM to capture bidirectional dependencies while adaptively
distributed cloud–edge architecture. The authors applied dimensionality         weighting features. However, the model struggles to identify unknown
reduction (GMDH, Chi2) and trained RNN/Bi-LSTM models on BoT-IoT,               attack types and may lose critical information during under sampling,
demonstrating scalable inference for edge environments. The study               affecting robustness in real-world deployments.
emphasized reduced computational complexity and deployment feasi­                   Hou et al. (2023) [26] introduced LCVAE-CBiLSTM, a hybrid intru­
bility. However, it lacks interpretability, similarity learning, and does       sion detection method combining Log-Cosh Conditional Variational
not explore contrastive pair-based detection. SiamIDS builds on this            Autoencoder (LCVAE) for minority class sample generation with
foundation by adding SHAP-based interpretability, contrastive Bi-LSTM           CNN-BiLSTM for spatiotemporal feature extraction. The NSL-KDD
modeling, and a cloud-centric inference design.                                 dataset was used. The model achieved 87.30 % accuracy, 80.89 %
    Hindy (2023) [20] introduced a one-shot Siamese learning model to           recall, 96.08 % precision, 87.89 % F1-score, and a FAR of 4.36 %. The
detect zero-day attacks by learning distance metrics from traffic pairs.        novelty lies in using log-cosh loss to improve generative reconstruction
The method achieved strong generalization on CICIDS2017 and                     and mitigate gradient explosion, enhancing minority attack detection.
NSL-KDD, reducing retraining requirements. However, it employed                 Limitations include sensitivity bias across attack types and reduced
basic MLP-based twin networks and did not incorporate sequence                  performance for certain 0-day and rare attacks.
modeling or interpretability. SiamIDS builds upon this foundation with a            Ali et al. (2023) [27] proposed a dual-layer intrusion detection
Bi-LSTM-based Siamese backbone, feature compression, SHAP-based                 framework combining Shuffle Shepherd Optimization (SSO)-based
decision explanation, and unsupervised clustering to further enhance            feature selection and LSTM for classification, reinforced with SHA3–256
detection granularity and transparency.                                         hash functions for intrusion prevention. The methodology includes
    Madhu et al. (2023) [21] introduces a deep learning framework for           real-time data normalization, optimal feature filtration via SSO, and
intrusion detection in smart home IoT networks using TabNet and CNN.            sequential attack detection. Evaluated on KDDCUP99 and UNSW-NB15

                                                                            3
P. Kaliyaperumal et al.                                                                                          Computer Standards & Interfaces 97 (2026) 104119


datasets, results show 99.92 % (KDDCUP99) and 99.91 % (UNSW-NB15)                classification system for IoT networks combining Decision Tree for
accuracy; precision at 98 %, recall at 98.2 %, specificity near 99 %,            initial detection and CNN-BiLSTM for anomaly type classification. The
F1-score at 98 %, and extremely low FNR (0.001). Limitations include             approach uses SMOTE for class balancing and Particle Swarm Optimi­
real-time online validation only; the model lacks adaptability for               zation (PSO) for feature selection. Evaluated on the IoTID20 and
cross-domain threat intelligence and faces constraints under                     N-BaIoT datasets, it achieved up to 91.87 % accuracy, precision and
ultra-high-speed traffic.                                                        recall near 90 %, and F1-score around 89 %. The novelty lies in
    Jiang et al. (2023) [28] proposed FR-APPSO-BiLSTM, a network                 cascading lightweight and deep models with optimized preprocessing. A
anomaly detection model combining feature reduction via hierarchical             limitation includes reliance on labeled data and high computational
clustering and autoencoders with an improved PSO algorithm for                   resources for CNN-BiLSTM, affecting real-time adaptability in con­
BiLSTM optimization. Tested on NSL-KDD, UNSW-NB15, and                           strained IoT settings.
CICIDS-2017 datasets, the model achieved up to 95.44 % accuracy,                     Zhang et al. (2025) [35] proposed a hybrid intrusion detection model
98.58 % precision, 98.40 % recall, 99.92 % specificity, and 98.49 %              combining CNN, Bi-LSTM, and Transformer networks to handle
F1-score. Novelty lies in adaptive velocity and position updates, and            spatial-temporal features in IoT traffic. Their system used CICIDS2017
dynamic parameter tuning within PSO, enhancing BiLSTM’s perfor­                  and BoT-IoT datasets and integrated multi-stage feature selection via
mance. Limitations include scalability challenges in high-speed net­             XGBoost and mutual information. While achieving high accuracy, the
works and potential sensitivity to feature subset selection.                     model lacks interpretability and does not address zero-day threats or
    Yaras and Dener (2024) [29] developed a hybrid model combining               similarity learning. Unlike SiamIDS, their work does not integrate SHAP
1D-CNN and LSTM, optimized for scalable environments using PySpark               explainability, contrastive training, or support cloud-native
and Google Colab. Their model, tested on CICIoT2023 and TON_IoT,                 deployment.
achieved high accuracy without data balancing techniques. The work                   Alabbadi an Bajaber (2025) [36] focuses on explainable AI for
confirms the value of hybrid DL for IoT traffic but lacks contrastive            intrusion detection using DL models like DNN and CNN, complemented
learning, explainability, or behavior clustering. SiamIDS extends this by        by SHAP and LIME for interpretability. Evaluated on TON_IoT, the
integrating Bi-LSTM within a Siamese structure and offering                      models achieved high classification accuracy, and the SHAP visualiza­
SHAP-based insights and OPTICS-based threat clustering for real-time             tions improved analyst trust in IDS outputs. However, the approach does
analysis.                                                                        not include temporal sequence learning or contrastive similarity mech­
    Althiyabi et al. (2024) [30] proposed a few-shot intrusion detection         anisms. SiamIDS complements this by integrating SHAP with Bi-LSTM
model using 1D-CNN and Prototypical Networks, evaluated on                       Siamese modeling, providing explainable and scalable detection of un­
CICIDS2017 and MQTT-IoT datasets. The model achieved high perfor­                known attacks.
mance under limited data conditions (5-shot and 10-shot settings),                   Alhayan et al. (2025) [37] proposed SHODLM-CEIDS, a hybrid deep
supporting rare class detection. However, it lacked temporal analysis,           learning model for intrusion detection in cloud computing, combining
interpretability, and similarity-based reasoning. SiamIDS similarly tar­         Dung Beetle Optimization (DBO) for feature selection, CNN-BiLSTM for
gets zero-day detection but incorporates Bi-LSTM Siamese modeling and            classification, and Spotted Hyena Optimization (SHO) for tuning. Eval­
SHAP explanations, with additional OPTICS clustering to reveal                   uated on NSL-KDD dataset (148,517 samples), it achieved 99.49 % ac­
behavioral groupings among anomalies.                                            curacy, 94.49 % recall, 88.75 % precision, 91.24 % F1-score, and high
    Bo et al. (2024) [31] developed a few-shot intrusion detection model         specificity. The novelty lies in integrating biologically inspired opti­
integrating Adaptive Feature Fusion (AFF) with Prototypical Networks.            mizers with deep learning. Results showed robust detection across
Using CICIDS2017 and ISCX2012, the system achieved over 99 % ac­                 attack types. Limitations include potential inefficiency in tuning across
curacy with minimal labeled data, thanks to feature diversity from bi­           scenarios and computational cost for high-dimensional data.
nary and statistical sources. Despite this, it lacks temporal modeling and           Duc et al. (2025) [38] proposed FedSAGE, a federated DGA malware
explainability, and does not address post-detection analysis like clus­          detection system using Variational Autoencoder (VAE)-based unsuper­
tering. SiamIDS takes a step further by employing Bi-LSTM for sequence           vised clustering and resource-aware client selection. The methodology
modeling, SHAP for decision transparency, and OPTICS for behavioral              includes latent space representation via pre-trained VAEs and client
analysis.                                                                        grouping using affinity propagation. Evaluated on a multi-zone DGA
    Touré et al. (2024) [32] proposed a hybrid zero-day attack detection        dataset with CNN, BiLSTM, and Transformer models, it achieved up to
framework combining supervised (CNN, DT, RF, KNN, NB) and unsu­                  89.83 % accuracy, 80.32 % F1-score, precision near 90 %, recall above
pervised (K-Means) learning with online adaptation. The methodology              80 %, and strong specificity in unseen attack scenarios. Novelty lies in
includes flow feature engineering, anomaly identification via                    clustering clients without raw data or labels. Limitations include scaling
silhouette-based clustering, and new class validation through online             affinity propagation and assuming client reliability, which may affect
learning. Experiments were conducted on IBM real-time network flows              performance in large deployments.
and NSL-KDD datasets. Results show high accuracy: 98.4 % (IBM), 96.6                 Natha et al. (2025) [39] introduced the Composite Recurrent
% (NSL-KDD); F1-score up to 99 %, specificity and precision above 98 %,          Bi-Attention (CRBA) model for spatiotemporal anomaly detection in
and recall exceeding 97 %. Limitations include dependence on clus­               video surveillance. Combining DenseNet201 for spatial feature extrac­
tering thresholds and need for periodic model retraining to maintain             tion with BiLSTM networks and attention layers for temporal modeling,
real-time responsiveness.                                                        the methodology targets real-time detection of anomalies like accidents
    Chintapalli et al. (2024) [33] proposed an intrusion detection               and theft. Evaluated on UCF Crime and Road Anomaly Dataset (RAD),
framework for IoT systems using OOA-modified Bi-LSTM with ELU                    the model achieved 92.2 % (RAD) and 86.2 % (UCF) accuracy, with
activation for robust sequence learning. The Osprey Optimization Al­             F1-scores over 92 %, precision and recall exceeding 92 %, and specificity
gorithm (OOA) selected informative features from N-BaIoT,                        above 91 %. Limitations include high computational demands; novelty
CICIDS-2017, and ToN-IoT datasets. The model achieved impressive                 lies in integrating attention-driven BiLSTM with DenseNet to enhance
results: N-BaIoT (99.98 % accuracy, 99.94 % recall, 99.90 % precision,           spatiotemporal anomaly recognition.
99.89 % F1, 99.90 % specificity), CICIDS-2017 (99.97 % accuracy, 99.91               Alsaleh et al. (2025) [40] proposed a semi-decentralized federated
% recall, 99.96 % F1), and ToN-IoT (99.88 % accuracy, 99.89 % recall,            learning model for intrusion detection in heterogeneous IoT networks.
99.90 % F1). The novelty lies in integrating OOA for feature selection           The methodology clusters resource-constrained IoT clients, using
and ELU to avoid vanishing gradients. Limitations include reliance on            BiLSTM, LSTM, and WGAN as lightweight local models. Trained on
predefined datasets and absence of real-time deployment validation.              CICIoT2023, the BiLSTM model achieved 99.09 % accuracy, 68.05 %
    Guan et al. (2024) [34] proposed ACS-IoT, a two-tier anomaly                 recall, 79.48 % precision, 70.45 % F1-score, and robust specificity.

                                                                             4
P. Kaliyaperumal et al.                                                                                            Computer Standards & Interfaces 97 (2026) 104119


Table 1
CIC IoT-DIAD 2024 dataset Traffic Distribution by Attack Category.
  Traffic          Attack Family     Specific Attack Types        Number of
  Category                                                        Records

  Benign           —                 Normal IoT Traffic           398,330
  Malicious        Brute Force       Dictionary Attack            3619
                   Distributed DoS   ACK_Frag, ICMP_Flood,        3478,814
                                     HTTP_Flood, ICMP_Frag
                   Denial of         SYN_Flood, HTTP_Flood,       7901,855
                   Service           UDP_Flood
                   Mirai Variant     Mirai-greeth Flood           174,588
                   Reconnaissance    Vulnerability Scan           442,158
                   Spoofing          ARP Spoofing, DNS Spoofing   157,238
                   Web-Based         SQL Injection                11,328


Novelty lies in clustering clients by model update similarity using
autoencoder-processed weights and Manhattan-based K-means,                        Fig. 3. CIC IoT-DIAD 2024 dataset Attack Category Distribution Percentage.
enhancing FedAvg aggregation and reducing communication overhead.
Limitations include underperformance on severely imbalanced classes               3. Materials and methods
and increased complexity in cluster formation, suggesting avenues for
dynamic clustering optimization.                                                  3.1. Materials
    Mohale & Obagbuwa (2025) [41] developed an XAI-integrated
ML-based IDS using Decision Trees, MLP, XGBoost, Random Forest,                   3.1.1. CIC IoT-DIAD 2024 dataset
CatBoost, Logistic Regression, and Gaussian Naive Bayes. Tested on                    All experimental evaluations for SiamIDS are conducted using the
UNSW-NB15 (2.5 M records, 9 attack types), XGBoost and CatBoost                   CIC IoT-DIAD 2024 dataset [42], a comprehensive and recently released
achieved 87 % accuracy, 0.86–0.87 precision, 0.88 recall, 0.87 F1-score,          benchmark for IoT network intrusion detection. This dataset was chosen
and 0.94 ROC-AUC. The novelty lies in combining SHAP, LIME, and ELI5              for its realistic representation of network behavior across diverse IoT
for interpretable IDS decision-making. Limitations include dataset scope          devices under both benign and adversarial conditions, providing a
and challenges integrating XAI into resource-constrained environments.            challenging and practical testbed for intrusion diagnosis. As shown in
Results affirm improved transparency without compromising detection               Table 1, it includes flow-level records for 33 distinct attack types,
performance.                                                                      grouped into 7 high-level attack families—DDoS, DoS, Spoofing, Mirai,
    While recent advances in intrusion detection have achieved strong             Reconnaissance, Web-based intrusions, and Brute Force attacks. Each
performance using deep learning, most existing methods continue to                flow comprises 83 features, capturing a broad spectrum of traffic char­
face several critical limitations that hinder their effectiveness in real-        acteristics, including timestamps, protocol flags, packet and byte sta­
world cloud-IoT deployments. First, many models rely heavily on su­               tistics, flow duration, and header information [43]. The dataset is
pervised learning and labeled datasets, making them ineffective against           provided in preprocessed CSV format with ground-truth labels for both
zero-day attacks or unseen threat patterns. Second, although Siamese              binary classification (Benign vs. Attack) and multiclass classification
architectures and few-shot models have been introduced, they often                (specific attack types). A notable challenge of the dataset is its class
neglect temporal behavior modeling, which is crucial for capturing                imbalance, with benign traffic constituting a smaller fraction of total
evolving patterns in IoT traffic. Another recurring issue is the lack of          flows, while certain attack types like UDP Flood or ACK Fragmentation
interpretability. Most state-of-the-art IDS solutions do not explain their        dominate, and others like SQL Injection are underrepresented. This
decision-making process, making them impractical for SOC analysts who             imbalance motivates the use of contrastive learning within the Siamese
require transparency for trust and incident response. While some works            framework, which focuses on modeling behavioral similarity rather than
have explored SHAP or LIME, these are usually decoupled from                      relying on traditional class distributions. The dataset’s richness and di­
sequence-aware architectures or do not integrate similarity-based                 versity make it suitable for evaluating SiamIDS under large-scale,
anomaly detection. Moreover, post-detection behavioral clustering,                imbalanced, and heterogeneous IoT traffic conditions.
which can aid in triaging threats and identifying variants, is rarely                 Additionally, Fig. 3 presents the overall class distribution across
incorporated into modern IDS pipelines. Additionally, cloud readiness             major families, highlighting the dominance of DoS and DDoS traffic and
and real-time scalability remain under-addressed. Many models exhibit             the relatively minor presence of attacks such as Spoofing or Web-based
high training accuracy but are not optimized for deployment in dy­                intrusions. This data distribution profile poses a real-world challenge for
namic, resource-constrained environments like microservices or                    intrusion detection models and serves as a robust foundation for eval­
distributed SOCs.                                                                 uating SiamIDS under imbalanced, diverse, and large-scale conditions.
    To bridge these gaps, we propose SiamIDS—a unified, cloud-centric
framework that incorporates:                                                      3.1.2. Data pre-processing
                                                                                      The proposed SiamIDS framework is trained and evaluated using the
 • Autoencoder-based compression for dimensionality reduction,                    CIC IoT-DIAD 2024 dataset [42], which comprises high-dimensional IoT
 • Bi-LSTM Siamese architecture for temporal similarity learning and              network traffic, including benign flows and 33 distinct attack types. To
   zero-shot detection,                                                           prepare the data for temporal similarity modeling and ensure learning
 • SHAP explainability for transparent decision-making, and                       efficiency, the following preprocessing steps are applied. First, feature
 • OPTICS clustering for post-detection threat grouping.                          scaling is performed using Z-score normalization [44], Di defined as in
                                                                                  Eq. (1):
    This holistic design not only improves detection accuracy but also
provides behavioral insights and practical deployability, fulfilling both                (tDi − μ)
                                                                                  Di =                                                                         (1)
technical and operational requirements of next-generation IoT security                      σ
systems.
                                                                                  where tDi is the original traffic data, μ is the mean, and σ is the standard
                                                                                  deviation. While Z-score assumes approximate normality and does not

                                                                              5
P. Kaliyaperumal et al.                                                                                                         Computer Standards & Interfaces 97 (2026) 104119


                                                            Fig. 4. Operational architecture of Shallow Autoencoder.


Table 2                                                                                    Table 3
Contrastive Pair Generation Statistics.                                                    Dataset Splits and Their Roles in Model Training, Validation, and Evaluation.
  Pair Type               Description                                      Count            Dataset        Data Proportion / Size    Purpose / Usage
                                                                                            Split
  Positive Pairs          Unique benign–benign pairs from training split   100,000
  Negative Pairs          Unique benign–attack pairs from training split   100,000          Training Set   70 % of benign and        Used for Autoencoder and Siamese
  Total Training          For Siamese contrastive learning                 200,000                         attack flows              training; initial OPTICS parameter
    Pairs                                                                                                                            calibration
  Validation Pairs        50 % positive, 50 % negative from validation     20,000           Validation     10 % of benign and        Used to generate validation pairs and tune
                          split                                                               Set          attack flows              the similarity threshold
  Reference Set           Benign flows used for similarity scoring at      10,000           Test Set       20 % of mixed traffic     Reserved for final performance evaluation
                          inference                                                                        flows                     and clustering
  Test Sequences          Unseen flows (Benign + Attack) from test split   ~2.5             Reference      10,000 benign flows       Excluded from training; used at test time
                                                                           million            Set          (from training)           for similarity comparison


explicitly model non-linear relationships, it effectively standardizes the                 dissimilarity. A stratified contrastive sampling approach is adopted to
feature space prior to neural network training. In SiamIDS, non-linear                     ensure diversity and prevent overlap across training, validation, and
dependencies are subsequently captured by the autoencoder, making                          reference sets [48]. Positive Pairs are built from randomly selected
Z-score a lightweight and effective preprocessing choice. Z-score is                       benign flows and represent behaviorally similar sequences. Negative
favored over min–max or robust scaling because it recenters features                       Pairs consist of benign and malicious sequences, highlighting dissimilar
around zero with unit variance, which is essential for LSTM-based                          patterns in flow dynamics. Validation Pairs are sampled independently
models that are sensitive to feature scale across time steps [45,46].                      for threshold tuning and ROC analysis and a reference set of benign
This promotes gradient stability and uniform feature influence during                      flows is held out exclusively for similarity comparison during inference.
sequence learning. Next, sequence slicing converts raw traffic flows into                  The overall pair composition and dataset usage are detailed in Table 2.
fixed-length windows (e.g., 10–20 packets), preserving temporal conti­                     This setup ensures balanced training, avoids information leakage, and
nuity. Finally, label conversion is applied: each sequence is labeled as                   allows the Siamese model to generalize to diverse and unseen attacks.
Benign or Malicious, enabling binary contrastive learning in the Siamese
network. This aligns with the framework’s focus on modeling behavioral                     3.1.5. Training and testing splits
similarity rather than traditional multi-class classification.                                 To ensure robust and leakage-free evaluation, the CIC IoT-DIAD
                                                                                           2024 dataset is partitioned into stratified training, validation, and
3.1.3. Feature extraction                                                                  testing subsets. Stratification preserves the distribution of benign and
    To improve efficiency, generalization, and training stability in the                   attack flows across splits, ensuring balanced representation of all classes.
SiamIDS framework, a shallow Autoencoder (AE) is employed for                              A reference set of benign flows is held out exclusively for test-time
dimensionality reduction [47]. As illustrated in Fig. 4, the Autoencoder                   similarity scoring in the Siamese network, preventing overlap with
module is a key component of the overall SiamIDS architecture, which                       training data and enabling unbiased anomaly assessment. For contras­
integrates dimensionality reduction, Siamese Bi-LSTM-based detection,                      tive learning, unique positive (Benign–Benign) and negative
SHAP-based explainability, and OPTICS-based clustering. This unsu­                         (Benign–Attack) pairs are generated using a stratified sampling strategy,
pervised AE neural network is trained exclusively on benign traffic,                       as detailed in Section 3.1.4. Training pairs are used to teach the Siamese
allowing it to learn compressed latent representations that capture                        network robust behavioral embeddings, validation pairs support
essential, noise-free behavioral features from high-dimensional IoT                        threshold tuning and ROC evaluation, and the reference set is employed
traffic data.                                                                              solely during inference to compute similarity scores. This partitioning
                                                                                           strategy enhances generalization to unseen attack types, mitigates
3.1.4. Pair generation strategy                                                            overfitting, and aligns with SiamIDS’s emphasis on behavioral
   To support contrastive learning in SiamIDS, we construct pairs of                       similarity-based intrusion detection (see Table 3 for dataset splits and
network flow sequences that reflect behavioral similarity or                               their roles).


                                                                                       6
P. Kaliyaperumal et al.                                                                                                 Computer Standards & Interfaces 97 (2026) 104119


                                               Fig. 5. Architecture of the Bi-LSTM layers in SiamIDS framework.


3.2. Methods

3.2.1. Autoencoder-based feature compression for IoT intrusion detection
    Autoencoders are unsupervised neural networks that learn com­
pressed representations of input data by reconstructing it with minimal
error. In IoT intrusion detection, they efficiently reduce feature dimen­
sionality while preserving critical behavioral patterns of network traffic
[49,50] (Fig. 4).
    An autoencoder comprises an encoder that maps input x ∈ Rn to a
lower-dimensional latent space z ∈ Rm (m < n) via a non-linear trans­
formation f as defined in Eq. (2), and a decoder g that reconstructs xfrom
z as defined in Eq. (3). Training minimizes reconstruction loss, typically
Mean Squared Error (MSE):
z = f(x) = σ (We x + be ),                                               (2)

x = g(z) = σ (Wd z + bd )
̂                                                                        (3)

where Wand b denote weights and biases, and σis the activation function
(ReLU/Sigmoid). In the SiamIDS framework, the autoencoder com­
presses inputs before feeding them into the Siamese Bi-LSTM, enhancing
computational efficiency and filtering noise while preserving flow
characteristics. It is trained exclusively on benign traffic to model
normal behavior; significant reconstruction errors indicate anomalies.
The employed architecture features shallow fully connected encoder-                                   Fig. 6. Siamese Network Similarity Learning.
decoder layers with a 20-neuron bottleneck, empirically optimized to
balance reconstruction accuracy and compactness. This setup ensures
effective dimensionality reduction without compromising the ability to              it = σ (Wi ∗ [ht− 1 , xt ] + bi )                                               (5)
discriminate anomalous traffic, forming a robust foundation for subse­
quent temporal and similarity-based analysis.                                       Ct = tanh(WC ∗ [ht− 1 , xt ] + bCt )                                            (6)

3.2.2. Bi-LSTM-based temporal modeling of network traffic                           Ct = ft ⊙ Ct− 1 + it ⊙ Ct                                                       (7)
     Bidirectional Long Short-Term Memory (Bi-LSTM) networks extend
Recurrent Neural Networks (RNNs) by processing sequential data in                   ot = σ(Wo ∗ [ht− 1 , xt ] + bo )                                                (8)
both forward and backward directions, thereby capturing contextual
                                                                                    ht = ot ⊙ tanh(Ct )                                                             (9)
information from past and future time steps. In intrusion detection,
where network traffic exhibits temporal dependencies, Bi-LSTM effec­                    Within the SiamIDS framework, Bi-LSTM constitutes the core of the
tively models evolving flow behaviors. An LSTM unit maintains a cell                twin subnetworks, generating time-aware, flow-sensitive embeddings
state Ct governed by three gates—input (it), forget (ft), and output (ot)—          for each input instance. These embeddings are leveraged to compute
as defined in Eqs. (4–9). These mechanisms enable selective retention               similarity scores during contrastive training and inference. The imple­
and updating of information over time. Unlike conventional LSTMs, Bi-               mented Bi-LSTM employs two LSTM layers per direction with 64 hidden
                                                           [        ]
LSTM concatenates hidden states from both directions h→      t ; ht , allow­
                                                                  ←                 units, integrated with dropout and batch normalization for regulariza­
ing comprehensive temporal representation of traffic sessions. The in­              tion and stability. By capturing bidirectional and long-range de­
ternal architecture of the Bi-LSTM layers used in the SiamIDS framework             pendencies, Bi-LSTM enhances the framework’s ability to discern subtle
is illustrated in Fig. 5.                                                           temporal deviations, significantly improving zero-day attack diagnosis
        (                      )                                                    accuracy.
ft = σ Wf ∗ [ht− 1 , xt ] + bf                                            (4)


                                                                                7
P. Kaliyaperumal et al.                                                                                             Computer Standards & Interfaces 97 (2026) 104119


           Fig. 7. SHAP Force Plot Illustrating Feature Contributions.


3.2.3. Siamese network for similarity-based anomaly detection
    A Siamese Neural Network employs dual, weight-shared sub­
networks that learn a discriminative similarity metric between paired
inputs through their latent feature representations. In intrusion diag­
nosis, this design effectively differentiates benign and malicious traffic,
particularly under limited or imbalanced labeled data conditions [51,
52]. Each branch receives distinct inputs x1 and x2, generating embed­                                Fig. 8. OPTICS Clustering of Anomalies.
dings f(x1) and f(x2). The similarity is measured using the Euclidean
distance, as defined in Eq. (10):                                                   and debugging, and fosters trust by aligning SiamIDS with the broader
                                                                                    principles of explainable artificial intelligence (XAI) in IoT–cloud
D(x1 , x2 ) = ||f(x1 ) − f(x2 )|∣2                                       (10)
                                                                                    intrusion diagnosis.
   Learning is governed by the contrastive loss function, presented in
Eq. (11):                                                                           3.2.5. OPTICS for density-based clustering of anomalous behaviors
           1      1                                                                     Beyond detecting intrusions, grouping anomalies into coherent
L = (1 − y) D2 + y max (0, m − D)2                                       (11)       behavioral clusters is essential for root cause analysis and threat
           2      2
                                                                                    profiling. To address this, the SiamIDS framework employs OPTICS
where y ∈ {0, 1}denotes pair similarity and mdefines the margin for                 (Ordering Points To Identify the Clustering Structure) for post-detection
dissimilar samples.                                                                 clustering of anomalous traffic. OPTICS is a density-based algorithm that
    As shown in Fig. 6, the SiamIDS framework trains on both intra-class            extends DBSCAN by identifying clusters of varying densities without
(similar) and inter-class (dissimilar) traffic pairs to model behavioral            requiring a predefined cluster count. It introduces two key metri­
proximity. During inference, each traffic instance is compared against              cs—core distance and reachability distance—to reveal hierarchical data
benign references; instances exceeding a learned threshold are marked               structures. The reachability distance between two points is defined in
anomalous. The similarity-driven paradigm enables zero-day threat                   equation (13) as:
identification, minimizes dependence on predefined class boundaries,
                                                                                    Reachability − dist(p, o) = max (core − dist(o), dist(p, o))              (13)
and enhances scalability. Combined with Bi-LSTM-based temporal
encoding, the Siamese configuration reinforces contextual discrimina­               where core-dist(o)is the minimum radius ε containing at least MinPts
tion and interpretability within complex IoT–cloud environments.                    neighbors.
                                                                                        In SiamIDS, anomalous flows detected by the Siamese Bi-LSTM
3.2.4. SHAP for feature-level explainability in intrusion detection                 module are passed to OPTICS for clustering. This enables behavioral
    Interpretability is a critical requirement in cybersecurity applica­            grouping, where related attack variants—such as multiple DDoS or
tions, particularly for deep learning models deployed in sensitive or               botnet types—are organized into semantically meaningful clusters. As
mission-critical environments. To overcome the “black-box” limitation               shown in Fig. 8, the resulting reachability plots and 2D projections
of architectures such as Bi-LSTM and Siamese networks, the SHapley                  reveal the underlying structure of anomalous behaviors.
Additive exPlanations (SHAP) framework is integrated into the SiamIDS                   OPTICS provides several advantages: it eliminates the need to specify
model to provide transparent, feature-level interpretability.                       the number of clusters, effectively detects non-convex and variable-
    SHAP is a game-theoretic approach that assigns each input feature a             density formations, and exhibits strong resilience to noise. Its integra­
contribution score (Shapley value) toward the model’s prediction [36,               tion enhances post-detection analytics, enabling Security Operations
41]. The Shapley value for feature i is defined in Eq. (12):                        Centers (SOCs) to interpret, correlate, and prioritize anomalies effi­
        ∑ |S|!(|F| − |S| − 1)!                                                      ciently—thereby supporting dynamic threat intelligence and adaptive
ϕi =                           [f(S ∪ {i}) − f(S)]                       (12)       response in complex IoT–cloud ecosystems.
       S⊆F\{i}
                   |F|!

                                                                                    4. Proposed methodology: SiamIDS for interpretable IoT
where F represents the full feature set, S is any subset excluding i, and f
                                                                                    intrusion detection
(S) is the model output using only features in S. This formulation eval­
uates a feature’s marginal contribution across all possible feature com­
                                                                                        This section details the internal design, operational workflow, and
binations. Within SiamIDS, SHAP is applied post-inference to interpret
                                                                                    implementation components of SiamIDS—a novel intrusion detection
anomaly predictions generated by the Siamese module. Once a traffic
                                                                                    system engineered for interpretability, zero-day detection, and scalable
flow is flagged as malicious, SHAP computes per-feature importance
                                                                                    deployment in IoT-cloud ecosystems. The methodology addresses
scores, revealing which attributes influenced the anomaly score most
                                                                                    several pressing challenges in modern IDS—namely, detection of zero-
strongly. As shown in Fig. 7, SHAP visualizations such as force plots
                                                                                    day attacks, model explainability, low-resource deployment, and post-
enable both local and global interpretation of detection outcomes.
                                                                                    detection behavioral analysis. SiamIDS integrates five core modules:
    Integrating SHAP enhances model transparency, supports validation
                                                                                    an autoencoder for dimensionality reduction, a Bi-LSTM backbone for


                                                                                8
P. Kaliyaperumal et al.                                                                                           Computer Standards & Interfaces 97 (2026) 104119


          Fig. 9. Architectural overview of SiamIDS integrating autoencoder, Bi-LSTM Siamese network, SHAP-based explanation, and OPTICS clustering.


temporal modeling, a Siamese network for contrastive similarity                      Training proceeds until the convergence threshold T is satisfied. The
learning, SHAP for explainability, and OPTICS for clustering of detected          reduced-dimensional sequence Z  ̂ D is then passed through a Bi-LSTM to
anomalies. Each component plays a crucial role in enabling the system             capture temporal dependencies. The hidden state at time t is computed
to accurately and transparently detect malicious behavior.                        as in Eq. (17):
                                                                                       → ←
4.1. System model                                                                 ht = ht ‖ht                                                               (17)

    The SiamIDS framework operates through a structured sequence of               and aggregated via average pooling to form a global sequence embed­
processes encompassing dimensionality reduction, temporal embed­                  ding e. To distinguish benign from malicious traffic, SiamIDS employs a
ding, similarity learning, interpretable decision-making, and post-               Siamese architecture with contrastive learning. Given paired embed­
detection clustering. Initially, a shallow autoencoder is trained exclu­          dings e1,e2, the Euclidean distance d(e1,e2) = ∣e1 − e2∣2is minimized for
sively on benign traffic to compress high-dimensional network vectors             similar pairs and maximized for dissimilar pairs using the contrastive
Dinto a compact latent representation Z   ̂ D . The encoder and decoder           loss is defined in Eq. (18):
functions are defined in Eqs. (14) and (15), respectively:                        Lcon = y d2 + (1 − y)max (0, m − d)2                                      (18)
̂ D = Eθ (D) = σ(We D + be ),
Z                                                                      (14)
                                                                                  where y ∈ {0, 1}indicates pair similarity, and menforces separation be­
                                                                                  tween dissimilar samples. During inference, a test sequence Dtestis
D
̂ = gθ ( Z             ̂ D + bd )
         ̂ D ) = σ (Wd Z                                               (15)
                                                                                  encoded into etestand compared to reference benign embeddings Eref.
                                                                                  The mean distance defines an anomaly score, and sequences exceeding
where We,Wdand be,bdare trainable parameters, and σ is the activation
                                                                                  threshold τare flagged as anomalous. To ensure interpretability, SHAP
function (ReLU for encoder, Sigmoid for decoder). The network is
                                                                                  computes feature-level contributions for each prediction as per Eq. (19):
trained to minimize the mean squared error (MSE) between original and
reconstructed inputs as defined in Eq. (16):                                                    n
                                                                                                ∑
                 ⃒                                                                f(x) = ϕ0 +         ϕi                                                    (19)
              n
           1∑    ⃒
                 ⃒     ̂ i |2
                                                                                                i=1
MSEloss =        ⃒Di − D                                         (16)
           n i=1 ⃒
                                                                                  where ϕ0is the expected model output and ϕiquantifies the contribution


                                                                              9
P. Kaliyaperumal et al.                                                                                            Computer Standards & Interfaces 97 (2026) 104119


Algorithm 1
SiamIDS Working Flow.
  Input: Network traffic sequences D
    1. Normalize features using Z-score.
    2. Encode with autoencoder: Z_D = E(D)
    3. Construct pair set:
     → Positive: (B1, B2), label y=1
     → Negative: (B1, A), label y=0
    4. For each pair:
     → Compute embeddings (e1, e2)
     → Compute distance: d = ||e1 - e2||²
     → Compute L_con and update model
    5. During inference:
     → Encode test: e_test
     → Compare to E_ref
     → Compute anomaly score
     → Apply SHAP to explain decisions
     → Cluster anomalies using OPTICS


of feature i. DeepExplainer is employed to provide human-                           4.2. Architecture and working of SiamIDS
understandable insights into feature influences. Finally, detected
anomalies Eanom = {e1 ,e2 ,…,en }are analyzed with OPTICS clustering for                This section introduces SiamIDS, a cloud-compatible intrusion
behavioral grouping. Core and reachability distances are computed as in             detection framework developed for scalable and interpretable anomaly
Eq. (20) and (21):                                                                  detection in IoT environments. As depicted in Fig. 9, the framework
                                                                                    begins with a data preprocessing stage that includes Z-score-based
core(p) = distance to minPts − th neighbor,                             (20)
                                                                                    feature scaling, fixed-length sequence slicing, and label transformation.
                                                                                        The processed data is then passed into a shallow autoencoder,
reachability(o, p) = max (core(p), distance(p, o))                      (21)
                                                                                    trained exclusively on benign traffic, to generate low-dimensional latent
    The resulting reachability plot reveals dense clusters and sparse               representations. These embeddings capture core behavioral patterns
outliers, supporting SOC analysts in profiling attack families. Collec­             while reducing computational overhead.
tively, these formulations Eqs. (14–21) define SiamIDS’s learning ob­                   To enable contrastive learning, SiamIDS constructs input pairs—­
jectives, similarity metrics, decision thresholds, interpretability logic,          positive pairs (Benign–Benign) and negative pairs (Benign–Malicious)—
and clustering strategies, enabling robust, scalable, and explainable               which are then fed into a Siamese network consisting of two identical Bi-
intrusion detection in complex IoT–cloud environments.                              LSTM branches. Each branch encodes the temporal dependencies in the


                          Fig. 10. Process flow of the shallow Autoencoder used for dimensionality reduction in the SiamIDS framework.

                                                                               10
P. Kaliyaperumal et al.                                                                                            Computer Standards & Interfaces 97 (2026) 104119


respective input sequences, and the network outputs a similarity score
that quantifies behavioral similarity.
    During inference, each test sequence is compared against a reference
pool of benign embeddings to determine whether it is anomalous. For
model transparency, the system integrates a SHAP-based explainability
layer, which highlights the contribution of each feature toward the
model’s decision.
    Finally, the anomalous outputs are subjected to post-detection clus­
tering using OPTICS, a density-based algorithm that organizes similar
anomalies into behavioral clusters while identifying outliers. This sup­
ports real-time triaging and semantic profiling of novel or zero-day
threats in large-scale IoT deployments. The step-by-step operational
flow of SiamIDS is detailed in Algorithm 1.

4.3. Autoencoder architecture with latent space design and bottleneck
configuration

    The Autoencoder consists of two parts: an encoder Eθ and a decoder
Dθ. The overall process of the shallow Autoencoder used in SiamIDS is
depicted in Fig. 10, where the input data is encoded into a compressed
latent space and then reconstructed to minimize the reconstruction
error. The encoder maps the input vector D into a lower-dimensional
latent space ZD as in Eq. (14). The decoder then reconstructs the input
as in Eq. (15). Where, the ReLU activation function is used in the
encoder, while the decoder employs the Sigmoid activation function,
denoted as σ. The network is trained to minimize the mean squared error
(MSE) between the input D and the reconstructed output D,
                                                        ̂ the MSE loss
defined as in Eq. (16). A convergence threshold T is dynamically
monitored to determine training stability. When ∣MSEt− MSEt− 1∣ < T,
the training stops and the encoder is used for feature compression.
    The latent dimension (bottleneck size) is a critical hyperparameter.
We empirically evaluate various latent sizes (10 to 40) and select 20 as
optimal. This choice is based on achieving minimal reconstruction loss
without sacrificing temporal variance or interpretability. Smaller sizes
(e.g., 10 or 15) result in underfitting and information loss, while larger
ones (e.g., 35 or 40) offer negligible accuracy gain but higher
complexity. The chosen bottleneck layer significantly reduces the input
size for the Siamese Bi-LSTM, enhancing computational efficiency and              Fig. 11. Architecture of the Siamese Bi-LSTM network for attack detection in
convergence speed.                                                                the SiamIDS framework.
    Unlike traditional dimensionality reduction techniques such as
Principal Component Analysis (PCA) or Information Gain, which assume              4.4. Siamese network with Bi-LSTM backbone
linear separability or rely on predefined feature importance scores, the
Autoencoder offers a more adaptive and data-driven alternative [53,54].               At the core of the proposed SiamIDS framework is a Siamese neural
It is capable of capturing non-linear dependencies between features,              network composed of two identical sub-networks, each built upon Bi-
which are especially common in complex IoT traffic. Moreover, instead             directional Long Short-Term Memory (Bi-LSTM) layers. This design
of relying on generic variance-based projections like PCA, the Autoen­            enables the system to assess behavioral similarity between two network
coder learns task-specific embeddings that are optimized for down­                traffic sequences, making it ideal for detecting previously unseen (zero-
stream objectives—such as temporal similarity learning in the Siamese             day) or obfuscated threats through contrastive learning rather than
network. This enables the model to retain semantically meaningful                 traditional classification [20,57]. As shown in Fig. 11, the Siamese
patterns critical for distinguishing subtle behavioral anomalies. Another         network architecture processes the input sequences through two iden­
key advantage is that the Autoencoder avoids manual feature engi­                 tical Bi-LSTM branches. Each Siamese branch processes a flow sequence
neering or domain assumptions, allowing the model to generalize across            of reduced-dimensional input (from the Autoencoder) and maps it to a
diverse traffic sources and attack types [55]. While PCA projects data            latent embedding space. The Bi-LSTM architecture captures sequential
into orthogonal components derived from eigenvectors—often without                dependencies in both forward and backward directions, allowing the
regard to task relevance [56]—Autoencoders learn to reconstruct input             model to learn packet timing patterns, transition structures, and burst
patterns, preserving latent structures that are most informative for              behaviors commonly present in IoT traffic [58]. The input sequence D
reconstruction error minimization and anomaly detection. This makes               ={D1,D2,…,DT}, where each Dt ∈ ZD is a feature vector for a packet at
Autoencoders particularly suitable for dynamic, evolving network en­              time step t, and T is the sequence length. The Bi-LSTM produces forward
vironments, where handcrafted or static feature selection methods may                                             → ←
                                                                                  and backward hidden states ht , ht and concatenates them as ht, as
fall short. Once convergence is achieved (see flowchart), the trans­              defined in Eq. (17).
formed vectors ZD from the encoder constitute the reduced-dimensional                 The final output embedding e is typically derived from average
input to the Siamese network in detection phase. This modular separa­             pooling of the Bi-LSTM. Both branches share weights (i.e., θleft=θright),
tion enhances interpretability and enables easy plug-and-play with                ensuring symmetric encoding and allowing the network to focus on
different detection models.                                                       relative sequence similarity rather than absolute classification. The
                                                                                  embedding generation process is outlined in Algorithm 2, which


                                                                             11
P. Kaliyaperumal et al.                                                                                                    Computer Standards & Interfaces 97 (2026) 104119


Algorithm 2
Embedding Generation via Siamese Bi-LSTM.
  Define Siamese_BiLSTM_Encoder(θ): Bi-directional LSTM layers with shared weights
    For each input sequence D = {D₁, D₂, …, D_T}:
        Reduce dimensionality: D’ = AE.encode(D)
        Compute Bi-LSTM embedding:
             For t = 1 to T:
                  h→_t = LSTM_forward(D’_t), h←_t = LSTM_backward(D’_t)
                  h_t = [h→_t || h←_t]
             end
        Return e = AveragePool({h₁, h₂, …, h_T})
    end


Algorithm 3
Pair construction and contrastive loss calculation.
  PositivePairs ← RandomPairs(Benign, Benign)
    NegativePairs ← RandomPairs(Benign, Attack)
    TrainPairs ← PositivePairs ∪ NegativePairs
     For each pair (D₁, D₂) in TrainPairs with label y ∈ {1, 0}:
         e₁ = Siamese_BiLSTM_Encoder(D₁)
         e₂ = Siamese_BiLSTM_Encoder(D₂)
         Compute distance: d = ||e₁ - e₂||₂
         Compute contrastive loss:
             L = y * d² + (1 - y) * max(0, m - d)²
         Update weights θ using gradient descent
     end


describes how each input sequence is processed through the Bi-LSTM                        4.4.2. Detection logic during inference
layers to produce the final embedding.                                                        During inference, each unlabeled test sequence is passed through the
                                                                                          trained Siamese model and compared against a reference pool of benign
4.4.1. Pair construction for contrastive training                                         embeddings derived from clean validation data. For a test embedding
   The Siamese network is trained using a contrastive learning para­                      etest, its similarity to each reference er ∈ D is computed using a distance
digm. Instead of training the model to classify a sequence, we present it                 function. The average distance across all comparisons is used as the
with pairs of sequences, each labeled based on their similarity:                          anomaly score. If this score falls below a pre-defined threshold τ, the
                                                                                          sequence is classified as anomalous:
 • Positive pairs: Two benign sequences (Benign–Benign) that are ex­                                  {
                                                                                                        Anomalous, ifmin(etest , er ) < τ
   pected to produce high similarity.                                                     Label =
                                                                                                               Benign, otherwies
 • Negative pairs: One benign and one malicious sequence
   (Benign–Malicious), which should exhibit low similarity.                                   The threshold τ is determined using Receiver Operating Character­
                                                                                          istic (ROC) analysis on a held-out validation set to optimize sensitivity
    (D1, D2) is a sequence pair, and y ∈ {0,1} the label indicating simi­                 and specificity. To ensure real-time capability in large-scale de­
larity (1 for similar, 0 for dissimilar). The embeddings e1=f(D1), e2=f                   ployments, embedding indexing using FAISS (Facebook AI Similarity
(D2) are passed through a distance function d, such as Euclidean dis­                     Search) is employed. This enables fast retrieval of the most similar
tance. The contrastive loss function, Lcon is then defined as in Eq. (18).                benign embeddings without exhaustive pairwise computation [59]. The
This formulation ensures that embeddings of similar pairs are pulled                      process of generating reference embeddings and computing anomaly
closer, while dissimilar pairs are pushed apart beyond the margin. In our                 scores is outlined in Algorithm 4.
setup, m is empirically set to 1.0, based on convergence behavior and
validation performance. To avoid class imbalance, the pair generation is                  4.5. Explainability integration with SHAP for feature-level interpretation
carefully balanced with equal proportions of positive and negative pairs.
Malicious samples are randomly sampled from all attack categories,                            One of the key challenges in deploying deep learning-based intrusion
ensuring representation across different threat behaviors. The process                    detection systems (IDS) in operational environments is the lack of
for constructing these pairs, as well as computing the contrastive loss                   interpretability. Security analysts often require clear, feature-level ex­
and updating the model’s weights, is described in Algorithm 3.                            planations for why a traffic instance is flagged as anomalous, especially
                                                                                          in high-stakes environments like SOCs (Security Operation Centers). To

Algorithm 4
Generation of reference embeddings and anomaly score computation.
  E_ref = {Siamese_BiLSTM_Encoder(D_r) | D_r ∈ clean validation set}
    For each test sequence D_test ∈ Dtest:
         e_test = Siamese_BiLSTM_Encoder(D_test)
         Compute distance set: S = {||e_test - e_r||₂ | e_r ∈ E_ref}
         AnomalyScore = mean(S)
         if AnomalyScore ≥ τ:
               Label ← Anomalous
         else:
               Label ← Benign
         end
     end


                                                                                     12
P. Kaliyaperumal et al.                                                                                          Computer Standards & Interfaces 97 (2026) 104119


Algorithm 5
Explainability Layer using SHAP.
  Encode test sequence using Siamese network:
    e_test ← f_left(D_test)
    Compute similarity score:
    s ← similarity(e_test, e_ref)
    Initialize SHAP Explainer:
    explainer ← DeepExplainer(f_left, background_data)
    Compute SHAP values for test input:
    SHAP_values ← explainer.shap_values(D_test)
    Interpret output:
    For each feature i in D_test:
    ϕ_i ← SHAP_values[i]
    Return explanation vector {ϕ₁, ϕ₂, …, ϕ_n}


address this, the SiamIDS framework integrates a SHapley Additive ex­            level SHAP values. This produces a ranked explanation vector indicating
Planations (SHAP) layer, enabling feature-level interpretability for             the most influential features responsible for the anomaly classification.
similarity-based decisions made by the Siamese network. SHAP is a                   The integration of SHAP into SiamIDS provides several practical
game-theoretic approach to explaining the output of machine learning             benefits that enhance both operational utility and trust in the detection
models by computing the contribution of each input feature toward the            process. First, SHAP explanations offer valuable analyst insight by
model’s prediction. It is based on the concept of Shapley values from            highlighting which protocol fields or flow-level features—such as Flow
cooperative game theory, which assigns a fair value to each player               Duration, Packet Length Variance, or TCP Flag PSH—contributed most
(feature) based on their contribution to the final outcome [41,60].              significantly to a sequence being flagged as anomalous. This granular
    Given a model f and input D ∈ DZ, SHAP aims to express the model’s           feedback helps analysts quickly understand behavioral deviations from
prediction as in Eq. (22).                                                       benign patterns. Second, the model’s explainability fosters trust and
                 n
                                                                                 transparency, which is particularly important in high-assurance do­
                 ∑
f(D) = ϕ0 +             ϕi                                          (22)         mains where AI-assisted decisions must be auditable and compliant with
                  i=0                                                            regulatory standards. Third, SHAP enables detailed root-cause analysis,
                                                                                 helping determine whether anomalies are driven by unusual timing
where ϕ0 is the model’s expected output and ϕi represents the Shapley            patterns, abnormal port behavior, or traffic volume inconsistencies.
value or contribution of feature i. In the context of SiamIDS, SHAP is           Lastly, SHAP can be used for model debugging, offering visibility into
applied to the left branch of the Siamese network to explain why a test          whether the Siamese network is overfitting to irrelevant features or
sequence is similar or dissimilar to a reference benign sequence.                overlooking critical ones. This makes SHAP a powerful component not
    Although SHAP is traditionally designed for explaining classification        only for improving incident response but also for refining model
or regression outputs, it is adapted in SiamIDS to interpret similarity          robustness during development and retraining phases.
scores produced by the Siamese network. Specifically, SHAP is applied
to the left branch of the Siamese architecture, which receives the test
sequence and encodes it into a latent embedding etest . This embedding is        4.6. Behavioral clustering of anomalies using optics
then compared to a reference benign embedding eref, and the similarity
(or distance) between the two determines whether the test sequence is                While the Siamese Bi-LSTM architecture effectively detects anoma­
considered anomalous. To explain this similarity decision, a SHAP                lous sequences by measuring their dissimilarity from known benign
explainer—DeepExplainer—is initialized to compute the contribution of            behavior, the detection output alone is insufficient for understanding the
each input feature toward the final similarity score. A high positive            structure of emerging or zero-day threats. To enhance post-detection
SHAP value indicates that a feature increases dissimilarity (supports            analysis, the SiamIDS framework incorporates a lightweight clustering
anomaly), while a negative value suggests alignment with benign                  layer using OPTICS (Ordering Points To Identify the Clustering Struc­
behavior. The step-by-step procedure for SHAP-based interpretation               ture). This component allows the system to group behaviorally similar
within SiamIDS is detailed in Algorithm 5, including encoding the input,         anomalies and uncover hidden attack families, improving threat visi­
computing similarity, initializing the explainer, and generating feature-        bility and aiding security analysts in response planning.
                                                                                     OPTICS is a density-based clustering algorithm that extends DBSCAN

Algorithm 6
OPTICS-Based Clustering of Anomalous Embeddings in SiamIDS.
  Set OPTICS parameters:
     min_samples ← 10
     xi ← 0.05
     Initialize OPTICS model:
     optics_model ← OPTICS(min_samples, xi, metric=’euclidean’)
     Fit model on anomalous embeddings:
     optics_model.fit(E_anom)
     Extract reachability plot and cluster structure:
     reachability ← optics_model.reachability_
     ordering ← optics_model.ordering_
     labels ← optics_model.labels_
     Post-process labels:
     For each embedding e_i in E_anom:
     If labels[i] == -1:
     Mark as noise
     Else:
     Assign to cluster C_j
     Return cluster labels and noise point indices


                                                                            13
P. Kaliyaperumal et al.                                                                                                      Computer Standards & Interfaces 97 (2026) 104119


Table 4                                                                                    Table 5
Experimental Environment Setup.                                                            Model Hyperparameters and Configurations.
  Component               Configuration                                                     Component            Parameter             Value / Description

  Platform                Google Colab Pro                                                  Autoencoder          Latent Size           20 (compressed feature dimension)
  OS Environment          Linux-based Virtual Machine                                       ​                    Activation            ReLU
  CPU                     2.3 GHz Intel Xeon (virtualized)                                  ​                    Loss                  Mean Squared Error (MSE)
  RAM                     16 GB                                                             ​                    Optimizer             Adam
  GPU                     NVIDIA Tesla T4                                                   ​                    Learning rate         0.001
  Python Version          3.10                                                              ​                    Epochs                39
  Major Libraries         TensorFlow 2.13, Keras, scikit-learn, SHAP, FAISS, OPTICS         ​                    Batch Size            512
  Runtime Type            GPU-enabled (CUDA-supported)                                      Siamese Model        Bi-LSTM Units         64 units (per direction)
                                                                                            ​                    Embedding Size        128
                                                                                            ​                    Loss Function         Contrastive Loss
by removing the requirement of a fixed global density threshold. Instead                    ​                    Margin                1.0
                                                                                                                 Epochs                30
of forcing a predefined number of clusters, OPTICS generates a reach­                       ​
                                                                                                                 Optimizer             Adam
ability plot that reveals variable-density clusters and outlier points
                                                                                            ​
                                                                                            ​                    Learning rate         0.001
(noise) without relying on user-specified k values or epsilon parameters.                   ​                    Batch Size            256
This makes it ideal for unsupervised threat categorization in cyberse­                      SHAP                 Explainer Type        DeepExplainer (left Siamese branch)
curity, where attack behaviors can vary in structure, intensity, and fre­                   OPTICS               min_samples           50
                                                                                                                 xi                    0.05
quency. Unlike k-means or hierarchical clustering, which assume convex
                                                                                            ​
                                                                                            ​                    Distance Metric       Euclidean
or hierarchical cluster shapes, OPTICS adapts naturally to irregular or
elongated cluster boundaries, which are common in network traffic data
[61,62].                                                                                   preserved while avoiding overfitting. The Siamese Bi-LSTM, including
    Once the Siamese model flags a sequence as anomalous, its corre­                       hidden units, embedding size, contrastive margin, and learning rate, was
sponding latent embedding etest ∈zDk is preserved for further analysis.                    calibrated to maximize temporal feature representation and inter-class
The collection of all such anomalous embeddings, denoted as Eanom={e1,                     separation while maintaining stable convergence. OPTICS parameters,
e2,…,en}, is then passed to the OPTICS algorithm for unsupervised                          such as min_samples and xi, were selected to produce meaningful clus­
clustering. OPTICS operates by computing core distances and reach­                         ters of anomalous flows, effectively distinguishing dense attack groups
ability distances to build a reachability plot that reveals the hierarchical               from sparse outliers. SHAP’s DeepExplainer was used to provide inter­
density-based structure in the data. Unlike DBSCAN or k-means, OPTICS                      pretable, feature-level insights post-inference. This hyperparameter se­
does not require a fixed number of clusters or a neighborhood radius, but                  lection process was guided by performance metrics including
instead relies on two key parameters: min_samples (minimum points to                       reconstruction error, clustering quality, and detection effectiveness on
form a dense region) and xi (minimum steepness to detect cluster                           the validation set. The finalized hyperparameters reflect empirically
boundaries). In SiamIDS, we set min_samples = 10 and xi = 0.05 to                          validated settings that enable robust, scalable, and interpretable intru­
allow flexible and fine-grained clustering. The detailed procedure for                     sion detection within complex IoT–cloud environments. Table 5 sum­
applying OPTICS to the SiamIDS anomaly embeddings is presented in                          marizes these configurations for all SiamIDS modules.
Algorithm 6, including parameter initialization, model fitting, cluster
label extraction, and noise identification. These clusters, along with the                 5.3. Performance metrices
detected noise points, form the basis for post-detection threat interpre­
tation, allowing analysts to profile attack behaviors and prioritize                           To comprehensively evaluate the effectiveness of SiamIDS, we assess
investigation.                                                                             its performance using detection metrics, clustering metrics, and inter­
                                                                                           pretability insights. Each component provides quantitative or qualita­
5. Experimentation, results and analysis                                                   tive insights into the accuracy, behavior, and explainability of the
                                                                                           system.
5.1. Experimental setup
                                                                                           5.3.1. Detection metrics
    The SiamIDS framework was implemented using Python 3.10,                                   The intrusion detection performance of SiamIDS is measured using
leveraging core libraries including TensorFlow 2.13, Keras 2.13, scikit-                   widely accepted metrics derived from the confusion matrix: True Posi­
learn 1.3.2, SHAP 0.41.0, FAISS 1.7.4, and OPTICS 0.9.0. All experi­                       tives (TP), True Negatives (TN), False Positives (FP), and False Negatives
ments were conducted on Google Colab Pro, running a Linux-based                            (FN). Accuracy quantifies the overall proportion of correctly identified
virtual machine configured with 2 virtual CPU cores (2.3 GHz Intel                         benign and malicious flows and is calculated using Eq. (23). Precision,
Xeon), 16 GB RAM, and an NVIDIA Tesla T4 GPU with 16 GB memory.                            defined in Eq. (24), reflects the proportion of true malicious instances
GPU acceleration (CUDA 12.1 and cuDNN 8.9) was used for both model                         among all instances predicted as malicious. Recall (or sensitivity), given
training and inference to ensure efficient computation. The complete                       in Eq. (25), measures the model’s ability to correctly detect actual at­
experimental environment, including hardware, runtime configuration,                       tacks. To balance both precision and recall, especially important in
and major software components, is detailed in Table 4.                                     imbalanced datasets, the F1-score is used, as defined in Eq. (26). Spec­
                                                                                           ificity, expressed in Eq. (27), complements recall by capturing the pro­
5.2. Hyperparameters and model configuration                                               portion of correctly identified benign traffic. A crucial metric for security
                                                                                           applications is the False Negative Rate (FNR), shown in Eq. (28), as it
    The architecture of SiamIDS comprises four primary components: a                       represents the rate at which attacks are missed. Additionally, we
shallow Autoencoder, a Siamese Bi-LSTM for temporal similarity                             compute the Area Under the ROC Curve (AUC-ROC) using Eq. (29),
modeling, SHAP for interpretability, and OPTICS for clustering of                          which evaluates the model’s ability to discriminate between benign and
anomalous flows. Each component’s hyperparameters were determined                          malicious flows across various thresholds, summarizing overall detec­
through iterative empirical validation to optimize performance, gener­                     tion performance into a single scalar value.
alizability, and stability. For the Autoencoder, the latent size, batch size,                                  TP + TN
and training epochs were tuned to balance dimensionality reduction                         Accuracy =                                                                  (23)
                                                                                                          TP + TN + FP + FN
with accurate reconstruction, ensuring essential traffic patterns are

                                                                                      14
P. Kaliyaperumal et al.                                                                                     Computer Standards & Interfaces 97 (2026) 104119


                                 Fig. 12. (a-g). MSE Loss Curve for Autoencoder based dimesionlaity reduction.


                    TP                                                                FN
Precision =                                                     (24)        FNR =                                                                     (28)
                 TP + FP                                                            FN + TP
                TP                                                                   ∫1
Recall =                                                        (25)
             TP + FN                                                        AUC =         TPR(FPR) d(FPR)                                             (29)
                                                                                     0
            Precision ∗ Recall
F1 = 2∗                                                         (26)
           TPrecision + Recall
                                                                            5.3.2. Clustering metrics
                TN                                                              To evaluate the quality of clustering in the post-detection stage using
Specificity =                                                   (27)        OPTICS, we employ three widely used metrics: Silhouette Score,
              TP + FP
                                                                            Davies–Bouldin Index (DBI), and Adjusted Rand Index (ARI). These
                                                                            collectively assess intra-cluster cohesion, inter-cluster separation, and


                                                                       15
P. Kaliyaperumal et al.                                                                       Computer Standards & Interfaces 97 (2026) 104119


                          Fig. 13. (a–h). Confusion Matrices for the Binary Classification.


                                                         16
P. Kaliyaperumal et al.                                                                                            Computer Standards & Interfaces 97 (2026) 104119


                                                   Fig. 14. (a–h). AUC for each individual attack type.

                                                                                                          ⎛    ⎞
alignment with ground truth labels.
    The Silhouette Score, S(i), shown in Eq. (30), measures how well a                                ⎜
                                                                                             (a+b) − E⎜ a+b ⎟
                                                                                                             ⎟
sample is matched to its own cluster compared to other clusters. A                             n      ⎝( n ) ⎠
higher score (closer to 1) indicates better-defined clusters:                    ARI =     ⎧ 2      ⎛
                                                                                                      2
                                                                                                            ⎞⎫ (32)
                                                                                           ⎪
                                                                                           ⎪                 ⎪
                                                                                                             ⎪
          b(i) − a(i)                                                                      ⎨        ⎜       ⎟⎬
S(i) =                                                               (30)              max     a+b  ⎜   a+b
                                                                                             ( ) − E⎝( )⎠   ⎟
         max{a(i), b(i)}
                                                                                           ⎩ n           n
                                                                                           ⎪
                                                                                           ⎪                 ⎪
                                                                                                             ⎪
                                                                                                             ⎭
    Where:                                                                                      2        2
                                                                                    Where:
 • a(i): Average intra-cluster distance of sample i
 • b(i): Minimum average distance to points in the nearest neighboring            a Number of pairs of elements that are in the same cluster in both true
   cluster (inter-cluster)                                                          and predicted clusterings
                                                                                  b Number of pairs that are in different clusters in both true and pre­
   The Davies–Bouldin Index (DBI), defined in Eq. (31), evaluates the               dicted clusterings
average "similarity" between clusters—lower values indicate better
separation and compactness:                                                         Index: number of agreeing pairs between predicted and true labels
                                                                                    Expected Index: expected number of agreeing pairs by chance
              k         (           )
           1∑              σi+ σj
BDI =            maxj∕
                     =i    (                                         (31)
           k i=1          d ci , cj                                              5.3.3. Interpretability
                                                                                    SHAP values are used to identify the most influential features in
    Where:                                                                       prediction decisions for anomalous sequences. This qualitative layer
                                                                                 enhances explainability, enabling analysts to interpret why a sequence
 • k: Number of clusters                                                         deviated from benign behavior, and supports post-hoc validation.
 • σi: Average distance of all samples in cluster i to centroid ci
 • d(ci,cj): Distance between centroids of clusters i and j
                                                                                 5.4. Evaluation and results

    Finally, the Adjusted Rand Index (ARI), given in equation (32),
                                                                                    This section presents the experimental evaluation of the proposed
quantifies the similarity between predicted cluster labels and ground
                                                                                 SiamIDS framework across four key dimensions: detection performance,
truth attack classes, adjusted for random chance. An ARI close to 1 in­
                                                                                 anomaly clustering, interpretability, and resource efficiency. The results
dicates strong agreement.
                                                                                 demonstrate that SiamIDS is not only accurate and explainable, but also

                                                                            17
P. Kaliyaperumal et al.                                                                                               Computer Standards & Interfaces 97 (2026) 104119


Table 6                                                                               Based attacks, the model achieves high true positive rates, with rela­
Detection Performance Metrics of SiamIDS.                                             tively low false negatives, reflecting effective detection of these attack
  Attack Family       Precision   Recall   Specificity   F1-Score    Accuracy         types. However, DDoS and DoS attacks exhibit a higher number of false
                                                                                      negatives and false positives, suggesting that the classifier faces chal­
  BruteForce          0.8575      0.9890   0.9985        0.9185      0.9984
  DDoS                0.9978      0.9900   0.9813        0.9939      0.9891           lenges distinguishing these high-volume attacks from benign flows. The
  DoS                 0.9989      0.9900   0.9792        0.9945      0.9895           overall matrix shows strong discrimination between attack and benign
  Mirai               0.9654      0.9900   0.9845        0.9776      0.9861           traffic, with a total of 2487,450 true positives versus 26,136 false neg­
  Recon               0.9826      0.9899   0.9805        0.9862      0.9854           atives and 1450 false positives, indicating robust detection at the
  Spoofing            0.9566      0.9900   0.9823        0.9730      0.9845
  Web-Based           0.8594      0.9898   0.9954        0.9200      0.9952
                                                                                      aggregate level. These matrices highlight the strengths of SiamIDS in
  Overall             0.9994      0.9896   0.9818        0.9945      0.9894           detecting most attack types while identifying specific areas, such as
                                                                                      DDoS and DoS detection, for further improvement.
                                                                                          Fig. 14 (a–g) illustrates the AUC values for each individual attack
lightweight and scalable for real-world IoT intrusion detection in cloud              type—BruteForce, DDoS, DoS, Mirai, Recon, Spoofing, and Web-Based
environments.                                                                         attacks. These plots demonstrate the model’s discriminative ability to
                                                                                      correctly distinguish each attack from benign traffic across different
5.4.1. Evaluation of latent space in autoencoder-based dimensionality                 classification thresholds. High AUC scores close to 1 indicate strong
reduction                                                                             performance, with the classifier effectively balancing true positive and
    To identify the optimal latent space dimension for effective feature              false positive rates for each attack category. Fig. 14 (h) presents the
reduction, a shallow autoencoder was trained and evaluated across a                   overall AUC combining all attack types, reflecting the aggregate detec­
range of latent sizes: 40, 35, 30, 25, 20, 15, and 10. The corresponding              tion capability of the model on the entire test set. The high overall AUC
Mean Squared Error (MSE) loss curves for both training and validation                 confirms the model’s robustness and consistent performance in identi­
are shown in Figs. 12(a-g). As observed, the MSE steadily decreases from              fying diverse attacks while minimizing false alarms, making it suitable
latent sizes 40 to 20, indicating improved reconstruction fidelity as the             for practical deployment in network security environments.
representation becomes more compact yet still expressive. Notably, the                    The classification performance of SiamIDS across different attack
lowest validation loss is achieved at latent size 20, suggesting this setting         types is detailed in Table 6. The model demonstrates consistently high
offers the best trade-off between dimensionality reduction and infor­                 recall values nearly 0.99 across all attack classes, underscoring its
mation preservation. However, when the latent size is further reduced to              effectiveness in correctly detecting true positives and minimizing false
15 and 10, the MSE begins to increase again, signaling underfitting due               negatives. Precision varies more widely, ranging from 0.86 (BruteForce,
to excessive compression and loss of critical behavioral patterns in the              Web-Based) to nearly 0.999 (DoS, DDoS), indicating slight fluctuations
network traffic. This U-shaped trend in the MSE validates the selection               in the false positive rate due to overlaps in traffic patterns. Specificity
of 20 as the optimal latent dimension, as it maintains low reconstruction             remains strong across all categories—above 0.97—demonstrating the
error while minimizing model complexity. This compact representation                  model’s ability to correctly identify benign flows and reduce false
not only accelerates downstream Siamese training but also enhances                    alarms. The F1-scores, which harmonize precision and recall, are
generalization by eliminating redundant or noisy features.                            consistently above 0.91, reinforcing the balanced detection capability of
                                                                                      the framework. Overall accuracy exceeds 0.98 across all classes, con­
5.4.2. Evaluation of detection performance using confusion matrices                   firming the system’s robustness in distinguishing between benign and
    Fig. 13 (a–g) presents confusion matrices for the binary classification           malicious behavior. The relatively lower precision for BruteForce and
of seven attack types: BruteForce, DDoS, DoS, Mirai, Recon, Spoofing,                 Web-Based attacks suggests minor classification challenges, likely due to
and Web-Based attacks. Each matrix reports true positives (TP), false                 subtle similarities with legitimate traffic. Nevertheless, the SiamIDS
positives (FP), true negatives (TN), and false negatives (FN), illustrating           framework delivers reliable and scalable detection performance across a
the classifier’s ability to distinguish each attack from benign traffic.              broad range of attack vectors, making it well-suited for operational
Fig. 13 (h) shows the overall confusion matrix for all attack types                   deployment in cloud-scale IoT infrastructures.
combined, summarizing the model’s performance on the full test set.                       The contrastive Siamese Bi-LSTM architecture effectively captures
The results indicate varying levels of detection performance across                   behavioral dissimilarities without relying on attack-specific labels.
attack categories. For BruteForce, Mirai, Recon, Spoofing, and Web-                   Moreover, ROC curve analysis enabled threshold tuning to optimize


                                                         Fig. 15. False Negative Rates across attack Family.

                                                                                 18
P. Kaliyaperumal et al.                                                                                           Computer Standards & Interfaces 97 (2026) 104119


                                                  Fig. 16. OPTICS Multiclass Clustering Confusion Matrix.


trade-offs between false positives and false negatives, enhancing the              behavioural characteristics and corresponding attack type:
model’s reliability in operational contexts.
    The false negative rates (FNR) across all attack types remain                   • DoS clusters displayed highly repetitive packet bursts with short
consistently low, around 1 %, As shown in Fig. 15, indicating the                     inter-arrival times and stable source–destination pairs, capturing
model’s strong ability to detect attacks with minimal missed cases. The               their flooding behavior.
overall FNR of 1.04 % reflects reliable threat detection, reducing the risk         • DDoS clusters exhibited similar burst patterns but with distributed
of undetected malicious activity in network traffic.                                  source addresses and variable intensity, explaining their partial
                                                                                      overlap with DoS and Recon flows.
5.4.3. Evaluation of OPTICS-based clustering of anomalous behavior                  • Reconnaissance clusters were characterized by sequential port-
   To enhance the interpretability of anomalies identified by the Sia­                scanning patterns, moderate flow duration, and a high diversity of
mese network, OPTICS clustering was applied to all anomalous se­                      destination ports—features unique to probing activities.
quences. This density-based method, which does not require a                        • Spoofing clusters showed forged source addresses with consistent
predefined number of clusters, identified 14 behaviourally distinct                   payload sizes, demonstrating deceptive identity traits while main­
groups using reachability and local density criteria. The clustering                  taining communication frequency patterns.
process was quantitatively strong, achieving a Silhouette Score of 0.901,           • Brute-Force clusters reflected short, high-frequency login attempts
DBI of 0.092, and an Adjusted Rand Index (ARI) of 0.889—indicating                    and uniform packet payloads, highlighting their credential-guessing
that the resulting clusters were both well-separated and closely aligned              nature despite low sample volume.
with ground-truth attack classes.                                                   • Mirai botnet traffic formed coherent clusters distinguished by device-
   The confusion matrix (Fig. 16) visualizes the alignment between                    specific periodic beaconing and TCP synchronization anomalies,
predicted clusters and actual attack types following label post-                      marking automated command-and-control behavior.
processing. Each cluster was examined to interpret its dominant


                                  Fig. 17. Top Three SHAP-Contributing Features for Six Representative Anomalous Cases.

                                                                              19
P. Kaliyaperumal et al.                                                                                          Computer Standards & Interfaces 97 (2026) 104119


                                                 Fig. 18. Impact of Component on SiamIDS Performance.


 • Web-Based attack clusters exhibited irregular request–response sizes          (Reconnaissance) showed high attribution for Protocol, Fwd Pkt Len
   and longer flow durations, occasionally merging with DoS or                   Mean, and Pkt Rate, which capture systematic probing with non-
   Spoofing patterns due to shared transport-layer traits.                       standard protocols and uniform packet emission rates.
                                                                                     In contrast, Case A5, a benign sample incorrectly flagged as anom­
    Quantitatively, DoS attacks exhibited the highest clustering accu­           alous (false positive), exhibited influence from Fwd IAT Min, Pkt Size
racy, with over 1.56 million flows correctly grouped, followed by DDoS           Mean, and Flag PSH. The overlap of these traits with attack-like behav­
(688,785) and Reconnaissance (87,428) samples. Spoofing and Brute-               iors explains the misclassification and demonstrates how SHAP helps
Force behaviors were distinctly isolated, with 31,124 and 716                    analysts interpret and refine detection boundaries. Finally, Case A6,
correctly grouped flows respectively. Mirai traffic was reliably captured        labeled as noise by OPTICS and considered a zero-day candidate, pre­
in a single dense cluster (34,554 flows). About 6.7 % of anomalous se­           sented Bwd Pkts/s, Fwd IAT Var, and TotLen Bwd as top contrib­
quences were marked as noise by OPTICS, representing potential zero-             utors—indicating a unique traffic pattern unseen in other clusters and
day attacks, evasive threat variants, or anomalous benign activities             suggesting either a novel or evasive behavior type.
requiring deeper forensic inspection.                                                Beyond interpretability, the SHAP analysis offers actionable insights
    These findings demonstrate that SiamIDS embeddings effectively               for real-world intrusion analysis and response. For instance, feature
preserve temporal and statistical traits of diverse IoT threats, enabling        patterns like Flow Duration and Dst Port enable analysts to recognize
OPTICS to form semantically coherent, behavior-driven clusters. By               targeted exploitation attempts, while Tot Fwd Pkts and Flow IAT Mean
removing the need for predefined cluster counts, this post-detection step        serve as early warning indicators for volumetric DDoS behavior. The
strengthens interpretability, supports attack attribution, and enhances          analysis of false positives (Case A5) aids in threshold calibration and
operational readiness for cloud-scale intrusion diagnosis.                       model retraining, and the interpretation of unseen feature combinations
                                                                                 (Case A6) demonstrates SHAP’s role in zero-day investigation. Thus,
5.4.4. Evaluation of SHAP-based explainability for anomalous predictions         SHAP explanations not only clarify SiamIDS’s internal reasoning but
    To enhance the interpretability of SiamIDS predictions, SHAP                 also support root-cause analysis, adaptive tuning, and informed
(SHapley Additive exPlanations) values were computed for anomalous               response decisions in operational IoT intrusion detection.
sequences using the DeepExplainer on the Siamese network’s left                      Collectively, these results show that SiamIDS embeddings effectively
branch. This enabled the identification of the most influential features         preserve key temporal and statistical characteristics of diverse IoT attack
driving dissimilarity judgments between a given sequence and the                 types. SHAP-based explainability provides transparent, feature-level
benign reference set. Fig. 17 summarize this feature-level analysis, of­         reasoning that enhances trust, supports forensic validation, and
fering both tabular and visual perspectives on how specific features             strengthens the interpretability of the model’s anomaly judgments in
contributed to anomaly decisions.                                                practical deployments.
    Fig. 17 presents the top three SHAP-contributing features for six
representative anomalous cases. Each row corresponds to a unique
                                                                                 5.5. Analysis of the proposed siamids
sequence (A1–A6), and the marked cells indicate the features with the
highest SHAP attribution. For instance, in Case A1 (Web-Based attack),
                                                                                 5.5.1. Component-wise impact
Flow Duration, Dst Port, and Pkt Size Var were the dominant contributors,
                                                                                     The ablation study, visualized in Fig. 18 confirms the necessity of
indicating short, bursty traffic targeting unusual ports with irregular
                                                                                 each component within the SiamIDS framework. While the exclusion of
packet sizes—traits that significantly deviate from benign flow patterns
                                                                                 SHAP or OPTICS had no effect on core detection metrics, they removed
and are common in web exploitation attempts. In Case A2 (DDoS), Tot
                                                                                 critical layers for explainability and behavioural grouping. The removal
Fwd Pkts, Flow IAT Mean, and Init Fwd Win surfaced as key drivers,
                                                                                 of the Autoencoder reduced performance due to increased input
reflecting automated high-volume flows typical of DDoS floods. Simi­
                                                                                 dimensionality and training inefficiency. More substantial degradation
larly, Case A3 (Spoofing) highlighted Src IP, Bwd IAT Max, and Pkt Len
                                                                                 occurred when Bi-LSTM was replaced with a feedforward MLP, and
Std Dev as top contributors, revealing address inconsistencies and timing
                                                                                 when the Siamese structure was replaced with a standard DNN—high­
deviations characteristic of spoofed communication. Case A4
                                                                                 lighting the significance of temporal modeling and similarity-based

                                                                            20
P. Kaliyaperumal et al.                                                                                                   Computer Standards & Interfaces 97 (2026) 104119


Table 7                                                                                 time for processing 1 million flows is approximately 4.5 s, confirming
Resource Utilization Metrics of SiamIDS Framework.                                      that SiamIDS is real-time capable, as illustrated in Fig. 19.
  Component          Metric            Value    Execution Context
                                                                                        5.5.3. Statistical significance analysis
  Autoencoder        Training Time     4.8      On benign sequences (latent size
                                       min      = 20)                                       To validate the robustness of SiamIDS, a Wilcoxon signed-rank test
  Autoencoder        Model Size        9.6 MB   Stored in HDF5 format                   was performed comparing SiamIDS with baseline models across all
                                                (compressed)                            seven attack types. This non-parametric test is suitable for paired, non-
  Autoencoder        Peak RAM Usage    820      During training on 200,000              normally distributed performance data and evaluates whether observed
                                       MB       sequences
  Siamese Bi-        Training Time     8.5      Trained on 200,000 pairs
                                                                                        improvements are statistically significant. Table 8 presents the results
    LSTM                               min                                              for F1-Score across attack families. All p-values are below 0.05, con­
  Siamese Bi-        Model Size        13.2     Includes shared Bi-LSTM                 firming that SiamIDS significantly outperforms the baseline models at
    LSTM                               MB       weights and embedding head              the 95 % confidence level. These results provide strong statistical evi­
  Siamese Bi-        Inference Time    3.2 s    Pairwise similarity with 10,000
                                                                                        dence that the observed performance improvements are unlikely to
    LSTM             (per 100 K)                reference embeddings
  SHAP               Explainer Time/   0.4 s    Applied only on flagged                 occur by chance, reinforcing the reliability of the proposed framework.
                     Seq                        anomalous samples
  OPTICS             Clustering Time   2.3      For 150,000 anomalous                   5.5.4. Analysis of comparative performance with state-of-the-art methods
                                       min      sequences                                  To evaluate the real-world viability of SiamIDS, Table 9 compares
  Overall            Total Inference   4.5 s    Real-time capable for 1 million
   Pipeline          Time (1 M)                 test sequences
                                                                                        SiamIDS with recent state-of-the-art models from literature in terms of
                                                                                        accuracy, resource demands, and real-time suitability. To facilitate a fair
                                                                                        and consistent comparison, resource-related metrics for existing meth­
learning in capturing complex traffic behaviours and ensuring robust                    ods—such as training time, model size, RAM usage, and inference
detection.                                                                              speed—were estimated based on reported architectural configurations,
                                                                                        typical computational settings, and available implementation details.
5.5.2. Resource efficiency and real-time suitability                                    SiamIDS outperforms across key criteria such as precision (99.94 %), F1-
    To ensure practical deployability in large-scale IoT environments,                  score (99.45 %), training time (13.3 min), and inference speed
SiamIDS was designed with a focus on computational efficiency and                       (>220,000 samples/sec), while maintaining a model size under 10 MB.
scalability. As detailed in Table 7, the overall pipeline demonstrates                  These results highlight its unique balance of effectiveness and deploy­
impressive resource utilization across all stages—training, inference,                  ability, making it ideal for cloud-based microservices, SOC pipelines,
explainability, and clustering. The Autoencoder module, trained solely                  and IoT security orchestration frameworks.
on benign sequences with a latent size of 20, completes training in 4.8
min, consumes 820 MB RAM, and compiles to a compact 9.6 MB model
file. This enables rapid deployment and retraining in lightweight envi­                 Table 8
ronments. The Siamese Bi-LSTM network, trained on 200,000 contras­                      Wilcoxon Signed-Rank Test Results Comparing SiamIDS with Baseline Models.
tive pairs, converges within 8.5 min, with a model size of 13.2 MB and                   Attack Family   SiamIDS Median     Baseline Median     Wilcoxon W      p-value
an inference time of 3.2 s per 100 K samples, even while comparing
                                                                                         BruteForce      0.9185             0.8760              21              0.0032
against a 10,000-sample reference embedding set. This demonstrates the
                                                                                         DDoS            0.9939             0.9821              19              0.0025
architecture’s suitability for high-throughput similarity scoring.                       DoS             0.9945             0.9814              20              0.0028
    Interpretability via SHAP adds negligible overhead—just 0.4 s per                    Mirai           0.9776             0.9603              18              0.0041
flagged sequence, as it is selectively applied only to anomalous flows.                  Recon           0.9862             0.9715              19              0.0035
                                                                                         Spoofing        0.9730             0.9552              20              0.0029
Similarly, the OPTICS clustering step, applied to 150,000 anomalies,
                                                                                         Web-Based       0.9200             0.8857              21              0.0031
completes in just 2.3 min, enabling real-time post-detection behavioral                  Overall         0.9945             0.9778              19              0.0026
grouping without compromising responsiveness. The total inference


Fig. 19. Inference speed versus training time of the proposed SiamIDS compared to existing methods, highlighting real-time capabilities and training time efficiency.

                                                                                   21
P. Kaliyaperumal et al.                                                                                                 Computer Standards & Interfaces 97 (2026) 104119


Table 9
Comparative Evaluation of Proposed SiamIDS with Existing Methods.
  Reference #             Dataset               Precision   Recall   F1-     Accuracy     Training      Model       RAM             Inference Speed     Real-Time
                                                                     Score                Time (min)    Size (MB)   Usage           (samples/sec)       Suitability
                                                                                                                    (GB)

  Zhang et al. [35]       CICIDS2017, BoT-IoT   99.69 %     99.49    99.81   99.80 %      45–60         >100        4.5             50K                 No
                                                            %        %
  Aldaej et al. [19]      BoT-IoT               99.45 %     98.25    99.12   99.56 %      25            35          2.8             95K                 Limited
                                                            %        %
  Yaras & Dener [29]      CICIoT2023, TON_IoT   98.75 %     98.75    98.75   98.75 %      30–35         40          3.2             80K                 Limited
                                                            %        %
  Alabbadi & Bajaber      TON_IoT               99.53 %     99.17    99.33   99.96 %      40            55          3.5             60K                 No
    [36]                                                    %        %
  Bedi et al. [17]        NSL-KDD               91.46 %     92.99    -       -            18            25          2               100K                Moderate
  Hindy [20]              CICIDS2017, NSL-      -           98.00    -       86.42 %      20            28          2.3             105K                Moderate
                          KDD                               %
  Althiyabi et al. [30]   CICIDS2017, MQTT      93.46 %     93.13    92.40   93.13 %      15            22          2               95K                 Moderate
                                                            %        %
  Madhu et al. [21]       IoT testbed data      95.00       92.00    95.00   96.00 %      28            50          3               70K                 No
                                                            %        %
  Saurabh et al. [18]     UNSW-NB15, Bot-IoT    97.00 %     96.00    96.00   96.60 %      30            38          3.1             85K                 Limited
                                                            %        %
  Bo et al. [31]          CICIDS2017,           -           98.29    -       97.78 %      25–30         33          2.5             90K                 Moderate
                          ISCX2012                          %
  Touré et al. [32]      IBM, NSL-KDD          98.00 %     97.00    99.00   98.4 %       40            50          4               75K                 Moderate
                                                            %        %
  Alhayan et al. [37]     NSL-KDD               88.75 %     94.49    91.24   99.49 %      50            90          6               60K                 Limited
                                                            %        %
  Guan et al. [34]        IoTID20, N-BaIoT      90 %        90 %     89 %    91.87 %      35            60          5               55K                 Limited
  Hnamte & Hussain        CICIDS2018,           100 %       100 %    100 %   100 %        >60           >90         8               45K                 No
    [22]                  Edge_IIoT
  Alzboon et al. [23]     KDD99                 99.99 %     99.99    99.99   99.99 %      30            40          3               80K                 Limited
                                                            %        %
  Ben Said et al. [24]    InSDN, NSL-KDD,       99.85 %     95.28    >97 %   97.77 %      45            65          4               60K                 Moderate
                          UNSW-NB15                         %
  Zhang et al. [25]       KDDCUP99, NSLKDD,     >97 %       >97 %    99 %    99.08 %      40            60          4.5             65K                 Limited
                          CICIDS2017
  Duc et al. [38]         Custom DGA dataset    90 %        >80 %    80.32   89.83 %      >50           >100        >6              40K                 No
                                                                     %
  Hou et al. [26]         NSL-KDD               96.08 %     80.89    87.89   87.30 %      35            55          4               45K                 No
                                                            %        %
  Ali et al. [27]         KDDCUP99, UNSW-       98 %        98.2 %   98 %    99.91 %      30            40          3.5             85K                 Moderate
                          NB15
  Chintapalli et al.      N-BaIoT, CICIDS-      >99.9 %     >99.9    >99.9   >99.9 %      40            50          4               90K                 Limited
    [33]                  2017, ToN-IoT                     %        %
  Jiang et al. [28]       NSL-KDD, UNSW-        98.58 %     98.40    98.49   95.44 %      30            55          4.2             70K                 Moderate
                          NB15, CICIDS-2017                 %        %
  Natha et al. [39]       RAD, UCF Crime        >92 %       >92 %    >92 %   ~92 %        >60           85          >6              35K                 No
  Alsaleh et al. [40]     CICIoT2023            79.48 %     68.05    70.45   99.09 %      30            40          3               80K                 Limited
                                                            %        %
  Mohale &                UNSW-NB15             87 %        88 %     87 %    87 %         30            40          3.5             85K                 Moderate
    Obagbuwa
    (2025) [41]
  Proposed SiamIDS        CIC IoT-DIAD 2024     99.94 %     98.96    99.45   98.94 %      13.3          <10         <1.5            220K                Yes
                                                            %        %


5.6. Discussion                                                                       empowers the model with transparency—a critical feature in real-world
                                                                                      SOC deployments where interpretability directly affects operator trust
    The experimental results confirm that SiamIDS achieves a balanced                 and response time. Analysts can clearly understand which features (e.g.,
integration of detection accuracy, interpretability, and operational                  protocol flags, packet timing) drove the anomaly decision, which re­
efficiency—three pillars often pursued separately in intrusion detection              duces investigation overhead. From a deployment perspective, SiamIDS
research. Its use of a Siamese Bi-LSTM architecture enables the system to             is lightweight and modular. It can function as a cloud-hosted micro­
learn nuanced temporal patterns and behavioral similarities between                   service, enabling scalability and easy integration into existing moni­
network sequences, which proves especially effective for identifying                  toring ecosystems. Its small model size and low RAM usage make it
rare and evolving threats such as zero-day attacks. Compared to con­                  suitable for deployment in resource-constrained environments as well.
ventional classification-based IDS models, SiamIDS demonstrates better                However, despite these strengths, certain limitations merit attention.
generalization and lower reliance on labeled training data. The                       For instance, low-volume attacks that closely mimic benign behavior
contrastive learning approach not only enhances robustness to class                   may occasionally evade detection or be grouped with benign clusters.
imbalance but also facilitates meaningful latent space embeddings, as                 Similarly, threshold tuning remains sensitive to data distributions, and
evidenced by the high clustering coherence reported with OPTICS. By                   future work may need to adopt adaptive thresholding or domain-specific
categorizing attacks behaviorally rather than merely by labels, the sys­              calibration to accommodate diverse environments. Another notable
tem supports semantically-aware threat profiling, which can aid inci­                 challenge lies in handling encrypted traffic, where payload inspection
dent response teams in prioritizing actions based on behavioral                       becomes infeasible. Although SiamIDS primarily relies on flow-level and
similarity. Furthermore, the integration of SHAP explanations                         statistical features, the lack of visibility into encrypted payloads may

                                                                               22
P. Kaliyaperumal et al.                                                                                                              Computer Standards & Interfaces 97 (2026) 104119


limit its ability to fully characterize complex application-layer attacks.                     [3] B. Padma, M. Bukya, U. Ujjwal, An intelligent hybrid framework for threat pre-
                                                                                                   identification and secure key distribution in Zigbee-enabled IoT networks using
Integrating side-channel features such as timing, packet size distribu­
                                                                                                   RBF and blockchain, Appl. Syst. Innov. 8 (3) (May 2025) 76, https://doi.org/
tion, and TLS handshake metadata could help mitigate this limitation.                              10.3390/asi8030076.
Additionally, cross-domain generalization remains an open                                      [4] A.I. Zreikat, Z. AlArnaout, A. Abadleh, E. Elbasi, N. Mostafa, The integration of the
issue—models trained on one IoT or cloud domain may exhibit reduced                                Internet of Things (IoT) applications into 5G networks: a review and analysis,
                                                                                                   Computers 14 (7) (Jun. 2025) 250, https://doi.org/10.3390/computers14070250.
performance when transferred to another with differing traffic charac­                         [5] S.S. Qureshi, J. He, S.U. Qureshi, N. Zhu, A. Wajahat, A. Nazir, A. Wadud,
teristics or device behaviors. Domain adaptation or federated learning                             Advanced AI-driven intrusion detection for securing cloud-based industrial IoT,
approaches may therefore be explored in future work to enhance                                     Egypt. Informat. J. 30 (2025) 100644.
                                                                                               [6] H. Alamleh, L. Estremera, S.S. Arnob, A.A.S. AlQahtani, Advanced persistent
generalizability and resilience across distributed environments. Overall,                          threats and wireless local area network security: an in-depth exploration of attack
the system strikes a strong balance between detection precision, inter­                            surfaces and mitigation techniques, J. Cybersecur. Privacy 5 (2) (May 2025) 27,
pretability, and deployability, positioning it as a viable next-generation                         https://doi.org/10.3390/jcp5020027.
                                                                                               [7] A. Alharthi, M. Alaryani, S. Kaddoura, A comparative study of machine learning
solution for cloud-integrated IoT intrusion detection.                                             and deep learning models in binary and multiclass classification for intrusion
                                                                                                   detection systems, Array 26 (Jul. 2025), https://doi.org/10.1016/j.
6. Conclusion and future scope                                                                     array.2025.100406.
                                                                                               [8] J. Ferdous, R. Islam, A. Mahboubi, M.Z. Islam, A Survey on ML Techniques for
                                                                                                   Multi-Platform Malware Detection: Securing PC, Mobile Devices, IoT, and Cloud
    This paper proposed SiamIDS, a novel cloud-centric intrusion                                   Environments, Multidisciplinary Digital Publishing Institute (MDPI), Feb. 01, 2025,
detection framework tailored for large-scale IoT environments. The                                 https://doi.org/10.3390/s25041153.
                                                                                               [9] T. Al-Shurbaji, M. Anbar, S. Manickam, I.H. Hasbullah, N. ALfriehate, B.A. Alabsi,
system uniquely integrates a Siamese Bi-LSTM network with contrastive                              H. Hashim, Deep Learning-Based Intrusion Detection System For Detecting IoT
learning, autoencoder-based feature reduction, SHAP-based interpret­                               Botnet Attacks: a Review, IEEE Access, 2025.
ability, and OPTICS clustering—a combination not seen in existing IDS                         [10] Y. Zhang, R.C. Muniyandi, F. Qamar, A Review of Deep Learning Applications in
                                                                                                   Intrusion Detection Systems: Overcoming Challenges in Spatiotemporal Feature
literature. This multi-stage architecture enables the detection of both
                                                                                                   Extraction and Data Imbalance, Multidisciplinary Digital Publishing Institute
known and zero-day threats while offering transparent, feature-level                               (MDPI), Feb. 01, 2025, https://doi.org/10.3390/app15031552.
explanations and post-detection behavioral grouping. Experimental re­                         [11] G. Aldehim, T. Shahzad, M.A. Khan, Y.Y. Ghadi, W. Jiang, T. Mazhar, H. Hamam,
                                                                                                   Balancing sustainability and security: a review of 5G and IoT in smart cities, Digit.
sults on the CIC IoT-DIAD 2024 dataset demonstrate high detection
                                                                                                   Commun. Netw. (2025).
performance with an overall F1-score of 99.45 %, precision of 99.94 %,                        [12] S.B. Sharma, A.K. Bairwa, Leveraging AI for Intrusion Detection in IoT Ecosystems:
and a recall of 98.96 %. Clustering quality metrics such as a Silhouette                           A Comprehensive Study, Institute of Electrical and Electronics Engineers Inc, 2025,
Score of 0.901, DBI of 0.092, and ARI of 0.889 confirm the effectiveness                           https://doi.org/10.1109/ACCESS.2025.3550392.
                                                                                              [13] U. Tariq, T.A. Ahanger, Employing SAE-GRU deep learning for scalable botnet
of semantic grouping. The system is also efficient, achieving inference                            detection in smart city infrastructure, PeerJ. Comput. Sci. 11 (2025), https://doi.
speeds over 220 K samples/sec with a RAM usage of less than 1.5 GB.                                org/10.7717/peerj-cs.2869.
However, current limitations include reliance on fixed similarity                             [14] A. Bensaoud, J. Kalita, Optimized detection of cyber-attacks on IoT networks via
                                                                                                   hybrid deep learning models, Ad. Hoc. Netw. 170 (2025) 103770, https://doi.org/
thresholds and potential sensitivity to evolving traffic patterns.                                 10.1016/j.adhoc.2025.103770.
    In the near future, it is planned to explore adaptive thresholding,                       [15] J. Zhang, R. Chen, Y. Zhang, W. Han, Z. Gu, S. Yang, Y. Fu, MF2POSE: multi-task
multi-modal data fusion, self-supervised sequence modeling with                                    feature Fusion Pseudo-siamese Network for intrusion detection using category-
                                                                                                   distance promotion loss, in: Knowl. Based. Syst., 283, 2024 111110.
transformers, federated learning for decentralized training, and inte­                        [16] O.A. Alimi, Data-Driven Learning Models for Internet of Things Security: Emerging
gration with the MITRE ATT&CK framework to support threat mitiga­                                  Trends, Applications, Challenges and Future Directions, Multidisciplinary Digital
tion and automated response. These directions will enhance the                                     Publishing Institute (MDPI), May 01, 2025, https://doi.org/10.3390/
                                                                                                   technologies13050176.
scalability, resilience, and practical deployment of SiamIDS in real-
                                                                                              [17] P. Bedi, N. Gupta, V. Jindal, Siam-IDS: handling class imbalance problem in
world SOC environments.                                                                            intrusion detection systems using Siamese neural network. Procedia Computer
                                                                                                   Science, Elsevier B.V., 2020, pp. 780–789, https://doi.org/10.1016/j.
                                                                                                   procs.2020.04.085.
CRediT authorship contribution statement                                                      [18] K. Saurabh, S. Sood, P.A. Kumar, U. Singh, R. Vyas, O.P. Vyas, R. Khondoker,
                                                                                                   Lbdmids: LSTM based deep learning model for intrusion detection systems for IOT
   Prabu Kaliyaperumal: Writing – original draft, Conceptualization.                               networks. 2022 IEEE World AI IoT Congress (AIIoT), IEEE, 2022, pp. 753–759.
                                                                                              [19] A. Aldaej, T.A. Ahanger, I. Ullah, Deep Learning-inspired IoT-IDS mechanism for
Palani Latha: Writing – review & editing, Validation. Selvaraj Pala­
                                                                                                   edge computing environments, Sensors 23 (24) (Dec. 2023), https://doi.org/
nisamy: Writing – review & editing, Formal analysis, Data curation.                                10.3390/s23249869.
Sridhar Pushpanathan: Visualization, Investigation. Anand Nayyar:                             [20] H. Hindy, et al., Leveraging siamese networks for one-shot intrusion detection
Writing – review & editing, Project administration, Methodology,                                   model, J. Intell. Inf. Syst. 60 (2) (Apr. 2023) 407–436, https://doi.org/10.1007/
                                                                                                   s10844-022-00747-z.
Investigation. Balamurugan Balusamy: Methodology. Ahmad                                       [21] B. Madhu, M. Venu Gopala Chari, R. Vankdothu, A.K. Silivery, V. Aerranagula,
Alkhayyat: Writing – original draft, Resources.                                                    Intrusion detection models for IOT networks via deep learning approaches, Meas.:
                                                                                                   Sens. 25 (Feb. 2023), https://doi.org/10.1016/j.measen.2022.100641.
                                                                                              [22] V. Hnamte, J. Hussain, DCNNBiLSTM: an efficient hybrid deep learning-based
Declaration of competing interest                                                                  intrusion detection system, Telemat. Informat. Rep. 10 (Jun. 2023), https://doi.
                                                                                                   org/10.1016/j.teler.2023.100053.
                                                                                              [23] K. Alzboon, J. Al-Nihoud, W. Alsharafat, Novel network intrusion detection based
    The authors declare that they have no known competing financial                                on feature filtering using FLAME and new cuckoo selection in a genetic algorithm,
interests or personal relationships that could have appeared to influence                          Appl. Sci. (Switzerland) 13 (23) (Dec. 2023), https://doi.org/10.3390/
the work reported in this paper.                                                                   app132312755.
                                                                                              [24] R. Ben Said, Z. Sabir, I. Askerzade, CNN-BiLSTM: A hybrid deep learning approach
                                                                                                   for network intrusion detection system in software-defined networking with hybrid
Data availability                                                                                  feature selection, IEEe Access. 11 (2023) 138732–138747, https://doi.org/
                                                                                                   10.1109/ACCESS.2023.3340142.
                                                                                              [25] J. Zhang, X. Zhang, Z. Liu, F. Fu, Y. Jiao, F. Xu, A network intrusion detection
    No data was used for the research described in the article.                                    model based on BiLSTM with multi-head attention mechanism, Electronics
                                                                                                   (Switzerland) 12 (19) (Oct. 2023), https://doi.org/10.3390/electronics12194170.
References                                                                                    [26] T. Hou, H. Xing, X. Liang, X. Su, Z. Wang, A Marine hydrographic station networks
                                                                                                   intrusion detection method based on LCVAE and CNN-BiLSTM, J. Mar. Sci. Eng. 11
                                                                                                   (1) (Jan. 2023), https://doi.org/10.3390/jmse11010221.
 [1] S. Jain, P. Sukul, J. Groppe, B. Warnke, P. Harde, R. Jangid, S. Groppe,
                                                                                              [27] A.M. Ali, F. Alqurashi, F.J. Alsolami, S. Qaiyum, A double-layer indemnity
     A scientometric analysis of reviews on the Internet of Things, J. Supercomput. 81
                                                                                                   enhancement using LSTM and HASH function technique for intrusion detection
     (6) (2025) 1–35.
                                                                                                   system, Mathematics 11 (18) (Sep. 2023), https://doi.org/10.3390/
 [2] A. Marengo, “Navigating the nexus of AI and IoT: a comprehensive review of data
                                                                                                   math11183894.
     analytics and privacy paradigms,” Oct. 01, 2024, Elsevier B.V. doi: 10.1016/j.
     iot.2024.101318.


                                                                                         23
P. Kaliyaperumal et al.                                                                                                                   Computer Standards & Interfaces 97 (2026) 104119

[28] H. Jiang, S. Ji, G. He, X. Li, Network traffic anomaly detection model based on               [46] A. Demircioğlu, The effect of feature normalization methods in radiomics, Insights.
     feature reduction and bidirectional LSTM neural Network optimization, Sci.                         ImAging 15 (1) (Dec. 2024), https://doi.org/10.1186/s13244-023-01575-7.
     Program. 2023 (Nov. 2023) 1–18, https://doi.org/10.1155/2023/2989533.                         [47] A. Kumar, R. Radhakrishnan, M. Sumithra, P. Kaliyaperumal, B. Balusamy,
[29] S. Yaras and M. Dener, “IoT-based intrusion detection system using new hybrid                      F. Benedetto, A scalable hybrid autoencoder–extreme learning machine framework
     deep learning algorithm,” 2024, doi: 10.3390/electronics.                                          for adaptive intrusion detection in high-dimensional networks, Future Internet. 17
[30] T. Althiyabi, I. Ahmad, M.O. Alassafi, Enhancing IoT security: A few-shot learning                 (5) (May 2025) 221, https://doi.org/10.3390/fi17050221.
     approach for intrusion detection, Mathematics 12 (7) (Apr. 2024), https://doi.org/            [48] B.Y. An, J.H. Yang, S. Kim, T. Kim, Malware detection using dual Siamese network
     10.3390/math12071055.                                                                              model, CMES - Comput. Model. Eng. Sci. 141 (1) (2024) 563–584, https://doi.org/
[31] J. Bo, K. Chen, S. Li, P. Gao, Boosting few-shot network intrusion detection with                  10.32604/cmes.2024.052403.
     adaptive feature fusion mechanism, Electronics (Switzerland) 13 (22) (Nov. 2024),             [49] Y. Xiao, Y. Feng, K. Sakurai, An efficient detection mechanism of network
     https://doi.org/10.3390/electronics13224560.                                                       intrusions in IoT environments using autoencoder and data partitioning,
[32] A. Touré, Y. Imine, A. Semnont, T. Delot, A. Gallais, A framework for detecting                   Computers 13 (10) (Oct. 2024), https://doi.org/10.3390/computers13100269.
     zero-day exploits in network flows, Comput. Netw. 248 (Jun. 2024), https://doi.               [50] K.A. Alaghbari, H.S. Lim, M.H.M. Saad, Y.S. Yong, Deep autoencoder-based
     org/10.1016/j.comnet.2024.110476.                                                                  integrated model for anomaly detection and efficient feature extraction in IoT
[33] S.S.N. Chintapalli, S.P. Singh, J. Frnda, P. Bidare Divakarachari, V.L. Sarraju,                   networks, Internet Things 4 (3) (Sep. 2023) 345–365, https://doi.org/10.3390/
     P. Falkowski-Gilski, OOA-modified Bi-LSTM network: an effective intrusion                          iot4030016.
     detection framework for IoT systems, Heliyon. 10 (8) (Apr. 2024), https://doi.org/            [51] T. Patel, S.S. Iyer, SiaDNN: Siamese deep neural network for anomaly detection in
     10.1016/j.heliyon.2024.e29410.                                                                     user behavior, Knowl. Based. Syst. 324 (2025) 113769, https://doi.org/10.1016/j.
[34] Y. Guan, M. Noferesti, N. Ezzati-Jivan, A two-tiered framework for anomaly                         knosys.2025.113769.
     classification in IoT networks utilizing CNN-BiLSTM model[Formula presented],                 [52] M. Sarhan, S. Layeghy, M. Gallagher, M. Portmann, From zero-shot machine
     Softw. Impacts. 20 (May 2024), https://doi.org/10.1016/j.simpa.2024.100646.                        learning to zero-day attack detection, Int. J. Inf. Secur. 22 (4) (Aug. 2023)
[35] C. Zhang, J. Li, N. Wang, D. Zhang, Research on intrusion detection method based                   947–959, https://doi.org/10.1007/s10207-023-00676-0.
     on Transformer and CNN-BiLSTM in Internet of things, Sensors 25 (9) (May 2025),               [53] K. Berahmand, F. Daneshfar, E.S. Salehi, Y. Li, Y. Xu, Autoencoders and their
     https://doi.org/10.3390/s25092725.                                                                 applications in machine learning: a survey, Artif. Intell. Rev. 57 (2) (Feb. 2024),
[36] A. Alabbadi, F. Bajaber, An intrusion detection system over the IoT data streams                   https://doi.org/10.1007/s10462-023-10662-6.
     using eXplainable artificial intelligence (XAI), Sensors 25 (3) (Feb. 2025), https://         [54] B.A. Manjunatha, K.A. Shastry, E. Naresh, P.K. Pareek, K.T. Reddy, A network
     doi.org/10.3390/s25030847.                                                                         intrusion detection framework on sparse deep denoising auto-encoder for
[37] F. Alhayan, M.K. Saeed, R. Allafi, M. Abdullah, A. Subahi, N.A. Alghanmi,                          dimensionality reduction, Soft. comput. 28 (5) (Mar. 2024) 4503–4517, https://
     H. Alkhudhayr, Hybrid deep learning models with spotted hyena optimization for                     doi.org/10.1007/s00500-023-09408-x.
     cloud computing enabled intrusion detection system, J. Radiat. Res. Appl. Sci. 18             [55] N. Latif, W. Ma, H.B. Ahmad, Advancements in securing federated learning with
     (2) (2025) 101523.                                                                                 IDS: a comprehensive review of neural networks and feature engineering
[38] M.V. Duc, P.M. Dang, T.T. Phuong, T.D. Truong, V. Hai, N.H. Thanh, Detecting                       techniques for malicious client detection, Artif. Intell. Rev. 58 (3) (Mar. 2025),
     emerging DGA malware in federated environments via variational autoencoder-                        https://doi.org/10.1007/s10462-024-11082-w.
     based clustering and resource-aware client selection, Future Internet. 17 (7) (Jul.           [56] A.A. Wani, Comprehensive review of dimensionality reduction algorithms:
     2025) 299, https://doi.org/10.3390/fi17070299.                                                     challenges, limitations, and innovative solutions, PeerJ. Comput. Sci. 11 (Jul.
[39] S. Natha, F. Ahmed, M. Siraj, M. Lagari, M. Altamimi, A.A. Chandio, Deep BiLSTM                    2025) e3025, https://doi.org/10.7717/peerj-cs.3025.
     attention model for spatial and temporal anomaly detection in video surveillance,             [57] T.S. Lakshmi, M. Govindarajan, A. Srinivasulu, Embedding and Siamese deep
     Sensors 25 (1) (Jan. 2025), https://doi.org/10.3390/s25010251.                                     neural network-based malware detection in Internet of Things, Int. J. Pervas.
[40] S. Alsaleh, M.E.B. Menai, S. Al-Ahmadi, A heterogeneity-aware semi-decentralized                   Comput. Commun. 21 (1) (Jan. 2025) 14–25, https://doi.org/10.1108/IJPCC-06-
     model for a lightweight intrusion detection system for IoT networks based on                       2022-0236.
     federated learning and BiLSTM, Sensors 25 (4) (Feb. 2025), https://doi.org/                   [58] W. Dai, X. Li, W. Ji, S. He, Network intrusion detection method based on CNN-
     10.3390/s25041039.                                                                                 BiLSTM-attention model, IEEe Access. 12 (2024) 53099–53111, https://doi.org/
[41] V.Z. Mohale, I.C. Obagbuwa, Evaluating machine learning-based intrusion                            10.1109/ACCESS.2024.3384528.
     detection systems with explainable AI: enhancing transparency and                             [59] Y. Li, G. Guo, J. Shi, R. Yang, S. Shen, Q. Li, J. Luo, A versatile framework for
     interpretability, Front. Comput. Sci. 7 (2025), https://doi.org/10.3389/                           attributed network clustering via K-nearest neighbor augmentation, The VLDB
     fcomp.2025.1520741.                                                                                Journal 33 (6) (2024) 1913–1943.
[42] M. Rabbani, et al., Device identification and anomaly detection in IoT                        [60] T.B. Ogunseyi, G. Thiyagarajan, An explainable LSTM-based intrusion detection
     environments, IEEe Internet. Things. J. 12 (10) (2025) 13625–13643, https://doi.                   system optimized by Firefly algorithm for IoT networks, Sensors 25 (7) (Apr.
     org/10.1109/JIOT.2024.3522863.                                                                     2025), https://doi.org/10.3390/s25072288.
[43] G. Black, K. Fronczyk, W. Arliss, R. Allen, Descriptor: firewall attack detections and        [61] S. Subudhi, S. Panigrahi, Application of OPTICS and ensemble learning for
     extractions (FADE), IEEE Data Descrip. 2 (May 2025) 163–172, https://doi.org/                      database intrusion detection, J. King Saud Univ. - Comput. Inf. Sci. 34 (3) (Mar.
     10.1109/ieeedata.2025.3572866.                                                                     2022) 972–981, https://doi.org/10.1016/j.jksuci.2019.05.001.
[44] M.S. Korium, M. Saber, A. Beattie, A. Narayanan, S. Sahoo, P.H.J. Nardelli,                   [62] P. Artioli, A. Maci, A. Magrì, A comprehensive investigation of clustering
     Intrusion detection system for cyberattacks in the Internet of vehicles environment,               algorithms for user and entity behavior analytics, Front. Big. Data 7 (2024),
     Ad. Hoc. Netw. 153 (Feb. 2024), https://doi.org/10.1016/j.adhoc.2023.103330.                       https://doi.org/10.3389/fdata.2024.1375818.
[45] L.B.V de Amorim, G.D.C. Cavalcanti, R.M.O. Cruz, The choice of scaling technique
     matters for classification performance, Appl. Soft. Comput. 133 (2023) 109924,
     https://doi.org/10.1016/j.asoc.2022.109924.


                                                                                              24