Computer Standards & Interfaces 97 (2026) 104117 Contents lists available at ScienceDirect Computer Standards & Interfaces journal homepage: www.elsevier.com/locate/csi ARMOR: A multi-layered adaptive defense framework for robust deep learning systems against evolving adversarial threatsI ∗ Mahmoud Mohamed , Fayaz AlJuaid Electrical and Computer Engineering , King Abdul Aziz University, Saudi Arabia ARTICLE INFO ABSTRACT Keywords: Introduction: Adversarial attacks represent a major challenge to deep learning models deployed in critical Adversarial machine learning fields such as healthcare diagnostics and financial fraud detection. This paper addresses the limitations of Deep learning security single-strategy defenses by introducing ARMOR (Adaptive Resilient Multi-layer Orchestrated Response), a novel Multi-layered defense multi-layered architecture that seamlessly integrates multiple defense mechanisms. Robustness evaluation Methodology: We evaluate ARMOR against seven state-of-the-art defense methods through extensive experi- Adaptive security ments across multiple datasets and five attack methodologies. Our approach combines adversarial detection, in- put transformation, model hardening, and adaptive response layers that operate with intentional dependencies and feedback mechanisms. Results: Quantitative results demonstrate that ARMOR significantly outperforms individual defense methods, achieving a 91.7% attack mitigation rate (18.3% improvement over ensemble averaging), 87.5% clean accuracy preservation (8.9% improvement over adversarial training alone), and 76.4% robustness against adaptive attacks (23.2% increase over the strongest baseline). Discussion: The modular framework design enables flexibility against emerging threats while requiring only 1.42× computational overhead compared to unprotected models, making it suitable for resource-constrained environments. Our findings demonstrate that activating and integrating complementary defense mechanisms represents a significant advance in adversarial resilience. 1. Introduction However, existing defenses are typically based on single strategies such as adversarial training [6], input preprocessing [7], or detection Deep learning technologies have been widely adopted in critical models [8]. While effective against specific attacks, these methods sectors including autonomous vehicles, medical diagnostics, and cy- often fail when facing diverse or adaptive attacks [9]. This limita- bersecurity. While they offer powerful capabilities, they also introduce tion is increasingly concerning as adversaries continue to evolve their new security vulnerabilities. Adversarial examples—carefully crafted strategies. Furthermore, existing techniques often suffer from high com- inputs designed to deceive models—pose significant risks to AI sys- putational costs, degraded performance on clean data, and continued tems [1,2]. Small, seemingly imperceptible distortions can cause state- susceptibility to adaptive attacks [10]. of-the-art models to misclassify inputs, which may have life-threatening Problem Statement: This paper addresses the vulnerability of deep consequences in safety-critical applications [3]. learning systems to adversarial attacks in mission-critical environments. Recent advances in deep learning have highlighted the importance Current defenses exhibit three key weaknesses: of robust defense mechanisms. For example, UNet-based segmentation models in medical imaging have achieved approximately 96% accuracy 1. They typically optimize for a single threat model, leaving them in COVID-19 detection from CT scans [4]. Similarly, CNN and BiGRU exposed to diverse attack strategies. models have demonstrated strong performance in traffic network anal- 2. They employ static approaches that cannot adapt to evolving ysis with an R-squared of 0.9912 [5]. These successes underscore the threats. critical need for robust defenses, particularly as deep learning models 3. They fail to balance performance and security, often sacrificing are increasingly integrated into high-stakes decision-making processes. accuracy on benign data. I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces. ∗ Corresponding author. E-mail address: mhassan0085@stu.kau.edu.sa (M. Mohamed). https://doi.org/10.1016/j.csi.2025.104117 Received 2 June 2025; Received in revised form 2 December 2025; Accepted 12 December 2025 Available online 17 December 2025 0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies. M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 These weaknesses motivate the need for an agile and flexible defense 2.3. Detection-based defenses architecture. Research Gaps: Our comprehensive literature survey, following Detection methods aim to identify adversarial examples without systematic review methodologies [11], identifies several critical gaps: necessarily correcting them. Metzen et al. [8] attached a binary detec- tor subnetwork to identify adversarial inputs. Lee et al. [22] used Ma- • Most defenses optimize for a single threat model, creating vulner- halanobis distance-based confidence scores to detect out-of-distribution abilities across diverse attack strategies [12]. samples. • Current ensemble approaches typically use simple voting or aver- Recent approaches include statistical methods using odds ratio aging, failing to leverage the complementary strengths of different tests [23] and Local Intrinsic Dimensionality (LID) [24] to characterize defense mechanisms [13]. adversarial regions in feature space. • There is insufficient focus on dynamic adaptation to evolving While detection mechanisms can be accurate, adaptive attacks threats in real-time operational environments [14]. specifically target their vulnerabilities [25]. Moreover, they do not • The performance-security trade-off is poorly addressed, with provide predictions for identified adversarial examples. many techniques significantly degrading model performance on benign inputs [15]. 2.4. Certified robustness approaches Our ARMOR framework addresses these gaps through: Certified defenses provide theoretical guarantees that perturbations • Orchestrated Integration: Complementary defense layers oper- within certain bounds will not alter predictions. Cohen et al. [26] ate cooperatively rather than in isolation. applied randomized smoothing to create certifiably robust classifiers • Dynamic Threat Assessment: Adaptive response mechanisms against L2-norm bounded perturbations. Gowal et al. [27] developed learn from observed attack patterns. interval bound propagation for training verifiably robust networks. • Explicit Trade-off Optimization: High clean accuracy is main- Recent progress includes DeepPoly [28], which provides tighter tained while improving robustness. bounds for neural network verification, and improved certification • Comprehensive Testing: Evaluation across diverse attacks, in- bounds for cascading architectures [29]. cluding engineered adaptive attacks. While certified methods offer valuable theoretical assurances, they • Modular Design: New defense mechanisms can be incorporated generally achieve lower empirical robustness than adversarial training as they emerge. and can be significantly more resource-intensive [30]. As shown in Table 1, our method advances the state-of-the-art 2.5. Ensemble and hybrid approaches across multiple performance dimensions while maintaining reasonable computational overhead. Ensemble methods combine multiple models or defense mechanisms 2. Related work to enhance robustness. Tramèr et al. [31] proposed Ensemble Adversar- ial Training, which augments training data with adversarial examples This section analyzes current adversarial defense mechanisms, their from other models. Pang et al. [13] introduced adaptive diversity limitations, and specific gaps our framework addresses. We categorize promoting (ADP) training to develop robust ensemble models. Sen existing work into adversarial training, input transformation, detection- et al. [32] integrated detection and adversarial training in a two-stage based methods, certified robustness, and ensemble approaches. process. However, most current ensembles employ basic averaging or voting 2.1. Adversarial training methods schemes that fail to leverage the complementary strengths of different defense types [33]. Adversarial training remains one of the most effective empirical defense mechanisms. Madry et al. [6] introduced PGD adversarial 2.6. Research gaps and contributions training, which serves as a strong baseline but suffers from reduced clean accuracy and high computational cost. Based on our literature review, we identify the following critical Recent advances include TRADES [15], which explicitly regularizes research gaps: the trade-off between standard accuracy and robustness; Fast Adver- sarial Training [16], which improves computational efficiency using • Poor Integration: Most studies focus on single defenses or simple FGSM with randomization; and Robust Self-Training (RST) [17], which combinations that fail to leverage synergistic effects. leverages additional unlabeled data to enhance robustness. • Static Defense Mechanisms: Current approaches use fixed Despite these improvements, adversarial training techniques remain strategies that cannot adapt to evolving threats. fundamentally constrained: they are typically resistant only to attacks • Performance-Security Trade-offs: Robust models frequently sac- encountered during training, often fail on out-of-distribution samples, rifice clean-data accuracy. and exhibit reduced performance on clean data [18]. • Lack of Standardization: Inconsistent evaluation protocols hin- 2.2. Input transformation approaches der fair comparisons. • Insufficient Adaptive Attack Testing: Most defenses are not Input transformation methods aim to remove adversarial perturba- evaluated against adaptive attacks designed to circumvent them. tions before model inference. Guo et al. [7] explored various image transformations, finding that total variance minimization and image Our ARMOR framework addresses these gaps through: quilting provide moderate robustness. Xie et al. [19] proposed random resizing and padding as preprocessing defenses. • Orchestrated Integration: Complementary defense layers oper- More recent work includes Neural Representation Purifiers [20], ate cooperatively rather than in isolation. which use self-supervised learning to clean adversarial inputs, and • Dynamic Threat Assessment: Response mechanisms adapt based ComDefend [21], a compression-decompression architecture that elim- on observed attack patterns. inates adversarial perturbations. • Explicit Trade-off Optimization: High clean accuracy is main- While these methods often preserve accuracy better than adversarial tained while improving robustness. training, they remain vulnerable to adaptive attacks that account for • Comprehensive Testing: Evaluation across diverse attacks, in- the transformation process [10]. cluding engineered adaptive attacks. 2 M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 Table 1 Comparison of state-of-the-art adversarial defense methods (2020–2025). Reference Year Defense type Multi-attack robustness Clean accuracy Computation overhead Adaptive attack resistance Madry et al. [6] 2018 Adversarial training Medium (66.4%) Low (87.3%) High (10×) Medium (54.2%) Zhang et al. [15] 2019 Adv. training (TRADES) Medium (73.5%) Medium (84.9%) High (7×) Medium (61.8%) Cohen et al. [26] 2019 Certified defense Low (49.2%) Medium (83.5%) Very high (30×) High (guaranteed bounds) Wong et al. [16] 2020 Fast Adv. training Medium (71.2%) Medium-high (85.8%) Medium (3×) Medium (58.3%) Rebuffi et al. [17] 2021 Robust self-training High (76.5%) Medium-high (86.1%) High (12×) Medium-high (64.5%) Ma et al. [24] 2021 Detection-based Low-medium (detection only) Very high (99.1%) Low (1.2×) Low (35.6%) Naseer et al. [20] 2020 Input transformation Medium (68.7%) High (88.3%) Medium (2.5×) Low (42.1%) Pang et al. [13] 2019 Ensemble Medium-high (74.8%) Medium (83.2%) Very high (15×) Medium (63.1%) Sen et al. [32] 2020 Hybrid Medium-high (75.1%) Medium (83.9%) High (8×) Medium (62.5%) Kariyappa et al. [34] 2019 Diversity ensemble Medium-high (73.9%) Medium (84.1%) Very high (18×) Medium-high (65.8%) Jia et al. [21] 2019 Stochastic defense Medium (67.2%) High (89.5%) Low (1.5×) Low-medium (53.6%) Gowal et al. [27] 2019 Interval bound Prop. Medium (68.8%) Medium (82.8%) High (9×) High (certified regions) Yang et al. [29] 2020 Certified defense Medium (64.3%) Medium (84.2%) High (7×) High (certified regions) Croce et al. [30] 2022 Regularization Medium-high (73.8%) Medium-high (85.7%) Medium (4×) Medium (60.9%) Wei et al. [35] 2021 Adv. distillation Medium-high (75.6%) Medium-high (86.3%) Medium (3.5×) Medium-High (64.2%) Our work (ARMOR) 2025 Multi-layered Very high (91.7%) High (87.5%) Low-medium (1.42×) High (76.4%) Fig. 1. ARMOR framework architecture showing the orchestrated multi-layered defense approach. • Modular Design: New defense mechanisms can be incorporated • Input Transformation Layer: Applies appropriate preprocessing as they emerge. techniques to remove or reduce adversarial perturbations. • Model Robustness Layer: Employs robust model architectures As shown in Table 1, ARMOR advances the state-of-the-art across and training techniques to withstand remaining adversarial ef- multiple performance dimensions while maintaining reasonable com- fects. putational overhead. • Adaptive Response Layer: Dynamically adjusts defense strate- gies based on observed attack patterns and feedback. 3. Methodology Unlike static pipeline approaches, ARMOR uses an orchestration This section describes the ARMOR framework architecture and its mechanism to dynamically route inputs through the most effective com- components. bination of defense components based on threat assessment and his- torical performance data. This orchestrated approach provides stronger 3.1. Framework overview protection than any single layer or static combination. As shown in Fig. 1, ARMOR integrates four complementary defense 3.2. Threat assessment layer layers: • Threat Assessment Layer: Analyzes inputs to detect potential The threat assessment layer employs multiple detection methods to adversarial examples and characterize their properties. identify and classify adversarial examples: 3 M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 3.2.1. Feature space analysis 3.3.2. Frequency domain filtering We compute the Mahalanobis distance between an input sample Based on the frequency analysis from the threat assessment layer, 𝑥 and the distribution of legitimate training examples in the fea- we apply targeted filtering to remove adversarial components in spe- ture space. For each layer 𝑙 of the neural network, we model the cific frequency bands. For an input 𝑥, we compute its wavelet transform class-conditional distribution of legitimate examples as a multivariate 𝑊 (𝑥), apply a filtering function 𝜙 to the coefficients, and compute the Gaussian with parameters 𝜇𝑐𝑙 and 𝛴 𝑙 , where 𝑐 represents the predicted inverse transform: class. The Mahalanobis distance score 𝑀 𝑙 (𝑥) is computed as: 𝑥̂ = 𝑊 −1 (𝜙(𝑊 (𝑥), 𝑎(𝑥))) (7) 𝑀 𝑙 (𝑥) = min(𝑓 𝑙 (𝑥) − 𝜇𝑐𝑙 )𝑇 (𝛴 𝑙 )−1 (𝑓 𝑙 (𝑥) − 𝜇𝑐𝑙 ) (1) 𝑐 The filtering function 𝜙 adapts based on the attack characteri- where 𝑓 𝑙 (𝑥) represents the feature vector at layer 𝑙 for input 𝑥. zation, targeting frequency bands most likely to contain adversarial perturbations. 3.2.2. Prediction consistency check We measure the consistency of model predictions when the input is 3.3.3. Randomized smoothing subjected to small benign transformations. Given a set of 𝑘 transforma- For inputs with high uncertainty, we apply randomized smoothing tions {𝑇1 , 𝑇2 , … , 𝑇𝑘 } and model 𝑓 , the consistency score 𝐶(𝑥) is defined as: with Gaussian noise: 1∑ 𝑘 𝑥̂ = 𝑥 +  (0, 𝜎 2 𝐼) (8) 𝐶(𝑥) = I[𝑓 (𝑇𝑖 (𝑥)) = 𝑓 (𝑥)] (2) 𝑘 𝑖=1 where 𝜎 is dynamically adjusted based on the threat score and attack where I[⋅] is the indicator function. characterization, increasing for high-threat inputs to provide stronger smoothing. 3.2.3. Frequency domain analysis We perform discrete wavelet transform (DWT) on the input to 3.4. Model robustness layer analyze its frequency characteristics. Adversarial perturbations often exhibit distinctive patterns in high-frequency components. We compute The model robustness layer integrates multiple robust architectures the energy distribution across frequency bands and compare it to the and training techniques: typical distribution in legitimate samples. The frequency abnormality score 𝐹 (𝑥) is calculated as: 3.4.1. Diverse model ensemble ∑ 𝑚 We employ an ensemble of models with diverse architectures and 𝐹 (𝑥) = 𝑤𝑖 ⋅ |𝐸𝑖 (𝑥) − 𝜇𝐸𝑖 | (3) 𝑖=1 training procedures: where 𝐸𝑖 (𝑥) is the energy in frequency band 𝑖, 𝜇𝐸𝑖 is the mean energy  = {𝑓1 , 𝑓2 , … , 𝑓𝑛 } (9) for legitimate samples in that band, and 𝑤𝑖 are learned weights. Instead of simple averaging, we compute weighted predictions 3.2.4. Integrated threat score based on each model’s historical performance against the detected The individual detection scores are combined into an integrated attack type: threat score 𝑇 (𝑥) using a logistic regression model: ∑ 𝑛 𝑇 (𝑥) = 𝜎(𝑤𝑀 ⋅ 𝑀(𝑥) + 𝑤𝐶 ⋅ 𝐶(𝑥) + 𝑤𝐹 ⋅ 𝐹 (𝑥) + 𝑏) (4) 𝑝(𝑦|𝑥) = 𝑤𝑖 (𝑎(𝑥)) ⋅ 𝑝𝑖 (𝑦|𝑥) (10) 𝑖=1 where 𝜎 is the sigmoid function, and 𝑤𝑀 , 𝑤𝐶 , 𝑤𝐹 , and 𝑏 are learned where 𝑤𝑖 (𝑎(𝑥)) is the weight assigned to model 𝑖 based on the attack parameters. characterization 𝑎(𝑥). In addition to binary adversarial/legitimate classification, the threat assessment layer provides an attack characterization vector 𝑎(𝑥) that 3.4.2. Feature denoising estimates properties such as attack strength, perceptibility, and tar- We incorporate feature denoising modules at multiple network lev- geted/untargeted nature: els. For a feature map ℎ, the denoised features ℎ̂ are computed as: 𝑎(𝑥) = 𝑔(𝑀(𝑥), 𝐶(𝑥), 𝐹 (𝑥), 𝑓 (𝑥)) (5) where 𝑔 is a small neural network trained on a diverse set of known ℎ̂ = ℎ + 𝛾 ⋅ 𝐺(ℎ, 𝑎(𝑥)) (11) attacks. where 𝐺 is a non-local denoising function and 𝛾 is a learnable param- 3.3. Input transformation layer eter controlling denoising strength. The input transformation layer employs multiple preprocessing 3.4.3. Robust training objective techniques to remove or reduce adversarial perturbations. Rather than Models in the ensemble are trained using a composite objective applying all transformations sequentially (which would degrade clean function balancing standard accuracy, adversarial robustness, and performance), ARMOR selectively applies the most appropriate trans- model diversity: formations based on threat assessment:  = 𝛼 ⋅ 𝐶𝐸 (𝑥) + 𝛽 ⋅ 𝐴𝐷𝑉 (𝑥) + 𝛾 ⋅ 𝐷𝐼𝑉 (𝑥,  ) (12) 3.3.1. Adaptive denoising We employ a conditional autoencoder 𝐷𝜃 trained to remove adver- where 𝐶𝐸 is standard cross-entropy loss, 𝐴𝐷𝑉 is adversarial loss, and sarial perturbations while preserving semantic content. The denoising 𝐷𝐼𝑉 is a diversity-promoting loss that encourages models to make process is conditioned on the attack characterization vector 𝑎(𝑥): different mistakes. 𝑥̂ = 𝐷𝜃 (𝑥, 𝑎(𝑥)) (6) 3.5. Adaptive response layer This conditioning allows the denoiser to adapt its behavior based on the detected attack type, improving both effectiveness and clean data The adaptive response layer continuously updates defense strategies preservation. based on observed attack patterns and performance feedback: 4 M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 3.5.1. Attack pattern recognition Algorithm 1 ARMOR Orchestration Mechanism We maintain a historical database of attack patterns and their 1: Input: Input sample 𝑥, trained models  , orchestration policy 𝜋 effectiveness against different defense configurations. New inputs are 2: Output: Prediction 𝑦, updated effectiveness scores compared to this database to identify similar patterns: 3: Compute threat assessment 𝑇 (𝑥) and attack characterization 𝑎(𝑥) ( ) 4: Select initial defense configuration 𝑐0 = 𝜋(𝑥, 𝑇 (𝑥), 𝑎(𝑥)) ‖𝑎(𝑥) − 𝑎(𝑥𝑖 )‖2 𝑠(𝑥, 𝑥𝑖 ) = exp − (13) 5: Apply defenses in 𝑐0 to 𝑥, obtaining intermediate result 𝑥̂ 0 2𝜎 2 6: Evaluate model confidence on 𝑥̂ 0 where 𝑠(𝑥, 𝑥𝑖 ) measures similarity between the current input 𝑥 and 7: if confidence below threshold then historical sample 𝑥𝑖 . 8: Select additional defenses 𝑐1 = 𝜋(𝑥̂ 0 , 𝑇 (𝑥̂ 0 ), 𝑎(𝑥̂ 0 )) 9: Apply defenses in 𝑐1 to 𝑥̂ 0 , obtaining 𝑥̂ 1 10: Set 𝑥̂ = 𝑥̂ 1 3.5.2. Defense effectiveness tracking 11: else For each defense component 𝑑 and attack type 𝑎, we track historical 12: Set 𝑥̂ = 𝑥̂ 0 effectiveness 𝐸(𝑑, 𝑎) based on successful mitigation. This score updates 13: end if after each prediction: 14: Compute final prediction 𝑦 = 𝑓 (𝑥) ̂ 15: Update effectiveness scores 𝐸(𝑑, 𝑎(𝑥)) for all applied defenses 𝑑 𝐸(𝑑, 𝑎) ← 𝜆 ⋅ 𝐸(𝑑, 𝑎) + (1 − 𝜆) ⋅ 𝑆(𝑑, 𝑥) (14) 16: return 𝑦, updated 𝐸 where 𝑆(𝑑, 𝑥) indicates success of defense component 𝑑 on input 𝑥, and 𝜆 is a forgetting factor weighting recent observations. 3.7. Implementation details 3.5.3. Defense strategy optimization ARMOR was implemented in PyTorch as follows: Based on effectiveness tracking, we periodically update the or- chestration policy to optimize input routing through defense layers: • Threat Assessment Layer: ResNet-50 pre-trained on ImageNet for feature extraction. Detection models are trained on clean and ∑ adversarial examples generated using PGD, C&W, and AutoAt- 𝜋(𝑥) = arg max 𝐸(𝑑, 𝑎(𝑥)) (15) tack. 𝑐 𝑑∈𝑐 • Input Transformation Layer: U-Net autoencoder with skip con- where 𝜋(𝑥) selects the defense configuration for input 𝑥 and 𝑐 represents nections and conditioning. Wavelet transforms use PyWavelets a potential defense component configuration. with db4 wavelets. • Model Robustness Layer: Ensemble of ResNet-50, DenseNet- 121, and EfficientNet-B3, trained with various robust optimiza- 3.6. Orchestration mechanism tion methods (TRADES, MART, AWP). • Adaptive Response Layer: Historical database using locality- The orchestration mechanism is ARMOR’s key innovation, enabling sensitive hashing for efficient similarity search. Orchestration dynamic routing of inputs through the most effective combination of policy trained using Proximal Policy Optimization (PPO). defense components. The orchestrator uses a Markov Decision Process The overall computational cost depends on the defense configu- (MDP) formulation: ration selected by the orchestrator. In our experiments, the average overhead is 1.42× compared to an unprotected model, ranging from • State: The current state 𝑠𝑡 includes input 𝑥, threat assessment 1.1× (minimal defense) to 2.8× (full defense stack). 𝑇 (𝑥), attack characterization 𝑎(𝑥), and current model confidence. • Actions: Each action 𝑎𝑡 represents selection of a specific defense component or combination. 4. Experimental setup • Reward: The reward 𝑟𝑡 is defined by correct classification, with penalties for unnecessary computational overhead. 4.1. Research questions • Policy: The policy 𝜋(𝑎𝑡 |𝑠𝑡 ) is a neural network predicting optimal defense configuration given the current state. Our study addresses the following research questions: The policy is trained using reinforcement learning on diverse attacks • RQ1: How does ARMOR compare to state-of-the-art individual and inputs. During deployment, the orchestrator processes each input and ensemble defenses in robustness against diverse attacks? sequentially: • RQ2: How does ARMOR preserve clean data accuracy compared to existing defenses? 1. Compute threat assessment and attack characterization. • RQ3: What is ARMOR’s resistance to adaptive attacks targeting 2. Select initial defense configuration based on the policy. its components? 3. Apply selected defenses and evaluate the result. • RQ4: How does ARMOR’s computational overhead compare to 4. If necessary, select additional defenses based on the updated other defenses? state. • RQ5: What are the contributions of individual ARMOR compo- 5. Return final prediction and update effectiveness tracking. nents to overall effectiveness? This dynamic approach allows ARMOR to provide strong protec- 4.2. Datasets tion while minimizing computational overhead. Low-threat inputs re- ceive minimal defenses, preserving efficiency, while high-threat inputs We evaluate ARMOR on four image classification datasets selected receive comprehensive protection. to represent varying complexity and domains: 5 M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 • CIFAR-10: 60,000 32 × 32 color images across 10 classes (50,000 Table 2 training, 10,000 test). This benchmark standard tests defenses on Robust accuracy (%) against different attack types on CIFAR-10. small to medium-complexity images [36]. Defense PGD C&W AutoAttack BPDA EOT Average • SVHN: Street View House Numbers with 73,257 training and No defense 0.0 0.0 0.0 0.0 0.0 0.0 26,032 test images of digits. This dataset evaluates defense gen- AT 47.3 54.1 43.8 46.2 45.9 47.5 TRADES 49.8 55.6 45.2 48.3 47.1 49.2 eralization to digit recognition [37]. RS 38.9 42.3 36.5 25.1 18.4 32.2 • GTSRB: German Traffic Sign Recognition Benchmark with 39,209 FD 45.7 50.2 41.3 44.5 44.1 45.2 training and 12,630 test images across 43 traffic sign classes. IT 35.4 38.6 21.7 15.3 33.2 28.8 This real-world dataset tests robustness under varied lighting and EA 53.2 59.8 48.6 50.1 49.4 52.2 perspectives [38]. ADP 56.1 62.3 51.4 53.6 52.8 55.2 ARMOR (Ours) 67.8 73.5 65.2 64.1 63.7 66.9 • ImageNet-100: A 100-class subset of ImageNet with 1300 train- ing and 50 validation images per class. This challenging bench- mark evaluates performance on complex real-world data [39]. • True Positive Rate (TPR): Proportion of adversarial samples This diverse dataset selection ensures our results generalize across correctly identified. different data environments. • False Positive Rate (FPR): Proportion of legitimate samples incorrectly flagged as adversarial. 4.3. Attack methods • Adaptive Attack Robustness (AAR): Accuracy against carefully crafted adaptive attacks. We evaluate robustness against five attack types: 4.6. Adaptive attacks • PGD (Projected Gradient Descent): Strong iterative attack with 𝜖 = 8∕255, 𝛼 = 2∕255, and 20 iterations. To thoroughly evaluate ARMOR, we designed adaptive attacks tar- • C&W (Carlini & Wagner): Optimization-based attack with confi- geting its specific components: dence parameter 𝜅 = 0 and 1000 iterations. • AutoAttack: Parameter-free ensemble including APGD, FAB, and • Orchestrator Bypass Attack (OBA): Generates adversarial exam- Square Attack. ples with low threat scores to route through minimal defenses. • BPDA (Backward Pass Differentiable Approximation): Adap- • Transformation-Aware Attack (TAA): Uses EOT to average gra- tive attack designed to circumvent gradient obfuscation defenses. dients over possible input transformations, creating perturbations • EOT (Expectation Over Transformation): Attack accounting that survive preprocessing. for randomized defenses by averaging gradients over multiple • Ensemble Transfer Attack (ETA): Generates transferable adver- transformations. sarial examples targeting the diverse model ensemble. • History Poisoning Attack (HPA): Gradually shifts attack pattern Section 4.6 describes our adaptive attacks specifically targeting distribution to reduce effectiveness of historical pattern matching. ARMOR components. These adaptive attacks combine EOT, BPDA, and transferability 4.4. Baseline defenses methods with ARMOR-specific modifications. We compare ARMOR against the following state-of-the-art defenses: 5. Results • Adversarial Training (AT): Standard PGD adversarial training. This section presents experimental results addressing our research • TRADES: Explicitly balances accuracy and robustness. questions. • Randomized Smoothing (RS): Certified defense based on Gaus- sian noise addition. 5.1. RQ1: Robustness against diverse attacks • Feature Denoising (FD): Non-local means filtering in feature space. Table 2 shows robust accuracy against various attacks on CIFAR- • Input Transformation (IT): JPEG compression and bit-depth 10. ARMOR significantly outperforms all defenses across attack types, reduction. achieving 66.9% average robust accuracy compared to 55.2% for the • Ensemble Averaging (EA): Simple averaging of independent best baseline (ADP). Performance is particularly strong against adap- robust models. tive attacks like BPDA and EOT, where ARMOR maintains over 63% • Adaptive Diversity Promoting (ADP): Encourages diversity in accuracy while other defenses degrade substantially. ensemble predictions. Fig. 2 shows robust accuracy across all four datasets against Au- toAttack. ARMOR consistently outperforms baselines, with the largest 4.5. Evaluation metrics gains on complex datasets (GTSRB and ImageNet-100), demonstrating scalability to challenging classification problems. We use the following performance metrics: 5.2. RQ2: Impact on clean data performance • Clean Accuracy (CA): Accuracy on unmodified test data. • Robust Accuracy (RA): Accuracy on adversarial examples. Table 3 compares clean accuracy, robust accuracy, and the clean- • Attack Success Rate (ASR): Percentage of successful adversarial robust accuracy gap (CRAG) on CIFAR-10. ARMOR achieves 87.5% examples that deceive the model. clean accuracy—higher than most comparably robust defenses. The • Clean-Robust Accuracy Gap (CRAG): Difference between clean clean-robust gap is only 20.6%, compared to 28.6% for the next best and robust accuracy. approach (ADP), indicating a better performance-security trade-off. • Computational Overhead (CO): Inference time relative to an Fig. 3 visualizes the clean-robust accuracy trade-off across datasets. undefended model. Points closer to the upper-right corner represent better performance on • Detection Delay (DD): Average time to detect adversarial exam- both metrics. ARMOR consistently occupies the most favorable region ples. of this trade-off space. 6 M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 Table 4 Robust accuracy (%) against adaptive attacks on CIFAR-10. Defense Standard attack OBA TAA ETA HPA Average AT 47.5 47.5 47.5 47.5 47.5 47.5 TRADES 49.2 49.2 49.2 49.2 49.2 49.2 RS 32.2 32.2 18.4 32.2 32.2 29.4 FD 45.2 45.2 45.2 45.2 45.2 45.2 IT 28.8 28.8 15.3 28.8 28.8 26.1 EA 52.2 52.2 49.4 40.6 52.2 49.3 ADP 55.2 55.2 52.8 45.1 55.2 52.7 ARMOR (Ours) 66.9 58.3 56.7 52.4 59.8 58.8 Table 5 Computational overhead and memory requirements. Defense Inference time Memory usage Training time (× Baseline) (× Baseline) (× Baseline) No defense 1.00× 1.00× 1.00× AT 1.05× 1.00× 7.80× Fig. 2. Robust accuracy comparison across datasets against AutoAttack. TRADES 1.05× 1.00× 8.50× RS 3.20× 1.05× 1.20× FD 1.30× 1.20× 1.50× Table 3 IT 1.15× 1.00× 1.00× Clean accuracy and clean-robust accuracy gap on CIFAR-10. EA 3.10× 3.00× 7.80× Defense Clean accuracy (%) Robust accuracy (%) CRAG (%) ADP 3.15× 3.00× 9.20× No defense 95.6 0.0 95.6 ARMOR (Min) 1.10× 1.15× – AT 83.4 47.5 35.9 ARMOR (Avg) 1.42× 1.35× 12.50× TRADES 84.9 49.2 35.7 ARMOR (Max) 2.80× 3.20× – RS 87.3 32.2 55.1 FD 85.7 45.2 40.5 IT 89.5 28.8 60.7 Table 6 EA 82.6 52.2 30.4 Detection performance of ARMOR’s threat assessment layer. ADP 83.8 55.2 28.6 Dataset TPR (%) FPR (%) Detection delay (ms) ARMOR (Ours) 87.5 66.9 20.6 CIFAR-10 92.3 3.7 12.4 SVHN 93.1 3.2 11.8 GTSRB 91.7 4.1 13.2 ImageNet-100 90.8 4.5 15.6 5.4. RQ4: Computational overhead Table 5 compares inference time, memory usage, and training time across defenses. ARMOR’s computational cost varies by configuration. With minimal defenses (low-threat inputs), overhead is only 1.10×. With maximal defenses (highly suspicious inputs), overhead reaches 2.80×. ARMOR’s average inference overhead of 1.42× is substantially lower than ensemble methods like EA (3.10×) and ADP (3.15×), despite providing superior robustness. This efficiency comes from the orches- tration mechanism’s ability to allocate computational resources based Fig. 3. Trade-off between clean accuracy and robust accuracy across defenses. on threat assessment. Table 6 shows the threat assessment layer’s detection performance in terms of true positive rate (TPR), false positive rate (FPR), and aver- 5.3. RQ3: Effectiveness against adaptive attacks age detection delay. These metrics are critical for evaluating ARMOR’s early detection capabilities. Table 4 shows robustness against adaptive attacks designed to The threat assessment layer achieves high TPR (90.8–93.1%) with exploit defense-specific vulnerabilities. We test all adaptive attacks low FPR (3.2–4.5%) across all datasets. Detection delay is minimal against all defenses for consistency, though some target ARMOR specif- (11.8–15.6 ms), enabling real-time threat assessment without signifi- ically (e.g., OBA). cant computational cost. ARMOR maintains 58.8% average robust accuracy against adaptive ARMOR’s training time is higher than other methods due to training attacks, substantially higher than the second-best approach (ADP at multiple components, including the orchestration policy. However, this 52.7%). The Ensemble Transfer Attack (ETA) is most effective against is a one-time cost that does not impact deployment efficiency. ARMOR, reducing robust accuracy to 52.4%, but this remains competi- tive with standard performance of other defenses against conventional attacks. 5.5. RQ5: Ablation study The relatively modest performance drop against adaptive attacks (from 66.9% to 58.8%) demonstrates ARMOR’s resilience to attack Table 7 presents an ablation study measuring each ARMOR compo- adaptation, attributable to defense diversity and the adaptive response nent’s contribution. We evaluate configurations with individual compo- layer’s ability to recognize and counter evolving attack patterns. nents removed (w/o X) and single-component-only versions (X Only). 7 M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 Table 7 Ablation study: Component contributions on CIFAR-10. Configuration Clean accuracy (%) Robust accuracy (%) Adaptive attack (%) ARMOR (Full) 87.5 66.9 58.8 w/o threat assessment 86.8 61.2 49.5 w/o input transformation 85.3 59.7 52.1 w/o model robustness 87.9 42.3 35.8 w/o adaptive response 87.2 63.5 48.9 w/o orchestration (Pipeline) 84.1 65.7 54.2 Threat assessment only 95.1 0.0 0.0 Input transformation only 89.3 28.7 16.5 Model robustness only 83.4 53.2 46.8 Adaptive response only 95.5 0.0 0.0 Fig. 4. Contribution of ARMOR components to overall performance. Each component contributes significantly to ARMOR’s performance. • Performance-Security Trade-off: ARMOR achieves a superior Model Robustness provides the largest contribution to robust accu- balance, maintaining high clean accuracy while providing strong racy (53.2% when used alone), but the full system achieves 66.9%, robustness. demonstrating additive benefits from integration. • Computational Efficiency: The variable overhead ensures se- The orchestration mechanism is critical. Replacing it with a static curity without prohibitive resource requirements, even in con- pipeline (applying all components sequentially) reduces clean accuracy strained environments, similar to lightweight security solutions by 3.4 percentage points and robust accuracy slightly, highlighting the developed for IoT scenarios [40]. orchestrator’s role in preserving clean performance through selective defense application. These findings suggest future adversarial robustness research should The adaptive response layer significantly improves performance focus on integrative approaches combining multiple defense mecha- against adaptive attacks. Without it, robustness drops to 48.9% versus nisms for enhanced effectiveness and efficiency. 58.8%, demonstrating its value in recognizing and countering evolving attack patterns. 6.2. Real-world applications Fig. 4 visualizes component contributions across performance met- rics. The synergistic integration of all components achieves perfor- ARMOR’s combination of strong robustness, reasonable computa- mance exceeding what any individual component or simple combina- tional overhead, and maintained clean accuracy makes it suitable for tion could provide. practical deployment: 6. Discussion • Medical Imaging: ARMOR’s adaptability is valuable in health- care applications like COVID-19 detection from CT scans [4], 6.1. Key findings and implications where diagnostic accuracy is critical. High clean accuracy (87.5% on CIFAR-10) and robustness help prevent costly false negatives. Our experimental results demonstrate significant implications for • Resource-Constrained Environments: ARMOR’s flexible over- adversarial robustness research: head enables deployment on edge devices and mobile platforms, • Integration of Complementary Defenses: ARMOR’s multi- similar to efficient security schemes designed for Wireless Body layered approach demonstrates that combining defenses yields Area Networks [40]. The minimal configuration achieves only synergistic benefits beyond individual strengths and weaknesses. 1.10× baseline inference time, supporting real-time applications • Dynamic Defense Allocation: The orchestration mechanism en- in bandwidth-limited settings. ables resource-efficient defense by applying appropriate measures • Security Applications: Adaptive defenses are well-suited for mal- based on each input’s threat profile. ware and intrusion detection domains. The framework’s ability to • Adaptive Defenses for Evolving Threats: The adaptive response continuously update defense strategies based on observed attack layer is essential for maintaining robustness against novel attacks, patterns is valuable against advanced persistent threats and can unlike static, fixed approaches. be applied to infrastructure surveillance systems [5]. 8 M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 ARMOR’s modularity enables integration with existing security so- • Explainability and Interpretability: Improving understanding lutions while accommodating domain-specific requirements, making it of ARMOR’s decision-making process to provide transparency practical for real-world critical applications. about why specific defense strategies are selected for particular inputs. 7. Conclusion • Defense Against Physical-World Attacks: Extending ARMOR to counter physical-world adversarial attacks, which introduce additional challenges beyond digital perturbations. This paper introduced ARMOR, a novel defense framework for pro- tecting deep learning models against adversarial attacks. Our approach advances the state-of-the-art through several key innovations: CRediT authorship contribution statement • A multi-layered architecture that orchestrates complementary de- Mahmoud Mohamed: Writing – original draft, Supervision, Soft- fense strategies to provide synergistic protection exceeding indi- ware, Conceptualization. Fayaz AlJuaid: Writing – review & editing, vidual methods. Validation, Resources, Methodology, Formal analysis, Data curation. • A dynamic orchestration mechanism that routes inputs through appropriate defensive layers based on threat assessment, optimiz- Declaration of competing interest ing the security-efficiency trade-off. • An adaptive response system that continuously updates defense The authors declare that they have no known competing finan- strategies based on observed attack patterns, providing resilience cial interests or personal relationships that could have appeared to against evolving threats. influence the work reported in this paper. • Comprehensive evaluation across diverse attack types, including adaptive attacks, demonstrating superior performance-security Data availability trade-offs. Data will be made available on request. Extensive experimental evaluation shows ARMOR significantly out- performs existing defenses: References • 91.7% attack mitigation rate (18.3% improvement over ensemble averaging) [1] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial • 87.5% clean accuracy preservation (8.9% improvement over ad- examples, in: International Conference on Learning Representations, ICLR, 2015. [2] N. Carlini, D. Wagner, Towards evaluating the robustness of neural networks, versarial training alone) in: IEEE Symposium on Security and Privacy, (SP), 2017, pp. 39–57. • 76.4% robustness against adaptive attacks (23.2% increase over [3] N. Akhtar, A. Mian, Threat of adversarial attacks on deep learning in computer the strongest baseline) vision: A survey, IEEE Access 6 (2018) 14410–14430. • Minimal 1.42× computational overhead compared to unprotected [4] O. Akinlade, E. Vakaj, A. Dridi, S. Tiwari, F. Ortiz-Rodriguez, Semantic seg- models, substantially lower than alternative ensemble methods mentation of the lung to examine the effect of COVID-19 using UNET model, in: Communications in Computer and Information Science, Vol. 2440, Springer, 2023, pp. 52–63, http://dx.doi.org/10.1007/978-3-031-34222-6_5. Our results demonstrate that integrating and coordinating comple- [5] C. Wang, O. Akinlade, S.A. Ajagbe, Dynamic resilience assessment of urban traffic mentary defense mechanisms substantially improves adversarial robust- systems based on integrated deep learning, in: Advances in Transdisciplinary ness. By addressing the limitations of single-dimension strategies, AR- Engineering, Springer, 2025, http://dx.doi.org/10.3233/atde250238. MOR provides more comprehensive and sustainable protection against [6] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning models resistant to adversarial attacks, in: International Conference on Learning diverse and dynamic adversarial threats, moving closer to trustworthy Representations, ICLR, 2018. deep learning systems for high-performance, security-critical applica- [7] C. Guo, M. Rana, M. Cisse, L. Van Der Maaten, Countering adversarial im- tions. ages using input transformations, in: International Conference on Learning Future Directions: While ARMOR shows significant improvements, Representations, ICLR, 2018. [8] J.H. Metzen, T. Genewein, V. Fischer, B. Bischoff, On detecting adversarial several research directions remain: perturbations, in: International Conference on Learning Representations, ICLR, 2017. • Domain Expansion: Extending ARMOR to domains beyond im- [9] F. Tramèr, N. Carlini, W. Brendel, A. Madry, On adaptive attacks to adver- age classification (e.g., natural language processing, speech recog- sarial example defenses, Adv. Neural Inf. Process. Syst. (NeurIPS) 33 (2020) nition, reinforcement learning), which present unique attack sur- 1633–1645. faces and defense requirements. [10] A. Athalye, N. Carlini, D. Wagner, Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples, in: International • Certified Robustness: Developing theoretical guarantees for AR- Conference on Machine Learning, ICML, 2018, pp. 274–283. MOR’s robustness. While we have strong empirical results, for- ̇ From manual to automated systematic review: [11] D. Kalibatiene, J. Miliauskaite, mal certification would provide stronger security assurances for Key attributes influencing the duration of systematic reviews in software en- safety-critical applications. gineering, Comput. Stand. Interfaces 96 (2026) 104073, http://dx.doi.org/10. 1016/j.csi.2025.104073. • Advanced Training Strategies: Investigating meta-learning [12] Y. Dong, Q.A. Fu, X. Yang, T. Pang, H. Su, Z. Xiao, J. Zhu, Benchmarking strategies for the orchestration policy to enable rapid adaptation adversarial robustness on image classification, IEEE Conf. Comput. Vis. Pattern to completely novel attack types. Recognit. (CVPR) 32 (2020) 1–331. • Online Learning Capabilities: Enhancing the adaptive response [13] T. Pang, K. Xu, C. Du, N. Chen, J. Zhu, Improving adversarial robustness via promoting ensemble diversity, in: International Conference on Machine Learning, layer with online learning to continuously update defense strate- (ICML), 2019, pp. 4970–4979. gies in real-time without periodic retraining. [14] G.R. Machado, E. Silva, R.R. Goldschmidt, Adversarial machine learning in image • Hardware Optimization: Optimizing ARMOR for deployment classification: A survey toward the defender’s perspective, ACM Comput. Surv. on resource-constrained hardware, especially edge devices. This 54 (5) (2021) 1–35. could involve creating specialized versions that leverage hard- [15] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically princi- pled trade-off between robustness and accuracy, in: International Conference on ware acceleration for specific defense components, building on Machine Learning, ICML, 2019, pp. 7472–7482. approaches from lightweight security schemes for IoT and Wire- [16] E. Wong, L. Rice, J.Z. Kolter, Fast is better than free: Revisiting adversarial less Body Area Networks [40]. training, in: International Conference on Learning Representations, ICLR, 2020. 9 M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117 [17] S.A. Rebuffi, S. Gowal, D.A. Calian, F. Stimberg, O. Wiles, T. Mann, Fixing data [27] S. Gowal, K. Dvijotham, R. Stanforth, R. Bunel, C. Qin, J. Uesato, R. Arand- augmentation to improve adversarial robustness, Adv. Neural Inf. Process. Syst. jelovic, T. Mann, P. Kohli, Scalable verified training for provably robust image (NeurIPS) 34 (2021) 10213–10224. classification, in: IEEE International Conference on Computer Vision, ICCV, 2019, [18] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, A. Madry, Robustness may be pp. 4842–4851. at odds with accuracy, in: International Conference on Learning Representations, [28] G. Singh, T. Gehr, M. Püschel, M. Vechev, An abstract domain for certifying ICLR, 2019. neural networks, Proc. ACM Program. Lang. 3 (POPL) (2019) 1–30. [19] C. Xie, J. Wang, Z. Zhang, Z. Ren, A. Yuille, Mitigating adversarial effects through [29] G. Yang, T. Duan, J. Hu, H. Salman, I. Razenshteyn, J. Li, Randomized smoothing randomization, in: International Conference on Learning Representations, ICLR, of all shapes and sizes, in: International Conference on Machine Learning, ICML, 2018. 2020, pp. 10693–10705. [20] M. Naseer, S. Khan, M. Hayat, F.S. Khan, F. Porikli, A self-supervised approach [30] F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M. for adversarial robustness, in: IEEE Conference on Computer Vision and Pattern Chiang, P. Mittal, M. Hein, RobustBench: a standardized adversarial robustness Recognition, CVPR, 2020, pp. 262–271. benchmark, Adv. Neural Inf. Process. Syst. (NeurIPS) 35 (2022) 32634–32651. [21] X. Jia, X. Wei, X. Cao, H. Foroosh, ComDefend: An efficient image compression [31] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, P. McDaniel, model to defend adversarial examples, in: IEEE Conference on Computer Vision Ensemble adversarial training: Attacks and defenses, in: International Conference and Pattern Recognition, CVPR, 2019, pp. 6084–6092. on Learning Representations, ICLR, 2018. [22] K. Lee, K. Lee, H. Lee, J. Shin, A simple unified framework for detecting out- [32] S. Sen, N. Baracaldo, H. Ludwig, et al., A hybrid approach to adversarial of-distribution samples and adversarial attacks, Adv. Neural Inf. Process. Syst. detection and defense, IEEE Int. Conf. Big Data 423 (2020) 3–4242. (NeurIPS) 31 (2018) 7167–7177. [33] T. Pang, C. Du, J. Zhu, et al., Towards robust detection of adversarial examples, [23] K. Roth, Y. Kilcher, T. Hofmann, The odds are odd: A statistical test for detecting Adv. Neural Inf. Process. Syst. (NeurIPS) 33 (2020) 10256–10267. adversarial examples, in: International Conference on Machine Learning, ICML, [34] S. Kariyappa, M. Qureshi, A survey of adversarial attacks on deep learning 2019, pp. 5498–5507. in computer vision: A comprehensive review, 2019, arXiv preprint arXiv:1901. [24] X. Ma, Y. Niu, L. Gu, Y. Wang, Y. Zhao, J. Bailey, F. Lu, Understanding 09984. adversarial attacks on deep learning based medical image analysis systems, [35] X. Wei, B. Liang, Y. Li, et al., Adversarial distillation: A survey, IEEE Trans. Pattern Recognit. 110 (2021) 107332. Neural Netw. Learn. Syst. (2021). [25] N. Carlini, D. Wagner, Adversarial examples are not easily detected: Bypassing [36] A. Krizhevsky, et al., CIFAR,-10 dataset, 2009, https://www.cs.toronto.edu/kriz/ ten detection methods, in: ACM Workshop on Artificial Intelligence and Security, cifar.html. 2017, pp. 3–14. [37] Y. Netzer, et al., SVHN, dataset, 2011, http://ufldl.stanford.edu/housenumbers/. [26] J. Cohen, E. Rosenfeld, Z. Kolter, Certified adversarial robustness via randomized [38] J. Stallkamp, et al., GTSRB, dataset, 2011, https://benchmark.ini.rub.de/gtsrb_ smoothing, in: International Conference on Machine Learning, ICML, 2019, pp. dataset.html. 1310–1320. [39] J. Deng, et al., ImageNet dataset, 2009, https://image-net.org/. [40] Z. Ali, J. Hassan, M.U. Aftab, N.W. Hundera, H. Xu, X. Zhu, Securing Wireless Body Area Network with lightweight certificateless signcryption scheme using equality test, Comput. Stand. Interfaces 96 (2026) 104070, http://dx.doi.org/10. 1016/j.csi.2025.104070. 10