726 lines
79 KiB
Plaintext
726 lines
79 KiB
Plaintext
Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
Contents lists available at ScienceDirect
|
||
|
||
|
||
Computer Standards & Interfaces
|
||
journal homepage: www.elsevier.com/locate/csi
|
||
|
||
|
||
|
||
|
||
ARMOR: A multi-layered adaptive defense framework for robust deep
|
||
learning systems against evolving adversarial threatsI
|
||
∗
|
||
Mahmoud Mohamed , Fayaz AlJuaid
|
||
Electrical and Computer Engineering , King Abdul Aziz University, Saudi Arabia
|
||
|
||
|
||
|
||
ARTICLE INFO ABSTRACT
|
||
|
||
Keywords: Introduction: Adversarial attacks represent a major challenge to deep learning models deployed in critical
|
||
Adversarial machine learning fields such as healthcare diagnostics and financial fraud detection. This paper addresses the limitations of
|
||
Deep learning security single-strategy defenses by introducing ARMOR (Adaptive Resilient Multi-layer Orchestrated Response), a novel
|
||
Multi-layered defense
|
||
multi-layered architecture that seamlessly integrates multiple defense mechanisms.
|
||
Robustness evaluation
|
||
Methodology: We evaluate ARMOR against seven state-of-the-art defense methods through extensive experi-
|
||
Adaptive security
|
||
ments across multiple datasets and five attack methodologies. Our approach combines adversarial detection, in-
|
||
put transformation, model hardening, and adaptive response layers that operate with intentional dependencies
|
||
and feedback mechanisms.
|
||
Results: Quantitative results demonstrate that ARMOR significantly outperforms individual defense methods,
|
||
achieving a 91.7% attack mitigation rate (18.3% improvement over ensemble averaging), 87.5% clean accuracy
|
||
preservation (8.9% improvement over adversarial training alone), and 76.4% robustness against adaptive
|
||
attacks (23.2% increase over the strongest baseline).
|
||
Discussion: The modular framework design enables flexibility against emerging threats while requiring only
|
||
1.42× computational overhead compared to unprotected models, making it suitable for resource-constrained
|
||
environments. Our findings demonstrate that activating and integrating complementary defense mechanisms
|
||
represents a significant advance in adversarial resilience.
|
||
|
||
|
||
|
||
1. Introduction However, existing defenses are typically based on single strategies
|
||
such as adversarial training [6], input preprocessing [7], or detection
|
||
Deep learning technologies have been widely adopted in critical models [8]. While effective against specific attacks, these methods
|
||
sectors including autonomous vehicles, medical diagnostics, and cy- often fail when facing diverse or adaptive attacks [9]. This limita-
|
||
bersecurity. While they offer powerful capabilities, they also introduce tion is increasingly concerning as adversaries continue to evolve their
|
||
new security vulnerabilities. Adversarial examples—carefully crafted strategies. Furthermore, existing techniques often suffer from high com-
|
||
inputs designed to deceive models—pose significant risks to AI sys- putational costs, degraded performance on clean data, and continued
|
||
tems [1,2]. Small, seemingly imperceptible distortions can cause state- susceptibility to adaptive attacks [10].
|
||
of-the-art models to misclassify inputs, which may have life-threatening Problem Statement: This paper addresses the vulnerability of deep
|
||
consequences in safety-critical applications [3]. learning systems to adversarial attacks in mission-critical environments.
|
||
Recent advances in deep learning have highlighted the importance Current defenses exhibit three key weaknesses:
|
||
of robust defense mechanisms. For example, UNet-based segmentation
|
||
models in medical imaging have achieved approximately 96% accuracy 1. They typically optimize for a single threat model, leaving them
|
||
in COVID-19 detection from CT scans [4]. Similarly, CNN and BiGRU exposed to diverse attack strategies.
|
||
models have demonstrated strong performance in traffic network anal- 2. They employ static approaches that cannot adapt to evolving
|
||
ysis with an R-squared of 0.9912 [5]. These successes underscore the threats.
|
||
critical need for robust defenses, particularly as deep learning models 3. They fail to balance performance and security, often sacrificing
|
||
are increasingly integrated into high-stakes decision-making processes. accuracy on benign data.
|
||
|
||
|
||
|
||
I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces.
|
||
∗ Corresponding author.
|
||
E-mail address: mhassan0085@stu.kau.edu.sa (M. Mohamed).
|
||
|
||
https://doi.org/10.1016/j.csi.2025.104117
|
||
Received 2 June 2025; Received in revised form 2 December 2025; Accepted 12 December 2025
|
||
Available online 17 December 2025
|
||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
These weaknesses motivate the need for an agile and flexible defense 2.3. Detection-based defenses
|
||
architecture.
|
||
Research Gaps: Our comprehensive literature survey, following Detection methods aim to identify adversarial examples without
|
||
systematic review methodologies [11], identifies several critical gaps: necessarily correcting them. Metzen et al. [8] attached a binary detec-
|
||
tor subnetwork to identify adversarial inputs. Lee et al. [22] used Ma-
|
||
• Most defenses optimize for a single threat model, creating vulner- halanobis distance-based confidence scores to detect out-of-distribution
|
||
abilities across diverse attack strategies [12].
|
||
samples.
|
||
• Current ensemble approaches typically use simple voting or aver- Recent approaches include statistical methods using odds ratio
|
||
aging, failing to leverage the complementary strengths of different
|
||
tests [23] and Local Intrinsic Dimensionality (LID) [24] to characterize
|
||
defense mechanisms [13].
|
||
adversarial regions in feature space.
|
||
• There is insufficient focus on dynamic adaptation to evolving
|
||
While detection mechanisms can be accurate, adaptive attacks
|
||
threats in real-time operational environments [14].
|
||
specifically target their vulnerabilities [25]. Moreover, they do not
|
||
• The performance-security trade-off is poorly addressed, with
|
||
provide predictions for identified adversarial examples.
|
||
many techniques significantly degrading model performance on
|
||
benign inputs [15].
|
||
2.4. Certified robustness approaches
|
||
Our ARMOR framework addresses these gaps through:
|
||
Certified defenses provide theoretical guarantees that perturbations
|
||
• Orchestrated Integration: Complementary defense layers oper- within certain bounds will not alter predictions. Cohen et al. [26]
|
||
ate cooperatively rather than in isolation. applied randomized smoothing to create certifiably robust classifiers
|
||
• Dynamic Threat Assessment: Adaptive response mechanisms against L2-norm bounded perturbations. Gowal et al. [27] developed
|
||
learn from observed attack patterns. interval bound propagation for training verifiably robust networks.
|
||
• Explicit Trade-off Optimization: High clean accuracy is main- Recent progress includes DeepPoly [28], which provides tighter
|
||
tained while improving robustness. bounds for neural network verification, and improved certification
|
||
• Comprehensive Testing: Evaluation across diverse attacks, in- bounds for cascading architectures [29].
|
||
cluding engineered adaptive attacks. While certified methods offer valuable theoretical assurances, they
|
||
• Modular Design: New defense mechanisms can be incorporated generally achieve lower empirical robustness than adversarial training
|
||
as they emerge. and can be significantly more resource-intensive [30].
|
||
As shown in Table 1, our method advances the state-of-the-art
|
||
2.5. Ensemble and hybrid approaches
|
||
across multiple performance dimensions while maintaining reasonable
|
||
computational overhead.
|
||
Ensemble methods combine multiple models or defense mechanisms
|
||
2. Related work to enhance robustness. Tramèr et al. [31] proposed Ensemble Adversar-
|
||
ial Training, which augments training data with adversarial examples
|
||
This section analyzes current adversarial defense mechanisms, their from other models. Pang et al. [13] introduced adaptive diversity
|
||
limitations, and specific gaps our framework addresses. We categorize promoting (ADP) training to develop robust ensemble models. Sen
|
||
existing work into adversarial training, input transformation, detection- et al. [32] integrated detection and adversarial training in a two-stage
|
||
based methods, certified robustness, and ensemble approaches. process.
|
||
However, most current ensembles employ basic averaging or voting
|
||
2.1. Adversarial training methods schemes that fail to leverage the complementary strengths of different
|
||
defense types [33].
|
||
Adversarial training remains one of the most effective empirical
|
||
defense mechanisms. Madry et al. [6] introduced PGD adversarial 2.6. Research gaps and contributions
|
||
training, which serves as a strong baseline but suffers from reduced
|
||
clean accuracy and high computational cost.
|
||
Based on our literature review, we identify the following critical
|
||
Recent advances include TRADES [15], which explicitly regularizes
|
||
research gaps:
|
||
the trade-off between standard accuracy and robustness; Fast Adver-
|
||
sarial Training [16], which improves computational efficiency using • Poor Integration: Most studies focus on single defenses or simple
|
||
FGSM with randomization; and Robust Self-Training (RST) [17], which combinations that fail to leverage synergistic effects.
|
||
leverages additional unlabeled data to enhance robustness.
|
||
• Static Defense Mechanisms: Current approaches use fixed
|
||
Despite these improvements, adversarial training techniques remain
|
||
strategies that cannot adapt to evolving threats.
|
||
fundamentally constrained: they are typically resistant only to attacks
|
||
• Performance-Security Trade-offs: Robust models frequently sac-
|
||
encountered during training, often fail on out-of-distribution samples,
|
||
rifice clean-data accuracy.
|
||
and exhibit reduced performance on clean data [18].
|
||
• Lack of Standardization: Inconsistent evaluation protocols hin-
|
||
2.2. Input transformation approaches der fair comparisons.
|
||
• Insufficient Adaptive Attack Testing: Most defenses are not
|
||
Input transformation methods aim to remove adversarial perturba- evaluated against adaptive attacks designed to circumvent them.
|
||
tions before model inference. Guo et al. [7] explored various image
|
||
transformations, finding that total variance minimization and image Our ARMOR framework addresses these gaps through:
|
||
quilting provide moderate robustness. Xie et al. [19] proposed random
|
||
resizing and padding as preprocessing defenses. • Orchestrated Integration: Complementary defense layers oper-
|
||
More recent work includes Neural Representation Purifiers [20], ate cooperatively rather than in isolation.
|
||
which use self-supervised learning to clean adversarial inputs, and • Dynamic Threat Assessment: Response mechanisms adapt based
|
||
ComDefend [21], a compression-decompression architecture that elim- on observed attack patterns.
|
||
inates adversarial perturbations. • Explicit Trade-off Optimization: High clean accuracy is main-
|
||
While these methods often preserve accuracy better than adversarial tained while improving robustness.
|
||
training, they remain vulnerable to adaptive attacks that account for • Comprehensive Testing: Evaluation across diverse attacks, in-
|
||
the transformation process [10]. cluding engineered adaptive attacks.
|
||
|
||
2
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
Table 1
|
||
Comparison of state-of-the-art adversarial defense methods (2020–2025).
|
||
Reference Year Defense type Multi-attack robustness Clean accuracy Computation overhead Adaptive attack resistance
|
||
Madry et al. [6] 2018 Adversarial training Medium (66.4%) Low (87.3%) High (10×) Medium (54.2%)
|
||
Zhang et al. [15] 2019 Adv. training (TRADES) Medium (73.5%) Medium (84.9%) High (7×) Medium (61.8%)
|
||
Cohen et al. [26] 2019 Certified defense Low (49.2%) Medium (83.5%) Very high (30×) High (guaranteed bounds)
|
||
Wong et al. [16] 2020 Fast Adv. training Medium (71.2%) Medium-high (85.8%) Medium (3×) Medium (58.3%)
|
||
Rebuffi et al. [17] 2021 Robust self-training High (76.5%) Medium-high (86.1%) High (12×) Medium-high (64.5%)
|
||
Ma et al. [24] 2021 Detection-based Low-medium (detection only) Very high (99.1%) Low (1.2×) Low (35.6%)
|
||
Naseer et al. [20] 2020 Input transformation Medium (68.7%) High (88.3%) Medium (2.5×) Low (42.1%)
|
||
Pang et al. [13] 2019 Ensemble Medium-high (74.8%) Medium (83.2%) Very high (15×) Medium (63.1%)
|
||
Sen et al. [32] 2020 Hybrid Medium-high (75.1%) Medium (83.9%) High (8×) Medium (62.5%)
|
||
Kariyappa et al. [34] 2019 Diversity ensemble Medium-high (73.9%) Medium (84.1%) Very high (18×) Medium-high (65.8%)
|
||
Jia et al. [21] 2019 Stochastic defense Medium (67.2%) High (89.5%) Low (1.5×) Low-medium (53.6%)
|
||
Gowal et al. [27] 2019 Interval bound Prop. Medium (68.8%) Medium (82.8%) High (9×) High (certified regions)
|
||
Yang et al. [29] 2020 Certified defense Medium (64.3%) Medium (84.2%) High (7×) High (certified regions)
|
||
Croce et al. [30] 2022 Regularization Medium-high (73.8%) Medium-high (85.7%) Medium (4×) Medium (60.9%)
|
||
Wei et al. [35] 2021 Adv. distillation Medium-high (75.6%) Medium-high (86.3%) Medium (3.5×) Medium-High (64.2%)
|
||
Our work (ARMOR) 2025 Multi-layered Very high (91.7%) High (87.5%) Low-medium (1.42×) High (76.4%)
|
||
|
||
|
||
|
||
|
||
Fig. 1. ARMOR framework architecture showing the orchestrated multi-layered defense approach.
|
||
|
||
|
||
• Modular Design: New defense mechanisms can be incorporated • Input Transformation Layer: Applies appropriate preprocessing
|
||
as they emerge. techniques to remove or reduce adversarial perturbations.
|
||
• Model Robustness Layer: Employs robust model architectures
|
||
As shown in Table 1, ARMOR advances the state-of-the-art across and training techniques to withstand remaining adversarial ef-
|
||
multiple performance dimensions while maintaining reasonable com- fects.
|
||
putational overhead. • Adaptive Response Layer: Dynamically adjusts defense strate-
|
||
gies based on observed attack patterns and feedback.
|
||
3. Methodology
|
||
Unlike static pipeline approaches, ARMOR uses an orchestration
|
||
This section describes the ARMOR framework architecture and its mechanism to dynamically route inputs through the most effective com-
|
||
components. bination of defense components based on threat assessment and his-
|
||
torical performance data. This orchestrated approach provides stronger
|
||
3.1. Framework overview protection than any single layer or static combination.
|
||
|
||
As shown in Fig. 1, ARMOR integrates four complementary defense
|
||
3.2. Threat assessment layer
|
||
layers:
|
||
|
||
• Threat Assessment Layer: Analyzes inputs to detect potential The threat assessment layer employs multiple detection methods to
|
||
adversarial examples and characterize their properties. identify and classify adversarial examples:
|
||
|
||
3
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
3.2.1. Feature space analysis 3.3.2. Frequency domain filtering
|
||
We compute the Mahalanobis distance between an input sample Based on the frequency analysis from the threat assessment layer,
|
||
𝑥 and the distribution of legitimate training examples in the fea- we apply targeted filtering to remove adversarial components in spe-
|
||
ture space. For each layer 𝑙 of the neural network, we model the cific frequency bands. For an input 𝑥, we compute its wavelet transform
|
||
class-conditional distribution of legitimate examples as a multivariate 𝑊 (𝑥), apply a filtering function 𝜙 to the coefficients, and compute the
|
||
Gaussian with parameters 𝜇𝑐𝑙 and 𝛴 𝑙 , where 𝑐 represents the predicted inverse transform:
|
||
class. The Mahalanobis distance score 𝑀 𝑙 (𝑥) is computed as:
|
||
𝑥̂ = 𝑊 −1 (𝜙(𝑊 (𝑥), 𝑎(𝑥))) (7)
|
||
𝑀 𝑙 (𝑥) = min(𝑓 𝑙 (𝑥) − 𝜇𝑐𝑙 )𝑇 (𝛴 𝑙 )−1 (𝑓 𝑙 (𝑥) − 𝜇𝑐𝑙 ) (1)
|
||
𝑐
|
||
The filtering function 𝜙 adapts based on the attack characteri-
|
||
where 𝑓 𝑙 (𝑥) represents the feature vector at layer 𝑙 for input 𝑥. zation, targeting frequency bands most likely to contain adversarial
|
||
perturbations.
|
||
3.2.2. Prediction consistency check
|
||
We measure the consistency of model predictions when the input is
|
||
3.3.3. Randomized smoothing
|
||
subjected to small benign transformations. Given a set of 𝑘 transforma-
|
||
For inputs with high uncertainty, we apply randomized smoothing
|
||
tions {𝑇1 , 𝑇2 , … , 𝑇𝑘 } and model 𝑓 , the consistency score 𝐶(𝑥) is defined
|
||
as: with Gaussian noise:
|
||
|
||
1∑
|
||
𝑘 𝑥̂ = 𝑥 + (0, 𝜎 2 𝐼) (8)
|
||
𝐶(𝑥) = I[𝑓 (𝑇𝑖 (𝑥)) = 𝑓 (𝑥)] (2)
|
||
𝑘 𝑖=1 where 𝜎 is dynamically adjusted based on the threat score and attack
|
||
where I[⋅] is the indicator function. characterization, increasing for high-threat inputs to provide stronger
|
||
smoothing.
|
||
3.2.3. Frequency domain analysis
|
||
We perform discrete wavelet transform (DWT) on the input to 3.4. Model robustness layer
|
||
analyze its frequency characteristics. Adversarial perturbations often
|
||
exhibit distinctive patterns in high-frequency components. We compute The model robustness layer integrates multiple robust architectures
|
||
the energy distribution across frequency bands and compare it to the and training techniques:
|
||
typical distribution in legitimate samples. The frequency abnormality
|
||
score 𝐹 (𝑥) is calculated as:
|
||
3.4.1. Diverse model ensemble
|
||
∑
|
||
𝑚
|
||
We employ an ensemble of models with diverse architectures and
|
||
𝐹 (𝑥) = 𝑤𝑖 ⋅ |𝐸𝑖 (𝑥) − 𝜇𝐸𝑖 | (3)
|
||
𝑖=1 training procedures:
|
||
where 𝐸𝑖 (𝑥) is the energy in frequency band 𝑖, 𝜇𝐸𝑖 is the mean energy = {𝑓1 , 𝑓2 , … , 𝑓𝑛 } (9)
|
||
for legitimate samples in that band, and 𝑤𝑖 are learned weights.
|
||
Instead of simple averaging, we compute weighted predictions
|
||
3.2.4. Integrated threat score based on each model’s historical performance against the detected
|
||
The individual detection scores are combined into an integrated attack type:
|
||
threat score 𝑇 (𝑥) using a logistic regression model:
|
||
∑
|
||
𝑛
|
||
|
||
𝑇 (𝑥) = 𝜎(𝑤𝑀 ⋅ 𝑀(𝑥) + 𝑤𝐶 ⋅ 𝐶(𝑥) + 𝑤𝐹 ⋅ 𝐹 (𝑥) + 𝑏) (4) 𝑝(𝑦|𝑥) = 𝑤𝑖 (𝑎(𝑥)) ⋅ 𝑝𝑖 (𝑦|𝑥) (10)
|
||
𝑖=1
|
||
where 𝜎 is the sigmoid function, and 𝑤𝑀 , 𝑤𝐶 , 𝑤𝐹 , and 𝑏 are learned where 𝑤𝑖 (𝑎(𝑥)) is the weight assigned to model 𝑖 based on the attack
|
||
parameters. characterization 𝑎(𝑥).
|
||
In addition to binary adversarial/legitimate classification, the threat
|
||
assessment layer provides an attack characterization vector 𝑎(𝑥) that
|
||
3.4.2. Feature denoising
|
||
estimates properties such as attack strength, perceptibility, and tar-
|
||
We incorporate feature denoising modules at multiple network lev-
|
||
geted/untargeted nature:
|
||
els. For a feature map ℎ, the denoised features ℎ̂ are computed as:
|
||
𝑎(𝑥) = 𝑔(𝑀(𝑥), 𝐶(𝑥), 𝐹 (𝑥), 𝑓 (𝑥)) (5)
|
||
where 𝑔 is a small neural network trained on a diverse set of known ℎ̂ = ℎ + 𝛾 ⋅ 𝐺(ℎ, 𝑎(𝑥)) (11)
|
||
attacks.
|
||
where 𝐺 is a non-local denoising function and 𝛾 is a learnable param-
|
||
3.3. Input transformation layer eter controlling denoising strength.
|
||
|
||
The input transformation layer employs multiple preprocessing 3.4.3. Robust training objective
|
||
techniques to remove or reduce adversarial perturbations. Rather than Models in the ensemble are trained using a composite objective
|
||
applying all transformations sequentially (which would degrade clean function balancing standard accuracy, adversarial robustness, and
|
||
performance), ARMOR selectively applies the most appropriate trans- model diversity:
|
||
formations based on threat assessment:
|
||
= 𝛼 ⋅ 𝐶𝐸 (𝑥) + 𝛽 ⋅ 𝐴𝐷𝑉 (𝑥) + 𝛾 ⋅ 𝐷𝐼𝑉 (𝑥, ) (12)
|
||
3.3.1. Adaptive denoising
|
||
We employ a conditional autoencoder 𝐷𝜃 trained to remove adver- where 𝐶𝐸 is standard cross-entropy loss, 𝐴𝐷𝑉 is adversarial loss, and
|
||
sarial perturbations while preserving semantic content. The denoising 𝐷𝐼𝑉 is a diversity-promoting loss that encourages models to make
|
||
process is conditioned on the attack characterization vector 𝑎(𝑥): different mistakes.
|
||
|
||
𝑥̂ = 𝐷𝜃 (𝑥, 𝑎(𝑥)) (6) 3.5. Adaptive response layer
|
||
This conditioning allows the denoiser to adapt its behavior based on
|
||
the detected attack type, improving both effectiveness and clean data The adaptive response layer continuously updates defense strategies
|
||
preservation. based on observed attack patterns and performance feedback:
|
||
|
||
4
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
3.5.1. Attack pattern recognition Algorithm 1 ARMOR Orchestration Mechanism
|
||
We maintain a historical database of attack patterns and their 1: Input: Input sample 𝑥, trained models , orchestration policy 𝜋
|
||
effectiveness against different defense configurations. New inputs are 2: Output: Prediction 𝑦, updated effectiveness scores
|
||
compared to this database to identify similar patterns: 3: Compute threat assessment 𝑇 (𝑥) and attack characterization 𝑎(𝑥)
|
||
( ) 4: Select initial defense configuration 𝑐0 = 𝜋(𝑥, 𝑇 (𝑥), 𝑎(𝑥))
|
||
‖𝑎(𝑥) − 𝑎(𝑥𝑖 )‖2
|
||
𝑠(𝑥, 𝑥𝑖 ) = exp − (13) 5: Apply defenses in 𝑐0 to 𝑥, obtaining intermediate result 𝑥̂ 0
|
||
2𝜎 2
|
||
6: Evaluate model confidence on 𝑥̂ 0
|
||
where 𝑠(𝑥, 𝑥𝑖 ) measures similarity between the current input 𝑥 and 7: if confidence below threshold then
|
||
historical sample 𝑥𝑖 . 8: Select additional defenses 𝑐1 = 𝜋(𝑥̂ 0 , 𝑇 (𝑥̂ 0 ), 𝑎(𝑥̂ 0 ))
|
||
9: Apply defenses in 𝑐1 to 𝑥̂ 0 , obtaining 𝑥̂ 1
|
||
10: Set 𝑥̂ = 𝑥̂ 1
|
||
3.5.2. Defense effectiveness tracking 11: else
|
||
For each defense component 𝑑 and attack type 𝑎, we track historical 12: Set 𝑥̂ = 𝑥̂ 0
|
||
effectiveness 𝐸(𝑑, 𝑎) based on successful mitigation. This score updates 13: end if
|
||
after each prediction: 14: Compute final prediction 𝑦 = 𝑓 (𝑥) ̂
|
||
15: Update effectiveness scores 𝐸(𝑑, 𝑎(𝑥)) for all applied defenses 𝑑
|
||
𝐸(𝑑, 𝑎) ← 𝜆 ⋅ 𝐸(𝑑, 𝑎) + (1 − 𝜆) ⋅ 𝑆(𝑑, 𝑥) (14) 16: return 𝑦, updated 𝐸
|
||
|
||
where 𝑆(𝑑, 𝑥) indicates success of defense component 𝑑 on input 𝑥, and
|
||
𝜆 is a forgetting factor weighting recent observations.
|
||
3.7. Implementation details
|
||
|
||
3.5.3. Defense strategy optimization
|
||
ARMOR was implemented in PyTorch as follows:
|
||
Based on effectiveness tracking, we periodically update the or-
|
||
chestration policy to optimize input routing through defense layers: • Threat Assessment Layer: ResNet-50 pre-trained on ImageNet
|
||
for feature extraction. Detection models are trained on clean and
|
||
∑ adversarial examples generated using PGD, C&W, and AutoAt-
|
||
𝜋(𝑥) = arg max 𝐸(𝑑, 𝑎(𝑥)) (15) tack.
|
||
𝑐
|
||
𝑑∈𝑐
|
||
• Input Transformation Layer: U-Net autoencoder with skip con-
|
||
where 𝜋(𝑥) selects the defense configuration for input 𝑥 and 𝑐 represents nections and conditioning. Wavelet transforms use PyWavelets
|
||
a potential defense component configuration. with db4 wavelets.
|
||
• Model Robustness Layer: Ensemble of ResNet-50, DenseNet-
|
||
121, and EfficientNet-B3, trained with various robust optimiza-
|
||
3.6. Orchestration mechanism tion methods (TRADES, MART, AWP).
|
||
• Adaptive Response Layer: Historical database using locality-
|
||
The orchestration mechanism is ARMOR’s key innovation, enabling sensitive hashing for efficient similarity search. Orchestration
|
||
dynamic routing of inputs through the most effective combination of policy trained using Proximal Policy Optimization (PPO).
|
||
defense components. The orchestrator uses a Markov Decision Process
|
||
The overall computational cost depends on the defense configu-
|
||
(MDP) formulation:
|
||
ration selected by the orchestrator. In our experiments, the average
|
||
overhead is 1.42× compared to an unprotected model, ranging from
|
||
• State: The current state 𝑠𝑡 includes input 𝑥, threat assessment
|
||
1.1× (minimal defense) to 2.8× (full defense stack).
|
||
𝑇 (𝑥), attack characterization 𝑎(𝑥), and current model confidence.
|
||
• Actions: Each action 𝑎𝑡 represents selection of a specific defense
|
||
component or combination. 4. Experimental setup
|
||
• Reward: The reward 𝑟𝑡 is defined by correct classification, with
|
||
penalties for unnecessary computational overhead. 4.1. Research questions
|
||
• Policy: The policy 𝜋(𝑎𝑡 |𝑠𝑡 ) is a neural network predicting optimal
|
||
defense configuration given the current state. Our study addresses the following research questions:
|
||
|
||
The policy is trained using reinforcement learning on diverse attacks • RQ1: How does ARMOR compare to state-of-the-art individual
|
||
and inputs. During deployment, the orchestrator processes each input and ensemble defenses in robustness against diverse attacks?
|
||
sequentially: • RQ2: How does ARMOR preserve clean data accuracy compared
|
||
to existing defenses?
|
||
1. Compute threat assessment and attack characterization. • RQ3: What is ARMOR’s resistance to adaptive attacks targeting
|
||
2. Select initial defense configuration based on the policy. its components?
|
||
3. Apply selected defenses and evaluate the result. • RQ4: How does ARMOR’s computational overhead compare to
|
||
4. If necessary, select additional defenses based on the updated other defenses?
|
||
state. • RQ5: What are the contributions of individual ARMOR compo-
|
||
5. Return final prediction and update effectiveness tracking. nents to overall effectiveness?
|
||
|
||
|
||
This dynamic approach allows ARMOR to provide strong protec- 4.2. Datasets
|
||
tion while minimizing computational overhead. Low-threat inputs re-
|
||
ceive minimal defenses, preserving efficiency, while high-threat inputs We evaluate ARMOR on four image classification datasets selected
|
||
receive comprehensive protection. to represent varying complexity and domains:
|
||
|
||
5
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
• CIFAR-10: 60,000 32 × 32 color images across 10 classes (50,000 Table 2
|
||
training, 10,000 test). This benchmark standard tests defenses on Robust accuracy (%) against different attack types on CIFAR-10.
|
||
small to medium-complexity images [36]. Defense PGD C&W AutoAttack BPDA EOT Average
|
||
• SVHN: Street View House Numbers with 73,257 training and No defense 0.0 0.0 0.0 0.0 0.0 0.0
|
||
26,032 test images of digits. This dataset evaluates defense gen- AT 47.3 54.1 43.8 46.2 45.9 47.5
|
||
TRADES 49.8 55.6 45.2 48.3 47.1 49.2
|
||
eralization to digit recognition [37].
|
||
RS 38.9 42.3 36.5 25.1 18.4 32.2
|
||
• GTSRB: German Traffic Sign Recognition Benchmark with 39,209 FD 45.7 50.2 41.3 44.5 44.1 45.2
|
||
training and 12,630 test images across 43 traffic sign classes. IT 35.4 38.6 21.7 15.3 33.2 28.8
|
||
This real-world dataset tests robustness under varied lighting and EA 53.2 59.8 48.6 50.1 49.4 52.2
|
||
perspectives [38]. ADP 56.1 62.3 51.4 53.6 52.8 55.2
|
||
ARMOR (Ours) 67.8 73.5 65.2 64.1 63.7 66.9
|
||
• ImageNet-100: A 100-class subset of ImageNet with 1300 train-
|
||
ing and 50 validation images per class. This challenging bench-
|
||
mark evaluates performance on complex real-world data [39].
|
||
• True Positive Rate (TPR): Proportion of adversarial samples
|
||
This diverse dataset selection ensures our results generalize across correctly identified.
|
||
different data environments. • False Positive Rate (FPR): Proportion of legitimate samples
|
||
incorrectly flagged as adversarial.
|
||
4.3. Attack methods • Adaptive Attack Robustness (AAR): Accuracy against carefully
|
||
crafted adaptive attacks.
|
||
We evaluate robustness against five attack types:
|
||
4.6. Adaptive attacks
|
||
• PGD (Projected Gradient Descent): Strong iterative attack with
|
||
𝜖 = 8∕255, 𝛼 = 2∕255, and 20 iterations.
|
||
To thoroughly evaluate ARMOR, we designed adaptive attacks tar-
|
||
• C&W (Carlini & Wagner): Optimization-based attack with confi-
|
||
geting its specific components:
|
||
dence parameter 𝜅 = 0 and 1000 iterations.
|
||
• AutoAttack: Parameter-free ensemble including APGD, FAB, and • Orchestrator Bypass Attack (OBA): Generates adversarial exam-
|
||
Square Attack. ples with low threat scores to route through minimal defenses.
|
||
• BPDA (Backward Pass Differentiable Approximation): Adap- • Transformation-Aware Attack (TAA): Uses EOT to average gra-
|
||
tive attack designed to circumvent gradient obfuscation defenses. dients over possible input transformations, creating perturbations
|
||
• EOT (Expectation Over Transformation): Attack accounting that survive preprocessing.
|
||
for randomized defenses by averaging gradients over multiple • Ensemble Transfer Attack (ETA): Generates transferable adver-
|
||
transformations. sarial examples targeting the diverse model ensemble.
|
||
• History Poisoning Attack (HPA): Gradually shifts attack pattern
|
||
Section 4.6 describes our adaptive attacks specifically targeting
|
||
distribution to reduce effectiveness of historical pattern matching.
|
||
ARMOR components.
|
||
These adaptive attacks combine EOT, BPDA, and transferability
|
||
4.4. Baseline defenses methods with ARMOR-specific modifications.
|
||
|
||
We compare ARMOR against the following state-of-the-art defenses: 5. Results
|
||
|
||
• Adversarial Training (AT): Standard PGD adversarial training. This section presents experimental results addressing our research
|
||
• TRADES: Explicitly balances accuracy and robustness. questions.
|
||
• Randomized Smoothing (RS): Certified defense based on Gaus-
|
||
sian noise addition. 5.1. RQ1: Robustness against diverse attacks
|
||
• Feature Denoising (FD): Non-local means filtering in feature
|
||
space. Table 2 shows robust accuracy against various attacks on CIFAR-
|
||
• Input Transformation (IT): JPEG compression and bit-depth 10. ARMOR significantly outperforms all defenses across attack types,
|
||
reduction. achieving 66.9% average robust accuracy compared to 55.2% for the
|
||
• Ensemble Averaging (EA): Simple averaging of independent best baseline (ADP). Performance is particularly strong against adap-
|
||
robust models. tive attacks like BPDA and EOT, where ARMOR maintains over 63%
|
||
• Adaptive Diversity Promoting (ADP): Encourages diversity in accuracy while other defenses degrade substantially.
|
||
ensemble predictions. Fig. 2 shows robust accuracy across all four datasets against Au-
|
||
toAttack. ARMOR consistently outperforms baselines, with the largest
|
||
4.5. Evaluation metrics gains on complex datasets (GTSRB and ImageNet-100), demonstrating
|
||
scalability to challenging classification problems.
|
||
We use the following performance metrics:
|
||
5.2. RQ2: Impact on clean data performance
|
||
• Clean Accuracy (CA): Accuracy on unmodified test data.
|
||
• Robust Accuracy (RA): Accuracy on adversarial examples. Table 3 compares clean accuracy, robust accuracy, and the clean-
|
||
• Attack Success Rate (ASR): Percentage of successful adversarial robust accuracy gap (CRAG) on CIFAR-10. ARMOR achieves 87.5%
|
||
examples that deceive the model. clean accuracy—higher than most comparably robust defenses. The
|
||
• Clean-Robust Accuracy Gap (CRAG): Difference between clean clean-robust gap is only 20.6%, compared to 28.6% for the next best
|
||
and robust accuracy. approach (ADP), indicating a better performance-security trade-off.
|
||
• Computational Overhead (CO): Inference time relative to an Fig. 3 visualizes the clean-robust accuracy trade-off across datasets.
|
||
undefended model. Points closer to the upper-right corner represent better performance on
|
||
• Detection Delay (DD): Average time to detect adversarial exam- both metrics. ARMOR consistently occupies the most favorable region
|
||
ples. of this trade-off space.
|
||
|
||
6
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
Table 4
|
||
Robust accuracy (%) against adaptive attacks on CIFAR-10.
|
||
Defense Standard attack OBA TAA ETA HPA Average
|
||
AT 47.5 47.5 47.5 47.5 47.5 47.5
|
||
TRADES 49.2 49.2 49.2 49.2 49.2 49.2
|
||
RS 32.2 32.2 18.4 32.2 32.2 29.4
|
||
FD 45.2 45.2 45.2 45.2 45.2 45.2
|
||
IT 28.8 28.8 15.3 28.8 28.8 26.1
|
||
EA 52.2 52.2 49.4 40.6 52.2 49.3
|
||
ADP 55.2 55.2 52.8 45.1 55.2 52.7
|
||
ARMOR (Ours) 66.9 58.3 56.7 52.4 59.8 58.8
|
||
|
||
|
||
Table 5
|
||
Computational overhead and memory requirements.
|
||
Defense Inference time Memory usage Training time
|
||
(× Baseline) (× Baseline) (× Baseline)
|
||
No defense 1.00× 1.00× 1.00×
|
||
AT 1.05× 1.00× 7.80×
|
||
Fig. 2. Robust accuracy comparison across datasets against AutoAttack. TRADES 1.05× 1.00× 8.50×
|
||
RS 3.20× 1.05× 1.20×
|
||
FD 1.30× 1.20× 1.50×
|
||
Table 3 IT 1.15× 1.00× 1.00×
|
||
Clean accuracy and clean-robust accuracy gap on CIFAR-10. EA 3.10× 3.00× 7.80×
|
||
Defense Clean accuracy (%) Robust accuracy (%) CRAG (%) ADP 3.15× 3.00× 9.20×
|
||
No defense 95.6 0.0 95.6 ARMOR (Min) 1.10× 1.15× –
|
||
AT 83.4 47.5 35.9 ARMOR (Avg) 1.42× 1.35× 12.50×
|
||
TRADES 84.9 49.2 35.7 ARMOR (Max) 2.80× 3.20× –
|
||
RS 87.3 32.2 55.1
|
||
FD 85.7 45.2 40.5
|
||
IT 89.5 28.8 60.7
|
||
Table 6
|
||
EA 82.6 52.2 30.4 Detection performance of ARMOR’s threat assessment layer.
|
||
ADP 83.8 55.2 28.6 Dataset TPR (%) FPR (%) Detection delay (ms)
|
||
ARMOR (Ours) 87.5 66.9 20.6
|
||
CIFAR-10 92.3 3.7 12.4
|
||
SVHN 93.1 3.2 11.8
|
||
GTSRB 91.7 4.1 13.2
|
||
ImageNet-100 90.8 4.5 15.6
|
||
|
||
|
||
|
||
|
||
5.4. RQ4: Computational overhead
|
||
|
||
|
||
Table 5 compares inference time, memory usage, and training time
|
||
across defenses. ARMOR’s computational cost varies by configuration.
|
||
With minimal defenses (low-threat inputs), overhead is only 1.10×.
|
||
With maximal defenses (highly suspicious inputs), overhead reaches
|
||
2.80×.
|
||
ARMOR’s average inference overhead of 1.42× is substantially
|
||
lower than ensemble methods like EA (3.10×) and ADP (3.15×), despite
|
||
providing superior robustness. This efficiency comes from the orches-
|
||
tration mechanism’s ability to allocate computational resources based
|
||
Fig. 3. Trade-off between clean accuracy and robust accuracy across defenses. on threat assessment.
|
||
Table 6 shows the threat assessment layer’s detection performance
|
||
in terms of true positive rate (TPR), false positive rate (FPR), and aver-
|
||
5.3. RQ3: Effectiveness against adaptive attacks age detection delay. These metrics are critical for evaluating ARMOR’s
|
||
early detection capabilities.
|
||
Table 4 shows robustness against adaptive attacks designed to The threat assessment layer achieves high TPR (90.8–93.1%) with
|
||
exploit defense-specific vulnerabilities. We test all adaptive attacks low FPR (3.2–4.5%) across all datasets. Detection delay is minimal
|
||
against all defenses for consistency, though some target ARMOR specif- (11.8–15.6 ms), enabling real-time threat assessment without signifi-
|
||
ically (e.g., OBA). cant computational cost.
|
||
ARMOR maintains 58.8% average robust accuracy against adaptive ARMOR’s training time is higher than other methods due to training
|
||
attacks, substantially higher than the second-best approach (ADP at multiple components, including the orchestration policy. However, this
|
||
52.7%). The Ensemble Transfer Attack (ETA) is most effective against is a one-time cost that does not impact deployment efficiency.
|
||
ARMOR, reducing robust accuracy to 52.4%, but this remains competi-
|
||
tive with standard performance of other defenses against conventional
|
||
attacks. 5.5. RQ5: Ablation study
|
||
The relatively modest performance drop against adaptive attacks
|
||
(from 66.9% to 58.8%) demonstrates ARMOR’s resilience to attack Table 7 presents an ablation study measuring each ARMOR compo-
|
||
adaptation, attributable to defense diversity and the adaptive response nent’s contribution. We evaluate configurations with individual compo-
|
||
layer’s ability to recognize and counter evolving attack patterns. nents removed (w/o X) and single-component-only versions (X Only).
|
||
|
||
7
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
Table 7
|
||
Ablation study: Component contributions on CIFAR-10.
|
||
Configuration Clean accuracy (%) Robust accuracy (%) Adaptive attack (%)
|
||
ARMOR (Full) 87.5 66.9 58.8
|
||
w/o threat assessment 86.8 61.2 49.5
|
||
w/o input transformation 85.3 59.7 52.1
|
||
w/o model robustness 87.9 42.3 35.8
|
||
w/o adaptive response 87.2 63.5 48.9
|
||
w/o orchestration (Pipeline) 84.1 65.7 54.2
|
||
Threat assessment only 95.1 0.0 0.0
|
||
Input transformation only 89.3 28.7 16.5
|
||
Model robustness only 83.4 53.2 46.8
|
||
Adaptive response only 95.5 0.0 0.0
|
||
|
||
|
||
|
||
|
||
Fig. 4. Contribution of ARMOR components to overall performance.
|
||
|
||
|
||
Each component contributes significantly to ARMOR’s performance. • Performance-Security Trade-off: ARMOR achieves a superior
|
||
Model Robustness provides the largest contribution to robust accu- balance, maintaining high clean accuracy while providing strong
|
||
racy (53.2% when used alone), but the full system achieves 66.9%, robustness.
|
||
demonstrating additive benefits from integration. • Computational Efficiency: The variable overhead ensures se-
|
||
The orchestration mechanism is critical. Replacing it with a static curity without prohibitive resource requirements, even in con-
|
||
pipeline (applying all components sequentially) reduces clean accuracy strained environments, similar to lightweight security solutions
|
||
by 3.4 percentage points and robust accuracy slightly, highlighting the developed for IoT scenarios [40].
|
||
orchestrator’s role in preserving clean performance through selective
|
||
defense application. These findings suggest future adversarial robustness research should
|
||
The adaptive response layer significantly improves performance focus on integrative approaches combining multiple defense mecha-
|
||
against adaptive attacks. Without it, robustness drops to 48.9% versus nisms for enhanced effectiveness and efficiency.
|
||
58.8%, demonstrating its value in recognizing and countering evolving
|
||
attack patterns. 6.2. Real-world applications
|
||
Fig. 4 visualizes component contributions across performance met-
|
||
rics. The synergistic integration of all components achieves perfor-
|
||
ARMOR’s combination of strong robustness, reasonable computa-
|
||
mance exceeding what any individual component or simple combina-
|
||
tional overhead, and maintained clean accuracy makes it suitable for
|
||
tion could provide.
|
||
practical deployment:
|
||
6. Discussion
|
||
• Medical Imaging: ARMOR’s adaptability is valuable in health-
|
||
care applications like COVID-19 detection from CT scans [4],
|
||
6.1. Key findings and implications
|
||
where diagnostic accuracy is critical. High clean accuracy (87.5%
|
||
on CIFAR-10) and robustness help prevent costly false negatives.
|
||
Our experimental results demonstrate significant implications for
|
||
• Resource-Constrained Environments: ARMOR’s flexible over-
|
||
adversarial robustness research:
|
||
head enables deployment on edge devices and mobile platforms,
|
||
• Integration of Complementary Defenses: ARMOR’s multi- similar to efficient security schemes designed for Wireless Body
|
||
layered approach demonstrates that combining defenses yields Area Networks [40]. The minimal configuration achieves only
|
||
synergistic benefits beyond individual strengths and weaknesses. 1.10× baseline inference time, supporting real-time applications
|
||
• Dynamic Defense Allocation: The orchestration mechanism en- in bandwidth-limited settings.
|
||
ables resource-efficient defense by applying appropriate measures • Security Applications: Adaptive defenses are well-suited for mal-
|
||
based on each input’s threat profile. ware and intrusion detection domains. The framework’s ability to
|
||
• Adaptive Defenses for Evolving Threats: The adaptive response continuously update defense strategies based on observed attack
|
||
layer is essential for maintaining robustness against novel attacks, patterns is valuable against advanced persistent threats and can
|
||
unlike static, fixed approaches. be applied to infrastructure surveillance systems [5].
|
||
|
||
8
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
ARMOR’s modularity enables integration with existing security so- • Explainability and Interpretability: Improving understanding
|
||
lutions while accommodating domain-specific requirements, making it of ARMOR’s decision-making process to provide transparency
|
||
practical for real-world critical applications. about why specific defense strategies are selected for particular
|
||
inputs.
|
||
7. Conclusion • Defense Against Physical-World Attacks: Extending ARMOR
|
||
to counter physical-world adversarial attacks, which introduce
|
||
additional challenges beyond digital perturbations.
|
||
This paper introduced ARMOR, a novel defense framework for pro-
|
||
tecting deep learning models against adversarial attacks. Our approach
|
||
advances the state-of-the-art through several key innovations: CRediT authorship contribution statement
|
||
|
||
• A multi-layered architecture that orchestrates complementary de- Mahmoud Mohamed: Writing – original draft, Supervision, Soft-
|
||
fense strategies to provide synergistic protection exceeding indi- ware, Conceptualization. Fayaz AlJuaid: Writing – review & editing,
|
||
vidual methods. Validation, Resources, Methodology, Formal analysis, Data curation.
|
||
• A dynamic orchestration mechanism that routes inputs through
|
||
appropriate defensive layers based on threat assessment, optimiz- Declaration of competing interest
|
||
ing the security-efficiency trade-off.
|
||
• An adaptive response system that continuously updates defense The authors declare that they have no known competing finan-
|
||
strategies based on observed attack patterns, providing resilience cial interests or personal relationships that could have appeared to
|
||
against evolving threats. influence the work reported in this paper.
|
||
• Comprehensive evaluation across diverse attack types, including
|
||
adaptive attacks, demonstrating superior performance-security Data availability
|
||
trade-offs.
|
||
Data will be made available on request.
|
||
Extensive experimental evaluation shows ARMOR significantly out-
|
||
performs existing defenses:
|
||
References
|
||
• 91.7% attack mitigation rate (18.3% improvement over ensemble
|
||
averaging) [1] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial
|
||
• 87.5% clean accuracy preservation (8.9% improvement over ad- examples, in: International Conference on Learning Representations, ICLR, 2015.
|
||
[2] N. Carlini, D. Wagner, Towards evaluating the robustness of neural networks,
|
||
versarial training alone)
|
||
in: IEEE Symposium on Security and Privacy, (SP), 2017, pp. 39–57.
|
||
• 76.4% robustness against adaptive attacks (23.2% increase over [3] N. Akhtar, A. Mian, Threat of adversarial attacks on deep learning in computer
|
||
the strongest baseline) vision: A survey, IEEE Access 6 (2018) 14410–14430.
|
||
• Minimal 1.42× computational overhead compared to unprotected [4] O. Akinlade, E. Vakaj, A. Dridi, S. Tiwari, F. Ortiz-Rodriguez, Semantic seg-
|
||
models, substantially lower than alternative ensemble methods mentation of the lung to examine the effect of COVID-19 using UNET model,
|
||
in: Communications in Computer and Information Science, Vol. 2440, Springer,
|
||
2023, pp. 52–63, http://dx.doi.org/10.1007/978-3-031-34222-6_5.
|
||
Our results demonstrate that integrating and coordinating comple-
|
||
[5] C. Wang, O. Akinlade, S.A. Ajagbe, Dynamic resilience assessment of urban traffic
|
||
mentary defense mechanisms substantially improves adversarial robust- systems based on integrated deep learning, in: Advances in Transdisciplinary
|
||
ness. By addressing the limitations of single-dimension strategies, AR- Engineering, Springer, 2025, http://dx.doi.org/10.3233/atde250238.
|
||
MOR provides more comprehensive and sustainable protection against [6] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning
|
||
models resistant to adversarial attacks, in: International Conference on Learning
|
||
diverse and dynamic adversarial threats, moving closer to trustworthy
|
||
Representations, ICLR, 2018.
|
||
deep learning systems for high-performance, security-critical applica- [7] C. Guo, M. Rana, M. Cisse, L. Van Der Maaten, Countering adversarial im-
|
||
tions. ages using input transformations, in: International Conference on Learning
|
||
Future Directions: While ARMOR shows significant improvements, Representations, ICLR, 2018.
|
||
[8] J.H. Metzen, T. Genewein, V. Fischer, B. Bischoff, On detecting adversarial
|
||
several research directions remain:
|
||
perturbations, in: International Conference on Learning Representations, ICLR,
|
||
2017.
|
||
• Domain Expansion: Extending ARMOR to domains beyond im- [9] F. Tramèr, N. Carlini, W. Brendel, A. Madry, On adaptive attacks to adver-
|
||
age classification (e.g., natural language processing, speech recog- sarial example defenses, Adv. Neural Inf. Process. Syst. (NeurIPS) 33 (2020)
|
||
nition, reinforcement learning), which present unique attack sur- 1633–1645.
|
||
faces and defense requirements. [10] A. Athalye, N. Carlini, D. Wagner, Obfuscated gradients give a false sense
|
||
of security: Circumventing defenses to adversarial examples, in: International
|
||
• Certified Robustness: Developing theoretical guarantees for AR- Conference on Machine Learning, ICML, 2018, pp. 274–283.
|
||
MOR’s robustness. While we have strong empirical results, for- ̇ From manual to automated systematic review:
|
||
[11] D. Kalibatiene, J. Miliauskaite,
|
||
mal certification would provide stronger security assurances for Key attributes influencing the duration of systematic reviews in software en-
|
||
safety-critical applications. gineering, Comput. Stand. Interfaces 96 (2026) 104073, http://dx.doi.org/10.
|
||
1016/j.csi.2025.104073.
|
||
• Advanced Training Strategies: Investigating meta-learning
|
||
[12] Y. Dong, Q.A. Fu, X. Yang, T. Pang, H. Su, Z. Xiao, J. Zhu, Benchmarking
|
||
strategies for the orchestration policy to enable rapid adaptation adversarial robustness on image classification, IEEE Conf. Comput. Vis. Pattern
|
||
to completely novel attack types. Recognit. (CVPR) 32 (2020) 1–331.
|
||
• Online Learning Capabilities: Enhancing the adaptive response [13] T. Pang, K. Xu, C. Du, N. Chen, J. Zhu, Improving adversarial robustness via
|
||
promoting ensemble diversity, in: International Conference on Machine Learning,
|
||
layer with online learning to continuously update defense strate-
|
||
(ICML), 2019, pp. 4970–4979.
|
||
gies in real-time without periodic retraining. [14] G.R. Machado, E. Silva, R.R. Goldschmidt, Adversarial machine learning in image
|
||
• Hardware Optimization: Optimizing ARMOR for deployment classification: A survey toward the defender’s perspective, ACM Comput. Surv.
|
||
on resource-constrained hardware, especially edge devices. This 54 (5) (2021) 1–35.
|
||
could involve creating specialized versions that leverage hard- [15] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically princi-
|
||
pled trade-off between robustness and accuracy, in: International Conference on
|
||
ware acceleration for specific defense components, building on Machine Learning, ICML, 2019, pp. 7472–7482.
|
||
approaches from lightweight security schemes for IoT and Wire- [16] E. Wong, L. Rice, J.Z. Kolter, Fast is better than free: Revisiting adversarial
|
||
less Body Area Networks [40]. training, in: International Conference on Learning Representations, ICLR, 2020.
|
||
|
||
|
||
9
|
||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||
|
||
|
||
[17] S.A. Rebuffi, S. Gowal, D.A. Calian, F. Stimberg, O. Wiles, T. Mann, Fixing data [27] S. Gowal, K. Dvijotham, R. Stanforth, R. Bunel, C. Qin, J. Uesato, R. Arand-
|
||
augmentation to improve adversarial robustness, Adv. Neural Inf. Process. Syst. jelovic, T. Mann, P. Kohli, Scalable verified training for provably robust image
|
||
(NeurIPS) 34 (2021) 10213–10224. classification, in: IEEE International Conference on Computer Vision, ICCV, 2019,
|
||
[18] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, A. Madry, Robustness may be pp. 4842–4851.
|
||
at odds with accuracy, in: International Conference on Learning Representations, [28] G. Singh, T. Gehr, M. Püschel, M. Vechev, An abstract domain for certifying
|
||
ICLR, 2019. neural networks, Proc. ACM Program. Lang. 3 (POPL) (2019) 1–30.
|
||
[19] C. Xie, J. Wang, Z. Zhang, Z. Ren, A. Yuille, Mitigating adversarial effects through [29] G. Yang, T. Duan, J. Hu, H. Salman, I. Razenshteyn, J. Li, Randomized smoothing
|
||
randomization, in: International Conference on Learning Representations, ICLR, of all shapes and sizes, in: International Conference on Machine Learning, ICML,
|
||
2018. 2020, pp. 10693–10705.
|
||
[20] M. Naseer, S. Khan, M. Hayat, F.S. Khan, F. Porikli, A self-supervised approach [30] F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M.
|
||
for adversarial robustness, in: IEEE Conference on Computer Vision and Pattern Chiang, P. Mittal, M. Hein, RobustBench: a standardized adversarial robustness
|
||
Recognition, CVPR, 2020, pp. 262–271. benchmark, Adv. Neural Inf. Process. Syst. (NeurIPS) 35 (2022) 32634–32651.
|
||
[21] X. Jia, X. Wei, X. Cao, H. Foroosh, ComDefend: An efficient image compression [31] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, P. McDaniel,
|
||
model to defend adversarial examples, in: IEEE Conference on Computer Vision Ensemble adversarial training: Attacks and defenses, in: International Conference
|
||
and Pattern Recognition, CVPR, 2019, pp. 6084–6092. on Learning Representations, ICLR, 2018.
|
||
[22] K. Lee, K. Lee, H. Lee, J. Shin, A simple unified framework for detecting out- [32] S. Sen, N. Baracaldo, H. Ludwig, et al., A hybrid approach to adversarial
|
||
of-distribution samples and adversarial attacks, Adv. Neural Inf. Process. Syst. detection and defense, IEEE Int. Conf. Big Data 423 (2020) 3–4242.
|
||
(NeurIPS) 31 (2018) 7167–7177. [33] T. Pang, C. Du, J. Zhu, et al., Towards robust detection of adversarial examples,
|
||
[23] K. Roth, Y. Kilcher, T. Hofmann, The odds are odd: A statistical test for detecting Adv. Neural Inf. Process. Syst. (NeurIPS) 33 (2020) 10256–10267.
|
||
adversarial examples, in: International Conference on Machine Learning, ICML, [34] S. Kariyappa, M. Qureshi, A survey of adversarial attacks on deep learning
|
||
2019, pp. 5498–5507. in computer vision: A comprehensive review, 2019, arXiv preprint arXiv:1901.
|
||
[24] X. Ma, Y. Niu, L. Gu, Y. Wang, Y. Zhao, J. Bailey, F. Lu, Understanding 09984.
|
||
adversarial attacks on deep learning based medical image analysis systems, [35] X. Wei, B. Liang, Y. Li, et al., Adversarial distillation: A survey, IEEE Trans.
|
||
Pattern Recognit. 110 (2021) 107332. Neural Netw. Learn. Syst. (2021).
|
||
[25] N. Carlini, D. Wagner, Adversarial examples are not easily detected: Bypassing [36] A. Krizhevsky, et al., CIFAR,-10 dataset, 2009, https://www.cs.toronto.edu/kriz/
|
||
ten detection methods, in: ACM Workshop on Artificial Intelligence and Security, cifar.html.
|
||
2017, pp. 3–14. [37] Y. Netzer, et al., SVHN, dataset, 2011, http://ufldl.stanford.edu/housenumbers/.
|
||
[26] J. Cohen, E. Rosenfeld, Z. Kolter, Certified adversarial robustness via randomized [38] J. Stallkamp, et al., GTSRB, dataset, 2011, https://benchmark.ini.rub.de/gtsrb_
|
||
smoothing, in: International Conference on Machine Learning, ICML, 2019, pp. dataset.html.
|
||
1310–1320. [39] J. Deng, et al., ImageNet dataset, 2009, https://image-net.org/.
|
||
[40] Z. Ali, J. Hassan, M.U. Aftab, N.W. Hundera, H. Xu, X. Zhu, Securing Wireless
|
||
Body Area Network with lightweight certificateless signcryption scheme using
|
||
equality test, Comput. Stand. Interfaces 96 (2026) 104070, http://dx.doi.org/10.
|
||
1016/j.csi.2025.104070.
|
||
|
||
|
||
|
||
|
||
10
|
||
|