opaque-lattice/papers_txt/Refining-decision-boundaries-via-dynamic-label-adversa_2026_Computer-Standar.txt

                                                                 Computer Standards & Interfaces 97 (2026) 104111


                                                                     Contents lists available at ScienceDirect


                                                           Computer Standards & Interfaces
                                                              journal homepage: www.elsevier.com/locate/csi


Refining decision boundaries via dynamic label adversarial training for
robust traffic classificationI
Haoyu Tong a,c,d , Meixia Miao b,c,d , Yundong Liu a,c,d , Xiaoyu Zhang a,c,d                                              ,∗,

Xiangyang Luo c,d , Willy Susilo e
a
    State Key Laboratory of Integrated Service Networks (ISN), Xidian University, 710121, Xi’an, China
b School of Cyberspace Security, Xi’an University of Posts and Telecommunications, Xi’an, 710121, China
c Key Laboratory of Cyberspace Security, Ministry of Education of China, 450001, Zhengzhou, China
d Henan Key Laboratory of Cyberspace Situation Awareness, 450001, Zhengzhou, China
e
    School of Computing and Information Technology, University of Wollongong, Wollongong, Australia


ARTICLE                  INFO                               ABSTRACT

Keywords:                                                   Network traffic classification plays a critical role in securing modern communication systems, as it enables
Traffic classification                                      the identification of malicious or abnormal patterns within traffic data. With the growing complexity of
Adversarial examples                                        network environments, deep learning models have emerged as a compelling solution due to their ability to
Adversarial training
                                                            automatically learn discriminative representations from raw traffic. However, these models are highly vulner-
Label noise
                                                            able to adversarial examples, which can significantly degrade their performance by introducing imperceptible
                                                            perturbations. While adversarial training (AT) has emerged as a primary defense, it often suffers from label
                                                            noise, particularly when hard labels are forcibly assigned to adversarial examples whose true class may be
                                                            ambiguous. In this work, we first analyze the detrimental effect of label noise on adversarial training, revealing
                                                            that forcing hard labels onto adversarial examples can cause excessive shifts of the decision boundary away
                                                            from the adversarial examples, which in turn degrades the model’s generalization. Motivated by the theoretical
                                                            analysis, we propose Dynamic Label Adversarial Training (DLAT), a novel AT framework that mitigates label
                                                            noise via dynamically mixed soft labels. DLAT interpolates the logits of clean and adversarial examples
                                                            to estimate the labels of boundary-adjacent examples, which are then used as soft labels for adversarial
                                                            examples. By adaptively aligning the decision boundary toward the vicinity of adversarial examples, the
                                                            framework constrains unnecessary boundary shifts and alleviates generalization degradation caused by label
                                                            noise. Extensive evaluations on network traffic classification benchmarks validate the effectiveness of DLAT in
                                                            outperforming standard adversarial training and its variants in both robustness and generalization.


1. Introduction                                                                                  there is a growing demand for more intelligent and adaptive classi-
                                                                                                 fication methods that do not rely on payload visibility or fixed port
    Network traffic classification, which aims to determine the appli-                           mappings.
cation or service associated with observed traffic packets, flows, or                                In recent years, deep learning (DL) [9] has become a dominant
sessions, serves as a fundamental building block in a wide range of                              paradigm for network traffic classification due to its ability to auto-
networking tasks, including intrusion detection, quality-of-service man-                         matically extract the underlying representations from raw or lightly
agement, and traffic engineering [1,2]. In the early stages of network                           processed traffic data [10–14]. Compared to traditional statistical or
management, classification was carried out mainly through port-based                             machine learning approaches that rely heavily on manual feature en-
identification [3,4] and deep packet inspection (DPI) [5,6]. However,                            gineering, deep neural networks, including convolutional, recurrent,
these traditional approaches have become increasingly ineffective due                            and Transformer-based architectures, can effectively capture spatial
to the widespread use of dynamic port allocation, encrypted commu-                               and temporal patterns in traffic data, enabling high accuracy even
nication protocols, and intentional obfuscation techniques [7,8]. As                             in challenging scenarios such as previously unseen traffic. However,
network environments become more complex and security-conscious,


    I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces.
    ∗ Corresponding author at: State Key Laboratory of Integrated Service Networks (ISN), Xidian University, 710121, Xi’an, China.
    E-mail addresses: haoyutong@stu.xidian.edu.cn (H. Tong), miaofeng415@163.com (M. Miao), yundongliu@stu.xidian.edu.cn (Y. Liu),
xiaoyuzhang@xidian.edu.cn (X. Zhang), xiangyangluo@126.com (X. Luo), wsusilo@uow.edu.au (W. Susilo).

https://doi.org/10.1016/j.csi.2025.104111
Received 26 October 2025; Received in revised form 29 November 2025; Accepted 8 December 2025
Available online 13 December 2025
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
H. Tong et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104111


despite their impressive performance, deep learning-based classifiers               the adversarial example is far from the boundary, a larger weight is
remain highly susceptible to adversarial examples. These are deliber-               assigned to the clean prediction. In contrast, when it is close to the
ately crafted inputs with imperceptible perturbations that cause models             boundary, more weight is allocated to the adversarial output. This
to misclassify [15,16]. In the context of traffic classification, adversarial       similarity-guided interpolation enables precise estimation of soft labels
perturbations can manipulate flow-level features or packet sequences                for boundary-adjacent examples, which in turn facilitates more accu-
in ways that evade detection without disrupting the underlying com-                 rate adjustment of the decision boundary. By avoiding rigid supervision
munication protocols. To mitigate this vulnerability, adversarial train-            of hard labels, this adaptive labeling mechanism mitigates semantic
ing has been widely adopted as a defense mechanism by introducing                   distortion and helps the model learn more robust decision surfaces
adversarial examples during model training to enhance robustness [17].              under label noise. Our key contributions are outlined as follows:
    While adversarial training is effective in many domains, apply-
ing it to traffic classification poses unique challenges. Unlike natural                • We extend the understanding of label noise in adversarial training
image domains, traffic data distributions typically exhibit higher in-                    to the domain of network traffic classification. The compact and
trinsic dimensionality and more complex manifold structures. Different                    entangled distribution of traffic data makes it vulnerable to small
application protocols often share significant common subsequences                         perturbations, increasing the likelihood of label inconsistency in
at the byte level, creating naturally entangled features that separate                    adversarial examples. This inconsistency corresponds to a higher
classes through subtle statistical patterns rather than distinct visual                   degree of label noise, which enforces incorrect alignment and
characteristics. Furthermore, unlike images where semantic meaning                        impedes the learning of robust decision boundaries.
is often locally correlated, traffic features exhibit long-range depen-                 • We provide a theoretical characterization of how hard-label
dencies across packet sequences, making them particularly sensitive                       supervision on shifted adversarial examples induces excessive
to small, strategically placed perturbations. These characteristics cause                 movement of the decision boundary. Specifically, enforcing
even minor perturbations to readily shift traffic samples across class                    high-confidence predictions for adversarial examples distorts the
boundaries, leading to significant label noise during training. This issue                classifier, increasing the risk of misclassification for nearby exam-
is further exacerbated by standard adversarial training practices [18],                   ples from other classes.
which introduce perturbed examples into the training set while still                    • We introduce a novel adversarial training method called DLAT,
assigning them the same labels as their clean examples, thereby inten-                    which dynamically assigns soft labels to adversarial examples
sifying the semantic mismatch between the true and assigned labels.                       based on their estimated proximity to the decision boundary.
Traditional adversarial training typically enforces the original hard                     Instead of assigning uniform soft labels or incurring high compu-
label on adversarial examples. While effective to some extent, this rigid                 tational overhead through explicit boundary detection, DLAT es-
supervision introduces significant label noise, especially when adver-                    timates soft labels through interpolation between clean and ad-
sarial examples cross or approach decision boundaries. Consequently,                      versarial examples, substantially reducing the cost of label gener-
the decision boundary is pushed away from perturbed examples, often                       ation.
reinforcing the robustness of the class in which the adversarial example
is located at the expense of others. This imbalance undermines the
                                                                                    2. Related work
overall robustness of the model, particularly in tasks such as traffic
classification, where class semantics are inherently ambiguous and
                                                                                    2.1. Traffic classification
sensitive to perturbations.
    To address this issue, we propose Dynamic Label Adversarial Train-
ing (DLAT), a novel adversarial training framework designed to mit-                      Traffic classification, the task of identifying and categorizing net-
igate the adverse effects of excessive label noise in robust network                work traffic based on application types, has evolved significantly over
traffic classification. Rather than rigidly assigning the original hard             the years. Traditional methods such as port-based classification and
label to adversarial examples, DLAT constructs soft labels for examples             payload inspection (DPI) were initially dominant but became ineffec-
near decision boundaries through a similarity-guided strategy that takes            tive due to dynamic port allocation, encryption, and protocol obfusca-
advantage of the model’s output distributions. Such soft labels help                tion. Statistical and machine learning-based approaches later emerged,
guide the decision boundary toward the neighborhood of adversarial                  leveraging flow-level features (e.g., packet size, inter-arrival time) to
examples, rather than forcing it away due to overconfident and po-                  classify encrypted and unencrypted traffic. However, these methods
tentially incorrect supervision. Instead of explicitly approximating the            still relied on manual feature engineering, which is time-consuming and
decision boundary using computationally intensive techniques, such as               error prone. The advent of DNNs revolutionized traffic classification
multi-step adversarial attacks with decaying step sizes, DLAT leverages             by automating feature extraction and improving accuracy. Lotfollahi
the similarity between the output logits of clean and perturbed inputs              et al. [10] first applied deep learning to the field of traffic classification.
to estimate the soft labels of the examples near the decision boundary.             By leveraging stacked autoencoders (SAE) and CNN architectures, it
Specifically, since the similarity between their output distributions               enables automatic extraction of network traffic features and achieves
reflects how close the adversarial example lies to the current decision             efficient classification of encrypted network traffic. Subsequent studies
boundary, it serves as a reliable proxy for boundary proximity. Based               have advanced DL-based traffic classification in both accuracy and
on this similarity, DLAT interpolates between the model’s prediction on             applicability. Wang et al. [19] proposed an end-to-end 1D-CNN model
the clean and adversarial inputs. When adversarial and clean outputs                that processes raw packet bytes to capture spatial patterns, eliminating
are closely aligned, the soft label remains closer to the clean prediction;         the need for manual feature design. Lan et al. [20] combined 1D-
on the contrary, greater divergence triggers a softer supervisory signal            CNN, Bi-LSTM, and multi-head attention to classify darknet traffic,
that better reflects the model’s uncertainty regarding adversarial input.           leveraging side-channel features to enhance robustness. LEXNet [21]
This adaptive labeling mechanism mitigates the semantic distortion                  further improved deployment efficiency by introducing a lightweight
introduced by fixed-label training, thus reducing the risk of reinforcing           and interpretable CNN with residual connections and a prototype layer,
incorrect decision boundaries and improving robustness under label                  enabling real-time inference on edge devices without sacrificing ac-
noise. Specifically, since the similarity between the output distributions          curacy. Liu et al. [22] introduced an innovative hybrid architecture
of clean and adversarial examples serves as an effective proxy for their            TransECA-Net, combining ECANet-enhanced CNN modules with Trans-
proximity to the decision boundary, DLAT computes this similarity                   former encoders to simultaneously extract local channel-wise features
to guide the interpolation between their corresponding logits. When                 and global temporal dependencies.

                                                                                2
H. Tong et al.                                                                                                                    Computer Standards & Interfaces 97 (2026) 104111


2.2. Adversarial example attacks and defense                                       Truncation. To standardize the size of the input dimensions of the
                                                                                   model, we truncate the flow to the first 784 bytes:
    While deep learning has significantly advanced traffic classification,
                                                                                   𝜏𝑘 (F ) = (𝑏1 , … , 𝑏min(𝐿,𝑘) ),     𝑘 = 784.                                              (2)
it inherits the inherent vulnerabilities of DNNs and is susceptible to
adversarial example attacks. Adversarial examples are inputs delib-                Zero-Padding. For flows with 𝐿 < 784, zero-padding is applied to
erately modified with subtle perturbations that cause the model to
                                                                                   ensure uniform dimensionality:
produce incorrect predictions while remaining imperceptible to hu-                             {
man observers. This vulnerability also poses serious challenges to the                           (𝑏1 , … , 𝑏𝐿 , 0, … , 0) if 𝐿 < 784,
                                                                                   𝜋784 (F ) =                                                                                (3)
security and reliability of DL-based traffic classification systems, high-                       𝜏784 (F )                otherwise.
lighting the need for robust defense methods. Szegedy et al. [23] first
revealed this weakness by formulating an optimization problem to                   Image Mapping. The resulting 784-dimensional vector is reshaped into
find minimal perturbations that cause misclassification, attributing the           a 28 × 28 grayscale image in row-major order. We define the mapping
phenomenon to local linearity in deep networks. Goodfellow et al. [15]             𝛷 ∶ Z784
                                                                                        256
                                                                                            → Z28×28
                                                                                                256
                                                                                                     as:
introduced the Fast Gradient Sign Method (FGSM), which efficiently
                                                                                          ⎡ 𝑏1        𝑏2      ⋯        𝑏28 ⎤
generates adversarial examples by leveraging the linear approxima-                        ⎢                                 ⎥
                                                                                            𝑏        𝑏30      ⋯        𝑏56 ⎥
tion of the loss function. Kurakin et al. [24] extended FGSM to an                 𝛷(𝐟) = ⎢ 29                                ,                                               (4)
iterative version (BIM) to improve attack success. Madry et al. [17]                      ⎢ ⋮         ⋮       ⋱         ⋮ ⎥
                                                                                          ⎢𝑏         𝑏746     ⋯             ⎥
                                                                                                                       𝑏784 ⎦
further enhanced this with Projected Gradient Descent (PGD), adding                       ⎣ 745
random initialization to avoid local optima and establish a robust attack          where 𝐟 = 𝜋784 (F ) is the padded byte vector. This bijection arranges
benchmark. Carlini and Wagner [25] proposed a strong optimization-                 bytes row-by-row into a square image.
based attack C&W that effectively bypasses gradient masking defenses.
                                                                                   Normalization. Finally, pixel values are normalized to the range [0, 1]:
Sadeghzadeh [16] extends the adversarial attack to the traffic clas-
sification field and proposes adversarial pad attack and adversarial                                𝛷(𝐟)𝑖,𝑗
                                                                                    (𝛷(𝐟))𝑖,𝑗 =      .                                                  (5)
payload attack for packet and flow classification respectively, as well                          255
as adversarial burst attack for the statistical characteristics of flow time          The resulting tensor 𝑥 =  (𝛷(𝜋784 (F ))) ∈ [0, 1]28×28 is used as the
series.                                                                            input to downstream neural models.
    Adversarial training (AT) is a widely adopted defense strategy to
enhance DNNs’ robustness against such adversarial attacks by incor-                3.2. Notion
porating adversarial examples into the training process. Proposed by
Goodfellow et al. [15], AT initially used FGSM adversarial examples                    Let 𝒙 ∈ [0, 1]28×28 denote the resulting input image. The neural net-
combined with clean examples for optimization. Madry et al. [17]                   work takes 𝒙 as input and outputs either class predictions (e.g., traffic
showed that stronger PGD-based adversarial examples provide better                 type or application label) or binary decisions (e.g., benign vs. mali-
robustness through a min–max optimization. However, PGD training                   cious), depending on the task. Consider a 𝐾-class classification task on
often leads to overfitting on adversarial examples and reduced accu-               the dataset  = {(𝒙𝑖 , 𝒚 𝑖 )}𝑁
                                                                                                                𝑖=1
                                                                                                                    where 𝒙𝑖 are preprocessed network traffic
racy on clean data, highlighting a trade-off between robustness and                and 𝒚 𝑖 ∈  = {1, … , 𝐾} are class labels. We consider a parameterized
generalization. To address this, Zhang et al. [26] introduced TRADES to            model 𝑓𝜽 ∶ [0, 1]28×28 →  that maps a normalized grayscale image 𝑥
balance this trade-off with a regularized loss. Wang et al. [27] proposed          to a probability distribution over classes (i.e., 𝒑 = 𝑓𝜽 (𝒙)) and the final
MART, which treats misclassified examples differently to enhance ro-               predicted label is obtained by 𝒚̂ = arg max𝑘 𝒑𝑘 . We then denote the
                                                                                   standard loss function in the standard training process:
bustness. Dong et al. [28] developed AWP, combining input and weight
                                                                                                  1 ∑
perturbations to flatten the loss landscape and further reduce robust                                 𝑁

error. However, the aforementioned methods were originally proposed                𝑠𝑡 (𝜽, ) =         𝓁(𝑓𝜽 (𝒙𝑖 ), 𝒚 𝑖 ),                                                    (6)
                                                                                                  𝑁 𝑖=1
for image classification tasks and are not specifically designed for
robust traffic classification. Directly applying these methods to traffic          where 𝑁 is the number of the training data, and 𝓁(⋅) denotes a loss
classification may not yield optimal results. For example, adversarial             function that measures the discrepancy between the model prediction
training applied to traffic data frequently induces substantial label              and the ground-truth label (e.g., cross-entropy).
noise, and inadequate management of such noise can considerably
hinder the enhancement of model robustness.                                        3.3. Adversarial attack

                                                                                       Deep learning models are known to be vulnerable to adversar-
3. Preliminaries
                                                                                   ial examples perturbed by imperceptible noise that induce incorrect
                                                                                   predictions. Network traffic classifiers based on deep learning inherit
3.1. Pre-processing                                                                this vulnerability: small, carefully designed perturbations can cause
                                                                                   significant degradation in classification performance. Formally, given
   Consider a raw network traffic flow as a discrete byte-level se-                a trained model 𝑓𝜃 ∶ [0, 1]28×28 →  and a clean input 𝑥, an adversary
quence of arbitrary length. Formally, a raw traffic flow is defined as             aims to craft a perturbed input 𝑥′ = 𝑥 + 𝛿 such that:
a variable-length sequence:
                                                                                    Minimize       ‖𝛿‖𝑝 ,
F = (𝑏1 , 𝑏2 , … , 𝑏𝐿 ),                                                (1)        subject to:     𝑓𝜽 (𝒙 + 𝛿) = 𝒚 𝑡𝑎𝑟𝑔𝑒𝑡 ,                                                    (7)
                                                                                                                      28×28
where 𝐿 ∈ N+ denotes the sequence length, and each byte 𝑏𝑖 ∈                                       𝒙 + 𝛿 ∈ [0, 1]             ,
Z256 = {0, 1, … , 255}. The flow F thus resides in the input space                 where 𝛿 denotes the adversarial perturbation and ‖ ⋅ ‖𝑝 (𝑝 ∈ {0, 1, 2, ∞})
      ⋃
 ∶= ∞       𝑘
        𝑘=1 Z256 , which encompasses all finite-length byte sequences.             quantifies perturbation magnitude. For traffic image inputs, 𝑥′ = 𝑥 + 𝛿
   Following the methodology proposed by [19], each raw traffic flow               maintains the structural properties of legitimate traffic while causing
F is standardized to a fixed length of 784 bytes to enable batch process-          misclassification. Under a white-box threat model where adversaries
ing and compatibility with convolutional neural networks. Specifically,            possess full knowledge of both the preprocessing pipeline 𝛹 and clas-
the transformation pipeline 𝛹 ∶  → Z28×28
                                        256
                                              consists of:                         sifier parameters 𝜃, attacks are executed directly in the image domain.

                                                                               3
H. Tong et al.                                                                                                               Computer Standards & Interfaces 97 (2026) 104111


Crucially, the perturbation is constrained within the payload region of          flow (or packet) and 𝒙′ = 𝒙 + 𝛿 be its adversarial example. In standard
the traffic image, rather than the padding area.                                 adversarial training, each sample is annotated with a hard label 𝒚,
                                                                                 while the underlying ground-truth semantics are better represented by
Payload-Constrained Perturbation. To ensure semantic fidelity when
                                                                                 a softer distribution P(𝑌 ∣ 𝒙), especially for adversarial examples lying
mapping perturbed inputs back to the traffic domain, the adversarial
perturbation 𝛿 is restricted to the non-padding (i.e., payload) region:          close to the decision boundary. This inherent discrepancy between
                                                                                 the hard label and the true soft distribution can be regarded as label
 = {(𝑖, 𝑗) ∣ 28(𝑖 − 1) + 𝑗 ≤ 𝐿} ,                                    (8)        noise. Under adversarial perturbations 𝒙′ , such mismatches are further
                                                                                 amplified, leading to a higher effective label noise rate, which we define
where  denotes the set of pixels corresponding to the original 𝐿 bytes
                                                                                 as
of the flow F . During attack iterations, any updates falling outside
                                                                                              1 ∑ [
                                                                                                 𝑁
 are explicitly zeroed out. While this constraint does not achieve                                                            ]
the theoretically optimal adversarial perturbation, it aligns with re-           𝑝𝑒 (′ ) =         I 𝒚 𝑖 ≠ arg max P(𝑌 ∣ 𝒙′𝑖 ) ,                                      (12)
                                                                                              𝑁 𝑖=1
alistic payload limitations in network traffic and therefore produces
semantically faithful perturbations that are more suitable for practical         where ′ = (𝒙′𝑖 , 𝒚 𝑖 ) denotes the adversarial training set, and P(𝑌 ∣ 𝒙′𝑖 )
deployment. In this work, we adopt the PGD (Projected Gradient De-               reflects the (unknown) ground-truth label distribution of the perturbed
scent) [17] as our primary adversarial method. Specifically, we perform          input. Such excessive label noise disrupts the supervision learning,
iterative updates on the input image within the allowed perturbation             preventing the model from accurately learning the underlying discrim-
budget 𝜖 and constrain the perturbation to the valid traffic region :           inative features of the data. As a result, the classifier may overfit
                (             (    (            )))                              to incorrect labels or adversarial patterns rather than the true class
𝒙𝑡+1 = 𝛱𝜖 (𝒙)∩ 𝒙𝑡 + 𝛼 ⋅ sign ∇𝒙  𝑓𝜽 (𝒙𝑡 ), 𝒚     ,                (9)
                                                                                 semantics. This issue is particularly critical in adversarial training for
where  denotes the loss function, 𝛱 is the projection operator that             traffic classification, where decision boundaries between classes are
restricts the updated input to the intersection of the valid region  and        inherently subtle and highly sensitive to small perturbations.
the 𝓁𝑝 -ball of radius 𝜖 centered at 𝒙, and 𝛼 is the step size.
                                                                                 4.2. Impact of label noise on decision boundary robustness
3.4. Adversarial training
                                                                                     Adversarial training assumes that the label of an adversarial ex-
   One of the most effective defenses against adversarial attacks is
                                                                                 ample remains unchanged from its clean example. However, when
adversarial training (AT), which enhances model robustness by incor-
                                                                                 an adversarial example crosses the decision boundary into a region
porating adversarial examples into the training process. Specifically, it
                                                                                 semantically aligned with a different class, assigning it the original
formulates the training objective as a min–max optimization:
                                                                                 label introduces semantic inconsistency. We formalize this effect in a
     1 ∑
        𝑁
                     (              )                                            binary classification setting. Let the input space be  ⊂ R𝑑 and the
min         max 𝓁 𝑓𝜽 (𝒙𝑖 + 𝛿𝑖 ), 𝒚 𝑖 ,                      (10)
 𝜽 𝑁       ‖𝛿𝑖 ‖𝑝 ≤𝜖                                                             label space be  = {𝐴, 𝐵}. Consider a classifier 𝑓𝜽 ∶  → [0, 1],
       𝑖=1
   For network traffic classifiers, we extend this paradigm with                 where 𝑓𝜽 (𝒙) denotes the predicted probability of class 𝐴, and 1 − 𝑓𝜽 (𝒙)
payload-aware constraints:                                                       is the probability of class 𝐵. The decision boundary is defined by the
                                                                                 hypersurface 𝜽 = {𝒙 ∈  ∣ 𝑓𝜽 (𝒙) = 0.5}. We consider an adversarial
     1 ∑
         𝑁
min        max 𝓁(𝑓𝜽 (𝒙𝑖 + 𝛿), 𝒚 𝑖 )                               (11)           example 𝒙′ generated from a clean input 𝒙 of class 𝐴, such that 𝒙′ lies in
 𝜽 𝑁       𝛿 ∈
        𝑖=1 𝑖 𝑖                                                                  the classification region of class 𝐵, i.e., 𝑓𝜽 (𝒙′ ) < 0.5. During adversarial
              {                                     }                            training, if 𝒙′ is labeled as 𝐴 (i.e., the same as 𝒙), then minimizing
where 𝑖 = 𝛿 ∣ ‖𝛿‖𝑝 ≤ 𝜖 and 𝛿(𝑖,𝑗) = 0, ∀(𝑖, 𝑗) ∉ 𝑖 is the constraint
set for the 𝑖th example.                                                         the loss on 𝒙′ pushes the decision boundary toward class 𝐵, potentially
                                                                                 degrading the robustness of that class.
4. Label noise
                                                                                 Definition 1 (Margin Distance). Given a example 𝒙 ∈  and a classifier
    Label noise in adversarial training refers to the semantic mismatch          𝑓 ∶  → [0, 1], the margin distance from 𝒙 to the decision boundary
between the assigned labels and the true labels of adversarial examples.          = {𝒙 ∈  ∣ 𝑓 (𝒙) = 0.5} is defined as:
As first proposed by Dong et al. [18], this phenomenon arises from
                                                                                 𝑑𝑖𝑠𝑡(𝒙, ) = 𝑚𝑖𝑛 ‖𝒙 − 𝒙‖𝑝 .                                                          (13)
the practice of assigning adversarial examples the same labels as their                          𝒙 ∈
clean input. Given a clean input-label pair (𝒙, 𝒚), adversarial training
constructs a perturbed input 𝒙′ = 𝒙 + 𝛿 and assigns it the original              Theorem 1 (Excessive Boundary Shift Induced by Hard-Label Adversarial
label 𝒚 during training. However, the true label of 𝒙′ may differ due            Training ). Consider a binary classifier 𝑓 ∶  → [0, 1], with the pre-training
to the semantic distortion introduced by the adversarial perturbation            decision boundary defined as:
𝛿. This distributional shift is especially detrimental to learning robust
representations, as it misguides the optimization process.                       pre = {𝒙 ∈  ∣ 𝑓pre (𝒙) = 0.5}.                                                      (14)

                                                                                 Suppose 𝒙𝐴 ∈ 𝐴 is a clean example from class A and 𝒙′𝐴 = 𝒙𝐴 + 𝛿 is an
4.1. Amplified label noise in robust traffic classification
                                                                                 adversarial example generated to cross pre , i.e., 𝑓pre (𝒙′𝐴 ) < 0.5. Let 𝑓post be
    While label noise poses a general challenge in adversarial training,         the classifier obtained via hard-label adversarial training using (𝒙′𝐴 , 𝑦𝐴 ) as
it becomes even more prominent in the context of robust network                  supervision, where 𝑦𝐴 = 1. Then, under hard-label supervision, the training
traffic classification. Unlike image data, where semantic changes are            objective enforces high-confidence predictions for 𝒙′𝐴 , i.e.,
often human-perceivable, traffic data is inherently opaque and lacks
                                                                                 𝑓post (𝒙′𝐴 ) ≫ 0.5,                                                                   (15)
intuitive visual features. Consequently, different classes of traffic data
are compactly distributed and highly entangled, small perturbations in           which necessarily implies that the new decision boundary post = {𝒙 ∣
the byte-level input space can lead to disproportionately large semantic         𝑓post (𝒙) = 0.5} must satisfy
changes that are not easily detectable by human inspection. In such a
scenario, the probability of label mismatch between clean and adversar-                                 𝑓post (𝒙′𝐴 ) − 0.5
                                                                                 dist(𝒙′𝐴 , post ) =                        .                                         (16)
ial examples increases. Let 𝒙 be the image representation of a network                                  ‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝

                                                                             4
H. Tong et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104111


                                                                                 (0.5, 0.5) to guide adversarial training. However, in multi-classification,
                                                                                 it is difficult to determine the soft labels of the examples near the deci-
                                                                                 sion boundary, and the boundary may be the intersection of decisions of
                                                                                                                                     1    1         1
                                                                                 multiple classes, and using soft labels such as ( || , || , … , || ) does not
                                                                                 fit the shape of the decision boundary well. A natural solution would be
                                                                                 to find the examples near the current decision boundary that are within
                                                                                 the same class as the original class of the adversarial example, and
                                                                                 use the model’s output about them as a soft label. However, explicitly
                                                                                 detecting the decision boundary via iterative adversarial attacks is
                                                                                 computationally expensive. Instead, DLAT capitalizes on the fact that
                                                                                 the decision boundary must lie within the space between clean and
                                                                                 adversarial examples, using a lightweight interpolation mechanism to
                                                                                 approximate the soft labels of boundary-adjacent examples.

                                                                                 5.2. Method design
   Fig. 1. Decision boundary changes: Hard-Label AT vs. Soft-Label DLAT.
                                                                                     In order to accurately estimate the soft label of the examples near
                                                                                 the decision boundary, we first need to determine the proximity of
                                                                                 the adversarial examples to the current decision boundary, when the
    In typical cases where 𝑓post (𝒙′𝐴 ) → 1, the post-training boundary          adversarial examples are farther away from the decision boundary, the
moves far beyond 𝒙′𝐴 in the direction of class B. As a result, many              output logits of the clean examples are given higher weight for interpo-
nearby class-B examples 𝒙𝐵 ∈ 𝐵 satisfying 𝒙𝐵 ≈ 𝒙′𝐴 may fall                     lation in order to adjust the timely adjustment of the decision boundary
into the wrong side of the decision boundary, resulting in increased             to the vicinity of the adversarial examples, and on the contrary, the
misclassification. The detailed proof can be found in Appendix.                  adversarial examples are given higher weight for interpolation to be
    Although Theorem 1 is formulated in a binary classification setting          able to prevent the adjusted decision boundary from crossing too much
for analytical clarity, the underlying insights naturally extend to multi-       distance from the adversarial examples.
class scenarios. In the multi-class case, a classifier defines multiple
decision boundaries between classes. Hard-label adversarial training on          Algorithm 1: Dynamic Label Adversarial Training
an adversarial example 𝒙′ with true label 𝑦 forces an increase in the             1 Input: Network traffic dataset 𝐷; Learning rate 𝜂; Total
logit margin:                                                                        training epochs 𝑇 ; Model architecture 𝑓
                                                                                  2 Initialize model 𝑓 with parameters 𝜽                 // Model
𝑧𝑦 (𝒙) − 𝑧𝑘 (𝒙),   ∀𝑘 ≠ 𝑦,                                            (17)            initialization
which effectively pushes the decision boundaries of all other classes             3 for 𝑖 ∈ [𝑇 ] do
away from 𝒙′ . When 𝒙′ lies near the intersection of multiple class re-           4     foreach batch (𝑿, 𝒀 ) ∈ 𝐷 do
gions, this aggressive supervision disproportionately expands the region          5         𝑿 ′ ← 𝑃 𝐺𝐷(𝑓 , 𝑿, 𝒀 )    // Adversarial example
of class 𝑦 at the expense of compressing neighboring class regions,                           generation
analogous to the boundary distortion shown in the binary case.                    6          𝑶 ← 𝑓 (𝑿)
    Our dynamic label assignment mitigates this issue by relaxing                 7          𝑶′ ← 𝑓 (𝑿 ′ )
the overconfident supervision for adversarial examples near decision              8          𝐾𝐿 ← 𝐷𝑖𝑣(𝑶, 𝑶′ )                   // KL-based distance
boundaries. Rather than forcing 𝒙′ deep into the original decision field,                     computation
the interpolated target 𝒚 mix the interpolated target 𝒚 mix guides a more         9          𝛼 ← tanh(𝐾𝐿)+1
                                                                                                      2
appropriate adjustment of the decision boundaries. This calibrated               10          𝒀 𝑚𝑖𝑥 ← (1 − 𝛼) ⋅ 𝑶′ + 𝛼 ⋅ 𝑶               // Mixing label
supervision prevents the excessive boundary shift described in Theorem                        construction
1, enabling the model to maintain robustness in practical multi-class            11       adv ← 𝐷𝑖𝑣(𝑶′ , 𝒀 𝑚𝑖𝑥 )
traffic classification tasks.                                                    12       clean ← CE (𝑶, 𝒀 )
                                                                                 13       total ← adv + clean
5. Dynamic label adversarial training                                            14       𝜽 ← 𝜽 − 𝜂 ⋅ ∇𝜽 total                         // Model update
                                                                                 15    end
    Motivated by the analysis of label noise on the robustness of adver-         16 end
sarial training in Section 4, we propose DLAT (Dynamic Label Adversar-
ial Training), an adversarial training strategy that efficiently improves           Given a clean example 𝒙 and its adversarial example 𝒙′ = 𝒙 + 𝛿, let
adversarial robustness utilizing dynamically mixed soft labels.                  𝑓 denote the classifier with outputs 𝑶 = 𝑓 (𝒙) and 𝑶′ = 𝑓 (𝒙′ ). Since the
                                                                                 mapping between clean examples and hard labels can be established
5.1. Design inspiration                                                          soon by training, we can utilize the Kullback–Leibler (KL) divergence to
                                                                                 quantify the distance between the adversarial example and the decision
    In traditional adversarial training, assigning hard labels to adver-         boundary:
sarial examples introduces significant label noise, since the true label                         ∑                sof tmax(𝑶𝑖 )
of an adversarial example may differ from its clean counterpart. This            𝐷𝑖𝑣(𝑶, 𝑶′ ) =       sof tmax(𝑶𝑖 ) log           .                   (18)
                                                                                                𝑖                 sof tmax(𝑶′𝑖 )
label noise forces the decision boundary to move far away from these
                                                                                     Higher 𝐷𝑖𝑣 typically indicates larger distortion and label noise. To
examples, as shown in Fig. 1, ultimately leading to degraded model
                                                                                 obtain a stable and responsive mixing factor 𝛼 ∈ [0, 1], we normal-
robustness. To address this issue, the first step is to mitigate label
                                                                                 ize 𝐷𝑖𝑣(𝑶, 𝑶′ ) using the tanh function, which provides a smooth and
noise. According to Theorem 1 and Section 4.1, using soft labels can
                                                                                 symmetric mapping and naturally bounds the output. Accordingly, we
effectively reduce such label noise, thereby preventing the decision
                                                                                 define:
boundary from over-shifting. In binary classification, this corresponds                   (           )
to adjusting the boundary toward the neighborhood of the adversarial                  tanh 𝐷𝑖𝑣(𝑶, 𝑶′ ) + 1
                                                                                 𝛼=                        .                                         (19)
examples, which can be achieved by assigning a soft label such as                               2

                                                                             5
H. Tong et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104111


   This factor interpolates between 𝑶′ and 𝑶 to form the mixed soft              Table 1
label:                                                                           The balanced ISCX-VPN dataset.
                                                                                  Type                Imbalanced dataset    Imbalanced dataset
𝒚 𝑚𝑖𝑥 = (1 − 𝛼) ⋅ 𝑶′ + 𝛼 ⋅ 𝑶.                                        (20)
                                                                                                      Total number          Training set number    Test set number
    The training objective of DLAT combines two components. The first             VPN_Chat            7946                  1500                   200
is a KL divergence loss that aligns the model’s prediction on 𝒙′ with             VPN_Email           596                   1500                   59
                                                                                  VPN_File Transfer   1898                  1500                   189
𝒚 𝑚𝑖𝑥 to improve the model robustness:
                                                                                  VPN_P2P             912                   1500                   91
            (        )                                                            VPN_Streaming       1199                  1500                   119
adv = 𝐷𝑖𝑣 𝑶′ , 𝒚 𝑚𝑖𝑥 ,                                          (21)
                                                                                  VPN_VoIP            20 581                1500                   200
where the second is a cross-entropy loss that is used to allow the
model to learn generalization knowledge and improve clean example                Table 2
classification accuracy:                                                         The balanced CICIoT2022 dataset.
            ∑
clean = −     𝒚 𝑖 log sof tmax(𝑶𝑖 ).                         (22)                Type                Imbalanced dataset    Imbalanced dataset
                 𝑖                                                                                    Total number          Training set number    Test set number
   The overall loss is formulated as:
                                                                                  VPN_Chat            7946                  1500                   200
       [                                              ]                           VPN_Email           596                   1500                   59
min max adv (𝑓𝜽 (𝒙 + 𝛿), 𝒚 𝑚𝑖𝑥 ) + clean (𝑓𝜃 (𝒙), 𝒚) .             (23)
 𝜽 𝛿𝑖 ∈𝑖                                                                         VPN_File Transfer   1898                  1500                   189
    By dynamically adapting label softness based on Eq. (18)–(20) and             VPN_P2P             912                   1500                   91
                                                                                  VPN_Streaming       1199                  1500                   119
balancing loss components Eq. (21)–(23), DLAT mitigates excessive
                                                                                  VPN_VoIP            20 581                1500                   200
boundary shift caused by label noise, enabling models to learn robust
decision boundaries for tasks like traffic classification. The pseudo-code
for DLAT is presented on Algorithm 1.                                            Table 3
                                                                                 The balanced ISCX-ALL dataset.
6. Experiments                                                                    Type                Imbalanced dataset    Imbalanced dataset
                                                                                                      Total number          Training set number    Test set number
    In this section, we perform a wide variety of comprehensive ex-               Chat                7681                  5400                   600
periments to evaluate the performance of DLAT on both clean and                   Email               6459                  5400                   600
adversarial traffic. These evaluations are carried out on two datasets            File Transfer       7405                  5400                   600
                                                                                  P2P                 1849                  1652                   184
and compared against four state-of-the-art adversarial training methods
                                                                                  Streaming           3936                  3540                   393
in the computer vision field.                                                     VoIP                19 597                5400                   600
                                                                                  VPN_Chat            7946                  5400                   600
6.1. Experiment setup                                                             VPN_Email           596                   538                    59
                                                                                  VPN_File Transfer   1898                  1754                   189
                                                                                  VPN_P2P             912                   830                    91
Datasets. Experiments are performed using the ISCX VPN-nonVPN                     VPN_Streaming       1199                  1108                   119
                                                                                  VPN_VoIP            20 581                5400                   600
dataset [29] and the CICIoT2022 dataset [30]. The former includes
encrypted and unencrypted traffic, while the latter focuses on IoT-
related scenarios with both benign and malicious behaviors. We con-
struct three experimental settings from those datasets. The first, re-
ferred to as ISCX-VPN, includes six categories of encrypted VPN traffic:
                                                                                 Evaluation Metrics. In our experiments, we adopt two primary evalua-
VPN_Chat, VPN_Email, VPN_File Transfer, VPN_P2P, VPN_Streaming,
                                                                                 tion metrics to assess the effectiveness of DLAT: the Robust Classification
and VPN_VoIP. The second setting, named ISCX-ALL, expands the clas-
                                                                                 Accuracy (RCC) and the Clean Sample Accuracy (ACC). ASR measures
sification scope to twelve categories by incorporating six VPN and six
                                                                                 the proportion of adversarial traffic that successfully fools the model,
non-VPN traffic types. The third setting, derived from the CICIoT2022
                                                                                 indicating the robustness of the defense mechanism under adversarial
dataset, defines a six-class classification task encompassing typical
                                                                                 attacks. A lower RCC implies stronger robustness. In contrast, ACC
IoT device states and activities. The categories include: Power, Idle,
                                                                                 evaluates the classification accuracy on clean, unperturbed traffic, re-
Interactions, Scenarios, Active, and Attacks. Since the original datasets
                                                                                 flecting the model’s predictive performance under normal conditions.
exhibit significant class imbalance, we first split the data into training
                                                                                 A higher ACC indicates better generalization and utility in benign
and testing sets with a 9:1 ratio, and then apply class-wise balancing
                                                                                 settings. We report both metrics to provide a comprehensive assessment
separately within each subset to ensure a relatively balanced class
distribution. The statistics of the balanced datasets are summarized in          of the trade-off between robustness and standard accuracy.
Table 1, 2 and 3.                                                                Baselines. We compare DLAT to the following representative ad-
Training. We adopt two representative neural network architectures as            versarial training baselines, including PGD-AT [17], TRADES [26],
backbone models: PreActResNet [31], DenseNet [32], MobileNet [33],               MART [27], and AWP [28]. All baseline methods are implemented
WideResNet [34], and FFNN (Feed-Forward Neural Network) [35].                    following their original settings. For TRADES, the trade-off parameter
Both models are trained for 80 epochs using the momentum-based                   𝜆 is set to 1∕6, as suggested in the original paper. For AWP, the weight
stochastic gradient descent (MSGD) [36], with a momentum coefficient             perturbation step size 𝛾 is set to 0.01. Unlike those training methods,
of 0.9 and a weight decay of 5 × 10−4 . The initial learning rate is set         which still rely on hard labels and thus remain sensitive to mislabeled
to 0.1, and a multi-stage learning rate decay strategy is applied: the           data, DLAT explicitly incorporates soft-label supervision, making it
learning rate is reduced by a factor of 10 at the 40th epoch.                    more robust under label noise.

Attack and defense settings. For adversarial evaluation, we adopt the
                                                                                 6.2. The effectiveness of DLAT
widely used PGD-20 under the 𝓁∞ norm constraint. The perturbation
radius 𝜖 is set to 24∕255, and the step size 𝛼 is 4∕255. For generating
adversarial examples used in adversarial training, we employ PGD-10              Clean accuracy assessment. As shown in Table 4, the normal model
under the same 𝓁∞ -bounded perturbation settings.                                trained without adversarial defenses achieves the highest ACC across

                                                                             6
H. Tong et al.                                                                                                                   Computer Standards & Interfaces 97 (2026) 104111


Table 4
The clean sample accuracy (ACC) and robust classification accuracy (RCC) of different adversarial training methods across four network architectures: ResNet,
DenseNet, MobileNet, WideResNet, and FFNN on the ISCX-VPN, ISCX-ALL and CICIoT2022 datasets (%).
 Dataset         Method   Model

                          ResNet                        DenseNet                      MobileNet                     WideResNet                      FFNN

                          ACC            RCC            ACC            RCC            ACC            RCC            ACC              RCC            ACC             RCC

                 Normal   99.02 ± 0.30   0.00 ± 0.00    99.92 ± 0.08   0.67 ± 0.09    99.17 ± 0.00   3.58 ± 0.14    99.75 ± 0.00     0.83 ± 0.07    98.25 ± 0.00    7.67 ± 0.58
                 PGD-AT   98.72 ± 0.18   96.32 ± 0.29   96.02 ± 0.23   91.00 ± 0.72   97.87 ± 0.25   90.00 ± 2.69   99.35 ± 0.08     96.01 ± 0.11   97.25 ± 0.24    87.00 ± 0.81
                 TRADES   96.75 ± 0.37   94.62 ± 0.30   92.98 ± 0.29   89.92 ± 0.15   93.18 ± 0.44   85.35 ± 3.38   97.92 ± 0.24     96.03 ± 0.18   92.02 ± 0.41    83.68 ± 0.87
 ISCX-VPN
                 MART     98.08 ± 0.43   94.20 ± 0.59   82.65 ± 0.72   78.90 ± 0.53   80.83 ± 1.76   70.85 ± 1.74   98.51 ± 0.19     92.72 ± 0.17   93.28 ± 0.20    84.58 ± 0.60
                 AWP      98.18 ± 0.17   96.22 ± 0.17   95.40 ± 0.33   92.92 ± 0.09   93.40 ± 0.42   90.10 ± 0.49   73.82 ± 0.46     72.18 ± 0.54   95.63 ± 0.24    88.32 ± 0.29
                 DLAT     98.83 ± 0.09   96.53 ± 0.08   98.77 ± 0.26   93.93 ± 0.42   98.20 ± 0.10   93.07 ± 0.47   99.08 ± 0.05     96.38 ± 0.36   96.88 ± 0.17    86.37 ± 0.30

                 Normal   93.95 ± 4.36   2.04 ± 1.06    96.70 ± 2.11   0.23 ± 0.07    91.52 ± 4.99   3.74 ± 0.12    96.22 ± 1.48     7.23 ± 0.48    88.48 ± 0.27    1.61 ± 0.21
                 PGD-AT   88.56 ± 0.10   87.34 ± 0.20   82.96 ± 0.26   80.61 ± 0.30   82.19 ± 0.24   78.87 ± 0.73   88.63 ± 0.03     86.12 ± 2.89   83.00 ± 0.34    77.23 ± 0.29
                 TRADES   88.31 ± 0.13   86.19 ± 0.45   79.19 ± 1.12   73.98 ± 3.39   80.39 ± 0.80   75.26 ± 2.93   87.32 ± 1.41     84.90 ± 2.54   76.47 ± 1.90    71.01 ± 0.75
 ISCX-ALL
                 MART     88.19 ± 0.18   86.33 ± 0.51   77.22 ± 0.19   76.08 ± 0.22   80.78 ± 0.33   77.79 ± 0.31   87.67 ± 0.12     86.10 ± 0.45   75.99 ± 0.64    69.95 ± 1.79
                 AWP      86.31 ± 0.11   85.44 ± 0.10   78.00 ± 0.19   76.43 ± 0.48   78.83 ± 0.07   77.58 ± 0.16   85.85 ± 0.12     84.71 ± 0.05   81.30 ± 0.21    76.91 ± 0.21
                 DLAT     89.44 ± 0.32   86.68 ± 0.40   88.83 ± 0.80   82.18 ± 0.43   84.35 ± 0.36   75.84 ± 1.27   88.71 ± 0.02     87.14 ± 0.41   86.79 ± 0.26    74.32 ± 0.81

                 Normal   99.82 ± 0.32   0.04 ± 0.01    99.73 ± 0.01   0.63 ± 0.02    98.50 ± 2.59   0.00 ± 0.00    99.99 ± 0.00     0.56 ± 0.01    99.67 ± 0.06    0.12 ± 0.06
                 PGD-AT   99.27 ± 0.08   96.26 ± 3.18   98.20 ± 0.02   96.86 ± 0.44   98.20 ± 0.79   97.65 ± 0.47   99.46 ± 0.21     93.73 ± 0.46   83.32 ± 2.40    81.36 ± 2.58
                 TRADES   98.35 ± 0.82   98.90 ± 0.57   98.04 ± 0.00   97.81 ± 1.36   98.05 ± 0.31   91.38 ± 0.74   98.06 ± 0.02     97.62 ± 0.19   96.84 ± 0.11    89.20 ± 0.27
 CICIoT2022
                 MART     98.19 ± 0.02   96.37 ± 2.27   98.05 ± 0.31   95.50 ± 0.50   98.06 ± 0.28   95.20 ± 0.40   99.00 ± 0.05     97.00 ± 0.10   98.20 ± 0.20    91.28 ± 1.50
                 AWP      98.25 ± 0.10   96.50 ± 0.20   98.10 ± 0.15   96.00 ± 0.25   98.15 ± 0.12   95.50 ± 0.30   99.10 ± 0.05     98.00 ± 0.10   98.00 ± 0.15    90.10 ± 0.50
                 DLAT     99.70 ± 0.02   99.20 ± 0.12   98.89 ± 0.17   97.12 ± 0.24   98.06 ± 0.28   97.88 ± 0.14   99.66 ± 0.02     98.99 ± 0.11   98.87 ± 0.09    91.93 ± 0.86


  Fig. 2. The robust classification accuracy (RCC) of DLAT under 𝓁1 and 𝓁2 norm-bounded PGD-20 attacks on datasets ISCX-VPN, ISCX-ALL and CICIot2022.


all architectures, ranging from 98.25% to 99.92% on ISCX-VPN, from                          notably outperforming PGD-AT, TRADES, MART, and AWP, with top
88.48% to 96.70% on ISCX-ALL, and from 98.50% to 99.99% on                                  results exceeding 96% on ResNet and WideResNet. Similarly, on ISCX-
CICIot2022. However, it fails completely under adversarial attacks,                         ALL and CICIot2022, it maintains leading robustness, achieving up to
with robustness classification accuracy (RCC) close to zero. In the                         87.14% and 98.99% RCC on WideResNet and surpassing competing
table, boldface highlights the best performance for each metric, while                      methods by a clear margin. These findings underscore the superior
underlining indicates the second-best. Compared to the normal model,                        robustness of DLAT while retaining competitive clean accuracy.
adversarial training methods such as PGD-AT, TRADES, and MART                                   Secondly, to further assess the robustness of DLAT against unseen
significantly improve robustness, albeit at the cost of decreased clean                     adversarial threats, we evaluate its robustness under a diverse set
accuracy. Specifically, PGD-AT maintains relatively higher ACC (e.g.,                       of attack methods, including adversarial perturbations constrained by
98.72% on ResNet and 88.56% on ISCX-ALL, while TRADES and MART                              different norm bounds (i.e., 𝓁1 and 𝓁2 norms) as well as FGSM [15],
show larger reductions in ACC on clean examples). Our method, DLAT,                         PGD-100 [17], and AutoAttack [37]. We first report the performance
consistently achieves competitive ACC, reaching up to 98.83% on                             of DLAT under 𝓁1 - and 𝓁2 -bounded PGD-20 attacks on the ISCX-
ResNet and 89.44% on ISCX-ALL, surpassing all baselines on ISCX-                            VPN, ISCX-ALL, and CICIot2022 datasets, as illustrated in Fig. 2. Each
ALL and maintaining top-tier accuracy on ISCX-VPN and CICIot2022.                           heatmap visualizes the RCC achieved by five different models un-
These results demonstrate that DLAT effectively enhances robustness                         der increasing perturbation radii. It can be observed that DLAT ex-
with minimal compromise to clean performance.                                               hibits strong robustness under both 𝓁1 - and 𝓁2 -bounded PGD-20 at-
Robust accuracy assessment. We first evaluate the RCC of various ad-                        tacks. Notably, the defense is more effective against 𝓁1 -norm pertur-
versarial training methods under adversarial attacks. As shown in Table                     bations, as indicated by the overall darker color tones in the corre-
4, adversarial training markedly improves RCC compared with the nor-                        sponding heatmaps. This suggests that DLAT better preserves classi-
mal model, which exhibits near-zero robustness. Among the compared                          fication performance when facing sparse but high-magnitude pertur-
methods, DLAT consistently surpasses most baselines in the majority of                      bations. Among the evaluated models, ResNet and DenseNet generally
cases across both datasets and network architectures. Specifically, on                      exhibit higher RCC scores across both norm types and datasets, with
ISCX-VPN, DLAT attains RCC scores above 86% across all architectures,                       RCC remaining above 0.8 under moderate 𝓁1 perturbations (e.g., 𝜖 =

                                                                                      7
H. Tong et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104111


                        Fig. 3. The RCC of DLAT under FGSM, PGD-100, AutoAttack on ISCX-VPN, ISCX-ALL, and CICIot2022 datasets.


          Fig. 4. The robust classification accuracy (RCC) of various models across classes on ISCX-ALL under increasing adversarial perturbation radii.


1140∕255). In contrast, MobileNet and DenseNet show relatively lower                in performance, particularly when 𝜖 exceeds 24/255. Despite this,
robustness, particularly under 𝓁2 -bounded attacks, where RCC values                architectures such as ResNet and wideresnet continue to maintain RCC
gradually decrease below 0.6 as the perturbation radius increases.                  above 0.5 at 𝜖 = 32∕255, suggesting that DLAT remains effective even
Nonetheless, the performance degradation across all models is smooth                under adaptive and high-strength adversarial attacks. These results
rather than abrupt, suggesting that DLAT retains a degree of robustness             collectively demonstrate the generalization capability of the framework
and stability.                                                                      across a broad range of attacks and perturbation intensities.
   As shown in Fig. 3, we further assess the performance of DLAT                        We thirdly evaluate the robustness of DLAT under varying attack
under three previously unseen adversarial attacks: FGSM, PGD-100,                   intensities, where the attack intensity corresponds to the radii of ad-
and AutoAttack. Under FGSM, all evaluated models exhibit strong                     versarial perturbations (denoted by Epsilon 𝜖). As comprehensively
robustness, with RCC values typically exceeding 0.85 below 𝜖 = 24∕255,              illustrated in Fig. 4, we present the RCC performance for each indi-
and models such as ResNet and WideResNet experiencing only marginal                 vidual class within the ISCX-ALL dataset (including Chat, Email, File
performance degradation. As the perturbation strength increases under               Transfer, P2P, Streaming, VoIP, VPN_Chat, VPN_Email, VPN_File Trans-
PGD-100, the RCC gradually decreases across all models. Nonetheless,                fer, VPN_P2P, VPN_Streaming, and VPN_VoIP) across multiple network
most models achieve RCCs above 0.5 at 𝜖 = 32∕255 on the ISCX-VPN                    architectures (ResNet, DenseNet, MobileNet, WideResNet, FFNN) un-
dataset, indicating a moderate level of robustness. AutoAttack presents             der increasing perturbation radii (𝜖 ranging from 0 to 56/255). The
the most challenging scenario, leading to a more pronounced decline                 adversarial training of DLAT is performed using adversarial examples

                                                                                8
H. Tong et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104111


                                    (a) Accuracy curve                                                   (b) Loss curve

                              Fig. 5. Comparison of accuracy and loss convergence results for DenseNet on the ISCX-ALL Dataset.


generated with a perturbation radius of 𝜖 = 24∕255. As shown in Fig. 4,           Table 5
across most classes and architectures, the trained models demonstrate             Comparison of the time consumption for each epoch of the adversarial training
strong robustness when the attack intensity remains within or below               methods (s).
this radius (𝜖 ≤ 24∕255), and the models still maintain relatively strong          Dataset       Model          AT         TRADES      MART      AWP       DLAT
resilience to perturbations (i.e., 24∕255 < 𝜖 < 32∕255). However, once                           ResNet         16.99      17.98       19.38     19.19     19.07
𝜖 exceeds 32∕255, the attack becomes significantly stronger, leading to            ISCX-VPN      DenseNet       12.59      14.02       14.52     15.84     14.28
a noticeable drop in RCC, especially for non-VPN classes.                                        MobileNet      26.14      28.55       28.14     30.83     27.98
                                                                                                 WideResNet     139.62     136.84      147.27    140.37    152.07
                                                                                                 FFNN           4.02       3.85        3.94      4.36      4.41
6.3. The efficiency of DLAT
                                                                                                 ResNet         74.32      80.69       84.49     89.11     81.57
                                                                                   ISCX-ALL      DenseNet       57.64      60.83       63.62     66.78     62.95
    To evaluate the training efficiency of DLAT, we compare its con-                             MobileNet      113.71     114.23      130.42    129.99    117.19
vergence with that of representative adversarial training baselines,                             WideResNet     673.35     621.27      688.85    688.37    762.18
including AT, TRADES, MART, and AWP. As illustrated in Fig. 5,                                   FFNN           16.43      15.03       17.86     17.62     16.31
DLAT demonstrates significantly faster convergence in both accuracy                              ResNet         47.35      48.92       51.19     51.32     49.63
and loss. Specifically, in the accuracy curve (Fig. 5(a), DLAT rapidly                           DenseNet       61.02      63.11       66.68     68.92     64.90
improves during the initial training epochs, reaching a stable accuracy            CICIoT2022    MobileNet      121.56     122.91      132.23    135.13    124.87
                                                                                                 WideResNet     680.37     690.82      703.16    710.55    695.09
above 0.85 within 30 epochs. In contrast, competing methods exhibit
                                                                                                 FFNN           18.06      19.42       18.98     19.56     20.43
slower convergence and lower final performance, with TRADES and
MART stabilizing below 0.80. Similarly, the loss curve (Fig. 5(b) further
highlights the advantage of DLAT in optimization stability. It consis-
tently maintains a lower loss value throughout training and converges             that DLAT consistently improves robustness and generalization over
to a final loss below 0.3, which is noticeably lower than those of other          standard adversarial training.
methods. These results collectively demonstrate that DLAT not only
accelerates the convergence process but also facilitates optimization             CRediT authorship contribution statement
toward better minima, indicating its efficiency and practicality for
robust model training.                                                               Haoyu Tong: Writing – original draft. Meixia Miao: Methodology,
    In addition to its fast convergence, DLAT maintains comparable                Formal analysis, Project administration. Yundong Liu: Data curation.
training time per epoch to other adversarial training methods, as re-             Xiaoyu Zhang: Writing – original draft, Supervision. Xiangyang Luo:
ported in Table 5. Across different model architectures and datasets,             Resources, Funding acquisition. Willy Susilo: Visualization, Validation,
the time cost of DLAT remains close to that of AT, TRADES, MART, and              Funding acquisition.
AWP. By achieving improved robustness and faster convergence with-
out sacrificing efficiency, DLAT offers a practical solution for robust           Declaration of competing interest
network traffic classification.
                                                                                      The authors declare that they have no known competing finan-
                                                                                  cial interests or personal relationships that could have appeared to
7. Conclusion                                                                     influence the work reported in this paper.

    In this paper, we investigated the vulnerability of deep traffic              Acknowledgments
classifiers to adversarial examples and the label noise introduced by
hard-label supervision in adversarial training. To address this issue, we            This work is funded by the Open Foundation of Key Laboratory of
proposed DLAT, a dynamic adversarial training framework that assigns              Cyberspace Security, Ministry of Education of China and Henan Key
soft labels to adversarial examples based on the similarity between               Laboratory of Cyberspace Situation Awareness (No. KLCS20240103),
clean and perturbed outputs. This similarity-guided interpolation helps           National Natural Science Foundation of China (No. 62472345), and
mitigate label noise and align the decision boundary more effectively.            Fundamental Research Funds for the Central Universities, China (No.
Experimental results on traffic classification benchmarks demonstrate             QTZX25088).

                                                                              9
H. Tong et al.                                                                                                                   Computer Standards & Interfaces 97 (2026) 104111


Appendix. The proof Theorem 1                                                              Data availability


Theorem 1 (Excessive Boundary Shift Induced by Hard-Label Adversarial                         Data will be made available on request.
Training ). Consider a binary classifier 𝑓 ∶  → [0, 1], with the pre-training
decision boundary defined as:                                                              References
pre = {𝒙 ∈  ∣ 𝑓pre (𝒙) = 0.5}.
                                                                                            [1] A. Azab, M. Khasawneh, S. Alrabaee, K.-K.R. Choo, M. Sarsour, Network traffic
                                                                                                classification: Techniques, datasets, and challenges, Digit. Commun. Netw. 10 (3)
Suppose 𝒙𝐴 ∈ 𝐴 is a clean example from class A and 𝒙′𝐴 = 𝒙𝐴 + 𝛿 is an
                                                                                                (2024) 676–692.
adversarial example generated to cross pre , i.e., 𝑓pre (𝒙′𝐴 ) < 0.5. Let 𝑓post be         [2] H. Yuan, G. Li, A survey of traffic prediction: from spatio-temporal data to
the classifier obtained via hard-label adversarial training using (𝒙′𝐴 , 𝑦𝐴 ) as                intelligent transportation, Data Sci. Eng. 6 (1) (2021) 63–85.
supervision, where 𝑦𝐴 = 1. Then, under hard-label supervision, the training                 [3] A.W. Moore, K. Papagiannaki, Toward the accurate identification of net-
                                                                                                work applications, in: International Workshop on Passive and Active Network
objective enforces high-confidence predictions for 𝒙′𝐴 , i.e.,
                                                                                                Measurement, Springer, 2005, pp. 41–54.
                                                                                            [4] A. Madhukar, C. Williamson, A longitudinal study of P2P traffic classification,
𝑓post (𝒙′𝐴 ) ≫ 0.5,
                                                                                                in: 14th IEEE International Symposium on Modeling, Analysis, and Simulation,
                                                                                                IEEE, 2006, pp. 179–188.
which necessarily implies that the new decision boundary post = {𝒙 ∣
                                                                                            [5] S. Fernandes, R. Antonello, T. Lacerda, A. Santos, D. Sadok, T. Westholm,
𝑓post (𝒙) = 0.5} must satisfy                                                                   Slimming down deep packet inspection systems, in: IEEE INFOCOM Workshops
                                                                                                2009, IEEE, 2009, pp. 1–6.
                       𝑓post (𝒙′𝐴 ) − 0.5
dist(𝒙′𝐴 , post ) =                        .                                               [6] N. Hubballi, M. Swarnkar, M. Conti, BitProb: Probabilistic bit signatures for
                       ‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝                                                       accurate application identification, IEEE Trans. Netw. Serv. Manag. 17 (3) (2020)
                                                                                                1730–1741, http://dx.doi.org/10.1109/TNSM.2020.2999856.
                                                                                            [7] A. Azab, P. Watters, R. Layton, Characterising network traffic for skype forensics,
Proof. Let 𝒙𝐴 ∈ 𝐴 be a clean example correctly classified as class A,                          in: 2012 Third Cybercrime and Trustworthy Computing Workshop, 2012, pp.
and let 𝒙′𝐴 = 𝒙𝐴 + 𝛿 be its adversarial variant generated to cross the                          19–27, http://dx.doi.org/10.1109/CTC.2012.14.
original decision boundary pre , i.e.,                                                     [8] H. Mohajeri Moghaddam, Skypemorph: Protocol Obfuscation for Censorship
                                                                                                Resistance, University of Waterloo, 2013.
𝑓pre (𝒙′𝐴 ) < 0.5.                                                                          [9] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015)
                                                                                                436–444.
Hard-label adversarial training uses the tuple (𝒙′𝐴 , 𝑦𝐴 = 1) as supervised                [10] M. Lotfollahi, M.J. Siavoshani, R.S.H. Zade, M. Saberian, Deep packet: a novel
data, forcing the model 𝑓post to assign high confidence to 𝒙′𝐴 :                                approach for encrypted traffic classification using deep learning, Soft Comput.
                                                                                                24 (2017) 1999–2012, URL https://api.semanticscholar.org/CorpusID:35187639.
𝑓post (𝒙′𝐴 ) → 1.                                                                          [11] L. Yang, A. Finamore, F. Jun, D. Rossi, Deep learning and traffic classification:
                                                                                                Lessons learned from a commercial-grade dataset with hundreds of encrypted
    Now, consider the new decision boundary:                                                    and zero-day applications, 2021, arXiv preprint arXiv:2104.03182.
                                                                                           [12] M.H. Pathmaperuma, Y. Rahulamathavan, S. Dogan, A.M. Kondoz, Deep learning
post = {𝒙 ∣ 𝑓post (𝒙) = 0.5}.                                                                  for encrypted traffic classification and unknown data detection, Sensors 22 (19)
                                                                                                (2022) 7643.
We approximate 𝑓post in a neighborhood of 𝒙′𝐴 using a first-order Taylor                   [13] X. Lin, G. Xiong, G. Gou, Z. Li, J. Shi, J. Yu, Et-bert: A contextualized datagram
                                                                                                representation with pre-training transformers for encrypted traffic classification,
expansion:                                                                                      in: Proceedings of the ACM Web Conference 2022, 2022, pp. 633–642.
                                                                                           [14] X. Ma, W. Zhu, J. Wei, Y. Jin, D. Gu, R. Wang, EETC: An extended encrypted
𝑓post (𝒙) ≈ 𝑓post (𝒙′𝐴 ) + ∇𝒙 𝑓post (𝒙′𝐴 )⊤ (𝒙 − 𝒙′𝐴 ).                                         traffic classification algorithm based on variant resnet network, Comput. Secur.
                                                                                                128 (2023) 103175.
Let 𝒙 ∈ post denote the closest point on the new boundary to 𝒙′𝐴 . By                    [15] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial
definition,                                                                                     examples, in: International Conference on Learning Representations, ICLR, 2014.
                                                                                           [16] A.M. Sadeghzadeh, S. Shiravi, R. Jalili, Adversarial network traffic: Towards
𝑓post (𝒙 ) = 0.5.                                                                              evaluating the robustness of deep-learning-based network traffic classification,
                                                                                                IEEE Trans. Netw. Serv. Manag. 18 (2) (2021) 1962–1976.
Using the linear approximation, we have:                                                   [17] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning
                                                                                                models resistant to adversarial attacks, in: International Conference on Learning
0.5 ≈ 𝑓post (𝒙′𝐴 ) + ∇𝒙 𝑓post (𝒙′𝐴 )⊤ (𝒙 − 𝒙′𝐴 ).                                              Representations, ICLR, 2018.
                                                                                           [18] C. Dong, L. Liu, J. Shang, Label noise in adversarial training: A novel per-
Solving for the shift vector:                                                                   spective to study robust overfitting, Adv. Neural Inf. Process. Syst. 35 (2022)
                                                                                                17556–17567.
∇𝒙 𝑓post (𝒙′𝐴 )⊤ (𝒙 − 𝒙′𝐴 ) ≈ 0.5 − 𝑓post (𝒙′𝐴 ).                                         [19] W. Wang, M. Zhu, J. Wang, X. Zeng, Z. Yang, End-to-end encrypted traffic
                                                                                                classification with one-dimensional convolution neural networks, in: 2017 IEEE
Let 𝒗 = ∇𝒙 𝑓post (𝒙′𝐴 )∕‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝 be the normalized gradient (i.e., the                International Conference on Intelligence and Security Informatics, ISI, IEEE,
local normal direction to the decision boundary). Then the minimal                              2017, pp. 43–48.
distance from 𝒙′𝐴 to the boundary is:                                                      [20] J. Lan, X. Liu, B. Li, Y. Li, T. Geng, DarknetSec: A novel self-attentive deep
                                                                                                learning method for darknet traffic classification and application identification,
                     |𝑓post (𝒙′𝐴 ) − 0.5|                                                       Comput. Secur. 116 (2022) 102663.
‖𝒙 − 𝒙′𝐴 ‖𝑝 =                        .                                                    [21] K. Fauvel, F. Chen, D. Rossi, A lightweight, efficient and explainable-by-design
                   ‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝                                                           convolutional neural network for internet traffic classification, in: Proceedings
    As 𝑓post (𝒙′𝐴 ) → 1, this implies:                                                          of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining,
                                                                                                2023, pp. 4013–4023.
                         0.5
dist(𝒙′𝐴 , post ) →                  .                                                    [22] Z. Liu, Y. Xie, Y. Luo, Y. Wang, X. Ji, TransECA-net: A transformer-based model
                   ‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝                                                           for encrypted traffic classification, Appl. Sci. 15 (6) (2025) 2977.
    This lower bound quantifies how far the decision boundary must                         [23] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R.
                                                                                                Fergus, Intriguing properties of neural networks, 2013, arXiv:1312.6199.
move beyond 𝒙′𝐴 to satisfy 𝑓post (𝒙′𝐴 ) = 1. If ∇𝒙 𝑓post (𝒙′𝐴 ) is not vanish-
                                                                                           [24] A. Kurakin, I.J. Goodfellow, S. Bengio, Adversarial examples in the physical
ingly large, this distance is significant. Finally, since 𝒙′𝐴 was crafted to                    world, in: Artificial Intelligence Safety and Security, Chapman and Hall/CRC,
lie just beyond pre , i.e., in close proximity to the original boundary,                       2018, pp. 99–112.
the boundary movement beyond 𝒙′𝐴 implies that the new decision                             [25] N. Carlini, D. Wagner, Towards evaluating the robustness of neural networks, in:
                                                                                                2017 IEEE Symposium on Security and Privacy, S&P, IEEE, 2017, pp. 39–57.
boundary has crossed deep into the region previously occupied by class
                                                                                           [26] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically princi-
B. Therefore, class-B examples in the vicinity of 𝒙′𝐴 are likely to be                          pled trade-off between robustness and accuracy, in: International Conference on
misclassified as class A under 𝑓post . □                                                        Machine Learning, PMLR, 2019, pp. 7472–7482.


                                                                                      10
H. Tong et al.                                                                                                                       Computer Standards & Interfaces 97 (2026) 104111


[27] Y. Wang, D. Zou, J. Yi, J. Bailey, X. Ma, Q. Gu, Improving adversarial                     [32] G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected
     robustness requires revisiting misclassified examples, in: International Conference             convolutional networks, in: Proceedings of the IEEE Conference on Computer
     on Learning Representations, ICLR, 2019.                                                        Vision and Pattern Recognition, 2017, pp. 4700–4708.
[28] D. Wu, S.-T. Xia, Y. Wang, Adversarial weight perturbation helps robust                    [33] A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M.
     generalization, Adv. Neural Inf. Process. Syst. 33 (2020) 2958–2969.                            Andreetto, H. Adam, Mobilenets: Efficient convolutional neural networks for
[29] G.D. Gil, A.H. Lashkari, M. Mamun, A.A. Ghorbani, Characterization of encrypted                 mobile vision applications, 2017, arXiv preprint arXiv:1704.04861.
     and VPN traffic using time-related features, in: Proceedings of the 2nd Interna-           [34] S. Zagoruyko, N. Komodakis, Wide residual networks, 2016, arXiv preprint
     tional Conference on Information Systems Security and Privacy, ICISSP 2016,                     arXiv:1605.07146.
     SciTePress Setúbal, Portugal, 2016, pp. 407–414.                                           [35] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by
[30] S. Dadkhah, H. Mahdikhani, P.K. Danso, A. Zohourian, K.A. Truong, A.A.                          back-propagating errors, Nature 323 (6088) (1986) 533–536.
     Ghorbani, Towards the development of a realistic multidimensional IoT profiling            [36] N. Qian, On the momentum term in gradient descent learning algorithms, Neural
     dataset, in: 2022 19th Annual International Conference on Privacy, Security &                   Netw. 12 (1) (1999) 145–151.
     Trust, PST, IEEE, 2022, pp. 1–11.                                                          [37] F. Croce, M. Hein, Reliable evaluation of adversarial robustness with an ensemble
[31] K. He, X. Zhang, S. Ren, J. Sun, Identity mappings in deep residual networks,                   of diverse parameter-free attacks, in: ICML, 2020.
     in: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, the
     Netherlands, October 11–14, 2016, Proceedings, Part IV 14, Springer, 2016, pp.
     630–645.


                                                                                           11