initial

2026-01-06 12:49:26 -07:00
commit dfa968ec7d
155 changed files with 539774 additions and 0 deletions
--- a/papers_txt/1-s2.0-S0920548925001473-main.txt
+++ b/papers_txt/1-s2.0-S0920548925001473-main.txt
--- a/papers_txt/1-s2.0-S1383762125000189-main.txt
+++ b/papers_txt/1-s2.0-S1383762125000189-main.txt
@@ -0,0 +1,834 @@
+                                                               Journal of Systems Architecture 160 (2025) 103346
+
+
+                                                                   Contents lists available at ScienceDirect
+
+
+                                                          Journal of Systems Architecture
+                                                          journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+Fast post-quantum private set intersection from oblivious pseudorandom
+function for mobile social networks✩
+Zhuang Shan a , Leyou Zhang a ,∗, Qing Wu b , Qiqi Lai c , Fuchun Guo d
+a School of Mathematics and Statistics, Xidian University, Xi’an 710126, China
+b
+  School of Automation, Xi’an University of Posts and Telecommunications, Xi’an 710121, China
+c
+  School of Computer Science, Shaanxi Normal University, Xi’an 710121, China
+d
+  Centre for Computer and Information Security Research, University of Wollongong, Wollongong, NSW 2522, Australia
+
+
+
+ARTICLE               INFO                                ABSTRACT
+
+Keywords:                                                 Mobile social networks have become integral to our daily lives, transforming communication methods and
+Mobile social networks                                    facilitating social interactions. With technological advancements, users generate vast amounts of valuable
+Private set intersection                                  and sensitive personal data, which is stored on servers to enable instant information sharing. To protect the
+Oblivious pseudorandom function
+                                                          sharing data, each platform has implemented many techniques such as end-to-end encryption mechanisms,
+Private information retrieval
+                                                          fully homomorphic encryption, etc. However, these approaches face several security and privacy challenges,
+                                                          including potential leaks of user data, vulnerabilities in encryption that expose privacy ciphertexts to
+                                                          probabilistic attacks, and threats posed by future quantum computers.
+                                                              Aimed at the above, we introduce a private set intersection (PSI) protocol based on oblivious pseudorandom
+                                                          functions (OPRF) under ring LPR problem from lattice. The proposed perturbed pseudorandom generator
+                                                          not only enhances the PSI’s resistance to probabilistic attacks, but also leads to generate a more efficient
+                                                          OPRF and a PSI. It boasts a time complexity of 𝑂(𝑛 log 𝑛) and is superior to existing well-known fast post-
+                                                          quantum PSI protocol operating at 𝑂(𝑚𝑛 log(𝑚𝑛)), where 𝑚 is the bit length of the cryptographic modulus and 𝑛
+                                                          represents the dimension of the security parameter. Simulation experiments and security analyses demonstrate
+                                                          that our proposal effectively preserves user privacy, ensures collusion resilience, verifies computation results,
+                                                          and maintains low computational costs. Finally, as an expansion of our OPRF, we also give a fast private
+                                                          information retrieval (PIR) protocol.
+
+
+
+1. Introduction                                                                             respective data sets. This way, even if data is stored in distributed
+                                                                                            systems, it can effectively prevent data breaches and violations of user
+    Mobile social networks have greatly enriched the ways people com-                       privacy, such as those caused by data leaks or unauthorized access.
+municate and enhanced the convenience of social interactions. With the                          The application of PSI in mobile social networks not only enhances
+development of technology, users generate a large amount of useful                          data security but also strengthens user trust in the platform, which
+and sensitive personal data within mobile social networks. This data
+                                                                                            is crucial for protecting user privacy and improving the platform’s
+often needs to be stored and processed to provide more personalized
+                                                                                            competitiveness. In this way, mobile social networks can continue to
+services and experiences [1,2]. However, due to the limited storage
+capacity of mobile social network devices, it is impossible to store all                    provide a rich and vibrant social experience and efficient information
+the data generated at any given moment, which presents challenges for                       services while safeguarding personal privacy. Furthermore, as an im-
+data storage and privacy protection.                                                        portant application in the field of privacy computing, PSI has recently
+    To address this issue while ensuring data confidentiality and se-                       garnered widespread attention due to its efficiency and practicality,
+curity, many mobile social network platforms have started adopting                          jointly promoting the rapid implementation of privacy computing tech-
+advanced privacy-preserving technologies, such as private set inter-                        nology and ensuring the secure flow and value extraction of data
+section (PSI). The technology allows two or more parties to securely                        elements.
+compute the intersection of their datasets without disclosing their
+
+
+    ✩ This document is the results of the research project funded by the National Science Foundation.
+    ∗ Corresponding author.
+    E-mail addresses: arcsec30@stu.xidian.edu.cn (Z. Shan), lyzhang@mail.xidian.edu.cn (L. Zhang), xiyouwuq@126.com (Q. Wu), laiqq@snnu.edu.cn (Q. Lai),
+fuchun@uow.edu.au (F. Guo).
+
+https://doi.org/10.1016/j.sysarc.2025.103346
+Received 3 November 2024; Received in revised form 24 December 2024; Accepted 16 January 2025
+Available online 25 January 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Z. Shan et al.                                                                                                     Journal of Systems Architecture 160 (2025) 103346
+
+
+                                                                                  set intersection from oblivious pseudorandom function is proposed in
+                                                                                  this paper, and it has the following advantages:
+
+                                                                                  • Symmetric encryption is adopted, which is efficient and reduces the risk of
+                                                                                    privacy leakage. The PSI in this paper is constructed based on OPRF,
+                                                                                    which belongs to asymmetric encryption, thus reducing the number
+                                                                                    of interactions between users and lowering the risk of user privacy
+                                                                                    leakage. Compared to symmetric encryption, the operational cost of
+                                                                                    asymmetric encryption is lower, reducing reliance on authoritative
+                                                                                    institutions.
+                                                                                  • The structure of OPRF is simple, and it is relatively efficient in post-
+                                                                                    quantum OPRF. The OPRF used to construct PSI in this paper is based
+                                                                                    on a new lattice problem, namely the learning parity with rounding
+                       Fig. 1. Mobile social networks.
+                                                                                    over ring problem(Ring-LPR). The Ring-LPR problem not only has a
+                                                                                    simple structure but also possesses the capability to resist quantum
+                                                                                    attacks.
+                                                                                  • A perturbed pseudorandom generator (PPRG) can withstand probabilistic
+                                                                                    attacks. In addition to OPRF, the PSI in this paper also includes
+                                                                                    a structure with a perturbed pseudorandom generator, which can
+                                                                                    overcome the weakness of weak encryption in symmetric encryp-
+                                                                                    tion, thereby preventing adversaries from guessing the corresponding
+                                                                                    plaintext using statistical methods on the ciphertext ratios.
+
+
+                       Fig. 2. Private set intersection.                          1.2. Technical overview
+
+                                                                                      We adopted oblivious transfer technique and hamming correlation
+    There are many common construction tools for PSI [3], and obliv-              robustness, both of which are used in the OPRF construction presented
+ious transfer (OT) is one of them. An OT [4] is a crucial tool used               in this paper. For the incidental pseudorandom function subject, we
+for secure multiparty computation. In this tool, the sender transmits             initially aimed to use learning parity with noise (LPN) over rings.
+data from a set of messages to the receiver but remains oblivious to              However, this approach results in varying encryption outcomes for the
+which specific message was sent, while the receiver is unaware of the             same private data, preventing the recipient from matching the private
+other messages they did not receive. This protocol is also known as the
+                                                                                  data. Thus, we sought to make LPN over rings behave consistently
+oblivious transfer protocol. The essence of an oblivious pseudorandom
+                                                                                  like learning with rounding (LWR), leading to the introduction of the
+function is a pseudorandom function (PRF) enhanced with oblivious
+                                                                                  concept of learning parity with rounding over rings (LPR over rings) in
+transfer capabilities.
+                                                                                  this paper.
+    In 1986, Goldreich, Goldwasser, and Micali introduced a new cryp-
+                                                                                      To prove that LPR over rings is quantum-resistant, we established
+tographic primitive known as the pseudorandom function, whose out-
+put appears to be randomly chosen [5]. Two decades later, Naor and                a reduction bridge between LPR over rings and LWR. Yes, LPR over
+Reingold [6] noticed that their number-theoretic PRF allows for an                rings is reduced to LWR, not LPN over rings. For (𝑞 = 2𝑛 , 𝑝)-LWR
+interactive and oblivious evaluation, where a ‘‘client’’ with input 𝑥             instances, we demonstrated the hardness of (𝑞 = 2, 𝑝 = 1)-LWR instances
+obtains 𝐹𝑘 (𝑥) for a function 𝐹𝑘 (𝑥) that is contributed by a ‘‘server’’.         and (𝑞 = 2, 𝑝 = 1)-LWR over rings, where (𝑞 = 2, 𝑝 = 1)-LWR over
+Neither does the client learn the function (i.e., its key 𝑘), nor does the        rings corresponds to LPR over rings. To verify that the computational
+server learn 𝑥 or 𝐹𝑘 (𝑥). Freedman et al. later called such two-party             efficiency of the post-quantum OPRF in this paper is quite fast, we
+protocol an OPRF and gave first formal definitions and two OPRFs                  compared the OPRF with the LWE-instantiated OPRF from [14]. The
+based on the Naor-Reingold PRF [7]. In 2009, Jarecki and Liu presented            results showed that, as theoretical analysis suggested, the computation
+an efficient OPRF for securing intersection data [8].                             efficiency improves with the increase of security parameters.
+    Oblivious pseudorandom functions have been utilized in PSI [9].                   Based on OPRF, we constructed private set intersection (PSI) based
+The additional functionalities of oblivious pseudorandom functions                on OPRF. Since the paper [15] analyzed that PSI based on symmetric
+also exhibit diversity, such as verifiable oblivious pseudorandom func-           encryption does not resist probabilistic attacks and proposed the con-
+tions (VOPRF, [10]) and partially oblivious pseudorandom functions                cept of perturbed pseudorandom generator, we used LPN over rings
+(POPRF, [11]).                                                                    to construct a pseudorandom generator and proved that it satisfies the
+    Currently, OPRFs still faces challenges, as summarized by Casacu-             definition of PPRG as given in [15].
+berta, Hesse, and Lehmann [12]. Efficient OPRF constructions often
+rely on discrete-log or factoring-type hardness assumptions, which
+                                                                                  1.3. Organizations
+are vulnerable to quantum computers. This paper aims to address
+this by constructing OPRFs based on lattice-hardness assumptions and
+improving their efficiency (see Figs. 1 and 2).                                       The structure of this paper is as follows. Section 3 provides the
+                                                                                  necessary definitions and lemmas as a foundation for the readers’
+1.1. Contributions                                                                knowledge. Section 4 presents the construction and efficiency analysis
+                                                                                  of OPRF, along with the definition and reduction of Ring-LPR. Section 5
+   Regarding the open problem proposed by Casacuberta, there are                  details the construction of the PSI in this paper, security proofs, and
+currently quantum-resistant OPRFs, namely Albrecht et al.’s lattice-              LWE-based efficiency analysis, as well as the construction of the PPRG
+based VOPRF [10] and Boneh et al.’s isogeny-based OPRF [13]. Both                 and the proof of its pseudorandomness. Finally, Section 6 summarizes
+constructions represent significant feasibility results but require further       the advantages and limitations of the PSI presented in this paper, as
+research to improve their efficiency [12]. So, fast post-quantum private          well as the extension of OPRF to PIR
+
+                                                                              2
+Z. Shan et al.                                                                                                                Journal of Systems Architecture 160 (2025) 103346
+
+
+2. Preliminary                                                                             ⎛ 0        0   0   ⋯     0      −1 ⎞
+                                                                                           ⎜ 1        0   0   ⋯     0       0 ⎟
+    Each element of a lattice in R𝑛 can be expressed linearly by 𝑛                         ⎜                                  ⎟
+                                                                                             0        1   0   ⋯     0       0 ⎟
+                                                                                         𝑋=⎜                                    .
+linearly independent vector integer coefficients. This set of linearly                     ⎜ 0        0   1   ⋯     0       0 ⎟
+independent vectors is called a lattice basis, and we know that the                        ⎜ ⋮        ⋮   ⋮   ⋱     ⋮      ⋮ ⎟⎟
+                                                                                           ⎜
+lattice basis is not unique. Given a set of lattice bases (𝑣1 , … , 𝑣𝑛 ) in                ⎝ 0        0   0   ⋯     1       0 ⎠
+the lattice , then the fundamental parallelelepiped is
+                  { 𝑛                     }                                              So there is
+                   ∑          |
+(𝑣1 , … , 𝑣𝑛 ) =      𝑘𝑖 𝑣𝑖 ||𝑘𝑖 ∈ [0, 1) .                                                       ⎛ 𝑎0       −𝑎𝑛−1     ⋯      −𝑎1 ⎞
+                              |                                                                    ⎜                               ⎟
+                   𝑖=1                                                                                𝑎1       𝑎0       ⋯      −𝑎2 ⎟
+                                                                                         𝑅𝑜𝑡(𝑓 ) = ⎜                                 ,
+If the lattice base (𝑣1 , … , 𝑣𝑛 ) is determined, use the symbol () to                           ⎜ ⋮          ⋮       ⋱       ⋮ ⎟
+replace (𝑣1 , … , 𝑣𝑛 ). ∀𝑥 ∈ R𝑛 , project it onto (). According to the                          ⎜ 𝑎        𝑎𝑛−2      ⋯          ⎟
+                                                                                                                                𝑎0 ⎠
+                                                                                                   ⎝ 𝑛−1
+properties of projection, there is a unique 𝑦 ∈ () makes 𝑦 − 𝑥 ∈ .
+                                                                                         it is easy to prove that this mapping relationship is isomorphic.
+Use the symbol det () to represent the volume of the fundamental
+parallelelepiped of the lattice . In other words, the symbol det ()
+                                                                                         Definition 3 (Learning with Rounding, [16,17]). Let 𝜆 be the security
+represents the determinant of a matrix composed of a set of lattice bases
+                                                                                         parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆), 𝑞 = 𝑞(𝜆), 𝑝 = 𝑝(𝜆) be integers. The LWR
+(𝑣1 , … , 𝑣𝑛 ). For a given 𝑛 dimensional lattice, the det () size of any set
+                                                                                         problem states that for 𝐴 ∈ Z𝑚×𝑛            𝑛        𝑚
+                                                                                                                          𝑞 , 𝑠 ∈ Z𝑞 , 𝑢 ∈ Z𝑞 the following distri-
+of lattice bases of the lattice is constant.
+                                                                                         butions are computationally indistinguishable: (𝐴, ⌊𝐴𝑠⌋𝑝 ) ≈𝐶 (𝐴, ⌊𝑢⌋𝑝 ).
+     Given 𝑛 lattice , (𝑣1 , … , 𝑣𝑛 ) and (𝑢1 , … , 𝑢𝑛 ) are two arbitrary groups
+                                                                        ∑                Here ⌊𝑥⌋𝑝 = ⌊ 𝑞𝑝 𝑥⌋, ⌊𝑥⌋ represents the floor function, which rounds down
+of lattice  respectively lattice bases. Therefore, there is 𝑣𝑖 = 𝑛𝑗=1 𝑚𝑖𝑗 𝑢𝑗
+           ∑𝑛       ′                                                                    to the nearest integer. For example, ⌊3.14⌋ = 3 and ⌊3⌋ = 3.
+and 𝑢𝑖 = 𝑗=1 𝑚𝑖𝑗 𝑣𝑗 , 𝑖 ∈ {1, … , 𝑛}, there are two integer matrices 𝑀 and
+𝑀 ′ such that
+⎛ 𝑣1 ⎞           ⎛ 𝑢1 ⎞       ⎛ 𝑢1 ⎞          ⎛ 𝑣1 ⎞                                     Definition 4 (Learning Parity with Noise, [18,19]). Let 𝜆 be the security
+⎜ ⋮ ⎟ = 𝑀 ⎜ ⋮ ⎟ and ⎜ ⋮ ⎟ = 𝑀 ′ ⎜ ⋮ ⎟ .                                                  parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆) be integers. The LPN problem states
+⎜      ⎟         ⎜     ⎟      ⎜       ⎟       ⎜       ⎟
+⎝ 𝑣𝑛 ⎠           ⎝ 𝑢𝑛 ⎠       ⎝ 𝑢𝑛 ⎠          ⎝ 𝑣𝑛 ⎠                                     that for 𝐴 ∈ Z𝑚×𝑛
+                                                                                                        2
+                                                                                                           , 𝑠 ∈ Z𝑛2 , 𝑢, 𝑒 ∈ Z𝑚
+                                                                                                                               2
+                                                                                                                                  the following distributions are
+                                                                                         computationally indistinguishable: (𝐴, 𝐴𝑠 + 𝑒) ≈𝐶 (𝐴, 𝑢).
+It is easy to prove that 𝑀 and 𝑀 ′ are inverse to each other, and 𝑀
+and 𝑀 ′ are both integer matrices, there are det (𝑀)⋅ det (𝑀 ′ ) = 1 and
+det (𝑀) = det (𝑀 ′ ) = ±1, so                                                            Definition 5 (Hamming Correlation Robustness, [14]). For a hash func-
+det (𝑣1 , … , 𝑣𝑛 ) = ± det (𝑢1 , … , 𝑢𝑛 ).                                               tion (⋅) and a pseudorandom function 𝐹𝑘 (⋅) with key 𝑘, (⋅) is Ham-
+                                                                                         ming correlation robust if (𝑥) ≈𝐶 𝐹𝑘 (𝑥).
+
+
+Definition 1. An ideal lattice is a subset of rings or domains that                      Definition 6 (OT1 ). The message sender sends data to the receiver
+satisfies the following two properties:                                                  from a set of pending messages but remains oblivious to which specific
+                                                                                         message was sent. Meanwhile, the receiver is unaware of the additional
+    1. Additive closure: If any two elements in the ideal are added, the                 data they want to receive. This protocol is also known as oblivious
+       result is still in the ideal. In other words, for any elements 𝑎 and              transfer.
+       𝑏 in the ideal, 𝑎 + 𝑏 also belongs to that ideal.
+    2. Multiplicative absorptivity: If an element in the ideal is multi-
+       plied by any element in the ring (or field), the result is still in               Definition 7 (OPRF, [20]). Let the PRF key 𝑘 consist of two bit-
+       the ideal. In other words, for any element 𝑎 in the ideal and any                 strings 𝑞 , 𝑠 ∈ {0, 1}𝜆 . Let 𝐹 (⋅)be a pseudorandom code that produces a
+       element 𝑟 in the ring (or field), 𝑎𝑟 and 𝑟𝑎 belong to that ideal.                 pseudorandom string and let  be a hash function. The pseudorandom
+                                                                                         function is computed as
+For a commutative ring, further require that the ideal be closed for both
+addition and multiplication. Such an ideal is called a true ideal.                       OPRF𝑘 (𝑥) = (𝑞 ⊕ [𝐹 (𝑥) ⋅ 𝑠]),
+
+                                                                                         where ⋅ denotes bitwise-AND and ⊕ denotes bitwise-XOR. For a ran-
+Definition 2. Referring to the definition of ideal, the ideal lattice  is               domly generated s, if 𝐹 (𝑥) has enough Hamming weight then the
+a subset of the lattice  that satisfies the following two properties:                   function OPRF𝑘 (𝑥) is pseudorandom assuming the hash function  is
+                                                                                         correlation robust.
+    1. Additive closure: If any two elements in an ideal lattice are
+       added, the result is still in the ideal lattice. In other words, for
+       any elements 𝑎 and 𝑏 in an ideal lattice, 𝑎+𝑏 also belongs to that                Definition 8 (PSI, [14]). PSI enables two parties, each holding a private
+       ideal lattice.                                                                    set of elements, to compute the intersection of the two sets while
+    2. Multiplicative absorptivity: If an element in an ideal lattice is                 revealing nothing more than the intersection itself.
+       multiplied by an element in any other ideal lattice, the result
+       remains in the ideal lattice. In other words, for any element 𝑎 in
+                                                                                         Definition 9 (Dihedral Coset Problem). Given a security parameter 𝜅, for
+       the ideal and any element 𝑟 in another ideal lattice, both 𝑎𝑟 and
+                                                                                         an instance of the DCP𝓁𝑞 problem, where 𝑁 denotes the modulus and 𝓁
+       𝑟𝑎 belong to that ideal lattice.
+                                                                                         represents the number of states. Each state is expressed as
+                                                                                         |0⟩|𝑥𝑖 ⟩ + |1⟩|(𝑥𝑖 + 𝑠) mod 𝑞⟩,    𝑖 ≤ 𝓁,
+Corollary 1. The ideal lattice  is a true idea of the lattice .                        and it stores 1 + ⌈log2 𝑞⌉ bits, where 𝑥 ∈𝑅 Z𝑛𝑞 and 𝑠 ∈ Z𝑛𝑞 . If 𝑠 can be
+    For 𝑓 (𝑥) = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 is mapped to                                   computed with probability poly(1∕ log 𝑞) in time poly(log 𝑞), then the
+                                                                                         DCP𝓁𝑞 problem is considered to be broken.
+𝑅𝑜𝑡(𝑓 ) = 𝑎0 𝐼 + 𝑎1 𝑋 + ⋯ + 𝑎𝑛−1 𝑋 𝑛−1 ∈ .
+                                         ̃
+
+Among them,    ̃ is the mapping of all Z[𝑥]∕<𝑥𝑛 + 1> to the elements in
+                                                                                           1
+the ideal lattice  collection, and                                                            https://blog.csdn.net/m0_61869253/article/details/139362753
+
+
+                                                                                     3
+Z. Shan et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103346
+
+
+                                                                                    3.2. Security proof of OPRF
+
+Note 1. The Dihedral Coset Problem is a difficult problem in quantum                   In this subsection, we will provide the definition of the underly-
+computing, and solving it has a time complexity of 𝑂(𝑒𝑛 ) or 𝑂(𝑛!).                 ing lattice problem for OPRF, learning parity with rounding, and its
+                                                                                    reduction proof.
+
+Lemma 1. If an efficient algorithm  can solve DCP𝓁2 in polynomial
+                                                                                    Definition 11 (Learning Parity with Rounding). Let 𝜆 be the security
+time, then there exists an efficient algorithm  ′ that can solve DCP𝓁𝑞 in
+                                                                                    parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆) be integers. The LPR problem states
+polynomial time.
+                                                                                    that for 𝐴 ∈ Z𝑚×𝑛
+                                                                                                   2
+                                                                                                      , 𝑠 ∈ Z𝑛2 , 𝑢 ∈ Z𝑚 2
+                                                                                                                            the following distributions are
+                                                                                    computationally indistinguishable: (𝐴, ⌊𝐴𝑠 mod 4⌋1 ) ≈𝐶 (𝐴, ⌊𝑢⌋1 ).
+Proof. We use a proof by contradiction. Suppose 𝑞 = 2𝑛 and there exists
+an efficient algorithm  that can solve DCP𝓁2 in polynomial time. For               Definition 12 (Learning Parity with Rounding Over Ring). The Ring LPR
+instances of DCP𝓁4 , we have                                                        problem states that for 𝑎, 𝑠, 𝑢 ∈ 2 the following distributions are
+|0⟩|𝑥𝑖 ⟩+|1⟩|(𝑥𝑖 + 𝑠) mod 4⟩ = |0⟩|𝑥′𝑖 ⟩ + |1⟩|(𝑥′𝑖 + 𝑠′ ) mod 2⟩                   computationally indistinguishable: (𝑎, ⌊𝑎𝑠 mod 4⌋1 ) ≈𝐶 (𝑎, ⌊𝑢⌋1 ).
+             + 2(|0⟩|𝑥′′          ′    ′′
+                      𝑖 ⟩ + |1⟩|(𝑥𝑖 + 𝑠 ) mod 2), 𝑖 ≤ 𝓁,
+
+so running the algorithm  twice will solve DCP𝓁4=22 . Similarly, run-              Lemma 4. For an LWR problem instance ⌊𝐴𝑠⌋𝑝 , if there exists an algorithm
+ning  four times will solve DCP𝓁16=24 , and continuing in this manner,              for solving 𝑠 from ⌊𝐴𝑠⌋1 , then there also exists an algorithm  ′ for
+running the algorithm  𝑛 times will solve DCP𝓁𝑞 . Let 𝑂() represent               solving the LWR problem.
+the time complexity of the algorithm . Thus, we have  ′ ≤ 𝑛𝑂()
+and algorithm  ′ is an efficient algorithm. □                                      Proof. Given that there exists an algorithm  that can solve ⌊𝐴𝑠⌋1 =
+                                                                                    ⌊ 𝐴𝑠 ⌋, for an LWR problem instance ⌊𝐴𝑠⌋𝑝 , we have:
+                                                                                      𝑞            ⌊   ⌋
+Definition 10 (Extrapolated Dihedral Coset Problem with model 2, [21]).             1            1 𝑝𝐴𝑠
+                                                                                      ⌊𝐴𝑠⌋𝑝 =
+Given a security parameter 𝜅, an instance of EDCP𝓁𝑛,2,𝜌 is provided,                𝑝            𝑝   𝑞
+                                                                                                   (       )
+where 2 denotes the modulus, 𝜌 represents the probability density                                1 𝑝𝐴𝑠
+                                                                                              =         +𝑒     (𝑒 ∈ (−1, 0]𝑚 )
+function, and 𝓁 denotes the number of states. Each state is expressed                            𝑝   𝑞
+                                                                                                          (     (      ]𝑚 )
+as                                                                                               1                  1
+   ∑                                                                                          = 𝐴𝑠 + 𝑒′    𝑒′ ∈ − , 0
+        𝜌(𝑗)|𝑗⟩|(𝑥𝑖 + 𝑗 𝑠) mod 2⟩, 𝑖 ≤ 𝓁,                                                        𝑞                  𝑝
+𝑗∈supp(𝜌)                                                                                    ≈ ⌊𝐴𝑠⌋1 .
+and stores 2 bits, where 𝑥𝑖 ∈𝑅 Z𝑛2 and 𝑠 ∈ Z𝑛2 . If 𝑠 can be determined
+                                                                                       Thus, the algorithm  can be used to solve the LWR problem.                □
+with probability poly(1∕(𝑛 log 2)) in time poly(𝑛 log 2), then the EDCP𝓁𝑛,2,𝜌
+problem is considered to be broken.                                                    We get next corollary by Lemma 3.
+                                                                                                                    √
+                                                                                    Corollary 3. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)
+Lemma 2. If there exists an algorithm for solving EDCP𝓁𝑛,4,𝜌 , then this            be an instance of 2-LWR. If there exists an algorithm for solving 2-LWR,
+algorithm can also solve DCP𝓁4 .                                                    then there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .
+                                                                                                                                           𝑟
+
+
+                                                                                                                   √
+Proof. Let                                                                          Corollary 4. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)
+      1            1                                                                be an instance of LPR. If there exists an algorithm for solving LPR, then
+|𝑏⟩ = √ |0⟩|𝑥𝑖 ⟩ + √ |1⟩|(𝑥𝑖 + 𝑠) mod 4⟩.
+       2            2                                                               there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .
+                                                                                                                                     𝑟
+
+Thus, 𝜌(0)|0⟩ = √1 |0⟩ and 𝜌(1)|1⟩ = √1 |1⟩. Hence, DCP𝓁2 is a special
+                     2                       2
+case of EDCP𝓁𝑛,2,𝜌 . Therefore, if there exists an algorithm for solving            Lemma 5. If there exists an algorithm  for solving the Ring-LPR problem,
+EDCP𝓁𝑛,2,𝜌 , this algorithm can also solve DCP𝓁2 . □                                then there also exists an algorithm  ′ for solving the LPR problem.
+
+
+                                       √                                            Proof. For an instance of the inner product Ring-LPR
+Lemma 3 ([21]). Let (𝑛, 𝑞 , 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and
+(𝑛, 𝑞 , 𝛼) be an instance of LWE. If there exists an algorithm for solving          𝑏 = ⌊𝑎 ⋅ 𝑠⌋1
+LWE𝑛,𝑞,𝛼 , then there exists an algorithm for solving G-EDCP𝓁𝑛,𝑞,𝜌 .                where 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 , we can represent 𝑎 as a circulant
+                                                                 𝑟
+                                                                                    matrix, specifically
+                               √                                                          ⎛ 𝑎0      −𝑎𝑛−1 ⋯ −𝑎1 ⎞
+Corollary 2. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)                 ⎜                           ⎟
+                                                                                              𝑎        𝑎0  ⋯ −𝑎2 ⎟
+be an instance of LPN. If there exists an algorithm for solving LPN𝑛,𝛼 , then       𝐴1 ∶= ⎜     1
+                                                                                                                        .
+                                                                                          ⎜ ⋮          ⋮   ⋱     ⋮ ⎟
+there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .                                      ⎜ 𝑎                         ⎟
+                                                    𝑟
+                                                                                          ⎝ 𝑛−1      𝑎𝑛−2  ⋯     𝑎0 ⎠
+                                                                                    Thus,
+3. Ring-LPR based OPRF
+                                                                                    𝑏 = ⌊𝑎 ⋅ 𝑠⌋1 ⇒ 𝑏 = 𝐴1 𝑠.
+3.1. Constructing OPRF                                                              where 𝑎 = (𝑎0 , 𝑎1 , … , 𝑎𝑛−1 ) ← 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 . We use
+                                                                                    a proof by contradiction. Suppose there exists an efficient algorithm
+    Fig. 3 presents the ring LPR-based oblivious pseudorandom func-                  that can solve Ring-LPR in polynomial time. We take the first row
+tion. In the next section, we will prove the security of the oblivious              from 𝐴1 , denote it as 𝛼1 , and have ⌊𝛼1 𝑠⌋1 = 𝑏1 , where 𝑏1 is the first
+pseudorandom function.                                                              component of 𝑏. For the LWR problem instance, 𝛽⃗ = ⌊𝛬𝑠⃗⌋1 , assume
+
+                                                                                4
+Z. Shan et al.                                                                                                   Journal of Systems Architecture 160 (2025) 103346
+
+
+
+
+                                                      Fig. 3. Oblivious Pseudorandom Function (OPRF).
+
+
+
+𝛬𝑇 = (𝛼1 , 𝛼2 , … , 𝛼𝑚 ).
+
+Thus, we use the algorithm  𝑚 times to find 𝛽𝑖 such that ⌊𝛾𝑖 ⌋1 = 𝛽𝑖 =
+⌊𝛼1 𝑠1 ⌋1 , and thus we can solve the equation
+𝛾 = 𝛬𝑠⃗, 𝛾 𝑇 = (𝛾1 , … , 𝛾𝑚 ).
+
+
+    Assuming that the time complexity of solving 𝑠 from LWR problem
+instance is 𝑂(𝛬, 𝛽), according to Corollary 3, let 𝑂(𝛾 = 𝛬𝑠⃗) be the
+computational complexity of solving the equation 𝛾 = 𝛬𝑠⃗, we have
+𝑚𝑂() + 𝑂(𝛾 = 𝛬𝑠⃗) ≥ 𝑂(𝛬, 𝛽) ≥ 𝑂(𝑛!) or 𝑂(𝑒𝑛 ).
+
+Let 𝑚 = 𝑛, then
+        𝑂(𝛬, 𝛽) − 𝑂(𝛾 = 𝛬𝑠⃗)
+𝑂() ≥
+                 𝑛
+        𝑂(𝑛!) − 𝑂(𝛾 = 𝛬𝑠⃗)    𝑂(𝑒𝑛 ) − 𝑂(𝛾 = 𝛬𝑠⃗)
+      ≥                    or                     .
+                𝑛                      𝑛
+This contradicts the assumption that there is an efficient algorithm 
+that can solve the inner product Ring-LPR in polynomial time, thus the
+theorem holds. □
+
+
+3.3. Efficiency analysis
+
+    This section simulates the OPRF computation efficiency of this
+paper and OPRF in [14] on MAC, Pad and Phone. The PRF of [14]
+is instantiated based on LWE.
+
+3.3.1. Efficiency analysis on MAC
+   The tools used in the subsection are Python 3.12, the programs are
+performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB (see
+Fig. 4).
+
+3.3.2. Efficiency analysis on mobile pad
+   The tools used in the subsection are Pydriod 3, the programs are
+performed on Xiaomi Pad 6 Pro File Explorer 1th Qualcomm(R)AI En-
+gine(TM) Xiaolong 8+ mobile platform@3.2 GHz, RAM 8.00+3.00 GB
+(see Fig. 5).
+                                                                                 Fig. 4. Parallel comparison of OPRF on MAC, where 𝑛 represents the security
+                                                                                 parameter, unit is microseconds.
+3.3.3. Summary of data comparison
+    From the simulation results, it can be seen that for 𝑛 ≤ 250, the
+LWE-based OPRF in [14] is slightly faster, while for 𝑛 > 250, the ring
+LPR-based OPRF in this paper is faster. Furthermore, as 𝑛 increases,             4. PSI based on OPRF
+the advantages of ring LPR become more pronounced. Based on the
+simulation results for Pad, the OPRF in this paper is more stable;                  In this paper, apart from OPRF, another tool used in the construction
+although there are fluctuations, they are less significant compared to           of PSI is a perturbed pseudorandom generator [15]. The perturbed
+the LWE-based OPRF in [14].                                                      pseudorandom generator in this paper is constructed from Ring-LPN.
+
+                                                                             5
+Z. Shan et al.                                                                                                                          Journal of Systems Architecture 160 (2025) 103346
+
+
+
+
+                                                                                                              Fig. 6. Pseudorandom generator with perturbation 𝐺𝛾 (⋅).
+
+
+
+                                                                                                √
+                                                                                                √𝑛−1
+                                                                                                √∑
+                                                                                          ‖𝑎‖ = √ |𝑎 |2 .       𝑖
+                                                                                                        𝑖=0
+
+
+
+
+                                                                                          Definition 15 ([15]). A pseudorandom generator with perturbation,
+                                                                                          denoted as 𝐺𝛾 (⋅), is defined such that for 𝑥1 , 𝑥2 ∈ , there exists 𝛾
+                                                                                          satisfying the following conditions:
+
+                                                                                                1. When 𝑥1 = 𝑥2 , Pr (𝐺𝛾 (𝑥1 ) = 𝐺𝛾 (𝑥2 )) ≤ 𝑂(exp(−𝑛)),
+                                                                                                2. When 𝑥1 = 𝑥2 , such that ‖𝐺𝛾 (𝑥1 ) − 𝐺𝛾 (𝑥2 )‖ < 𝛾, there exists 𝑁
+                                                                                                   such that ‖𝐺𝛾 (𝑥1 ) − 𝐺𝛾 (𝑥2 )‖ ≥ 𝛾 ⋅ 𝑁, where clearly 𝑁 = 1 is
+                                                                                                   optimal.
+
+
+
+                                                                                          Theorem 1. The Ring-LPN problem itself can be viewed as a pseudorandom
+                                                                                          function with perturbations.
+
+
+                                                                                          Proof. We prove each statement separately. First, when 𝑥1 = 𝑥2 , we
+Fig. 5. Parallel comparison of OPRF on mobile pads, where 𝑛 represents the security       have
+parameter, unit is microseconds.                                                            (                   )                  1
+                                                                                          Pr 𝐺𝛾 (𝑥1 ) = 𝐺𝛾 (𝑥2 ) = Pr (𝑒1 = 𝑒2 ) = 𝑛 .
+                                                                                                                                  2
+                                                                                                                  √
+                                                                                          Additionally, set 𝛾 = 𝑛 + 1, so
+Next, we will present the reduction process for Ring-LPN.
+                                                                                          ‖(𝐴𝑥1 + 𝑒1 ) − (𝐴𝑥2 + 𝑒2 )‖ = ‖𝑒1 − 𝑒2 ‖ < 𝛾 .
+4.1. Reduction of ring-LPN                                                                When 𝑥1 ≠ 𝑥2 , set 𝑣1 = 𝐺𝛾 (𝑥1 ), 𝑣2 = 𝐺𝛾 (𝑥2 ), and know that
+                                                                                                          √     ∑𝑛      ( )𝑘 ( )𝑛−𝑘
+                                                                                                                          1     1
+Definition 13 (Learning Parity with Noise Over Ring). The learning parity                 Pr (‖𝑣1 − 𝑣2 ‖ ≤ 𝑛) =     𝐶𝑛𝑘
+                                                                                                                𝑘=0
+                                                                                                                          3     2
+with noise over ring problem states that for 𝑎, 𝑠, 𝑒, 𝑢 ∈ {0,1} the
+following distributions are computationally indistinguishable: (𝑎, 𝑎𝑠 +                                                   ∑
+                                                                                                                          𝑛∕2         ( )𝑘 ( )𝑘 ( )𝑛−2𝑘
+                                                                                                                                       1    1    1
+                                                                                                                      +         𝐶𝑛𝑘                     .
+𝑒) ≈𝐶 (𝑎, 𝑢).                                                                                                                          3    6    2
+                                                                                                                          𝑘=0
+
+                                                                                          Because
+                                                                                                  ( )𝑘 ( )𝑛−𝑘     (     ( )2     ( )𝑛 )
+Corollary 5. If there exists an efficient algorithm  that can solve the                  ∑𝑛
+                                                                                                   1    1       1 2       2       2
+Ring-LPN problem in polynomial time, then there also exists an algorithm                      𝐶𝑛𝑘             = 𝑛     +      +⋯+
+                                                                                          𝑘=0
+                                                                                                   3    2      2    3     3       3
+ ′ that can solve the LPN problem.                                                                               (    ( )𝑛 )
+                                                                                                                3        2
+                                                                                                              = 𝑛 1−          ,
+                                                                                                               2         3
+Proof. The proof method is similar to that of Lemma 5, but this way
+                                                                                          and
+the computational complexity of  will decrease. If we want the Ring-                                                                             (                         )
+                                                                                          ∑
+                                                                                          𝑛∕2         ( )𝑘 ( )𝑘 ( )𝑛−2𝑘                                    (         ) 2𝑛
+LPN problem to be ‘approximately’ as hard as the LPN problem, then                                     1    1    1        3⋅6 1                                 1
+                                                                                                𝐶𝑛𝑘                     ≤                             1−                        .
+for the security parameters 𝜅1 of the Ring-LPN problem and 𝜅2 of the                      𝑘=0
+                                                                                                       3    6    2         17 2𝑛− 2𝑛                           3⋅6
+LPN problem, we have
+                                                                                          Therefore
+𝑒𝜅1            (𝜅 )!                                                                        (            √   √     )
+     ≥ 𝑒𝜅2 , or 1 ≥ (𝜅2 )!.                                                                                           1
+                                                                                          Pr ‖𝑣1 − 𝑣2 ‖ ≤ 𝑛 < 𝑛 + 1 ≤ 𝑛 .
+ 𝜅12            𝜅12                                                                                                  2
+                                                                                                                                                                       √
+Thus, we can roughly obtain 𝜅1 ≥ 1.5𝜅2 and 𝜅2 ≥ 12. Note that 𝑂(𝑛)                        Thus, there is a very high probability that ‖𝑣1 −𝑣2 ‖ ≥                       𝑛 + 1, and 𝑁 = 1
+is an asymptotically large quantity with respect to 𝑛. We use the most                    (see Fig. 6). □
+extreme case to determine the relationship between 𝜅1 and 𝜅2 . □
+
+
+4.2. Perturbed pseudorandom generator                                                     4.3. PSI based on OPRF
+
+Definition 14. Let 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 ∈ {0,1} . Define the                    Lemma 6. Assuming 𝑓 (𝑦) ≈𝐶 𝑢1 and 𝑔(𝑢1 ) ≈𝐶 𝑢2 , then (𝑔◦𝑓 )(𝑦) ≈𝐶 𝑢2 .
+norm of 𝑎 as ‖𝑎‖, and
+
+                                                                                      6
+Z. Shan et al.                                                                                                                 Journal of Systems Architecture 160 (2025) 103346
+
+
+
+
+                            Fig. 7. PSI based on OPRF.
+
+
+
+
+                                                                                            Fig. 9. Parallel comparison of PSI on mobile pads, where 𝑛 represents the security
+                                                                                            parameter, unit is microseconds.
+
+
+
+
+Fig. 8. Parallel comparison of PSI on MAC, where 𝑛 represents the security parameter,       Fig. 10. Comparison of PSI on mobile phones, where 𝑛 represents the security
+unit is microseconds.                                                                       parameter, unit is microseconds.
+
+
+
+                                                                                        7
+Z. Shan et al.                                                                                                                 Journal of Systems Architecture 160 (2025) 103346
+
+
+
+
+                                                                          Fig. 11. PIR based on OPRF.
+
+
+                                                                                             Proof. On one hand, because the pseudorandom 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →
+                                                                                             {0,1} , for any 𝑘 ∈ {0,1} , 𝑦 ∈  ⊂ {0, 1}∗ , we have 𝐹̃𝑘 (𝑦) ≈𝐶 𝑢𝜔 ∈
+                                                                                             {0,1} .
+                                                                                                  On the other hand, due to the pseudorandom function 𝐹𝑘 ∶ {0,1} ×
+                                                                                             {0,1} → {0,1} , for 𝑢𝓁1 ∈ {0,1} , we have 𝐹𝑘 (𝑢𝓁1 ) ≈𝐶 𝑢𝜔 . According
+                                                                                             to the property of the hash function, have 1 (𝑦) ≈𝐶 𝑢𝓁1 . Combining
+                                                                                             with Lemma 6, one can obtain that 𝐹𝑘 (1 (𝑦)) ≈𝐶 𝑢𝜔 . Consequently,
+                                                                                             𝐹̃𝑘 (𝑦) ≈𝐶 𝐹𝑘 (1 (𝑦)). □
+
+
+                                                                                             Theorem 2. If 1 is a collision resistant hash function, 2 and 3
+                                                                                             are hamming correlation robustness, then the protocol in Fig. 7 securely
+                                                                                             realizes 𝑃 𝑆 𝐼 in the semi-honest model when parameters 𝑚, 𝑤 are chosen
+                                                                                             as described in [14].
+
+
+                                                                                             Proof. Perspective from 𝑃1 .
+                                                                                             Hyb0 𝑃1 ’s view and 𝑃2 ’s output in the real protocol.
+                                                                                             Hyb1 Same as Hyb0 except that on 𝑃2 ’s side, for each 𝑖 ∈ [𝜔], if 𝑠[𝑖] = 0,
+                                                                                                  then sample 𝐴𝑖 ← {0, 1}𝑚 and compute 𝐵𝑖 = 𝐴𝑖 ⊕ 𝐷𝑖 ; otherwise
+                                                                                                  sample 𝐵𝑖 ← {0, 1}𝑚 and compute 𝐴𝑖 = 𝐵𝑖 ⊕ 𝐷𝑖 . This hybrid is
+                                                                                                  identical to Hyb0 .
+                                                                                             Hyb2 Initialize an 𝑚 × 𝑤 binary matrix 𝐷 to all 1’s. Denote its column
+                                                                                                  vectors by 𝐷1 , … , 𝐷𝜔 . Then 𝐷1 = ⋯ = 𝐷𝜔 = 1𝑚 . For 𝑦 ∈ ,
+                                                                                                  randomly select 𝑣 ← [𝑚]𝜔 , and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
+                                                                                             Hyb3 Find a suitable pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →
+                                                                                                  {0,1} . For 𝑦 ∈ , compute 𝑣̃ = 𝐹̃𝑘 (𝑦), randomly select 𝑣 ← [𝑚]𝜔 ,
+                                                                                                  and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
+                                                                                             Hyb4 Let there be a pseudorandom function 𝐹 ∶ {0,1} ×{0,1} → {0,1}
+                                                                                                  and a hash function 1 ∶ {0, 1}∗ → {0,1} . For 𝑦 ∈ , compute
+                                                                                                  𝑣′ = 𝐹𝑘 (1 (𝑦)), randomly select 𝑣 ← [𝑚]𝜔 , and set 𝐷𝑖 [𝑣[𝑖]] = 0 for
+                                                                                                  all 𝑖 ∈ [𝜔].
+                                                                                             Hyb5 Let there be a pseudorandom function 𝐹 ∶ {0,1} × {0,1} →
+                                                                                                   {0,1} , Hamming Correlation Robustness 2 ∶ Z𝑚×𝜔         {0,1}
+                                                                                                                                                                   → {0,1}
+                                                                                                   and a hash function 1 ∶ {0, 1}∗ → {0,1} . For 𝑦 ∈ , compute
+                                                                                                   𝑣′ = 𝐹𝑘 (1 (𝑦)), 𝑣 = 2 (𝑣′ ), and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
+Fig. 12. Parallel comparison of PIR on MAC, where 𝑛 represents the security parameter,          Given that Hyb0 ≈𝐶 Hyb1 ≈𝐶 Hyb2 ≈𝐶 Hyb3 , Hyb4 ≈𝐶 Hyb5 and
+unit is microseconds.                                                                        according to Lemma 7, it be known that Hyb3 ≈𝐶 Hyb4 . Therefore, we
+                                                                                             have Hyb0 ≈𝐶 Hyb5 .
+                                                                                                Perspective from 𝑃2 .
+Lemma 7. Find a suitable pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →                      Hyb0 𝑃2 ’s view in the real protocol.
+{0,1} . Assuming that the pseudo-random function 𝐹𝑘 ∶ {0,1} × {0,1} →
+                                                                                             Hyb1 𝜓 ← {0,1} , all other aspects are consistent with the real
+{0,1} and the hash function 1 ∶ {0, 1}∗ → {0,1} are indistinguishable,
+                                                                                                  protocol.
+we have
+                                                                                             Hyb2 Introduce 𝐺𝛾 ∶ {0,1} → {0,1} and Hamming Correlation
+𝐹̃𝑘 (𝑦) ≈𝐶 𝐹𝑘 (1 (𝑦)).
+                                                                                                  Robustness 3 ∶ Z𝑚×𝜔 {0,1}
+                                                                                                                             → {0,1} , let the initial matrices be
+                                                                                                  𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , randomly select 𝑣 ∈ [𝑚]𝜔 , set 𝐶𝑖 [𝑣[𝑖]] = 0
+                                                                                                  for all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]]).
+
+                                                                                         8
+Z. Shan et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103346
+
+
+Hyb3 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , find an appropriate             • Setup The simulator  generates some necessary parameters for the
+     pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ → {0,1} . For 𝑦 ∈ ,               algorithms and selects an appropriate hash functions 1 ∶ {0, 1}∗ →
+     compute 𝑣̃ = 𝐹̃𝑘 (𝑦), randomly select 𝑣 ← [𝑚]𝜔 , set 𝐶𝑖 [𝑣[𝑖]] = 0 for           {0,1} , Hamming Correlation Robustness 2 ∶ {0,1} → [𝑚]𝜔 , Ham-
+     all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]]).                               ming Correlation Robustness 3 ∶ Z𝑚×𝜔    → {0,1} and a 𝐺𝛾 ∶ {0,1} →
+                                                                                                                         {0,1}
+Hyb4 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , set a pseudo-                     {0,1} , a pseudorandom function 𝐹 ∶ {0,1} × {0,1} → {0,1} with
+     random function 𝐹 ∶ {0,1} × {0,1} → {0,1} , a hash function                   key 𝑘 ∈ {0,1} . The adversary 𝑃1 selects 𝑠 and transmits 𝑠 to the
+     1 ∶ {0, 1}∗ → {0,1} and Hamming Correlation Robustness                         simulator  using OT.
+              𝑚×𝜔
+     3 ∶ Z{0,1}     → {0,1} . For 𝑦 ∈ , compute 𝑣′ = 𝐹𝑘 (1 (𝑦)),                • H-Query, PRF-Query and PRG-Query The adversary 𝑃1 makes
+     randomly select 𝑣 ← [𝑚]𝜔 . Set 𝐶𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔]. Compute            queries about the hash function, pseudorandom function, oblivious
+     𝐺𝛾 (3 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]])).                                               transfer values, and pseudorandom generator. The simulator  pre-
+Hyb5 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , set a pseu-                       establishes lists for handling H-Query, PRF-Query, and PRG-Query
+     dorandom function 𝐹 ∶ {0,1} × {0,1} → {0,1} and a hash                        respectively.
+     function 1 ∶ {0, 1}∗ → {0,1} , Hamming Correlation Robustness
+               𝑚×𝜔
+     2 ∶ Z{0,1}    → {0,1} and 3 ∶ Z𝑚×𝜔        → {0,1} . For 𝑦 ∈ ,                  – 1 -Query For the 𝑖th query 𝑥𝑖 ∈ {0, 1}∗ corresponding to the
+                                            {0,1}
+     compute 𝑣′ = 𝐹𝑘 (1 (𝑦)), compute 𝑣′ = 𝐹𝑘 (1 (𝑦)). Set 𝐶𝑖 [𝑣[𝑖]] = 0                 value of 1 , the simulator  selects from the hash value list
+     for all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (3 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]])).                           if available, otherwise selects a random 𝑋𝑖 ∈ {0,1} . Set 𝑋𝑖 =
+  Similarly, it can be proven that Hyb0 ≈𝐶 Hyb5 . □                                        1 (𝑥𝑖 ) and update the list accordingly.
+                                                                                         – 2 -Query For the 𝑖th query 𝑦𝑖 ∈ {0,1} corresponding to the
+                                                                                           value of 2 , the simulator  selects from the hash value list if
+Definition 16 (CPA Security Model of the Protocol in Fig. 7). Assume                       available, otherwise selects a random 𝑌𝑖 ∈ [𝑚]𝜔 . Set 𝑌𝑖 = 2 (𝑦𝑖 )
+there exists a perturbed pseudorandom oracle machine 𝑃 𝑟𝑀𝛾 (where
+                                                                                           and update the list accordingly.
+𝛾 is the upper bound on the norm of the perturbation in 𝑃 𝑟𝑀𝛾 ), such
+                                                                                         – 3 -Query For the 𝑖th query 𝑧𝑖 ∈ Z𝑚×𝜔          corresponding to the
+that for an input 𝑥, it outputs two values: one is a random value 𝑦0 ,                                                              {0,1}
+                                                                                           value of 3 , the simulator  selects from the hash value list
+and the other is a pseudorandom value 𝑦1 with 𝑥 as its input.
+                                                                                           if available, otherwise selects a random 𝑍𝑖 ∈ {0,1} . Set 𝑍𝑖 =
+     • Setup The simulator  generates the necessary parameters for                        3 (𝑧𝑖 ) and update the list accordingly.
+       the algorithms. The adversary  chooses 𝑠 and sends it to the                     – 𝐹 -Query For the 𝑖th query 𝑢𝑖 ∈ {0,1} corresponding to the value
+       simulator  using OT.                                                               of 𝐹 , the simulator  selects from the pseudorandom function
+     • Hash Queries, PRF Queries and PRG Queries The adversary                             value list if available, otherwise selects a random 𝑈𝑖 ∈ {0,1} .
+        sequentially performs hash function queries, pseudorandom                         Set 𝑈𝑖 = 𝐹 (𝑢𝑖 , 𝑘) and update the list accordingly.
+       function queries, and pseudorandom synthesizer queries. Here,
+                                                                                         – 𝐺𝛾 -Query For the 𝑖th query 𝑤𝑖 ∈ {0,1} corresponding to the
+       the adversary cannot know the key in pseudorandom function
+                                                                                           value of 𝐺𝛾′ , the simulator  selects from the pseudorandom
+       queries.
+                                                                                           generator value list if available, otherwise selects a random
+     • Challenge The adversary  selects a private message 𝑚 and sends
+                                                                                           𝑊𝑖 ∈ {0,1} . Set 𝑊𝑖 = 𝐺𝛾′ (𝑤𝑖 ) and update the list accordingly.
+       it to the simulator . The simulator queries the hash function,
+       pseudorandom function, and oblivious transfer values of the real                    Note that 𝐺𝛾′ is not 𝐺𝛾black-box .
+       scheme, inputs these results into the pseudorandom oracle ma-
+       chine 𝑃 𝑟𝑀𝛾 , obtains two ciphertexts 𝑐0 and 𝑐1 , and sends them            • Challenge 𝑃1 selects 𝑚 ∈ ∕ and sends it to .  using the corre-
+       to the adversary .                                                            sponding hash function queries and pseudorandom function queries,
+     • Guessing After receiving the two ciphertexts 𝑐0 and 𝑐1 ,  guesses             inputs the queried values into the black-box 𝐺𝛾′ , obtaining 𝜓0 and 𝜓1 ,
+       which ciphertext corresponds to the encryption of 𝑚 and sends the              and then sends 𝜓0 , 𝜓1 to 𝑃1 .
+       guess back to the simulator .                                               • Guess Based on the received 𝜓0 and 𝜓1 , 𝑃1 guesses whether 𝜓0 or
+   The advantage of the adversary  is defined as the advantage of the                𝜓1 is the ciphertext of the encrypted message 𝑚.
+simulator  in distinguishing the outputs of 𝑃 𝑟𝑀𝛾 .                                   According to the assumption, if the adversary 𝑃1 can break the
+                                                                                    scheme with a non-negligible advantage, then the simulator  can
+Note 2. The 𝑃 𝑟𝑀 mentioned in this paper differs from [22]. In [22],               also break the black-box 𝐺𝛾′ with a non-negligible advantage. This
+𝑃 𝑟𝑀 refers to a pseudorandom oracle machine that outputs random                   contradicts the assumption that 𝐺𝛾′ is secure. □
+values when the adversary does not know the pseudorandom function key,
+and outputs pseudorandom function values based on the key known to the
+adversary when the key is known. This is a single-value output. However, the        4.4. Efficiency analysis PSI
+𝑃 𝑟𝑀 required in this paper outputs both of these values simultaneously,
+making it a multi-value output.                                                         This section simulates the PSI computation efficiency of this pa-
+                                                                                    per and PSI in [14] on MAC, Pad, and Phone. The PRF of [14] is
+Theorem 3. If 1 is a collision resistant hash function, 2 and 3 are              instantiated based on LWE.
+hamming correlation robustness, then the protocol in Fig. 7 securely realizes
+𝑃 𝑆 𝐼 in Definition 16.
+                                                                                    4.4.1. Efficiency analysis on MAC
+                                                                                       The tools used in the subsection are Python 3.12, the programs are
+Proof. Suppose the adversary 𝑃1 can break the scheme with non-                     performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB (see
+negligible advantage. Now, the simulator  simulates the scheme.                    Fig. 8).
+Suppose there exists a black-box 𝐺𝛾𝑏𝑙𝑎𝑐 𝑘−𝑏𝑜𝑥 such that
+                                    𝑦0 = 𝐺𝛾 (𝑥) ∈ {0,1} ,
+                                                                                    4.4.2. Efficiency analysis on mobile pad
+                                ↗                                                      The tools used in the subsection are Pydriod 3, the programs are
+𝐺𝛾𝑏𝑙𝑎𝑐 𝑘−𝑏𝑜𝑥 (𝑥) → (𝑦0 , 𝑦1 )
+                                ↘                                                   performed on Xiaomi Pad 6 Pro File Explorer 1th Qualcomm(R)AI En-
+                                    𝑦1 ∈𝑅 {0,1} .                                  gine(TM) Xiaolong 8+ mobile platform@3.2 GHz, RAM 8.00+3.00 GB
+                                                                                    (see Fig. 9).
+
+                                                                                9
+Z. Shan et al.                                                                                                         Journal of Systems Architecture 160 (2025) 103346
+
+
+4.5. Analysis of efficiency on mobile phones                                     Acknowledgments
+
+   The tools used in the subsection are Pydriod 3, the programs are per-            This work was supported in part by the National Nature Science
+formed on Redmi K30 File Explorer 4th Qualcomm(R)AI Engine(TM)                   Foundation of China under Grant 61872087 and Grant 51875457; in
+Qualcomm Xiaolong 730G 8+ mobile platform@2.2 GHz, RAM 6.00 GB                   part by the Key Foundation of National Natural Science Foundation
+(see Fig. 10).                                                                   of China under Grant U19B2021; and in part by the Key Research
+                                                                                 and Development Program of Shaanxi under Program 2022GY-028 and
+                                                                                 Program 2022GY-050.
+4.5.1. Summary of data comparison
+    From the simulation results, it can be seen that for 𝑛 ≤ 400, the            Data availability
+LWE-based OPRF in [14] is slightly faster, while for 𝑛 > 400, the ring
+LPR-based OPRF in this paper is faster. Furthermore, as 𝑛 increases,                No data was used for the research described in the article.
+the advantages of ring LPR become more pronounced. Based on the
+simulation results for Pad, the OPRF in this paper is more stable;
+although there are fluctuations, they are less significant compared to           References
+the LWE-based OPRF in [14].
+                                                                                  [1] R. Lei, X. Chen, D. Liu, C. Song, Y. Tan, A. Ren, CEIU: Consistent and efficient
+                                                                                      incremental update mechanism for mobile systems on flash storage, J. Syst. Ar-
+5. Expansion of this work                                                             chit. 152 (2024) 103151, http://dx.doi.org/10.1016/j.sysarc.2024.103151, URL:
+                                                                                      https://www.sciencedirect.com/science/article/pii/S1383762124000882.
+                                                                                  [2] J. Sun, L. Yin, M. Zou, Y. Zhang, T. Zhang, J. Zhou, Makespan-minimization
+   Private Information Retrieval (PIR) [23–29] is a technique that                    workflow scheduling for complex networks with social groups in edge
+enables a client to securely download a specific element, such as a                   computing, J. Syst. Archit. 108 (2020) 101799, http://dx.doi.org/10.1016/
+movie or a friend’s record, from a database managed by an untrusted                   j.sysarc.2020.101799, URL: https://www.sciencedirect.com/science/article/pii/
+server, such as a streaming service or a social network, without disclos-             S1383762120300928.
+                                                                                  [3] Y. Gao, Y. Luo, L. Wang, X. Liu, L. Qi, W. Wang, M. Zhou, Efficient scalable
+ing to the server which particular element has been retrieved. Given
+                                                                                      multi-party private set intersection(-variants) from bicentric zero-sharing, in:
+the functional similarities between PIR and PSI, this paper extends its
+                                                                                      Proceedings of the Conference on Computer and Communications Security, CCS,
+exploration into the construction of PIR using OPRF (see Fig. 11).                    Association for Computing Machinery (ACM), New York, NY, USA, 2024.
+                                                                                  [4] M.O. Rabin, How to exchange secrets with oblivious transfer, 2005, URL: https:
+5.1. Efficiency analysis PIR                                                          //eprint.iacr.org/2005/187.
+                                                                                  [5] O. Goldreich, S. Goldwasser, S. Micali, How to construct random functions, J.
+                                                                                      ACM 33 (4) (1986) 792–807, http://dx.doi.org/10.1145/6490.6503.
+    This section simulates the PSI computation efficiency of this paper           [6] M. Naor, O. Reingold, Number-theoretic constructions of efficient pseudo-random
+and machine learning-based PIR in [30](DLMI for short) on MAC.                        functions, J. ACM 51 (2) (2004) 231–262, http://dx.doi.org/10.1145/972639.
+The tools used in the subsection are Python 3.12, the programs are                    972643.
+                                                                                  [7] M.J. Freedman, Y. Ishai, B. Pinkas, O. Reingold, Keyword search and oblivious
+performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB.
+                                                                                      pseudorandom functions, in: J. Kilian (Ed.), Theory of Cryptography, Springer
+    The OPRF-based PIR proposed in this paper has a runtime that                      Berlin Heidelberg, Berlin, Heidelberg, 2005, pp. 303–324.
+differs from the machine learning-based PIR by no more than approx-               [8] S. Jarecki, X. Liu, Efficient oblivious pseudorandom function with applications
+imately 5 × 10−3 seconds. Additionally, the security of our PIR scheme                to adaptive OT and secure computation of set intersection, in: O. Reingold (Ed.),
+is theoretically supported in comparison to [30] (see Fig. 12).                       Theory of Cryptography, Springer Berlin Heidelberg, Berlin, Heidelberg, 2009,
+                                                                                      pp. 577–594.
+                                                                                  [9] V.K. Yadav, N. Andola, S. Verma, S. Venkatesan, A survey of oblivious trans-
+6. Conclusion                                                                         fer protocol, ACM Comput. Surv. 54 (10s) (2022) http://dx.doi.org/10.1145/
+                                                                                      3503045.
+    This paper presents a PSI based on efficient post-quantum OPRF and           [10] M.R. Albrecht, A. Davidson, A. Deo, N.P. Smart, Round-optimal verifiable
+                                                                                      oblivious pseudorandom functions from ideal lattices, in: J.A. Garay (Ed.), Public-
+proves its security under the semi-honest model, demonstrating security
+                                                                                      Key Cryptography – PKC 2021, Springer International Publishing, Cham, 2021,
+even in the CPA model in Definition 16. The addition of PPRG enables                  pp. 261–289.
+the PSI to effectively resist probabilistic attacks. In the simulation           [11] N. Tyagi, S. Celi, T. Ristenpart, N. Sullivan, S. Tessaro, C.A. Wood, A fast
+experiments, the proposed PSI shows greater efficiency compared to                    and simple partially oblivious PRF, with applications, in: O. Dunkelman, S.
+post-quantum PSIs represented by LWE.                                                 Dziembowski (Eds.), Advances in Cryptology – EUROCRYPT 2022, Springer
+    Although the PIR in this study is not as efficient as the machine                 International Publishing, Cham, 2022, pp. 674–705.
+                                                                                 [12] S. Casacuberta, J. Hesse, A. Lehmann, Sok: Oblivious pseudorandom functions,
+learning-based PIR, the gap between the two is already quite small.
+                                                                                      in: 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P),
+However, there are also notable shortcomings; the efficiency of the                   2022, pp. 625–646, http://dx.doi.org/10.1109/EuroSP53844.2022.00045.
+proposed PSI still lags behind that of non-post-quantum PSIs, which              [13] D. Boneh, D. Kogan, K. Woo, Oblivious pseudorandom functions from isogenies,
+will be addressed in future work.                                                     in: S. Moriai, H. Wang (Eds.), Advances in Cryptology – ASIACRYPT 2020,
+                                                                                      Springer International Publishing, Cham, 2020, pp. 520–550.
+                                                                                 [14] M. Chase, P. Miao, Private set intersection in the internet setting from lightweight
+CRediT authorship contribution statement                                              oblivious PRF, in: D. Micciancio, T. Ristenpart (Eds.), Advances in Cryptology –
+                                                                                      CRYPTO 2020, Springer International Publishing, Cham, 2020, pp. 34–63.
+   Zhuang Shan: Writing – original draft, Conceptualization. Leyou               [15] Z. Shan, L. Zhang, Q. Wu, Q. Lai, Analysis, modify and apply in IIOT form
+Zhang: Writing – review & editing, Writing – original draft. Qing Wu:                 light-weight PSI in CM20, 2024, URL: https://eprint.iacr.org/2024/969.
+                                                                                 [16] J. Alwen, S. Krenn, K. Pietrzak, D. Wichs, Learning with rounding, revisited, in:
+Conceptualization. Qiqi Lai: Writing – review & editing. Fuchun Guo:
+                                                                                      R. Canetti, J.A. Garay (Eds.), Advances in Cryptology – CRYPTO 2013, Springer
+Writing – review & editing.                                                           Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 57–74.
+                                                                                 [17] A. Banerjee, C. Peikert, A. Rosen, Pseudorandom functions and lattices, in: D.
+Declaration of competing interest                                                     Pointcheval, T. Johansson (Eds.), Advances in Cryptology – EUROCRYPT 2012,
+                                                                                      Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 719–737.
+                                                                                 [18] D. Bellizia, C. Hoffmann, D. Kamel, H. Liu, P. Méaux, F.-X. Standaert, Y.
+    The authors declare that they have no known competing finan-                      Yu, Learning parity with physical noise: Imperfections, reductions and FPGA
+cial interests or personal relationships that could have appeared to                  prototype, IACR Trans. Cryptogr. Hardw. Embed. Syst. 2021 (2021) 390–417,
+influence the work reported in this paper.                                            URL: https://api.semanticscholar.org/CorpusID:235814670.
+
+
+                                                                            10
+Z. Shan et al.                                                                                              Journal of Systems Architecture 160 (2025) 103346
+
+
+[19] Y. Yu, J. Zhang, Smoothing out binary linear codes and worst-case sub-                      Leyou Zhang received the M.S. and Ph.D. degrees from Xid-
+     exponential hardness for LPN, in: T. Malkin, C. Peikert (Eds.), Advances in                 ian University, Xi’an, China, in 2002 and 2009, respectively.
+     Cryptology – CRYPTO 2021, Springer International Publishing, Cham, 2021, pp.                From 2013 to 2014, he served as a visiting scholar at the
+     473–501.                                                                                    University of Wollongong, Australia. He currently worked
+[20] V. Kolesnikov, R. Kumaresan, M. Rosulek, N. Trieu, Efficient batched oblivious              in Xidian University as a professor.
+     PRF with applications to private set intersection, in: Proceedings of the 2016                  His current research interests include public key cryp-
+     ACM SIGSAC Conference on Computer and Communications Security, CCS ’16,                     tography, network security and computer security. He has
+     Association for Computing Machinery, New York, NY, USA, 2016, pp. 818–829,                  over 120 scientific publications in many highly ranked
+     http://dx.doi.org/10.1145/2976749.2978381.                                                  cybersecurity journals and conferences.
+[21] Z. Brakerski, E. Kirshanova, D. Stehlé, W. Wen, Learning with errors and
+     extrapolated dihedral cosets, in: Public-Key Cryptography – PKC 2018, Springer
+     International Publishing, 2018, pp. 702–727.
+[22] A. Jain, H. Lin, J. Luo, D. Wichs, The pseudorandom oracle model and ideal
+     obfuscation, in: H. Handschuh, A. Lysyanskaya (Eds.), Advances in Cryptology –
+     CRYPTO 2023, Springer Nature Switzerland, Cham, 2023, pp. 233–262.
+                                                                                                 Qing Wu received the M.S. and Ph.D. degrees from the Xid-
+[23] S. Angel, H. Chen, K. Laine, S. Setty, PIR with compressed queries and amortized
+                                                                                                 ian University, Xi’an, China, in 2006 and 2009, respectively.
+     query processing, in: 2018 IEEE Symposium on Security and Privacy, SP, 2018,
+                                                                                                     She currently works with Xi’an University of Posts and
+     pp. 962–979, http://dx.doi.org/10.1109/SP.2018.00062.                                       Communications, Xi’an, as a Professor. Her current research
+[24] A. Burton, S.J. Menon, D.J. Wu, Respire: High-rate PIR for databases with small             interests include artificial intelligence security and cloud
+     records, in: Proceedings of the Conference on Computer and Communications                   security.
+     Security, CCS, Association for Computing Machinery (ACM), New York, NY, USA,
+     2024.
+[25] J. Dujmovic, M. Hajiabadi, Lower-bounds on public-key operations in PIR, in: M.
+     Joye, G. Leander (Eds.), Advances in Cryptology – EUROCRYPT 2024, Springer
+     Nature Switzerland, Cham, 2024, pp. 65–87.
+[26] B. Fisch, A. Lazzaretti, Z. Liu, C. Papamanthou, Thorpir: Single server PIR via
+     homomorphic thorp shuffles, in: Proceedings of the Conference on Computer and
+     Communications Security, CCS, Association for Computing Machinery (ACM),
+     New York, NY, USA, 2024.
+                                                                                                 Qiqi Lai received the B.S. from PLA University of Informa-
+[27] A. Gascon, Y. Ishai, M. Kelkar, B. Li, Y. Ma, M. Raykova, Computationally
+                                                                                                 tion Engineering, henan, China, in 2008. And he received
+     secure private information retrieval and aggregation in the shuffle model, in:
+                                                                                                 the M.S. and Ph.D. degrees from Xidian University, Xi’an,
+     Proceedings of the Conference on Computer and Communications Security, CCS,                 China, in 2011 and 2015.
+     Association for Computing Machinery (ACM), New York, NY, USA, 2024.                             His currently works with Shaanxi Normal University,
+[28] A. Ghoshal, M. Zhou, E. Shi, Efficient pre-processing PIR without public-                   Xi’an, as a Professor. His current research interests include
+     key cryptography, in: M. Joye, G. Leander (Eds.), Advances in Cryptology –                  the theory of lattice-based public key cryptography and its
+     EUROCRYPT 2024, Springer Nature Switzerland, Cham, 2024, pp. 210–240.                       provable security, as well as the construction and analysis
+[29] M. Luo, F.-H. Liu, H. Wang, Faster FHE-based single-server private information              of homomorphic encryption schemes.
+     retrieval, in: Proceedings of the Conference on Computer and Communications
+     Security, CCS, Association for Computing Machinery (ACM), New York, NY, USA,
+     2024.
+[30] M. Lam, J. Johnson, W. Xiong, K. Maeng, U. Gupta, Y. Li, L. Lai, I. Leontiadis,
+     M. Rhu, H.-H.S. Lee, V.J. Reddi, G.-Y. Wei, D. Brooks, E. Suh, GPU-based
+                                                                                                 Funcun Guo received the B.S. and M.S. degrees from Fujian
+     private information retrieval for on-device machine learning inference, in:
+                                                                                                 Normal University, China, in 2005 and 2008, respectively,
+     Proceedings of the 29th ACM International Conference on Architectural Support               and the Ph.D. degree from the University of Wollongong,
+     for Programming Languages and Operating Systems, Volume 1, ASPLOS ’24,                      Australia, in 2013. He is currently an Associate Research
+     Association for Computing Machinery, New York, NY, USA, 2024, pp. 197–214,                  Fellow with the School of Computing and Information
+     http://dx.doi.org/10.1145/3617232.3624855.                                                  Technology, University of Wollongong.
+                                                                                                     His primary research interests include the public
+                                                                                                 key cryptography, in particular protocols, encryption and
+                          Zhuang Shan received the B.S. from Liaoning Institute of               signature schemes, and security proof.
+                          Science and Technology, benxi, China, in 2019. And he
+                          received the M.S. from North Minzu University, yinchuan,
+                          China, in 2022.
+                               He is currently pursuing the Ph,D. degree in mathemat-
+                          ics with Xidian University, Xi’an, China. His current interests
+                          include cryptography, reduction of hard problems in lattice,
+                          and network security.
+
+
+
+
+                                                                                            11
+
--- a/papers_txt/A-CP-ABE-based-access-control-scheme-with-cryptogra_2025_Journal-of-Systems-.txt
+++ b/papers_txt/A-CP-ABE-based-access-control-scheme-with-cryptogra_2025_Journal-of-Systems-.txt
@@ -0,0 +1,846 @@
+                                                                Journal of Systems Architecture 160 (2025) 103331
+
+
+                                                                     Contents lists available at ScienceDirect
+
+
+                                                            Journal of Systems Architecture
+                                                            journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+A CP-ABE-based access control scheme with cryptographic reverse firewall
+for IoV
+Xiaodong Yang a , Xilai Luo a ,∗, Zefan Liao a , Wenjia Wang a , Xiaoni Du b , Shudong Li c
+a College of Computer Science and Engineering, Northwest Normal University, China
+b
+    College of Mathematics and Statistics, Northwest Normal University, China
+c
+    Cyberspace Institute of Advanced Technology, Guangzhou University, China
+
+
+
+ARTICLE                  INFO                               ABSTRACT
+
+Keywords:                                                   The convergence of AI and internet technologies has sparked significant interest in the Internet of Vehicles
+Attribute-based encryption                                  (IoV) and intelligent transportation systems (ITS). However, the vast data generated within these systems
+Multi-authority                                             poses challenges for onboard terminals and secure data sharing. To address these issues, we propose a novel
+Internet of Vehicles
+                                                            solution combining ciphertext policy attribute-based encryption (CP-ABE) and a cryptographic reverse firewall
+Cryptographic reverse firewall
+                                                            (CRF) mechanism for IoV. This approach offers several advantages, including offline encryption and outsourced
+Outsource decryption
+                                                            decryption to improve efficiency. The CRF mechanism adds an extra layer of security by re-randomizing
+                                                            vehicle data, protecting sensitive information. While single-attribute authority schemes simplify access control,
+                                                            they are not ideal for IoV environments. Therefore, we introduce a multi-authority scheme to enhance
+                                                            security. Performance analysis demonstrates our scheme’s ability to optimize encryption and decryption while
+                                                            safeguarding vehicle data confidentiality. In summary, our solution improves data management, access control,
+                                                            and security in the IoV, contributing to its safe and efficient development.
+
+
+
+1. Introduction                                                                              significant concerns about data security [5]. Therefore, cloud-based
+                                                                                             solutions alone are insufficient to meet the demands of the IoV. To
+    Advances in 5G technology, coupled with the growing volume of ve-                        mitigate these issues, edge computing [6], fog computing [7], and
+hicular traffic, have intensified concerns regarding traffic safety, travel                  Roadside Units (RSUs) [8] have been proposed. RSUs, with their higher
+efficiency, and environmental impact. In response, Intelligent Transport                     computational capabilities, can process data more efficiently and up-
+Systems (ITS) and the IoV have emerged as critical components of                             load it to cloud servers in real time, addressing the challenges of latency
+modern transportation infrastructure. The functionality of the IoV relies                    and limited onboard processing power.
+on three key elements: the internal vehicle network, the vehicle-to-                             However, data security remains a critical issue. One potential so-
+vehicle communication network, and the in-vehicle mobile internet.                           lution is encrypting data before transmission, which introduces chal-
+These elements integrate technologies such as sensors, RFID (Radio Fre-                      lenges in ciphertext sharing. Traditional symmetric encryption, re-
+quency Identification), and automated control systems, operating under                       quiring a one-to-one correspondence between keys and users, proves
+established communication protocols to enable seamless, dynamic data                         inefficient for securing large volumes of data in IoV environments. Con-
+exchange between vehicles and the broader network.
+                                                                                             ventional asymmetric encryption algorithms also struggle with cipher-
+    While drivers benefit from applications like navigation and traffic
+                                                                                             text sharing and are ill-suited for the frequent updates characteristic
+information sharing, the limited computing power of onboard terminals
+                                                                                             of IoV applications. A more appropriate approach is Attribute-Based
+is insufficient for computationally intensive tasks such as autonomous
+                                                                                             Encryption (ABE), which enables fine-grained access control, supports
+driving and AI-based obstacle avoidance [1]. A potential solution is
+                                                                                             encryption for multiple recipients, and facilitates the creation of com-
+offloading data processing to cloud servers, but the large volume of
+                                                                                             plex access policies [9–11]. ABE allows data owners to control who
+vehicle-generated data introduces high latency in communication be-
+                                                                                             can access their data, but the decryption process is computationally
+tween the onboard terminal and the cloud, compromising real-time
+decision-making [2–4]. This latency, coupled with the risks associated                       intensive, requiring numerous pairing and exponential operations. This
+with data leakage and theft in semi-trusted cloud environments, raises                       places a significant burden on resource-constrained onboard terminals,
+
+
+
+     ∗ Corresponding author.
+    E-mail addresses: yangxd200888@163.com (X. Yang), 2023222208@nwnu.edu.cn (X. Luo), lzf0097@163.com (Z. Liao), neuer1130@163.com (W. Wang),
+duxiaonwnu@163.com (X. Du), lishudong@gzhu.edu.cn (S. Li).
+
+https://doi.org/10.1016/j.sysarc.2025.103331
+Received 11 August 2024; Received in revised form 4 December 2024; Accepted 2 January 2025
+Available online 17 January 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+X. Yang et al.                                                                                                  Journal of Systems Architecture 160 (2025) 103331
+
+
+hindering timely data retrieval and impeding efficient communication.          Yang et al. [22] introduced a CP-ABE scheme for dynamic big data
+As the number of attributes increases, the decryption complexity grows,        updates, and Feng et al. [23] developed a CP-ABE scheme for industrial
+leading to slower decryption times and higher resource consumption.            IoT. Other schemes [24,25] have improved security and efficiency,
+    To address these challenges, several outsourced ABE schemes have           broadening ABE’s application to the Internet of Medical Things (IoMT).
+been proposed [12–15], which offload expensive operations to cloud                 CP-ABE enables fine-grained access control, making it highly appli-
+servers, alleviating the computational load on onboard terminals. How-         cable in sectors such as smart healthcare and intelligent transportation.
+ever, even secure theoretical implementations of ABE are vulnerable to         However, single-attribute authority ABE schemes are vulnerable to col-
+practical attacks. Sophisticated adversaries may exploit backdoors [16],       lusion attacks. To address this, it is desirable to delegate each attribute
+manipulate pseudo-random number generators [17,18], or intercept               to different attribute authorities. Chase [26] was the first to introduce
+hardware interactions to gain unauthorized access to sensitive data. To        the concept of multiple attribute authorities within the ABE framework,
+counter these threats, the concept of a Cryptographic Reverse Firewall         where various authorities oversee different attributes. Lewko and Wa-
+(CRF) was introduced [19]. The CRF, positioned between the user and            ters [27] later introduced the initial decentralized ABE framework with
+the server, intercepts and alters messages to ensure data security, even       multiple authorities. Following this, Chaudhary et al. [28] proposed
+if the user is compromised.                                                    a multi-authority CP-ABE scheme tailored for the Internet of Vehicles
+    Moreover, traditional ABE schemes rely on a single attribute au-           (IoV) context.
+thority, which poses a risk of key leakage if the authority colludes
+                                                                                   Considering the constrained computing capabilities of user termi-
+with an adversary. To mitigate this, we propose a multi-authority
+                                                                               nals, Green et al. [12] introduced an ABE scheme that delegates de-
+ABE scheme, integrated with a CRF, to enhance security and prevent
+                                                                               cryption computations to the cloud. Lai et al. [13] improved upon this
+collusion attacks. The key contributions of this paper are as follows:
+                                                                               by achieving verifiability of outsourced decryption. Zhong et al. [29]
+    1. We propose a CP-ABE-based scheme that enables more granular             further enhanced the efficiency of outsourced decryption ABE schemes
+       access control policies, enhancing the system’s flexibility. This       and applied them to smart healthcare scenarios.
+       proves particularly beneficial in IoV scenarios such as IoV com-            Mironov and Stephens-Davidowitz [19] were the first to introduce
+       munication, where data access can be dynamically adjusted in            the concept of a reverse firewall. They proposed a generic architecture
+       accordance with the context.                                            to prevent user tampering, which could lead to data leakage. However,
+    2. The scheme integrates multiple attribute authorities to prevent         the previous approach was found unsuitable for ABE schemes, prompt-
+       collusion attacks and guarantee secure key management. Each             ing Ma et al. [30] to introduce a cryptographic reverse firewall utilizing
+       authority is responsible for managing vehicle attribute keys,           the CP-ABE scheme. Additionally, Hong et al. [31] proposed a KP-ABE
+       enhancing the security and efficiency of key generation, which          scheme with multiple authorities. Due to the limitations of KP-ABE in
+       is ideal for environments like smart cities or autonomous vehicle       achieving fine-grained access control, Zhao et al. [32] proposed a CP-
+       fleets.                                                                 ABE scheme incorporating a CRF and leveraged outsourced decryption
+    3. We enhance the CRF module by incorporating key parameter                to alleviate computational burdens. However, these approaches suffer
+       re-randomization within the multi-authority ABE framework,              from drawbacks, such as reliance on a single attribute authority or
+       strengthening security in IoV communications, even if certain           excessive computational overhead. Moreover, there is a risk of sys-
+       parts of the system are compromised.                                    tem compromise, which could lead to data leakage, especially in the
+    4. The scheme optimizes decryption efficiency through the use of           context of IoV, characterized by constrained computational resources
+       online-offline encryption techniques and offloading decryption          and stringent data privacy requirements. At the same time, the devel-
+       operations. Decryption time does not increase linearly with the         opment of IoV places higher demands on the security and flexibility
+       number of attributes, making it suitable for real-time applica-         of access control. Therefore, the proposed scheme combines CP-ABE,
+       tions like hazard detection and traffic optimization.                   CRF, and multi-authority models to meet the requirements for security,
+    5. The scheme also supports message integrity verification, which          flexibility, and low computational overhead.
+       can be easily carried out by onboard terminals using simple hash
+       functions, ensuring the authenticity of IoV messages and pre-
+                                                                               3. System model and definitions
+       venting malicious tampering in safety-critical communications.
+    The paper is organized as follows: Section 2 reviews existing              3.1. Preliminaries
+attribute-based encryption schemes and the application of CRFs. Sec-
+tion 3 provides an overview of the system and security models. Sec-               1. Bilinear Maps: Involve two multiplicative cyclic groups of prime
+tion 4 discusses the base scenario and the extended CRF module.                      order 𝑝, denoted as 𝐺 and 𝐺𝑇 , with 𝑔 representing a generator
+Section 5 presents security proofs for the base scheme and the CRF-                  of 𝐺. A bilinear map 𝑒 ∶ 𝐺 × 𝐺 → 𝐺𝑇 must satisfies the following
+enhanced scheme. Section 6 reports on experiments and results. Finally,              three features:
+Section 7 concludes the paper.
+                                                                                       (a) Non-degeneracy: 𝑒(𝑔 , 𝑔) ≠ 1.
+2. Related work                                                                        (b) Computability: Efficient computation of 𝑒(𝑀 , 𝑁) for any el-
+                                                                                           ements 𝑀 , 𝑁 ∈ 𝐺 is achievable through a polynomial-time
+    Sahai [10] introduced fuzzy identity-based encryption, which paved                     algorithm.
+the way for Attribute-Based Encryption (ABE). ABE later branched                       (c) Bilinearity: Efficient computation of 𝑎, 𝑏 ∈ 𝑍𝑝 for any ele-
+into two forms: Key-Policy ABE (KP-ABE) [9] and Ciphertext-Policy                          ments 𝑀 , 𝑁 ∈ 𝐺 we can acquire 𝑒(𝑀 𝑎 , 𝑁 𝑏 ) = 𝑒(𝑀 , 𝑁)𝑎𝑏 .
+ABE (CP-ABE) [11]. Initially, both schemes used access trees to define
+policies. However, the first CP-ABE scheme only provided security                 2. Access Structure: Consider a set 𝑃 = {𝑃1 , 𝑃2 , … , 𝑃𝑛 } representing
+under the random oracle model. Waters [20] introduced an LSSS-based                  𝑛 users. A collection 𝑄 is deemed monotone if, for any subsets
+CP-ABE scheme that encodes policies using matrices. This founda-                     ∀𝐾 , 𝐿: if 𝐾 ∈ 𝑄 and 𝐾 ⊆ 𝐿, then 𝐿 ∈ 𝑄. Let 𝑄 bbe a nonempty
+tional model has influenced many subsequent ABE schemes, which                       subset of 𝑃 that is monotonic, i.e. 𝑄 ⊆ 2{𝑃1 ,𝑃2 ,…,𝑃𝑛 } ∖{∅}, then call
+have expanded into diverse domains, particularly cloud computing.                    𝑄 a monotone access structure. In the context of access control,
+For example, Yu et al. [21] proposed a KP-ABE scheme enabling data                   sets included in 𝑄 are identified as authorized, while those that
+delegation to semi-trusted cloud servers while ensuring confidentiality.             are not included are referred to as unauthorized sets.
+
+                                                                           2
+X. Yang et al.                                                                                                              Journal of Systems Architecture 160 (2025) 103331
+
+
+    3. Linear Secret Sharing Scheme (LSSS): Let 𝐴̃ = {𝐴̃ 1 , 𝐴̃ 2 , … , 𝐴̃ 𝑁 } be
+       defined as the set that includes all possible attribute names. Cor-
+       responding to each attribute name 𝐴̃ 𝑖 ∈ 𝐴̃ within A, there is an
+       associated set of attribute values, denoted as 𝐴̃𝑖 = {𝐴𝑖,1 , 𝐴𝑖,2 , … ,
+       𝐴𝑖,𝑏𝑖 }, where 𝑏𝑖 is the order of 𝐴̃ 𝑖 . The policy for access is denoted
+       as 𝑇 = (𝑀 , 𝜌, 𝑉 ) Within the context of a linear secret sharing
+       scheme, 𝑀 denotes a matrix structured with 𝑙 row size and 𝑛
+       column size. 𝜌 denotes a function that associates each row of
+       𝑀 with an attribute name in 𝐴̃ 𝑖 . 𝑉 = {𝑣𝜌(𝑖) }𝑖∈[1,𝑙] represents
+       the set of attribute values associated with 𝑇 = (𝑀 , 𝜌). A LSSS
+       encompasses the following pair of algorithms:
+
+          (a) Distribute: Regarding the confidential value 𝑠 ∈ 𝑍𝑝 , arbi-
+              trarily choose a vector 𝑓 = (𝑠, 𝑓2 , … , 𝑓𝑛 ), where 𝑓2 , … , 𝑓𝑛 ∈
+              𝑍𝑝 . Calculate 𝜆𝑖 = 𝑀𝑖 ⋅ 𝑓 , where 𝑀𝑖 is the 𝑖𝑡ℎ row of matrix
+              𝑀. 𝜆𝑖 is a share of 𝑠 that corresponds to 𝜌(𝑖).
+          (b) Reconstruct: Let 𝑆 ∈ 𝐴̃ is permissible for any recognized                                                 Fig. 1. Leak game.
+              group and 𝐼 = {𝑖 ∶ 𝜌(𝑖) ∈ 𝑆} ⊆ {1, 2, … , 𝑙}, then, there
+                                                                 ∑
+              is a collection of constants {𝜔𝑖 ∈ 𝑍𝑝 } satisfy 𝑖∈𝐼 𝜔𝑖 𝑀𝑖 =
+              (1, 0, … , 0). The secret 𝑠 could be reconstructed by us via                         and a party 𝑃 form a composed party, then we call  a
+                             ∑
+              calculating 𝑖∈𝐼 𝜔𝑖 𝑀𝑖 = 𝑠.                                                          cryptographic reverse firewall for 𝑃 . Next we give definitions
+                                                                                                  of three properties of CRFs:
+       Assume S= {𝐼𝑢 , 𝑆} represents the collection of attributes for
+       users. 𝐼𝑢 ⊆ 𝐴̃ represents a collection of user attribute names.                              (a) Function Maintaining: In the context of any given reverse
+       𝑆 = {𝑠𝑖 }𝑖∈𝐼𝑢 denotes a set that includes all the attribute values                               firewall identified by  and any given party identified by
+       of the user. For ∀𝑖 ∈ 𝐼, where 𝐼 = {𝑖 ∶ 𝜌(𝑖) ∈ 𝑆} ⊆ {1, 2, … , 𝑙},                               𝑃 , let  1 ◦𝑃 = ◦𝑃 . For 𝑘 ≥ 2, let  𝑘 ◦𝑃 = ◦( 𝑘−1 ◦𝑃 ).
+       if 𝑖 satisfies (𝑀 , 𝜌) and 𝑠𝜌(𝑖) = 𝑣𝜌(𝑖) , thereafter, we identify S as                          For a framework  that adheres to the functionality re-
+       matching 𝑇 .                                                                                     quirement  , we define the reverse firewall  maintains
+    4. q-BDHE problem: Suppose 𝐺 and 𝐺𝑇 represent two cyclic groups                                     functionality if the composed party ◦𝑃 guarantees the
+       with multiplication as their operation, and the order of each is                                 functionality of the party 𝑃 under the scheme  in poly-
+       the prime 𝑝, and 𝑔 be a generator of 𝐺. 𝐺𝑇 has a bilinear map                                    nomial time.
+       𝑒 ∶ 𝐺 × 𝐺 → 𝐺𝑇 . Choose 𝑡, 𝑓 ∈ 𝑍𝑝 at random, and calculate                                   (b) Weakly Security-preserving:  operates under the premise
+                               2         𝑞     𝑞+2       2𝑞
+       𝐽 = (𝑔 , 𝑔 𝑡 , 𝑔 𝑓 , 𝑔 𝑓 , … , 𝑔 𝑓 , 𝑔 𝑓 , … , 𝑔 𝑓 ). In the context of the 𝑞-                   that it will fulfill the functionality need  and the security
+       BDHE problem, it is posited that no algorithm operating within                                   need . When faced with any polynomial-time adversary
+                                                                       𝑞+1
+       polynomial time can differentiate between 𝑒(𝑔 , 𝑔)𝑓 𝑡 ∈ 𝐺𝑇 and                                   𝐵, we say that the scheme  satisfies weakly security-
+       𝐾 ∈ 𝐺𝑇 with a significant advantage.                                                             preserving if ◦𝑃 satisfies the security requirement .
+    5. Cryptographic Scheme: The cryptographic scheme  defines the                                 (c) Weakly Exfiltration-resistant: The game Leak(, 𝑃𝑗 ,  , 𝜆),
+       interaction between parties (𝑃1 , 𝑃2 , … , 𝑃𝑙 ) with states. The pro-                            as depicted in the Fig. 1, is the work of designers Mironov
+       cess of scheme establishment is denoted by 𝑠𝑒𝑡𝑢𝑝(1𝜆 ), where 𝜆                                   and Stephens-Davidowitz [19]. The game is a security
+       refers to the security parameters. Each party enters the public                                  game between a reverse firewall  of party 𝑃 and a
+       parameters 𝑃𝑔 and related messages, and then runs the sys-                                       scheme  containing a tampering party  . The adversary
+       tem initialization algorithm to obtain the corresponding state                                   may control a party by hacking into the party’s algorithm
+       (𝜐𝑃𝑖 )𝑙𝑖=1 for each party. According to the order in which the                                   𝑟𝑒𝑐 𝑒𝑖𝑣𝑒, 𝑛𝑒𝑥𝑡, 𝑜𝑢𝑡𝑝𝑢𝑡.
+       scheme proceeds, the parties process messages from other parties                                 The purpose of the game is to let the adversary discern
+       in the scheme. Also, each party must have the corresponding                                      whether the party’s actions are honest or tampered with.
+       algorithms 𝑛𝑒𝑥𝑡𝑃𝑖 (𝜐𝑃𝑖 ) and 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒𝑃𝑖 (𝜐𝑃𝑖 ). 𝑛𝑒𝑥𝑡𝑃𝑖 (𝜐𝑃𝑖 ) is used to                         Thus, a reverse firewall with leak resistance can make it
+       output the updated message, 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒𝑃𝑖 (𝜐𝑃𝑖 ) is used to output the                              impossible for an adversary to tell if party 𝑃 has been tam-
+       states of the parties after the message update. After the scheme                                 pered with, or if the party is known to have been tampered
+       is completed, each party has algorithm 𝑜𝑢𝑡𝑝𝑢𝑡𝑃𝑖 (𝜐𝑃𝑖 ) return the                                with but does not know if the operation is honest, hence
+       results of the scheme. We assume that the scheme  meets                                         protecting the important privacy of the party.
+       functionality requirement  and security requirements .                                         If adversary 𝐵 within the Leak(, 𝑃𝑗 ,  , 𝜆) game cannot
+    6. Cryptographic Reverse Firewall: , the stateful algorithm, is syn-                               succeed in polynomial time with a noticeable advantage
+       onymous with the Cryptographic Reverse Firewall. When pro-                                       and while maintaining the party’s functionality  , then we
+       vided with a current state and an input message, the algorithm                                   label the reverse firewall  as weakly capable of resisting
+       processes them and subsequently outputs an updated state and                                     exfiltration.
+       message. For ease of presentation, the state of  is not explicitly
+       written out in the definition. Given that 𝑃 is a party and  is a
+       firewall, the expression ◦𝑃 is introduced to indicate the party                     3.2. System model
+       that emerges from their composition.
+                                                                                                Fig. 2 depicts the four components that constitute our scheme:
+        ◦𝑃 = 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒◦𝑃 (𝜐, )
+                                                                                            Attribute authorities (AA), Cloud server (CS), Data user (DU), Data
+                 = 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒𝑃 (𝜐, (𝑚))                                                      owner (DO). In addition, the system contains three reverse firewalls.
+                 = 𝑛𝑒𝑥𝑡◦𝑃 = (𝑛𝑒𝑥𝑡𝑃 (𝜐))                                                   To implement data re-randomization within the RSU, three firewalls
+                                                                                            are strategically positioned: 𝐴𝐴 , the reverse wall for AA; 𝐷𝑂 , acting
+                 = 𝑜𝑢𝑡𝑝𝑢𝑡◦𝑃 (𝜐) = 𝑜𝑢𝑡𝑝𝑢𝑡𝑃 (𝜐)                                   (1)
+                                                                                            as the reverse firewall for DO; and 𝐷𝑈 , fulfilling the same role for
+        When the composite party participates in the scheme, the initial                    DU.
+        state of the firewall  is set as the public parameter 𝑃𝑔 . If                          CS is mainly deployed to store cipher text and conversion key.
+
+                                                                                        3
+X. Yang et al.                                                                                                                Journal of Systems Architecture 160 (2025) 103331
+
+
+                                                                                             algorithm 𝐾 𝑒𝑦𝐺𝑒𝑛 and obtains corresponding secret key 𝑆 𝐾𝑖 .
+                                                                                             Then 𝐹 executes algorithm 𝐴𝐴 .𝐾 𝐺 and gets the re-randomized
+                                                                                             private key 𝑆 𝐾𝑖 ′ . Subsequently, 𝐹 executes 𝐾 𝑒𝑦𝐺𝑒𝑛.𝑟𝑎𝑛 to get
+                                                                                             conversion key 𝑇 𝐾𝑖 . Then 𝐹 executes 𝐷𝑈 .𝑇 𝐾 𝑈 𝑝𝑑 𝑎𝑡𝑒 to ob-
+                                                                                             tain re-randomized conversion key 𝑇 𝐾𝑖 ′ . Eventually, 𝐹 sends
+                                                                                             (𝑆 𝐾𝑖 ′ , 𝑇 𝐾𝑖 ′ ) to 𝐵.
+                                                                                          4. Challenge Phase: Two equal-length plaintexts, 𝑚0 , 𝑚1 , are deliv-
+                                                                                             ered by 𝐵 as part of the protocol. 𝐹 randomly chooses 𝑏 ∈
+                                                                                             {0, 1} and executes Enc.Offline*, Enc.Online* to obtain challenge
+                                                                                             ciphertext 𝐶 𝑇𝑏 . Then 𝐹 calls 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒, 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒
+                                                                                             to get updated cipher text 𝐶 𝑇𝑏 ′ . 𝐹 sends 𝐶 𝑇𝑏 ′ to 𝐵.
+                                                                                          5. Query Phase 2: Same as Query Phase 1.
+                                                                                          6. Guess Phase: 𝐵 outputs the guess 𝑏′ ∈ {0, 1} for 𝑏.
+
+
+                                                                                      Definition 1. The criterion for the basic scheme’s selective CPA-secure
+                                                                                      is met when the probability of adversary 𝐵’s success in the game during
+                             Fig. 2. System model.                                    polynomial time is negligible.
+
+                                                                                      4. System construction
+    AA is charged with the responsibility of establishing the public
+parameters and generating the master secret keys.                                     4.1. Basic scheme
+    DU includes setting the access policy that guides the encryption
+process and producing a verification credential. After these steps are                    The scheme contains 𝑁 attribute authorities, each attribute author-
+accomplished, the DU uploads both the encrypted data and the verifi-                  ity managing one class of attributes 𝐴̃𝑖 = {𝐴𝑖,1 , 𝐴𝑖,2 , … , 𝐴𝑖,𝑏𝑖 }, 𝐴𝑖,1 ∈ 𝑍𝑝 ,
+cation credential to the cloud server.                                                𝑖 = 1, 2, … , 𝑁, 𝑗 = 1, 2, … , 𝑏𝑖 .
+    DO initiates the process by generating a conversion key, which is
+                                                                                          1. Global Setup: Attribute authority 𝐴𝐴1 sets commonly known
+then uploaded to the cloud server. Following this, the DO retrieves the
+                                                                                             parameters 𝑃 𝑎𝑟𝑎𝑚𝑠 = {𝑔 , 𝑢, 𝑣, 𝑤, ℎ, 𝐺, 𝐺𝑇 , 𝐻0 ()} and publishes
+ciphertext and the verification credential from the cloud server to carry
+                                                                                             them, 𝐻0 is the designated collision-resistant hash function for
+out the concluding stages of decryption and integrity verification.
+                                                                                             generating robust verification credentials within the system.
+    𝐴𝐴 includes the re-randomization of public parameters and the                                                   
+                                                                                             𝐻0 () ∶ {0, 1}∗ → {0, 1} 𝐻0 .
+secret keys that belong to users.
+                                                                                          2. AASetup:
+    𝐷𝑂 is responsible to rerandomize cipher texts.
+    𝐷𝑈 is responsible to rerandomize conversion keys and conversion                            (a) For each Attribute Authority, the process involves ran-
+ciphertexts.                                                                                        domly choosing 𝛼𝑖 ∈ 𝑍𝑝 , determining 𝑌𝑖 = 𝑒(𝑔 , 𝑔)𝛼𝑖 , and
+                                                                                                    then distributing 𝑌𝑖 to other attribute authorities. As the
+3.3. Security model                                                                                 process concludes, each attribute authority carries out the
+                                                                                                                           ∏𝑁                 ∑𝑁
+                                                                                                    calculation for 𝑌 =                         𝑖=1 𝛼𝑖 = 𝑒(𝑔 , 𝑔)𝛼 ,
+    The DO and the DU in our system are considered completely trust-                                           ∑𝑁            𝑖=1 𝑌𝑖 = 𝑒(𝑔 , 𝑔)
+                                                                                                    where 𝛼 = 𝑖=1 𝛼𝑖 .
+worthy. However, the reverse firewalls and cloud server are deemed
+‘‘honest and curious’’, meaning they will comply with the algorithm’s                           (b) Each attribute authority 𝐴̂ 𝑖 operates as follows:
+steps but will also endeavor to discover any private information within                                   • Randomly select 𝑁 − 1 elements 𝑠𝑖𝑘 ∈ 𝑍𝑝 (𝑘 ∈
+the data. Furthermore, there is a risk of the Attribute Authority collud-                                   {1, 2, … , 𝑁}∖{𝑖}), calculate 𝑔 𝑠𝑖𝑘 and send it to other
+ing with an adversary. In response to this challenge, we have put in                                        attribute authorities.
+place a selective CPA security game, and the sequence of events within                                    • After receiving 𝑁 − 1 components 𝑔 𝑠𝑘𝑖 from other
+this game is as follows:                                                                                    ascribe powers 𝐴̂ 𝑘 (𝑘 ∈ {1, 2, … , 𝑁}∖{𝑖}), the master
+                                                                                                            key 𝑀 𝐾 𝑖 is calculated by the following formula:
+    1. Init Phase: The rival 𝐵 declares a set of malicious attribute                                                      ∏
+       authorities 𝑅 = (𝐴̂ 𝑖 )𝑖∈𝐼 and access policies (𝑀𝑖 ∗ , 𝜌𝑖 ∗ )𝑖∈𝐼 ∗ to be                             𝑀𝐾𝑖 =                  (𝑔 𝑠𝑖𝑘 ∕𝑔 𝑠𝑘𝑖 )
+       challenged, where 𝐼 ⊆ {1, 2, … , 𝑁}, 𝐼 ∗ ⊆ {1, 2, … , 𝑁}. Then                                                 𝑘∈{1,2,…,𝑁}∖{𝑖}
+                                                                                                                                 ∑                      ∑
+       𝐵 sends algorithms 𝐺𝑙𝑜𝑏𝑎𝑙𝑠𝑒𝑡𝑢𝑝∗ , 𝐴𝐴𝑆 𝑒𝑡𝑢𝑝∗ , 𝐾 𝑒𝑦𝐺𝑒𝑛∗ , 𝐾 𝑒𝑦.𝑟𝑎𝑛∗ ,                                               (                𝑠𝑖𝑘 −                  𝑠𝑘𝑖 )
+       𝑒𝑛𝑐 .𝑜𝑓 𝑓 𝑙𝑖𝑛𝑒∗ , 𝑒𝑛𝑐 .𝑜𝑛𝑙𝑖𝑛𝑒∗ to challenger 𝐹 .                                                          = 𝑔 𝑘∈{1,2,…,𝑁}∖{𝑖}       𝑘∈{1,2,…,𝑁}∖{𝑖}
+                                                                                                                                                               ,           (2)
+    2. Setup Phase: 𝐹 executes algorithms 𝐺𝑙𝑜𝑏𝑎𝑙𝑠𝑒𝑡𝑢𝑝∗ and 𝐴𝐴𝑆 𝑒𝑡𝑢𝑝∗ to                                            ∏𝑁
+       obtain the public parameter 𝑃 𝑎𝑟𝑎𝑚𝑠, attribute authorities public                                    where 𝑖=1 𝑀 𝐾𝑖 = 1.
+       key 𝑃 𝐾 and private key pairs (𝑃 𝐾𝑖 , 𝐴𝑆 𝐾 𝑖 )𝑖∈𝐼 . Subsequently, the                              • For each attribute 𝐴𝑖,𝑗 ∈ 𝐴̃𝑖 , calculate 𝑢𝐴𝑖,𝑗 ℎ.
+       reverse firewall puts the 𝑊𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 algorithm into action to
+                                                                                                     Attribution authority publishes public key 𝑃 𝐾 = (𝑔 , 𝑢, ℎ,
+       generate and announce the new public key 𝑃 𝐾 ′ , and in doing
+                                                                                                     𝑤, 𝑣, 𝑒(𝑔 , 𝑔)𝛼 , 𝐺, 𝐺𝑇 ) and keeps its own private key 𝐴𝑆 𝐾 𝑖 =
+       so, also retains the corresponding random number 𝑓 . 𝐵 can
+                                                                                                     {𝛼𝑖 , (𝑢𝐴𝑗 ℎ)𝐴 ∈𝐴̂ , 𝑀 𝐾𝑖 }.
+       receive 𝑃 𝐾𝑖 ′ from all non-malicious attribute authorities and                                            𝑗   𝑖
+
+       (𝑃 𝐾𝑖 , 𝐴𝑆 𝐾 𝑖 )𝑖∈𝐼 from all malicious attribute authorities.
+                                                                                          3. KeyGen: Each attribute authority 𝐴̂ 𝑖 execute algorithm as fol-
+    3. Query Phase 1: Adaptive requests for secret keys regarding at-
+                                                                                             lows:
+       tribute sets 𝑆1 , 𝑆2 , … , 𝑆𝑞 can be made by 𝐵. Each time 𝐵 per-
+       forms a key query, when submitting a set of attributes, it is                             (a) Select 𝜃𝑖 ∈ 𝑍𝑝 at random, thereafter derive the elements
+       imperative that they do not comply with the access structure                                  of the secret key, denoted as 𝑀 𝐾𝑖 ⋅ 𝑔 𝜃𝑖 , 𝑀 𝐾𝑖 ⋅ 𝑣−𝜃𝑖 , 𝑀 𝐾𝑖 ⋅
+       rules outlined by (𝑀𝑖 ∗ , 𝜌𝑖 ∗ )𝑖∈𝐼 ∗ , nor come from a malicious at-                         𝑔 𝛼𝑖 ⋅ 𝑤𝜃𝑖 and subsequently convey these elements to the
+       tribute authority 𝑅 = (𝐴̂ 𝑖 )𝑖∈𝐼 . For every query 𝑆𝑖 , 𝐹 executes                            pertinent attribute authorities.
+
+                                                                                  4
+X. Yang et al.                                                                                                                                       Journal of Systems Architecture 160 (2025) 103331
+
+
+            (b) Upon obtaining the components from various attribute                             4.2. CRF scheme
+                authorities, proceed to compute the secret key utilizing
+                the following steps:                                                                1. Initialization: The attribute authorities runs 𝐺𝑙𝑜𝑏𝑎𝑙𝑆 𝑒𝑡𝑢𝑝 and
+                      ∏𝑁                      ∑𝑁
+                                                                                                       𝐴𝐴𝑆 𝑒𝑡𝑢𝑝, each attribute authority sends 𝛼𝑖 to 𝐴𝐴 , then 𝐴𝐴
+                𝐾0 =     𝑀 𝐾𝑖 ⋅ 𝑔 𝛼𝑖 ⋅ 𝑤𝜃𝑖 = 𝑔 𝑖=1 𝛼𝑖 𝑤𝑟             (3)                               executes algorithms as follows:
+                         𝑖=1                                                                           𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 ∶ Upon receiving the parameters from 𝐴𝐴, the CRF
+                                                                                                                                     ∑
+                         ∏
+                         𝑁                       ∑𝑁                                                    𝐴𝐴 calculates 𝛼 = 𝑁           𝑖=1 𝛼𝑖 , then randomly chooses 𝑎, 𝑏, 𝑐 , 𝑑 , 𝑒, 𝑓 ∈
+                 𝐾1 =          𝑀 𝐾𝑖 ⋅ 𝑔 𝜃𝑖 = 𝑔    𝑖=1 𝜃𝑖   = 𝑔𝑟                       (4)              𝑍𝑝 and calculates 𝑔 ′ = 𝑔 𝑎 , 𝑢′ = 𝑢𝑏 , ℎ′ = ℎ𝑐 , 𝑤′ = 𝑤𝑑 , 𝑣′ =
+                         𝑖=1                                                                                                                ′             2
+                                                                                                       𝑣𝑒 , 𝛼 ′ = 𝛼 + 𝑓 , 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 = 𝑒(𝑔 , 𝑔)𝑎 (𝛼+𝑓 ) . 𝐴𝐴 stores 𝑓 and
+                         ∏𝑁                                                                                                                                                   ′
+                 𝐾𝑣 =          𝑀 𝐾𝑖 ⋅ 𝑣−𝜃𝑖 = 𝑣−𝑟                                      (5)              publishes the updated 𝑃 𝐾 ′ = (𝑔 ′ , 𝑢′ , ℎ′ , 𝑤′ , 𝑣′ , 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 , 𝐺, 𝐺𝑇 ).
+                                                                                                                                   ′
+                                                                                                       After receiving 𝑃 𝐾 , 𝐴𝐴 executes 𝐾 𝑒𝑦𝐺𝑒𝑛 to generate secret key
+                         𝑖=1
+                                                                                                       𝑆 𝐾 = {𝐾0 , 𝐾1 , {𝐾𝑖,2 , 𝐾𝑖,3 }𝑖∈[1,𝜎] , 𝑆𝐼 𝐷 } and sends 𝑆 𝐾 to CRF 𝐴𝐴 .
+            (c) For each attribute 𝜎 ∈ [𝑆𝐼 𝐷 ∩ 𝐴̂ 𝑖 ], randomly choose 𝑟𝜎 ∈                            𝐴𝐴 runs the following algorithm for re-randomization.
+                𝑍𝑝 , where 𝜎 ≤ 𝑁 and 𝑆𝐼 𝐷 denotes the set of users.                                    𝐴𝐴 .𝐾 𝐺 ∶ Provide 𝑃 𝐾 ′ , 𝑓 and 𝑁 as input, where 𝑁 rep-
+                                                      𝑟                𝑟                               resents the total number of attributes. 𝐴𝐴 randomly selects
+                Calculate 𝐾𝑖,2 = 𝑔 𝑟𝑖 , 𝐾𝑖,3 = (𝑢𝐴𝑖 ℎ) 𝑖 ⋅ 𝐾𝑣 = (𝑢𝐴𝑖 ℎ) 𝑖 𝑣−𝑟 .
+                                                                                                       𝑟′ , 𝑟1 ′ , 𝑟′2 , … , 𝑟′𝑁 ∈ 𝑍𝑝 , calculates 𝐾  ̃′ = 𝑔 ′ 𝑓 𝑤′ 𝑟′ , 𝐾
+                                                                                                                                                                         ̃′ = 𝑔 ′ 𝑟′ . For
+                Then user gets the secret key 𝑆 𝐾                   =      {𝐾0 , 𝐾1 ,                                                                   0                 1
+                                                                                                                                                              𝑟′𝑖             ′
+                {𝐾𝑖,2 , 𝐾𝑖,3 }𝑖∈[1,𝜎] , 𝑆𝐼 𝐷 }.                                                        𝑖 = 1, 2, … , 𝑁, 𝑊              computes 𝐾 = 𝑔 , 𝐾 = 𝑣′ −𝑟 , 𝐾
+                                                                                                                                      𝐴𝐴
+                                                                                                                                                    ̃ ′     ′        ′
+                                                                                                                                                                             𝑖,2
+                                                                                                                                                                                  ̃   ′ =
+                                                                                                                                                                                                       𝑣                       𝑖,3
+                                                                                                                   𝑟′                                𝑟′          ′
+                                                                                                       (𝑢′ 𝐴𝑖 ℎ′ ) 𝑖 ⋅ 𝐾𝑣′ = (𝑢′ 𝐴𝑖 ℎ′ ) 𝑖 𝑣′ −𝑟 . The intermediate key 𝑍 𝑆 𝐾 =
+    4. KeyGen.ran: Upon inputting 𝑆 𝐾, the data user independently                                      ̃′ , 𝐾
+                                                                                                       (𝐾     ̃′ , {𝑟′ , 𝐾
+                                                                                                                         ̃    ̃
+                                                                                                                           ′ ,𝐾 ′ }        ).
+                                                                                                          0    1        𝑖     𝑖,2    𝑖,3 𝑖∈[1,𝑁]
+       selects a random element from the finite field 𝜏 ∈ 𝑍𝑝 , and
+                                                                                                       Eventually, 𝐴𝐴 computes 𝐾0′ = 𝐾0 ⋅ 𝐾       ̃′ = 𝑔 ′ 𝛼+𝑓 𝑤′ 𝑟+𝑟′ =
+       proceeds to calculate 𝐾0′ = 𝐾0 1∕𝜏 = 𝑔 𝛼∕𝜏 𝑤𝑟∕𝜏 , 𝐾1′ = 𝐾1 1∕𝜏 = 𝑔 𝑟∕𝜏 .                             ′      ′                        ′
+                                                                                                                                                    0
+                                                        ′ = 𝐾 1∕𝜏 = 𝑔 𝑟𝑖 ∕𝜏 ,                                                   ̃′ = 𝑔 ′ 𝑟+𝑟 . For 𝑖 = 1, 2, … , 𝜎, where
+                                                                                                       𝑔 ′ 𝛼 𝑤′ 𝑟+𝑟 , 𝐾 ′ = 𝐾 ⋅ 𝐾
+       For 𝑖 = 1, 2, … , 𝜎, the data user calculates 𝐾𝑖,2      𝑖,2                                                            1          1           1     ′
+       𝐾𝑖,3
+                                       𝑟 ∕𝜏
+         ′ = 𝐾 1∕𝜏 = (𝑢𝐴𝑖 ℎ) 𝑖 𝑣−𝑟∕𝜏 . The transformation key, desig-
+                                                                                                                              ′
+                                                                                                       𝜎 ≤ 𝑁, 𝐴𝐴 calculates 𝐾𝑖,2          ̃
+                                                                                                                                  = 𝐾𝑖,2 ⋅ 𝐾 ′
+                                                                                                                                            𝑖,2
+                                                                                                                                                = 𝑔 ′ 𝑟𝑖 +𝑟𝑖 , 𝐾𝑖,3
+                                                                                                                                                                ′   =
+                 𝑖,3                                                                                                                             ′
+                                                                                                                 ′ = (𝑢′ 𝐴𝑖 ℎ′ )𝑟𝑖 +𝑟𝑖 𝑣′ −𝑟−𝑟 . 
+                                                                                                                                               ′
+       nated as 𝑇 𝐾 = (𝑆𝐼 𝐷 , 𝐾0′ , 𝐾1′ , {𝐾𝑖,2 ′ , 𝐾′ }                ) and the recovery                    ̃                                                            ′
+                                                        𝑖,3 𝑖∈[1,𝜎]                                    𝐾𝑖,3 ⋅ 𝐾  𝑖,3                               𝐴𝐴 sends the updated 𝑆 𝐾 =
+                                                                                                          ′   ′      ′     ′
+                                                                                                       (𝐾0 , 𝐾1 , {𝐾𝑖,2 , 𝐾𝑖,3 }      , 𝑆𝐼 𝐷 ) to data user.
+       key, denoted as 𝑅𝐾 = 𝜏, serve distinct functions within the
+                                                                                                                                      𝑖∈[1,𝜎]
+       cryptographic framework.                                                                     2. Data Upload: The data owner invokes the 𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒
+    5. Enc.Offline: Enter the 𝑃 𝐾, and let 𝑁 ′ denote the upper limit on                               and 𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 to obtain ciphertext 𝐶 𝑇 = ((𝑀 , 𝜌), 𝐶 , 𝐶0 ,
+       the count of rows within the secret sharing matrix. The data                                    {𝐶𝑗 ,1 , 𝐶𝑗 ,2 , 𝐶𝑗 ,3 }𝑗∈[1,𝑙] ) and verification credential 𝑇 𝑜𝑘𝑒𝑛, then
+       owner randomly chooses 𝑠 ∈ 𝑍𝑝 , calculates 𝐶̂ = 𝑒(𝑔 , 𝑔)𝛼𝑠 , 𝐶̂0 = 𝑔 𝑠 .                        sends 𝐶 𝑇 and 𝑇 𝑜𝑘𝑒𝑛 to CRF 𝐷𝑂 , 𝐷𝑂 executes algorithm as
+       For 𝑗 = 1, 2, … , 𝑁 ′ , the data owner randomly chooses 𝑑𝑗 ∈ 𝑍𝑝                                 follows:
+       and calculates 𝐶̂𝑗 ,1 = 𝑣𝑑𝑗 , 𝐶̂𝑗 ,2 = ℎ−𝑑𝑗 , 𝐶̂𝑗 ,3 = 𝑔 𝑑𝑗 . The intermediate                  𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒 ∶ Input 𝑃 𝐾 ′ and 𝑁 ′ , the notation 𝑁 ′ is
+       ciphertext 𝑀 𝑇 = (𝑠, 𝐶̂ , 𝐶̂0 , {𝑑𝑗 , 𝐶̂𝑗 ,1 , 𝐶̂𝑗 ,2 , 𝐶̂𝑗 ,3 }𝑗∈[1,𝑁 ′ ] ).                   used to represent the highest possible number of rows that are
+    6. Enc.Online: Input 𝑀 𝑇 , plaintext 𝑚, access structure (𝑀 , 𝜌), where                            allowed in the access structure. 𝐷𝑂 randomly chooses 𝑠′ ∈ 𝑍𝑝
+                                                                                                                                                                 ′ ′            ′
+       𝑀 is a matrix of 𝑙 rows and 𝑛 columns (𝑙 ≤ 𝑁 ′ ). The data                                      as secret value and calculates 𝐶̂ ′ = 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 𝑠 , 𝐶̂0′ = 𝑔 ′ 𝑠 . For
+                                                                                                                              ′                             ′
+                                                                                                       𝑗 = 1, 2, … , 𝑁 , 𝐷𝑂 randomly chooses 𝑑𝑗 ∈ 𝑍𝑝 and calculates
+       owner randomly chooses vector 𝑦⃖⃗ = (𝑠, 𝑦2 , … , 𝑦𝑛 ) ∈ 𝑍𝑝𝑛×1 . The
+                                                                                                                       𝑑′                 −𝑑 ′           𝑑′
+       secret share is 𝜆⃖⃗ = (𝜆1 , 𝜆2 , … , 𝜆𝑙 )𝑇 = 𝑀 𝑦⃖⃗. Then the data owner                         𝐶̂𝑗′,1 = 𝑣′ 𝑗 , 𝐶̂𝑗′,2 = ℎ′ 𝑗 , 𝐶̂𝑗′,3 = 𝑔 ′ 𝑗 . Enter the transitional
+       calculates 𝑇 𝑜𝑘𝑒𝑛 = 𝐻0 (𝑚), 𝐶 = 𝑚 ⋅ 𝐶̂ = 𝑚 ⋅ 𝑒(𝑔 , 𝑔)𝛼𝑠 , 𝐶0 = 𝐶̂0 = 𝑔 𝑠 .                      encryption, denoted as 𝑀 𝑇 ′ = (𝑠′ , 𝐶̂ ′ , 𝐶̂ ′ , {𝐶̂ ′ , 𝐶̂ ′ , 𝐶̂ ′ }   ).   0        𝑗 ,1       𝑗 ,2    𝑗 ,3 𝑗∈[1,𝑁 ′ ]
+       For 𝑗 = 1, 2, … , 𝑙, data owner computes 𝐶𝑗 ,1 = 𝐶̂𝑗 ,1 ⋅ 𝑤𝜆𝑗 =                                 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 ∶ Input 𝑃 𝐾 ′ , 𝑀 𝑇 ′ and 𝐶 𝑇 . The CRF 𝐷𝑂
+                                                                 −𝑑
+       𝑤𝜆𝑗 𝑣𝑑𝑗 , 𝐶𝑗 ,2 = 𝐶̂𝑗 ,2 ⋅ 𝑢−𝜌(𝑗)𝑑𝑗 = (𝑢−𝜌(𝑗) ℎ) 𝑗 , 𝐶𝑗 ,3 = 𝐶̂𝑗 ,3 = 𝑔 𝑑𝑗 .                    randomly selects vector 𝑦⃖⃖⃗′ = (𝑠′ , 𝑦′2 , ..., 𝑦′𝑛 )𝑇 ∈ 𝑍𝑝𝑛×1 , then secret
+       The ciphertext 𝐶 𝑇 = ((𝑀 , 𝜌), 𝐶 , 𝐶0 , {𝐶𝑗 ,1 , 𝐶𝑗 ,2 , 𝐶𝑗 ,3 }𝑗∈[1,𝑙] ) and the               shared vectors 𝜆⃖⃖⃗′ = (𝜆′ , … , 𝜆′ )𝑇 = 𝑀 𝑦⃖⃖⃗′ . Then 
+                                                                                                                                                 1               𝑛         computes                               𝐷𝑂
+                                                                                                                                                             ′           ′                                                 ′
+       verification credential is 𝑇 𝑜𝑘𝑒𝑛.                                                              𝐶 ′ = 𝐶 ⋅ 𝐶̂ ′ = 𝑚 ⋅ 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 ) , 𝐶0′ = 𝐶0 ⋅ 𝐶̂0′ = 𝑔 ′ 𝑠+𝑠 . For
+    7. Dec.Out: If the user’s attributes set, identified by 𝑆𝐼 𝐷 , does not                            𝑗 = 1, 2, … , 𝑙, where 𝑙 ≤ 𝑁 ′ , 𝐷𝑂 calculates
+       conform to the access structure, the cloud server will return                                                                    𝜆′                𝜆 +𝜆′𝑗 ′ 𝑑𝑗 +𝑑𝑗′
+                                                                                                       𝐶𝑗′,1 = 𝐶𝑗 ,1 ⋅ 𝐶̂𝑗′,1 ⋅ 𝑤′ 𝑗 = 𝑤′ 𝑗                          𝑣             ,                                             (8)
+       a null value ⊥ and terminate the algorithm. Otherwise, cloud
+                                                                                                                                                                                      ′
+       server collects 𝐼 = {𝑖, 𝜌(𝑖) ∈ 𝑆𝐼 𝐷 } and calculates {𝜔𝑖 ∈ 𝑍𝑝 }𝑖∈𝐼 ,                                                            −𝜌(𝑗)𝑑𝑗′                  𝜌(𝑗)        ′ −(𝑑𝑗 +𝑑𝑗 )
+                ∑                                                                                      𝐶𝑗′,2 = 𝐶𝑗 ,2 ⋅ 𝐶̂𝑗′,2 ⋅ 𝑢′                    = (𝑢′              ℎ)                 ,                                    (9)
+       where 𝑖∈𝐼 𝜔𝑖 ⋅ 𝑀𝑖 = (1, 0, … , 0) and 𝑀𝑖 is the 𝑖th row of matrix
+                                                                                                                                         𝑑 +𝑑𝑗′
+       𝑀. Then the cloud server calculates                                                             𝐶𝑗′,3 = 𝐶𝑗 ,3 ⋅ 𝐶̂𝑗′,3 = 𝑔 ′ 𝑗                .                                                                         (10)
+                                     𝑒(𝐶0 , 𝐾0′ )
+       𝐴= ∏                    ′                ′                   ′    𝜔𝑖                            The 𝐷𝑂 transmits the ciphertext 𝐶 𝑇 ′ = (𝐶 ′ , 𝐶0′ , {𝐶𝑗′,1 , 𝐶𝑗′,2 ,
+                𝑖∈𝐼 (𝑒(𝐶𝑖,1 , 𝐾1 ) ⋅ 𝑒(𝐶𝑖,2 , 𝐾𝑗 ,2 ) ⋅ 𝑒(𝐶𝑖,3 , 𝐾𝑗 ,3 ))
+                                                                                                       𝐶𝑗′,3 }𝑗∈[1,𝑙] , (𝑀 , 𝜌)), which has been re-randomized, along with
+           = 𝑒(𝑔 , 𝑔)𝛼 𝑠∕𝜏 ,                                                          (6)              the 𝑇 𝑜𝑘𝑒𝑛, to the cloud server.
+                                                                                                    3. Data Download: The data user runs 𝐾 𝑒𝑛𝐺𝑒𝑛.𝑟𝑎𝑛(𝑆 𝐾 ′ ) and sends
+       in the given context, 𝑗 represents the position or identifier for                               𝑇 𝐾 = (𝑆𝐼 𝐷 , 𝐾0′′ , 𝐾1′′ , {𝐾𝑖,2
+                                                                                                                                      ′′ , 𝐾 ′′ }       ) to CRF 𝐷𝑈 . Then 𝐷𝑈
+                                                                                                                                            𝑖,3 𝑖∈[1,𝜎]
+       the attribute value 𝜌(𝑖) in 𝑆𝐼 𝐷 ().                                                            executes algorithm as follows:
+    8. Dec.User: The data user uses the conversion key 𝑅𝐾 to decrypt                                   𝐷𝑈 .𝑇 𝐾 𝑈 𝑝𝑑 𝑎𝑡𝑒 ∶ 𝐷𝑈 randomly chooses 𝜑 ∈ 𝑍𝑝 and calculates
+       as follows:                                                                                                      1∕𝜑          𝛼 ′ ∕𝜏 𝜑        (𝑟+𝑟′ )∕𝜏 𝜑
+        𝐶       𝑒(𝑔 , 𝑔)𝛼𝑠 𝑚                                                                           𝐾0′′′ = 𝐾 ′′
+                                                                                                                 0
+                                                                                                                              = 𝑔′              𝑤′                   ,                                                         (11)
+           =                 𝜏 = 𝑚,                                  (7)
+       𝐴𝜏     (𝑒(𝑔 , 𝑔)𝛼𝑠∕𝜏 )                                                                                           1∕𝜑          (𝑟+𝑟′ )∕𝜏 𝜑
+                                                                                                       𝐾1′′′ = 𝐾 ′′
+                                                                                                                 1
+                                                                                                                              = 𝑔′                   ,                                                                         (12)
+        then data user uses the verification credential 𝑇 𝑜𝑘𝑒𝑛 to com-                                              1∕𝜑     (𝑟 +𝑟′ )∕𝜏 𝜑
+                                                                                                        ′′′
+        plete the ciphertext verification, if 𝐻0 (𝑚) = 𝑇 𝑜𝑘𝑒𝑛 holds, the                               𝐾𝑖,2 = 𝐾 ′′
+                                                                                                                𝑖,2
+                                                                                                                        = 𝑔′ 𝑖 𝑖         ,                                                                                     (13)
+        ciphertext is correct. Otherwise, the ciphertext may have been                                  ′′′         1∕𝜑      𝐴      (𝑟𝑖 +𝑟′𝑖 )∕𝜏 𝜑 ′ −(𝑟+𝑟′ )∕𝜏 𝜑
+                                                                                                       𝐾𝑖,3 = 𝐾 ′′
+                                                                                                                𝑖,3
+                                                                                                                        = (𝑢′ 𝑖 ℎ′ )              𝑣               .                                                            (14)
+        tampered with.
+
+
+
+                                                                                             5
+X. Yang et al.                                                                                                                                 Journal of Systems Architecture 160 (2025) 103331
+
+
+        𝐷𝑈 stores 𝜑 ∈ 𝑍𝑝 and sends re-randomize conversion key                                                                           𝑒(𝐶0′ , 𝐾0′′′ )
+        𝑇 𝐾 ′ = (𝑆𝐼 𝐷 , 𝐾0′′′ , 𝐾1′′′ , {𝐾𝑖,2′′′ , 𝐾 ′′′ }      ) to the cloud server.          𝐴′ = ∏            ′      ′′′      ′      ′′′         ′      ′′′ 𝜔𝑖
+                                                    𝑖,3 𝑖∈[1,𝜎]                                           𝑖∈𝐼 (𝑒(𝐶𝑖,1 , 𝐾1 ) ⋅ 𝑒(𝐶𝑖,2 , 𝐾𝑗 ,2 ) ⋅ 𝑒(𝐶𝑖,3 , 𝐾𝑗 ,3 ))
+        When receiving a decryption request from a data user, the cloud                                                           ′        ′                                               ′       ′
+        server performs 𝐷𝑒𝑐 .𝑂𝑢𝑡(𝑇 𝐾 ′ , 𝐶 𝑇 ′ ) to acquire a partially de-                                     𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏 𝜑                                 𝑒(𝑔 ′ , 𝑤′ )(𝑟+𝑟 )(𝑠+𝑠 )∕𝜏 𝜑
+                                                                                                   = ∏                                    ′
+                                                                                                                                                                   ⋅∏                            ′
+        crypted ciphertext 𝑇 𝐶 𝑇 . The cloud server sends 𝑇 𝐶 𝑇 = (𝐶 ′ , 𝐴 =                                     ′        ′ (𝑟+𝑟′ )(𝜆𝑖 +𝜆𝑖 )𝜔𝑖 ∕𝜏 𝜑                            ′ ′ (𝑟+𝑟′ )(𝑑𝑖 +𝑑𝑖 )𝜔𝑖 ∕𝜏 𝜑
+                      ′    ′                                                                              𝑖∈𝐼 𝑒(𝑔 , 𝑤 )                                                 𝑖∈𝐼 𝑒(𝑔 , 𝑣 )
+        𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏 𝜑 ) and 𝑇 𝑜𝑘𝑒𝑛 to 𝐷𝑈 , 𝐷𝑈 runs algorithms as                                                    1
+                                                                                                   ⋅∏
+        follows.                                                                                                               ′
+                                                                                                               ′ ′ −𝜌(𝑖)(𝑑𝑖 +𝑑𝑖 )(𝑟𝑖 +𝑟𝑖 ′ )𝜔𝑖 ∕𝜏 𝜑
+                                                                               ′   ′                    𝑖∈𝐼 𝑒(𝑔 , 𝑢 )
+        𝐷𝑈 .𝐷𝑒𝑐 ∶ The CRF 𝐷𝑈 computes 𝐴′ = 𝐴𝜑 = 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏
+                                                                                                                              1
+                                ′      ′  ′
+        and sends 𝑇 𝐶 𝑇 = (𝐶 , 𝐴 ) and 𝑇 𝑜𝑘𝑒𝑛 to the data user.                                    ⋅∏                             ′        ′
+                                                                                                                                                                                                               (15)
+                                                                                                        𝑖∈𝐼   𝑒(𝑔 ′ , ℎ′ )−(𝑑𝑖 +𝑑𝑖 )(𝑟𝑖 +𝑟𝑖 )𝜔𝑖 ∕𝜏 𝜑
+        After receiving re-randomize partially decrypted ciphertext, data
+        user runs 𝐷𝑒𝑐 .𝑈 𝑠𝑒𝑟 to recover plaintext 𝑚. Then the data user                                                       1
+                                                                                                   ⋅∏                               ′        ′
+        uses the verification credential 𝑇 𝑜𝑘𝑒𝑛 to finish the ciphertext                                𝑖∈𝐼   𝑒(𝑔 ′ , 𝑢′ )𝐴𝑖 (𝑑𝑖 +𝑑𝑖 )(𝑟𝑖 +𝑟𝑖 )𝜔𝑖 ∕𝜏 𝜑
+        verification, if 𝐻0 (𝑚) = 𝑇 𝑜𝑘𝑒𝑛 holds, the ciphertext is correct.                                                    1                                                        1
+                                                                                                   ⋅∏                                 ′            ′
+                                                                                                                                                                   ⋅∏                          ′           ′
+                                                                                                               ′ ′ (𝑑𝑖 +𝑑𝑖 )(𝑟𝑖 +𝑟𝑖 )𝜔𝑖 ∕𝜏 𝜑                                   ′ ′ −(𝑟+𝑟 )(𝑑𝑖 +𝑑𝑖 )𝜔𝑖 ∕𝜏 𝜑
+                                                                                                        𝑖∈𝐼 𝑒(𝑔 , ℎ )                                                   𝑖∈𝐼 𝑒(𝑔 , 𝑣 )
+                                                                                                                    ′     ′                                    ′    ′
+5. Security analysis                                                                                   𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏 𝜑 𝑒(𝑔 ′ , 𝑤′ )(𝑟+𝑟 )(𝑠+𝑠 )∕𝜏 𝜑                                  ′       ′
+                                                                                                   =                                  ∑               ′
+                                                                                                                                                                             = 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏 𝜑 .
+                                                                                                                          (𝑟+𝑟′ )          𝑖∈𝐼 (𝜆𝑖 +𝜆𝑖 )𝜔𝑖 ∕𝜏 𝜑
+                                                                                                              𝑒(𝑔 ′ , 𝑤′ )
+5.1. Security proof                                                                                                                                                                                            (16)
+                                                                                                                                               𝛼 ′ (𝑠+𝑠′ )∕𝜏
+                                                                                                𝐶′     𝐶′   𝑚 ⋅ 𝑒(𝑔 ′ , 𝑔 ′ )
+Theorem 1. Given that the 𝑞-BDHE assumption holds true, the proposed                              ′𝜏
+                                                                                                     = 𝜑𝜏 =                 ′   ′
+                                                                                                                                                                   =𝑚                                          (17)
+                                                                                                𝐴     𝐴      𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏
+scheme is deemed secure against selective CPA.
+                                                                                                It is evident from the aforementioned equations that the message
+                                                                                                ‘m’ remains decryptable under normal circumstances even after
+Proof. If a polynomial-time adversary 𝐵 can effectively compromise the                          the implementation of a cryptographic reverse firewall. Conse-
+proposed scheme with a significant advantage, then we can develop a                             quently, the functionality of the cryptographic reverse firewalls
+challenger 𝐹 to solve the 𝑞-BDHE problem with a significant advantage.                          is preserved.
+The process is as follows:                                                                   2. Weakly Security-preserving and Weakly Exfiltration-resistant
+     Init Phase: The adversary 𝐵 submits access policies (𝑀𝑖 ∗ , 𝜌𝑖 ∗ )𝑖∈𝐼 ∗ and                We assume the following security game process.
+a set of malicious attribute authorities 𝑅 = (𝐴̂ 𝑖 )𝑖∈𝐼 , where 𝑀𝑖 ∗ is a 𝑙 ∗ 𝑛                 Game 0: Same as chapter 3 security games.
+matrix. Furthermore, the attributes within the access structure must                            Game 1: In the init phase, attribute authorities’ 𝑃 𝐾 , 𝐴𝑆 𝐾 𝑖 are
+originate from trusted attribute authorities and cannot be maliciously                          generated by algorithms GlobalSetup and AASetup of basic
+manipulated.                                                                                    scheme, not GlobalSetup*, AASetup* and 𝐴𝐴 .SetUp. The sub-
+     Setup Phase: The challenger 𝐹 executes algorithms AASetup and                              sequent algorithms are carried over unchanged from Game
+GlobalSetup to generate public parameter 𝑃 𝑎𝑟𝑎𝑚𝑠 = {𝑔 , 𝑢, 𝑣, 𝑤, ℎ, 𝐺, 𝐺𝑇 ,                     0.
+𝐻0 ()} and private keys (𝑃 𝐾𝑖 , 𝐴𝑆 𝐾 𝑖 )𝑖∈𝐼 . The reverse firewall 𝐴𝐴 ex-                      Game 2: During both phase 1 and phase 2, the secret key 𝑆 𝐾 is
+ecutes the algorithm 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 to re-random public key, then 𝐴𝐴                             derived from the KeyGen algorithm of the foundational scheme,
+publishes updated public key 𝑃 𝐾 ′ .                                                            rather than being produced by KeyGen* or the 𝐴𝐴 .𝐾 𝐺. The
+     Query Phase 1: During this phase, 𝐵 can dynamically request secret                         𝑇 𝐾 is produced using the KeyGen.ran function of the underlying
+keys for attribute sets 𝑆1 , 𝑆2 , … , 𝑆𝑞 . For every query 𝑆𝑖 , 𝐹 executes                      scheme, and not through KeyGen.ran* or the 𝐷𝑈 .TKUpdate.
+algorithm KeyGen to obtain corresponding secret key 𝑆 𝐾𝑖 . Then 𝐹                               The subsequent algorithms mirror those utilized in Game 1.
+executes algorithm 𝐴𝐴 .𝐾 𝐺 to get re-randomized secret key 𝑆 𝐾𝑖′ .                             Game 3: During the challenge phase, the ciphertext labeled
+Subsequently, 𝐹 executes KeyGen.ran to get conversion key 𝑇 𝐾𝑖 . Then                           as 𝐶 𝑇𝑏 is constructed through the process of encryption de-
+𝐹 runs 𝐷𝑈 .𝑇 𝐾 𝑈 𝑝𝑑 𝑎𝑡𝑒 to get re-randomized conversion key 𝑇 𝐾𝑖′ . 𝐶                          noted by Enc.offline, Enc.online, not Enc.offline*, Enc.online*,
+returns (𝑆 𝐾𝑖′ , 𝑇 𝐾𝑖′ ) to 𝐵.                                                                  𝐷𝑂 .Enc.offline and 𝐷𝑂 .Enc.online. Actually, Game 3 is the
+     Challenge Phase: 𝐵 provides two messages, 𝑚0 and 𝑚1 , of equal                             security game of basic scheme.
+length. 𝐹 randomly selects 𝑏 ∈ {0, 1} and runs Enc.Offline* and                                 We then proceed to demonstrate the indistinguishability be-
+                                                                                                tween Game 0 and Game 1, followed by Game 1 and Game
+Enc.Online* to get challenge ciphertext 𝐶 𝑇𝑏 = ((𝑀 , 𝜌), 𝐶 , 𝐶0 , {𝐶𝑗 ,1 , 𝐶𝑗 ,2 ,
+                                                                                                2, and finally between Game 2 and Game 3, each in isolation.
+𝐶𝑗 ,3 }𝑗∈[1,𝑙] ).
+                                                                                                Between Game 0 and Game 1, it is observed that no matter
+     Then 𝐹 executes 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒 and 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 Obtain a
+                                                                                                the modifications introduced by the tampered GlobalSetup* and,
+ciphertext 𝐶 𝑇𝑏′ . 𝐹 that has been re-randomized sends 𝐶 𝑇𝑏′ to 𝐵.
+                                                                                                AASetup* algorithms, after the application of re-randomization
+     Query Phase 2: The challenger 𝐹 proceeds as in Query Phase 1.
+                                                                                                via the 𝑊𝐴𝐴 reverse firewall, the public parameter 𝑃 𝐾 ′ always
+     Guess Phase: 𝐵 outputs a bit 𝑏′ ∈ {0, 1}. If 𝑏′ = 𝑏, then 𝐹 outputs 0
+                                                                                                corresponds to the structure of the 𝑃 𝐾 that is generated by the
+(meaning that 𝐵 obtains the normally generated ciphertext). If 𝑏′ ≠
+                                                                                                standard algorithm. This uniformity is due to the malleability
+𝑏, then 𝐹 outputs 1(meaning that 𝐵 obtains the randomly selected
+                                                                                                of the key in question. Consequently, there is no distinguishable
+element). Hence, the adversary 𝐵 has advantage of 𝜖 security game
+                                                                                                difference between Game 0 and Game 1.
+directly correlates to the ability of function 𝐹 to resolve the 𝑞-BDHE
+                                                                                                Given that the secret key 𝑆 𝐾 and the conversion key 𝑇 𝐾,
+problem with the same level of probability.
+                                                                                                which are produced for the user by the attribute authority, also
+                                                                                                possess malleability, it follows that Game 1 and Game 2 are
+5.2. Security analysis                                                                          indistinguishable. When it comes to Game 2 and Game 3, the 𝐶 𝑇
+                                                                                                will undergo rerandomization by the reverse firewall, resulting
+    The features of the proposed scheme include:                                                in a new ciphertext 𝐶 𝑇 ′ , a process that is a consequence of
+                                                                                                the ciphertext’s malleable nature. Thus, regardless of how the
+    1. Function Maintaining                                                                     Enc.offline* and Enc.online* algorithms operate, the ultimate
+       If the collection of attributes associated with the secret key                           configuration of the ciphertext aligns with that of the basic
+                                                        ∑
+       constitutes an authorized set, then the equation 𝑖∈𝐼 𝜔𝑖 ⋅ (𝜆𝑖 +                          scheme’s ciphertext structure. Consequently, there is no distin-
+       𝜆𝑖 ′ ) = 𝑠 + 𝑠′ holds. Thus,                                                             guishable difference between Game 2 and Game 3. In summary,
+
+                                                                                         6
+X. Yang et al.                                                                                                    Journal of Systems Architecture 160 (2025) 103331
+
+Table 1
+Function comparison.
+ Scheme                      With CRFs         Outsource         Offline encryption        Multi-authority      Ciphertext verification          Access structure
+ Guo et al. [25]             ✕                 ✓                 ✓                         ✕                    ✕                                Tree
+ Chaudhary et al. [28]       ✕                 ✓                 ✕                         ✓                    ✕                                LSSS
+ Hong et al. [31]            ✓                 ✕                 ✕                         ✓                    ✕                                LSSS
+ Zhong et al. [29]           ✕                 ✓                 ✕                         ✕                    ✕                                Tree
+ Zhao et al. [32]            ✓                 ✓                 ✓                         ✕                    ✕                                Tree
+ Jin et al. [33]             ✓                 ✕                 ✕                         ✕                    ✕                                LSSS
+ Elhabob et al. [34]         ✓                 ✕                 ✕                         ✕                    ✓                                Tree
+ Ours                        ✓                 ✓                 ✓                         ✓                    ✓                                TREE
+
+
+       we deduce that Game 0 and Game 3 are equivalent in terms of                       By combining the above technologies, this method not only pro-
+       their indistinguishability. Given that the foundational scheme is                 tects the communication channel, but also improves the security
+       secure, it follows that the proposed scheme is also secure.                       of information.
+    3. Message Verification
+       The data user(vehicle/RSU) use parameters 𝑇 𝑜𝑘𝑒𝑛, 𝑚 and hash               6. Performance evaluation
+       function 𝐻0 () to check whether equation 𝐻0 (𝑚) = 𝑇 𝑜𝑘𝑒𝑛 holds
+       true. With the help of the verification procedure described, the           6.1. Experimental setup
+       data user can identify any tampering that may have occurred
+       with the message. Additionally, it provides assurance regarding                The following outlines the hardware and software contexts utilized
+       the completeness and dependability of the received message. If             for conducting the experiment:
+       the message changes, the equation will not holds. Therefore, the
+       proposed scheme supports the message verification.                             • The experimental apparatus consists of a desktop computer
+    4. Collusion Resistance                                                             equipped with a 3.2 GHz AMD Ryzen 5 5600x CPU, 16 GB of
+                                                                                        RAM, and runs the Windows 11 Professional (x64) OS.
+Theorem 2. Should the difficulty of the discrete logarithm problem remain             • The experimental schemes are realized using Java 8 and the
+uncompromised, the proposed scheme can defend against collusion attacks                 JPBC 2.0.0 library [32]. The prime-order bilinear pairings are
+initiated by up to 𝑁 − 1 attribute authorities.                                         constructed upon a 160-bit elliptic curve group, which is founded
+                                                                                        on the equation 𝑦2 = 𝑥3 + 𝑥.
+        According to the encryption process, each attribute authority
+        randomly chooses 𝑠𝑖𝑘 ∈ 𝑍𝑝 and attribute authority extends                 6.2. Theoretical analysis
+        the value 𝑔 𝑠𝑖𝑘 to all the other attribute authorities involved.
+        Given the difficulty inherent in the discrete logarithm problem, it           Table 1 provides a side-by-side comparison to examine the function-
+        would be problematic for an adversary 𝐵 to deduce 𝑠𝑖𝑘 from 𝑔 𝑠𝑖𝑘          ality of our proposed scheme in relation to other schemes. Scheme [25]
+        alone. Hence, even with the combined efforts of 𝑁 − 2 attribute           supports outsourced decryption and online encryption, but the rest
+        authorities working in tandem with the adversary, guessing a              of the functionality is not realized. Scheme [28] introduced multiple
+        valid 𝑀 𝐾𝑖 remains an unattainable task for the adversary. Con-           authorities to protect against collusion attacks. Scheme [29] only pro-
+        sequently, the adversary cannot devise a valid secret key 𝑆 𝐾.            vides outsource decryption, thus the efficiency of encryption phase is
+        This renders the proposed scheme resistant to collusion attacks           not good enough. Scheme [31–34], add CRF modules between entities
+        carried out by 𝑁 − 1 attribute authorities.                               based on the above schemes. However, these schemes either do not
+                                                                                  have outsourced decryption or do not have multiple attribute authori-
+5.3. Informal security analysis                                                   ties, which has some disadvantages. Our scheme provides both of these
+                                                                                  features, taking into account both efficiency and security. Through
+    1. Side channel attack defenses                                               comparison, we can find that the proposed scheme adds cryptographic
+       The proposed scheme utilizes CRF technology, which signif-                 reverse firewalls between entities. By employing these firewalls, the
+       icantly reduces the computational overhead while enhancing                 system is fortified with a layer of defense that maintains its func-
+       security. By leveraging CRF, it reduces the risk of messages               tional integrity against potential subversion attacks and any attempts
+       being attacked and complicates potential threats. In addition,             to tamper with its algorithms.
+       multi-authorization technology maximizes the security of the                   The introduction of multi-attribute authorities ensures that the sys-
+       entire system, effectively preventing single-point leakage, while          tem is resistant to collusion attacks. The proposed scheme also provides
+       balancing power consumption and execution time. These two                  outsourcing decryption as well as offline encryption, which requires
+       methods not only improve the efficiency, but also provide strong           low computation for the users to obtain the ciphertext. Addition-
+       protection against side channel attacks.                                   ally, verification credentials empower users to check and ensure the
+       In short, the scheme effectively combines efficiency and en-               ciphertext’s integrity.
+       hanced security, making it suitable for secure communication in                The following notations are applied within Tables 2 and 3 are as
+       vehicular networks that are susceptible to side channels.                  follows: 𝐸 signifies an exponential operation, and 𝑃 denotes a bilinear
+    2. Man-in-the-Middle attack defense0                                          pairing operation. In the given context, 𝑀 signifies the number of rows
+       The proposed scheme uses CP-ABE technology. This technique                 in a matrix as well as the number of leaf nodes in an access tree. The
+       uses a ciphertext policy, which embeds the access policy into the          symbol 𝑙 is used to denote the total number of attributes possessed by
+       ciphertext. This improves the security and flexibility of access           users, while 𝑘 signifies the minimum number of attributes from the
+       control and reduces the risk of man-in-the-middle attack (MITI)            access structure required to fulfill the decryption criteria.
+       due to identity forgery.                                                       As shown in Table 2, our scheme is in the middle of the 𝐾 𝑒𝑦𝐺𝑒𝑛
+       In addition, we enhance the CRF module by integrating key pa-              phase. However, our scheme achieves the lowest computational over-
+       rameter re-randomization within the multi-authority ABE frame-             head in the 𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 phase. In the 𝐷𝑒𝑐 .𝑂𝑢𝑡 phase, our scheme does
+       work. In addition, the proposed scheme also supports message               not achieve significant advantages. But in 𝐷𝑒𝑐 .𝑈 𝑠𝑒𝑟 phase, our scheme
+       integrity verification, easily executable by onboard terminals             requires only a single exponential operation, reaches a constant level
+       using simple hash functions.                                               of computational overhead.
+
+                                                                              7
+X. Yang et al.                                                                                                            Journal of Systems Architecture 160 (2025) 103331
+
+
+
+
+                                                           Fig. 3. Time consumption of basic scheme.
+
+                 Table 2
+                 Computation comparison.
+                  Scheme                   KeyGen               Encryption                             Outsource decryption            User decryption
+                                                                Offline              Online
+                  Guo et al. [25]          (𝑙 + 4)𝐸             (3𝑀 + 1)𝐸            3𝐸                2𝑙𝐸 + 2𝑙𝑃                       𝐸
+                  Chaudhary et al. [28]    (2𝑙 + 2)𝐸            ✕                    (3𝑀 + 1)𝐸         (4𝑙 + 2)𝐸                       𝐸
+                  Zhong et al. [29]        (3𝑙 + 6)𝐸            ✕                    (2𝑀 + 2)𝐸         ✕                               2𝑙𝐸 + (𝑙 + 1)𝑃
+                  Hong et al. [31]         (4𝑙 + 2)𝐸 + 𝑃        ✕                    (5𝑀 + 2)𝐸         ✕                               𝐸 + (3𝑘 + 1)𝑃
+                  Zhao et al. [32]         (2𝑙 + 4)𝐸            3𝑀 𝐸 + 𝑃             3𝐸                (3𝑙 + 1)𝐸 + (2𝑙 + 1)𝑃           2𝐸
+                  Jin et al. [33]          𝑙𝐸 + 𝑃               ✕                    6𝑀 𝐸 + 3𝑃         ✕                               𝑙𝐸 + 2𝑃
+                  Elhabob et al. [34]      (2𝑙 + 2)𝐸            ✕                    4𝐸                ✕                               3𝐸
+                  Ours                     (2𝑙 + 3)𝐸            (2𝑀 + 2)𝐸            3𝐸                𝑙𝐸 + 3𝑙𝑃                        𝐸
+
+
+Table 3                                                                                  Fig. 3(a) demonstrates that our scheme has a low computational
+Time consumption of CRFs.
+                                                                                     overhead., is observed to be low. As shown in Fig. 3(b), when compar-
+ Scheme                     𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝    𝐴𝐴 .𝐾 𝐺         𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒       ing the computational overhead of the 𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 phase, our scheme,
+ Hong et al. [31]           2𝑙𝐸 + 2𝑙𝑃       (5𝑙 + 2)𝐸        2𝑙𝐸 + 𝑃                 which benefits from the preprocessing performed in the 𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒
+ Zhao et al. [32]           2𝐸              (2𝑙 + 3)𝐸        4𝐸
+                                                                                     phase, has the lowest computational overhead of all the schemes eval-
+ Jin et al. [33]            (𝑙 + 2)𝐸        (2𝑙 + 2)𝐸        𝑃
+ Elhabob et al. [34]        2𝐸              (2𝑙 + 3)𝐸        4𝐸                      uated. In terms of Fig. 3(c), the efficiency of our scheme is in the
+ Ours                       5𝐸              (2𝑙 + 3)𝐸        2𝐸                      middle of the 𝐷𝑒𝑐 .𝑂𝑢𝑡 phase. While in the 𝐷𝑒𝑐 .𝑈 𝑠𝑒𝑟 phase, our scheme
+                                                                                     maintains the lowest computational overhead, It is also significant to
+                                                                                     observe that the overhead does not fluctuate with varying counts of
+                                                                                     attributes in the system.
+   In terms of CRFs’ time consumption, our scheme achieves time con-
+                                                                                         As depicted in Fig. 4, there is a performance comparison for the re-
+sumption of constant level in 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 phase as illustrated in 3, the
+                                                                                     randomization of secret keys by CRF 𝐴𝐴 . Our scheme’s computational
+time overhead does not fluctuate based on the count of attributes within
+                                                                                     overhead is similar to that of scheme [32], which is at the lower
+the system. Moreover, our scheme achieves the highest efficiency in
+                                                                                     level. Moreover, as shown in Fig. 5, the computational overhead of
+terms of the 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 phase, and requires only two exponential
+                                                                                     our scheme in the 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 phase is the most efficient and does
+operations.
+                                                                                     not escalate linearly with an increase in vehicle attributes, which is a
+                                                                                     distinct advantage over other scheme [31]. And compared with [33,
+6.3. Practical analysis                                                              34], the proposed scheme still has an advantage in the computational
+                                                                                     overhead of 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 phase.
+    In light of the hardware and software environment described within                   In summary, our scheme reduces resource consumption on the user
+the xperimental Setup section, Fig. 3 presents a performance comparison              side and improves the efficiency of data flow in vehicles with limited
+of the multiple phases of our scheme.                                                computing power.
+
+                                                                                 8
+X. Yang et al.                                                                                                       Journal of Systems Architecture 160 (2025) 103331
+
+
+                                                                               Acknowledgments
+
+                                                                                  This work was supported in part by Key project of Gansu Science
+                                                                               and Technology Plan (23YFGA0081), Gansu Province College Industry
+                                                                               Ssupport Plan (2023CYZC-09), National Natural Science Foundation of
+                                                                               China (No. 62362059).
+
+                                                                               Data availability
+
+                                                                                  The authors do not have permission to share data.
+
+
+                                                                               References
+                   Fig. 4. Time consumption of 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝.
+                                                                                [1] Siyi Liao, Jun Wu, Jianhua Li, Ali Kashif Bashir, Shahid Mumtaz, Alireza Jolfaei,
+                                                                                    Nida Kvedaraite, Cognitive popularity based AI service sharing for software-
+                                                                                    defined information-centric networks, IEEE Trans. Netw. Sci. Eng. 7 (4) (2020)
+                                                                                    2126–2136.
+                                                                                [2] Rich Miller, Rolling zettabytes: Quantifying the data impact of connected cars,
+                                                                                    Data Cent. Front. (2020).
+                                                                                [3] Kayhan Zrar Ghafoor, Linghe Kong, Sherali Zeadally, Ali Safaa Sadiq, Gre-
+                                                                                    gory Epiphaniou, Mohammad Hammoudeh, Ali Kashif Bashir, Shahid Mumtaz,
+                                                                                    Millimeter-wave communication for internet of vehicles: status, challenges, and
+                                                                                    perspectives, IEEE Internet Things J. 7 (9) (2020) 8525–8546.
+                                                                                [4] Soheila Ghane, Alireza Jolfaei, Lars Kulik, Kotagiri Ramamohanarao, Deepak
+                                                                                    Puthal, Preserving privacy in the internet of connected vehicles, IEEE Trans.
+                                                                                    Intell. Transp. Syst. 22 (8) (2020) 5018–5027.
+                                                                                [5] Liang Zhao, Hongmei Chai, Yuan Han, Keping Yu, Shahid Mumtaz, A collabo-
+                                                                                    rative V2X data correction method for road safety, IEEE Trans. Reliab. 71 (2)
+                                                                                    (2022) 951–962.
+                                                                                [6] Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, Lanyu Xu, Edge computing:
+                                                                                    Vision and challenges, IEEE Internet Things J. 3 (5) (2016) 637–646.
+                 Fig. 5. Time consumption of 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒.                 [7] Zhenyu Zhou, Haijun Liao, Bo Gu, Shahid Mumtaz, Jonathan Rodriguez, Resource
+                                                                                    sharing and task offloading in IoT fog computing: A contract-learning approach,
+                                                                                    IEEE Trans. Emerg. Top. Comput. Intell. 4 (3) (2019) 227–240.
+                                                                                [8] Xingwang Li, Zhen Xie, Zheng Chu, Varun G Menon, Shahid Mumtaz, Jianhua
+7. Conclusion                                                                       Zhang, Exploiting benefits of IRS in wireless powered NOMA networks, IEEE
+                                                                                    Trans. Green Commun. Netw. 6 (1) (2022) 175–186.
+                                                                                [9] Vipul Goyal, Omkant Pandey, Amit Sahai, Brent Waters, Attribute-based encryp-
+    In the IoV environment, securing the encryption and sharing of the              tion for fine-grained access control of encrypted data, in: Proceedings of the 13th
+vast amounts of data generated by vehicles, while preventing data leak-             ACM Conference on Computer and Communications Security, 2006, pp. 89–98.
+age due to device tampering, presents significant challenges. To address       [10] Amit Sahai, Brent Waters, Fuzzy identity-based encryption, in: Advances in
+these challenges, we propose an advanced attribute-based encryption                 Cryptology–EUROCRYPT 2005: 24th Annual International Conference on the
+                                                                                    Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, May
+scheme, enhanced with a cryptographic reverse firewall, specifically
+                                                                                    22-26, 2005. Proceedings 24, Springer, 2005, pp. 457–473.
+designed for the IoV ecosystem. This scheme is supported by multiple           [11] John Bethencourt, Amit Sahai, Brent Waters, Ciphertext-policy attribute-based
+attribute authorities, which not only defend against collusion attacks              encryption, in: 2007 IEEE Symposium on Security and Privacy, SP’07, IEEE,
+but also enable offline encryption and outsourced decryption. These                 2007, pp. 321–334.
+                                                                               [12] Matthew Green, Susan Hohenberger, Brent Waters, Outsourcing the decryption
+integrated features greatly improve the computational efficiency of
+                                                                                    of {abe} ciphertexts, in: 20th USENIX Security Symposium, USENIX Security 11,
+vehicular onboard units. Additionally, we deploy RSUs with CRFs                     2011.
+between the entities, ensuring that data remains secure even in the            [13] Junzuo Lai, Robert H. Deng, Chaowen Guan, Jian Weng, Attribute-based encryp-
+event of device tampering. The proposed attribute-based encryption                  tion with verifiable outsourced decryption, IEEE Trans. Inf. Forensics Secur. 8
+scheme, combined with the reverse firewall mechanism, shows great                   (8) (2013) 1343–1354.
+                                                                               [14] Suqing Lin, Rui Zhang, Hui Ma, Mingsheng Wang, Revisiting attribute-based
+promise in securing data transmission and storage within the IoV, while
+                                                                                    encryption with verifiable outsourced decryption, IEEE Trans. Inf. Forensics
+protecting against unauthorized access and data leakage.                            Secur. 10 (10) (2015) 2119–2130.
+                                                                               [15] Cong Zuo, Jun Shao, Guiyi Wei, Mande Xie, Min Ji, CCA-secure ABE with
+                                                                                    outsourced decryption for fog computing, Future Gener. Comput. Syst. 78 (2018)
+CRediT authorship contribution statement
+                                                                                    730–738.
+                                                                               [16] James Ball, Julian Borger, Glenn Greenwald, et al., Revealed: how US and UK
+   Xiaodong Yang: Writing – review & editing, Writing – original                    spy agencies defeat internet privacy and security, Know Your Neighb. (2013).
+draft. Xilai Luo: Writing – review & editing, Writing – original draft.        [17] Stephen Checkoway, Ruben Niederhagen, Adam Everspaugh, Matthew Green,
+                                                                                    Tanja Lange, Thomas Ristenpart, Daniel J Bernstein, Jake Maskiewicz, Hovav
+Zefan Liao: Writing – review & editing, Writing – original draft. Wenjia            Shacham, Matthew Fredrikson, On the practical exploitability of dual {ec} in
+Wang: Writing – review & editing, Writing – original draft. Xiaoni                  {tls} implementations, in: 23rd USENIX Security Symposium, USENIX Security
+Du: Writing – review & editing, Writing – original draft. Shudong Li:               14, 2014, pp. 319–335.
+Writing – review & editing, Writing – original draft.                          [18] Yevgeniy Dodis, Chaya Ganesh, Alexander Golovnev, Ari Juels, Thomas Risten-
+                                                                                    part, A formal treatment of backdoored pseudorandom generators, in: Advances
+                                                                                    in Cryptology–EUROCRYPT 2015: 34th Annual International Conference on the
+Declaration of competing interest                                                   Theory and Applications of Cryptographic Techniques, Sofia, Bulgaria, April
+                                                                                    26-30, 2015, Proceedings, Part I 34, Springer, 2015, pp. 101–126.
+                                                                               [19] Ilya Mironov, Noah Stephens-Davidowitz, Cryptographic reverse firewalls, in: Ad-
+    The authors declare that they have no known competing finan-                    vances in Cryptology-EUROCRYPT 2015: 34th Annual International Conference
+cial interests or personal relationships that could have appeared to                on the Theory and Applications of Cryptographic Techniques, Sofia, Bulgaria,
+influence the work reported in this paper.                                          April 26-30, 2015, Proceedings, Part II 34, Springer, 2015, pp. 657–686.
+
+
+                                                                           9
+X. Yang et al.                                                                                             Journal of Systems Architecture 160 (2025) 103331
+
+
+[20] Brent Waters, Ciphertext-policy attribute-based encryption: An expressive, effi-           Xilai Luo is presently a master’s degree candidate at the
+     cient, and provably secure realization, in: International Workshop on Public Key           College of Computer Science and Engineering, Northwest
+     Cryptography, Springer, 2011, pp. 53–70.                                                   Normal University, located in China. His academic pur-
+[21] Shucheng Yu, Cong Wang, Kui Ren, Wenjing Lou, Achieving secure, scalable,                  suits are focused on the areas of artificial intelligence,
+     and fine-grained data access control in cloud computing, in: 2010 Proceedings              information security, and cryptography.
+     IEEE INFOCOM, IEEE, 2010, pp. 1–9.
+[22] Kan Yang, Xiaohua Jia, Kui Ren, Ruitao Xie, Liusheng Huang, Enabling efficient
+     access control with dynamic policy updating for big data in the cloud, in: IEEE
+     INFOCOM 2014-IEEE Conference on Computer Communications, IEEE, 2014, pp.
+     2013–2021.
+[23] Jun Feng, Hu Xiong, Jinhao Chen, Yang Xiang, Kuo-Hui Yeh, Scalable and
+     revocable attribute-based data sharing with short revocation list for IIoT, IEEE
+     Internet Things J. 10 (6) (2022) 4815–4829.                                                Zefan Liao is actively working towards his master’s degree
+[24] Qian Mei, Hu Xiong, Yeh-Cheng Chen, Chien-Ming Chen, Blockchain-enabled                    in the College of Computer Science and Engineering at
+     privacy-preserving authentication mechanism for transportation cps with                    Northwest Normal University, China. His areas of research
+     cloud-edge computing, IEEE Trans. Eng. Manage. (2022).                                     interest include the fields of edge computing, information
+[25] Rui Guo, Geng Yang, Huixian Shi, Yinghui Zhang, Dong Zheng, O 3-R-CP-ABE: An               security, and cryptography.
+     efficient and revocable attribute-based encryption scheme in the cloud-assisted
+     IoMT system, IEEE Internet Things J. 8 (11) (2021) 8949–8963.
+[26] Melissa Chase, Multi-authority attribute based encryption, in: Theory of Cryp-
+     tography: 4th Theory of Cryptography Conference, TCC 2007, Amsterdam, the
+     Netherlands, February 21-24, 2007. Proceedings 4, Springer, 2007, pp. 515–534.
+[27] Allison Lewko, Brent Waters, Decentralizing attribute-based encryption, in: An-
+     nual International Conference on the Theory and Applications of Cryptographic
+     Techniques, Springer, 2011, pp. 568–588.                                                   Wenjia Wang is pursuing her master’s degree within the
+[28] Chandan Kumar Chaudhary, Richa Sarma, Ferdous Ahmed Barbhuiya, RMA-                        College of Computer Science and Engineering at Northwest
+     CPABE: A multi-authority CPABE scheme with reduced ciphertext size for IoT                 Normal University, China. Her research interests are cen-
+     devices, Future Gener. Comput. Syst. 138 (2023) 226–242.                                   tered on the topics of data security and network security.
+[29] Hong Zhong, Yiyuan Zhou, Qingyang Zhang, Yan Xu, Jie Cui, An efficient and
+     outsourcing-supported attribute-based access control scheme for edge-enabled
+     smart healthcare, Future Gener. Comput. Syst. 115 (2021) 486–496.
+[30] Hui Ma, Rui Zhang, Guomin Yang, Zishuai Song, Shuzhou Sun, Yuting Xiao,
+     Concessive online/offline attribute based encryption with cryptographic reverse
+     firewalls—Secure and efficient fine-grained access control on corrupted machines,
+     in: Computer Security: 23rd European Symposium on Research in Computer
+     Security, ESORICS 2018, Barcelona, Spain, September 3-7, 2018, Proceedings,                Xiaoni Du received the Ph.D. degree in cryptography from
+     Part II 23, Springer, 2018, pp. 507–526.                                                   Xidian University, Xi’an, China, in 2008.
+[31] Bo Hong, Jie Chen, Kai Zhang, Haifeng Qian, Multi-authority non-                                She worked as a Visiting Scholar with the University of
+     monotonic KP-ABE with cryptographic reverse firewall, IEEE Access 7 (2019)                 Kentucky, Lexington, KY, USA, and Hong Kong University
+     159002–159012.                                                                             of Science and Technology, Hong Kong, in 2011 and 2014,
+[32] Yang Zhao, Yuwei Pang, Xingyu Ke, Bintao Wang, Guobin Zhu, Mingsheng Cao,                  respectively. She is currently a Professor with the College
+     A metaverse-oriented CP-ABE scheme with cryptographic reverse firewall, Future             of Mathematics and Statistics, Northwest Normal Univer-
+     Gener. Comput. Syst. 147 (2023) 195–206.                                                   sity, Lanzhou, China. Her main research interests include
+[33] Jin C., Chen Z., Qin W., et al., Blockchain-based proxy re-encryption scheme               information security, cryptography, and coding.
+     with cryptographic reverse firewall for IoV, Int. J. Netw. Manage. (2024) e2305.
+[34] Elhabob R., Eltayieb N., Xiong H., et al., Equality test public key encryption
+     with cryptographic reverse firewalls for cloud-based E-commerce, IEEE Trans.
+     Consum. Electron. (2024).                                                                  Shudong Li received the M.S. degree in applied mathe-
+                                                                                                matics from Tongji University, Shanghai, China, in 2005,
+                                                                                                and the Ph.D. degree in Posts and Telecommunications from
+                          Xiaodong Yang (Member, IEEE) received the M.S. degree                 Beijing University, Beijing, China, in 2012.
+                          in cryptography from Tongji University, Shanghai, China, in                From 2013 to 2018, he held the position of a post-
+                          2005, and the Ph.D. degree in cryptography from Northwest             doctoral researcher at the National University of Defense
+                          Normal University, Lanzhou, China, in 2010.                           Technology in Changsha, China. He now serves as a Pro-
+                               In his role as a Postdoctoral Researcher at China’s State        fessor at the Cyberspace Institute of Advanced Technology
+                          Key Laboratory of Cryptology in Beijing during 2016, he               at Guangzhou University. His primary research interests
+                          played a significant part in advancing the field. Today, he           are in the realms of Big Data and its security, malware
+                          holds the position of Professor at the College of Computer            identification, and cloud computing.
+                          Science and Engineering, Northwest Normal University. The
+                          core of his research is anchored in public-key cryptogra-
+                          phy, information security protocols, and the application of
+                          wireless sensor networks.
+
+
+
+
+                                                                                           10
+
--- a/papers_txt/A-hash-based-post-quantum-ring-signature-scheme-fo_2025_Journal-of-Systems-A.txt
+++ b/papers_txt/A-hash-based-post-quantum-ring-signature-scheme-fo_2025_Journal-of-Systems-A.txt
@@ -0,0 +1,965 @@
+                                                                Journal of Systems Architecture 160 (2025) 103345
+
+
+                                                                     Contents lists available at ScienceDirect
+
+
+                                                            Journal of Systems Architecture
+                                                           journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+A hash-based post-quantum ring signature scheme for the Internet of Vehicles
+Shuanggen Liu a ,∗, Xiayi Zhou a , Xu An Wang b , Zixuan Yan a , He Yan a , Yurui Cao a
+a
+    School of Cyberspace Security, Xi’an University of Posts and Telecommunications, Xi’an, Shaanxi, China
+b
+    Key Laboratory of Network and Information Security, Engineering University of People’s Armed Police, Shaanxi, China
+
+
+
+ARTICLE                  INFO                               ABSTRACT
+
+Keywords:                                                   With the rapid development of the Internet of Vehicles, securing data transmission has become crucial,
+Ring signature                                              especially given the threat posed by quantum computing to traditional digital signatures. This paper presents
+Internet of Vehicles                                        a hash-based post-quantum ring signature scheme built upon the XMSS hash-based signature framework,
+Merkle tree
+                                                            leveraging Merkle trees for efficient data organization and verification. In addition, the scheme is applied to
+Post-quantum digital signature
+                                                            the Internet of Vehicles, ensuring both anonymity and traceability while providing robust quantum-resistant
+Hash-based signature scheme
+                                                            security. Evaluation results indicate that, compared to other schemes, the proposed method achieves superior
+                                                            verification speed while ensuring data security and privacy.
+
+
+
+1. Introduction                                                                                 area of study, with the aim of establishing a resilient foundation
+                                                                                                for the industry. The National Institute of Standards and Technology
+    As a fundamental necessity in modern life, the number of vehicles                           (NIST) has been conducting a multi-stage standardization process for
+produced worldwide continues to grow. According to relevant statistics,                         post-quantum cryptography. The third round of candidate evaluations
+global vehicle production reached 94 million units in 2023 [1]. Ad-                             has been completed, and algorithms such as SPHINCS+, CRYSTALS-
+ditionally, data from the International Organization of Motor Vehicle                           DILITHIUM, and CRYSTALS-KYBER have been standardized. These
+Manufacturers indicates that there are now 1.3 billion vehicles in                              algorithms achieve varying levels of bit-level security depending on
+use [2]. However, this growth brings various challenges, including                              key size and parameter settings, which align with NIST security levels
+network attacks, unauthorized access, and concerns around road safety                           from 1 to 5, representing 128/160/192/224/256-bit security strengths,
+and privacy. To address these issues, new research fields, such as                              respectively [5]. A post-quantum digital signature scheme is a dig-
+intelligent transportation systems (ITS) and the Internet of Vehicles                           ital signature scheme capable of resisting quantum attacks. Among
+(IoV), have emerged. These fields aim to provide safer, more efficient,                         post-quantum digital signature schemes, hash-based schemes are partic-
+and more harmonious vehicular environments. Vehicle-to-Everything                               ularly effective and provably secure. Hash-based post-quantum digital
+(V2X) technology enables the effective use of dynamic information                               signature schemes offer significant advantages over other types of
+from all networked vehicles via on-board devices, facilitating secure,
+                                                                                                post-quantum schemes due to their high computational efficiency, scal-
+efficient, intelligent, and comfortable services, thereby contributing
+                                                                                                ability, maturity, and reliance solely on the preimage resistance of the
+to the intelligence of social traffic systems [3]. The typical VANET
+                                                                                                underlying hash function [6].
+structure is shown in Fig. 1.
+                                                                                                    In IoV networks, where both privacy and traffic safety are essential,
+    With the increasing number of vehicles and the development of
+                                                                                                ring signatures are especially suitable. Ring signature schemes offer
+the IoV, it is a very important job to ensure the security of the
+                                                                                                anonymity by concealing the identity of signer among a group of par-
+IoV systems. Currently, the security of vehicular networks, whether
+                                                                                                ticipants. Using hash-based post-quantum ring signatures, vehicles can
+internal or external, primarily relies on digital signatures or public-
+                                                                                                sign messages anonymously within a group, ensuring their identities
+key encryption. However, as quantum computing advances, traditional
+digital signature algorithms are increasingly vulnerable to quantum                             cannot be traced. These signatures also provide unforgeability, collision
+attacks, making it essential to incorporate post-quantum digital sig-                           resistance, resilience against quantum attacks, and low communication
+nature algorithms into IoV research. Unlike traditional computers,                              overhead. In densely populated cities, managing keys for secure vehic-
+quantum computers can accelerate the cracking of probabilistic al-                              ular communications can be challenging, especially given the limited
+gorithms through parallel computation capabilities [4]. In light of                             IoV coverage [7]. The Merkle tree structure effectively compresses
+these challenges, post-quantum cryptography has become a critical                               keys, reducing key management costs [8]. In this study, we propose a
+
+
+     ∗ Corresponding author.
+       E-mail address: liushuanggen201@xupt.edu.cn (S. Liu).
+
+https://doi.org/10.1016/j.sysarc.2025.103345
+Received 11 November 2024; Received in revised form 23 December 2024; Accepted 16 January 2025
+Available online 23 January 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+S. Liu et al.                                                                                                   Journal of Systems Architecture 160 (2025) 103345
+
+
+                                                                                of classical signature and ring signature in the quantum environment,
+                                                                                and proposed two short signature schemes, which were implemented
+                                                                                in the quantum random prediction model and the ordinary model
+                                                                                respectively [20]. Recent literature has introduced novel architectures,
+                                                                                such as linkable ring signatures, threshold ring signatures, and identity-
+                                                                                based post-quantum ring signatures, discussing their post-quantum se-
+                                                                                curity features [21–23], Similarly, literature [24]systematically reviews
+                                                                                the theory and application of linkable ring signatures, providing an in-
+                                                                                depth comparison of anonymization and linkability schemes, but these
+                                                                                studies lack analysis of specific application scenarios (such as the IoV),
+                                                                                and do not fully consider resource-constrained environments and the
+                                                                                potential of anti-quantum computing.
+                                                                                    In response to the research of NIST on post-quantum algorithms
+                                                                                and verification ring signatures, a blockchain-based, post-quantum
+                                                                                anonymous, traceable, and verifiable authentication scheme was pro-
+                                                                                posed to mitigate quantum attacks while addressing security and pri-
+                                                                                vacy concerns, with an evaluation of its feasibility in IoV environ-
+                                                                                ments [25]. The IoV faces significant security and privacy challenges,
+                         Fig. 1. VANET structure.
+                                                                                and blockchain technology offers an effective platform to ensure both
+                                                                                user privacy and security [26–28]. Literature [29] proposes an identity
+                                                                                authentication and signature scheme for UAV-assisted Vehicular Ad
+                                                                                Hoc Networks (VANET), focusing on enhancing network anonymity
+hash-based post-quantum ring signature scheme for IoV applications.
+                                                                                and user privacy through an efficient authentication mechanism. Lit-
+The ring signature algorithm of Our scheme is based on the XMSS
+                                                                                erature [30] introduces a distributed message authentication scheme
+algorithm, aiming to enhance data sharing security and efficiency.
+                                                                                combined with a reputation mechanism to improve the security and
+Merkle trees are used to organize and verify data efficiently, while ring
+                                                                                trust of the IoV. The scheme uses node credit values to authenticate
+signatures ensure the authenticity and integrity of data within the IoV
+                                                                                message validity, effectively preventing malicious attacks and forgery.
+network without compromising user anonymity.
+                                                                                Literature [31] presents an authentication key negotiation protocol for
+                                                                                intelligent transportation systems in vehicle networks, strengthening
+1.1. Related works                                                              identity authentication and key exchange mechanisms to prevent secu-
+                                                                                rity threats such as eavesdropping, tampering, and man-in-the-middle
+    In recent years, hash-based post-quantum digital signature schemes          attacks. While these studies address key security challenges in vehicular
+have garnered significant attention within the cryptography commu-              networks, they often focus on specific aspects, lacking comprehensive
+nity. Following the fourth round of the NIST post-quantum digital               and scalable frameworks for real-world scenarios. Furthermore, the
+signature standardization process, the SPHINCS+ algorithm was in-               integration of post-quantum cryptography and scalability in dynamic,
+troduced as a supplementary standard, featuring a flexible, tunable             large-scale networks remains underexplored, highlighting opportunities
+hash function structure [9]. As the standardization process progresses,         for future research into robust and future-proof solutions. Given the
+researchers have proposed various adaptations, including SPHINCS-a              inherent advantages of ring signatures, they are particularly well-
+and SPHINCS+-c, which further compress signature sizes and enhance              suited for applications such as the Internet of Vehicles, making further
+execution speeds [10,11]. Additionally, Sun, Liu, and colleagues de-            investigation essential.
+veloped a domestic signature algorithm based on the post-quantum                    In order to ensure the post-quantum security of data transmission
+hash function SM3 [12]. Hülsing and Kudinov provided a rigorous                 in the IoV environment, researchers have proposed various solutions.
+security proof for the SPHINCS+ algorithm, confirming its robustness            The literature [32] recommends the use of lattice-based post-quantum
+in a post-quantum environment [13]. The XMSS algorithm forms the                digital signature, but the signature algorithm has not been combined
+foundation of SPHINCS+, with its architectural design and security              with specific scenarios. Another study [33] proposed a ring-signature
+proof presented by Hülsing, Butin, and others [14]. Research on hard-           scheme based on lattice-based difficult problems and combined it with
+ware implementations of the XMSS algorithm has also advanced, with              the vehicle-connected environment, but the quantum anti-attack char-
+significant contributions from Thoma and Güneysu [15]. Meanwhile,               acteristics of the scheme were not explained in detail. In addition,
+Sun and Liu investigated the feasibility of replacing the hash function         reducing energy consumption in blockchain has also become a research
+in XMSS with the domestic SM3 hash function [16]. An essential com-             focus [34]. An energy saving method is adopted to calculate the root of
+ponent of XMSS is WOTS+, a one-time signature algorithm; Hülsing                Merkle tree, and a Merkle tree design scheme conforming to the specifi-
+provided its security proof [17], while Zhang, Cui, and colleagues              cation is proposed. The effectiveness of this method is verified through
+evaluated the efficiency of WOTS+ in tree-based one-time signature              experiments. At the same time, the Merkle tree accumulator algorithm
+algorithms [18]. Currently, research on post-quantum digital signatures         proposed by Derler and Ramacher in [35] builds an accumulator that
+primarily concentrates on enhancing signature efficiency and replacing          can resist quantum attacks by using only hash function and symmetric
+the underlying hash functions. However, there is a scarcity of studies          meta language, and gives specific operations and definitions. However,
+that integrate post-quantum digital signatures with specific application        the specific algorithm implementation and its combination in practical
+scenarios or explore their variants.                                            application scenarios need to be further studied.
+    The exploration of post-quantum ring signatures is also accelerating
+in post-quantum digital signature research. Xie, Wang, and colleagues           1.2. Contributions
+highlighted that traditional signature algorithms are highly susceptible
+to quantum computing attacks, and noted that ring signatures offer                  Firstly, building on the Merkle tree accumulator algorithm described
+considerable advantages in blockchain applications, including medical           in Ref. [35], we propose a hash-based ring signature algorithm specif-
+data sharing and vehicular networking, due to their unique proper-              ically designed for IOV, we improve the Merkle tree accumulator
+ties [19]. Chatterjee and Chung et al. conducted an in-depth analysis on        algorithm to XMSS accumulator algorithm. This algorithm integrates
+the security of post-quantum ring signature, re-examined the security           the principles of ring signatures with Merkle tree structures. Unlike
+
+                                                                            2
+S. Liu et al.                                                                                                             Journal of Systems Architecture 160 (2025) 103345
+
+                Table 1
+                Notation for ring signature scheme.                                          Let the security parameter 𝜆, ring signature 𝑅𝑆 = (𝐺𝑒𝑛, 𝑠𝑖𝑔 , 𝑉 𝑒𝑟),
+                 𝜆                   Security parameter                                  algorithm A is polynomial-time algorithm (any PPT adversary A), for
+                                                                                         any integer 𝑠, define the following experiment:
+                 𝑁                   The size of the ring
+                 (𝑝𝑘, 𝑠𝑘)            Key pair                                                Step 1, the challenger generates 𝑠 key pairs (𝑝𝑘, 𝑠𝑘) in which
+                 𝑅                   A ring consisting of (𝑝𝑘1 , 𝑝𝑘2 , … … , 𝑝𝑘𝑙 )       𝑖 ∈ [1, 𝑠], and sends all the public keys 𝑃 𝐾𝑖 in a set 𝑃 𝐾 = (𝑃 𝐾1 ,
+                 𝑚                   The message digest                                  𝑃 𝐾2 , … , 𝑃 𝐾𝑠 ) to 𝐴.
+                 𝜎                   The signature of message                                Step 2, the challenger chooses one 𝑃 𝐾𝑖 and checks whether 𝑃 𝐾𝑖
+                                                                                         belongs to 𝑅, if 𝑆 𝑖𝑔(𝑠𝑘𝑖 , 𝑅, 𝑚) → 𝜎 is calculated by the challenger, then
+                                                                                         the challenger will send 𝜎 to A.
+                                                                                             Step 3, the attacker outputs the tuple 𝑅∗ , 𝑚∗ , 𝜎 ∗ , and the challenger
+traditional ring signature algorithms, this proposed scheme can resist
+                                                                                         checks it.
+quantum attacks, thus offering post-quantum security.
+                                                                                             If: 𝑅∗ ∈ 𝑃 𝐾 Attacker A never performs signature query access to
+   Secondly, we construct a new hash-based post-quantum ring sig-
+                                                                                         (𝑠𝑖𝑔 𝑛, 𝑅∗ , 𝑚∗ ),
+nature scheme for application of vehicular network. This scheme en-                          𝑉 𝑒𝑟(𝑅∗ , 𝑚∗ , 𝜎 ∗ )
+hances the security of data transmission within the vehicular network,                       And returns a 1 for the experiment, or a 0 otherwise.
+providing robust post-quantum security to effectively protect shared
+data.                                                                                    𝐴𝑑 𝑣𝜆,𝑠
+                                                                                             𝑈𝑁𝐹
+                                                                                                 (𝐴) = 𝑃 𝑟[𝐸 𝑥𝑝𝜆,𝑠
+                                                                                                               𝑈𝑁𝐹
+                                                                                                                   (𝐴) = 1] ≤ 𝑛𝑒𝑙𝑔(𝜆)
+
+
+1.3. Structure                                                                           Definition 3 (Anonymity). Anonymity in a ring signature scheme en-
+                                                                                         sures that the identity of signer remains concealed among a group of
+    The remainder of this paper is organized as follows: Chapter 2                       potential signers, making it impossible to determine who specifically
+provides the necessary foundational knowledge, along with a review                       generated the signature. This anonymity is achieved through a ring
+of the background and related work relevant to this study. In Chapter                    signature generation process that relies on the public keys of all group
+3, we present a post-quantum ring signature algorithm based on Merkle                    members, without revealing the identity of the actual signer.
+trees and discuss its application within the IoV environment. Chapter                        In the anonymization experiment, the adversary is given a ring
+4 offers a security analysis and proof of the robustness of proposed. In                 signature generated from any two pairs of public and private key pairs,
+Chapter 5, we evaluate the performance of the scheme and compare it                      as well as from either of these two private keys, which contains both
+                                                                                         public keys owned by the adversary, and the goal of adversary is to
+with existing alternatives. Finally, Chapter 6 concludes the paper and
+                                                                                         distinguish which private key was used to generate the ring signature
+outlines directions for future research.
+                                                                                         with negligible probability.
+                                                                                             Let the security parameter 𝜆, the ring signature 𝑅𝑆 = (𝐺𝑒𝑛, 𝑠𝑖𝑔 , 𝑉 𝑒𝑟),
+2. Preliminaries                                                                         algorithm A be a polynomial time algorithm, for any integer 𝑠 and any
+                                                                                         bit 𝑏, define the experiment as follows:
+2.1. Ring signature                                                                          Step 1, the challenger generates 𝑠 key pairs (𝑃 𝐾𝑖 , 𝑆 𝐾𝑖 ), of which
+                                                                                         𝑖 ∈ [1, 𝑠], and sends all the public keys 𝑃 𝐾𝑖 to A.
+    Ring signature is a digital signature scheme introduced by Rivest,                       Step 2, A sends (𝑅, 𝑚, 𝑖0 , 𝑖1 ) to the challenger, the challenger checks
+Shamir, and Tauman in 2001. A ring is composed of a group of                             if 𝑝𝑘𝑖0 ∈ 𝑅2 , 𝑝𝑘𝑖1 ∈ 𝑅2 , then the challenger calculates 𝑅2 𝜎 ←
+members, allowing any member within the group to sign on behalf                          𝑆 𝑖𝑔(𝑠𝑘𝑖𝑏 , 𝑅, 𝑚) and send 𝜎 to A.
+of the entire group without revealing the identity of the signing mem-                       Step 3, A returns a guess bit 𝑏∗ where the experiment 𝑏∗ = 𝑏 outputs
+                                                                                         1 if and 0 otherwise, and RS is considered anonymous if for all 𝑠 and
+ber [36],The main parameters of ring signature are given in Table 1.
+                                                                                         all polynomial-time algorithms A, the probability of A returning 1 in
+                                                                                         the (𝑠, 0)-anonymous experiment (in the 𝜆) is ignorably close to the
+Definition 1 (Ring Signature). A ring signature scheme consists of three
+                                                                                         probability of A returning 1 in the (𝑠, 1)anonymous experiment.
+core algorithms: key generation, signature generation, and signature
+                                                                                                                                 1
+verification. These algorithms are defined as follows:                                   𝐴𝑑 𝑣𝜆,𝑠
+                                                                                             𝐴𝑁 𝑂𝑁
+                                                                                                   (𝐴) = |𝑃 𝑟[𝐸 𝑥𝑝𝜆,𝑠
+                                                                                                                  𝐴𝑁 𝑂𝑁
+                                                                                                                        (𝐴)] −     | ≤ 𝑛𝑒𝑙𝑔(𝜆)
+                                                                                                                                 2
+Step1: Key generation
+(𝑝𝑘, 𝑠𝑘) ← 𝐺𝑒𝑛(𝜆, 𝑁):The size of the ring is 𝑁, set the security param-                  2.2. WOTS+
+eters 𝜆 the maximum number of members in the ring 𝑁, 𝜆 and 𝑁 as
+input, the output is the public and private key pair.                                        Ralph Merkle pioneered hash-based signature algorithms, as noted
+Step2: Signature generation                                                              in Ref. [37]. Currently, hash-based signature schemes are categorized
+𝜎 ← 𝑆 𝑖𝑔 𝑛(𝑠𝑘, 𝑅, 𝑚): Input private key 𝑠𝑘, set of all public keys 𝑅 =                   into three main types: one-time signature schemes (OTS), few-time
+(𝑃 𝐾1 , 𝑃 𝐾2 , … , 𝑃 𝐾𝐿 ), message 𝑚 ∈ 𝑀𝜆 , output signature 𝜎.                          signature schemes (FTS), and many-time signature schemes (MTS).
+                                                                                             The Table 2 below summarizes some of the most widely used hash-
+Step3: Signature verification
+                                                                                         based signature schemes. Research on OTS schemes began with the
+𝑇 𝑟𝑢𝑒∕𝑓 𝑎𝑙𝑠𝑒 ← 𝑉 𝑒𝑟(𝑅, 𝑚, 𝜎): Input a collection composed of all public
+                                                                                         Lamport-Diffie algorithm. This paper adopts the WOTS+ (Winternitz
+keys 𝑅, message 𝑚 ∈ 𝑀𝜆 , signature 𝜎, and output 𝑇 𝑟𝑢𝑒∕𝑓 𝑎𝑙𝑠𝑒.
+                                                                                         One-Time Signature Plus) scheme, which comprises three main compo-
+    A ring signature must satisfy two critical security properties:                      nents: key generation (GEN), signature generation (SIG), and signature
+anonymity and Unforgeability. Anonymity ensures that while the sig-                      verification (VER).
+nature indicates it was generated by a member of the ring, it does                           The first step is parameter selection, where parameter 𝜔, an integer
+not reveal the specific identity of the signer. Unforgeability guarantees                𝜔 ∈ 𝑁 with 𝜔 ≥ 2, is determined to set the number of hash iterations
+that only members of the ring can generate valid signatures; outsiders                   required to construct the 𝑛 ∈ 𝑁 public key. Additionally, the hash
+cannot create valid signatures for the ring.                                             output length m and security parameter n, where, need to be defined.
+                                                                                         Next, parameters 𝑙1 and 𝑙2 are computed, which are then summed to
+Definition 2 (Unforgeability). Unforgeability ensures that only members                  obtain l. The calculation method is as follows:
+of the ring can generate a valid signature. In the unforgeability model,                      ⌈        ⌉        ⌊                            ⌋
+                                                                                                  𝑚               log2 (𝑙1 (𝜔 − 1)) + log2 𝜔
+we assume that the attacker has access to a public key and aims to                       𝑙1 =            , 𝑙2 =                                , 𝑙 = 𝑙1 + 𝑙2
+                                                                                                log2 𝜔                       log2 𝜔
+produce a valid ring signature without authorization.
+
+
+                                                                                     3
+S. Liu et al.                                                                                                             Journal of Systems Architecture 160 (2025) 103345
+
+                Table 2
+                Classification table for hash-based signature schemes.
+                 Scheme Type                Scheme Name
+                 OTS                        Lamport-Diffe, WOTS, 𝑊 𝑂𝑇 𝑆 +
+                 FTS                        HORS, HORST-T, PORS, PORS-T
+                 MTS                        XMSS, SPHINCS, SPHINCS+
+
+
+                Table 3
+                Parameter descriptions for the WOTS+ algorithm.
+                 𝑛∈𝑁            Security parameter
+                 𝑤∈𝑁            Winternitz parameter (𝑤 ≥ 2)
+                 𝑚∈𝑁            Bit length of the message digest
+                                                        {                }
+                 𝐹𝑛             A set of functions, 𝐹𝑛 = 𝑓𝑘 ∣ 𝑘 ∈ {0, 1}𝑛 ,
+                                𝑓𝑘 ∶ {0, 1}𝑛 → {0, 1}𝑛
+                 ℎ∈𝑁            Height of the tree
+                 H              Hash function, 𝐻 ∶ {0, 1}∗ → {0, 1}𝑚
+                 𝑥 ∈ {0, 1}𝑛    Randomly chosen string 𝑥,
+                                used to construct a one-time verification key
+
+
+                                                                                                          Fig. 2. Key generation process for WOTS+.
+
+
+
+   The Table 3 gives the meaning of the parameters in the formula.
+Next define the operation, WOTS+ uses the function 𝐹𝑛 family:
+𝐹𝑛 ∶ {0, 1}𝑛 → {0, 1}𝑛
+                                                                                                           Fig. 3. Message digest generation graph.
+   Define the function operation:
+{ 𝑖
+ 𝑐 (𝑥, 𝑟) = 𝐹 (𝑐𝑘𝑖−1 (𝑥, 𝑟) ⊕ 𝑟𝑖 ) 𝑖 > 0
+         𝑐 𝑖 (𝑥, 𝑟) = 𝑥, 𝑖         𝑖=0
+
+    ⎧        𝑥 ∈ {0, 1}𝑛
+    ⎪                𝑛         𝑛
+    ⎨𝐹 = 𝐹 𝑛 ∶ {0, 1} → {0, 1}
+    ⎪ 𝑟 = (𝑟 , 𝑟 , … … , 𝑟 𝑤 )              𝑟 ∈ {0, 1}𝑛×(2
+                                                              𝜔−1 )
+    ⎩       1 2           2 −1
+Step1: Key Generation(GEN)
+     The process of key generation mainly includes two steps: private
+key generation and public key generation. The key generation process
+is shown in Fig. 2.
+     (1) Private key generation: Using PRG to generate 𝑙 + 2𝜔 − 1 n
+bits of random number, the first random number is the private key
+𝑠𝑘 = (𝑠𝑘0 , 𝑠𝑘1 , … … , 𝑠𝑘𝑙−1 ), and the last 2𝜔 − 1 are the mask, 𝑟 =
+(𝑟1 , 𝑟2 , … … , 𝑟2𝜔 −1 ).
+     (2) Public key generation: The public key consists of 𝑙 + 1 blocks,
+the first block is the mask r, the last L blocks are converted by sk, and
+The public key is composed as follows:
+            𝜔
+𝑝𝑘𝑖 = 𝑐 2 −1 (𝑠𝑘𝑖−1 , 𝑟),      𝑖 ∈ [1, 𝑙]                                                               Fig. 4. WOTS+ signature generation diagram.
+𝑝𝑘 = (𝑝𝑘0 , 𝑝𝑘1 , … , 𝑝𝑘𝑙 )
+     (      𝜔−1               𝜔−1
+                                          )
+   = 𝑟, 𝑐 2 (𝑠𝑘0 , 𝑟), … , 𝑐 2 (𝑠𝑘𝑙−1 , 𝑟)
+                                                                                        The message M is converted to 𝑏 = (𝑏0 , 𝑏1 , … … , 𝑏𝑙−1 ). Then, the
+     Step2: Message Signature(SIG)                                                   transmitted signature 𝜎 = (𝜎0 , 𝜎1 , … … , 𝜎𝑙−1 ) is processed as follows to
+     (1) Generate message digest: Generate message digest M that needs               obtain 𝑝𝑘′ . If the signature is the same as pk, the signature verification
+to be signed message m through the hash function, and then divide the                succeeds.
+message digest into 𝑙1 parts, each 𝜔 bit, where each 𝜔 bit represents the            𝑝𝑘′ =(𝑟, 𝑝𝑘′1 , 𝑝𝑘′2 , … , 𝑝𝑘′𝑙 )
+𝑚𝑖 , 𝑖 ∈ [0, 𝑙1 − 1] equivalent of an integer. The message digest generation               (       𝜔                   𝜔               𝜔
+                                                                                                                                                       )
+process is shown in Fig. 3, and the overall signature generation process                 = 𝑟, 𝐹 2 −1−𝑏0 (𝜎0 ), 𝐹 2 −1−𝑏1 (𝜎1 ), … , 𝐹 2 −1−𝑏𝑙−1 (𝜎𝑙−1 )
+is shown in Fig. 4.
+     (2) Calculate the checksum:
+      𝑙1
+      ∑                                                                              2.3. XMSS
+𝐶=          (2𝜔 − 1 − 𝑚𝑖 ) ≤ 𝑙1 (2𝜔 − 1)
+      𝑖=1                                                                            2.3.1. Merkle tree
+   Divide C into 𝜔 bits, and 𝑐 = (𝑐0 , 𝑐1 , … … , 𝑐𝑙2 −1 ).                              The Merkle Signature Scheme (MSS), proposed by Ralph Merkle in
+   Let 𝑏 = (𝑏0 , 𝑏1 , … … , 𝑏𝑙−1 ), that is b be the concatenation of 𝑚 and 𝑐.       1979, integrates the Merkle Tree with an OTS algorithm. A Merkle tree
+Signature generation is represented by the following formula:                        is a hierarchical structure where leaf nodes contain hash values of data,
+                                                                                     and non-leaf nodes store the combined hash values of their child nodes.
+𝜎 = (𝜎0 , 𝜎1 , … , 𝜎𝑙−1 )                                                            This structure enables efficient data integrity verification, especially for
+    (                                                     )
+  = 𝐹 𝑏0 (𝑠𝑘0 , 𝑟), 𝐹 𝑏1 (𝑠𝑘1 , 𝑟), … , 𝐹 𝑏𝑙−1 (𝑠𝑘𝑙−1 , 𝑟)                           large-scale datasets. The structure of the Merkle tree is shown in Fig. 5.
+                                                                                         According to the Fig. 5, the tree has 3 layers and 23 = 8 leaf nodes,
+    Step3: Message verification(VER)                                                 each storing the hash of a one-time signature public key. The leaf nodes,
+
+                                                                                 4
+S. Liu et al.                                                                                                     Journal of Systems Architecture 160 (2025) 103345
+
+
+
+
+                                                            Fig. 5. Merkle tree structure diagram.
+
+
+labeled node0 to node7, are hashed pairwise to generate the middle                2.3.4. Signature verification
+nodes. The final root node stores the public key.                                     The signature verification process ensures the correctness of the
+    The Merkle tree serves two primary functions:                                 OTS signature and validates that the corresponding OTS public key
+    (1) Data Integrity Verification, where users can check if data has            is consistent with the root of the Merkle tree. The main steps are as
+been tampered with by recalculating the root hash.                                follows:
+    (2) Public Key Size Compression, reducing the storage requirements                Step1: Extract Information
+for numerous public keys by consolidating them into a single root key.                Extract OTS serial number 𝑖, OTS signature 𝑆 𝑖𝑔𝑂𝑇 𝑆 , and path proof
+                                                                                  AuthPath for the Merkle tree from XMSS signature 𝑆 𝑖𝑔𝑋 𝑀 𝑆 𝑆 .
+2.3.2. Key generation
+                                                                                      Step2: Verify OTS signature
+    The XMSS algorithm deploys 2ℎ WOTS+ instances as the 2ℎ leaf
+                                                                                      Using the extracted OTS public key, verify the validity of 𝑆 𝑖𝑔𝑂𝑇 𝑆
+nodes of a Merkle tree with height ℎ, with the root node authenticating
+                                                                                  for the message M. If verification fails, the signature is deemed invalid.
+these instances [38]. The XMSS key consists of multiple OTS keys and
+                                                                                      Step3: Compute Merkle Tree Path
+the root of the Merkle tree as the public key.
+    Step1: Select the parameters                                                      Calculate the Merkle tree node of the OTS public key Using OTS
+    Step2: Generate a one-time signature key pair (𝑝𝑘, 𝑠𝑘)                        public key 𝑝𝑘𝑖 and path proof AuthPath, calculate the hash value of
+    Step3: Build the Merkle tree                                                  the parent node step by step from the leaf node 𝑝𝑘𝑖 until the root node
+    Use each OTS public key 𝑝𝑘𝑖 as a leaf node of the Merkle tree.                𝑁 𝑜𝑑 𝑒(𝑖) = 𝐻(𝑐 ℎ𝑖𝑙𝑑(𝑖) ∥ 𝑐 ℎ𝑖𝑙𝑑(𝑖)) is calculated.
+Each leaf node generates non-leaf nodes through a hash function, which                Step4: Compare Root Nodes
+eventually generates the Root node. The parent node in the Merkle tree                Compare the reconstructed root node with the root node Root
+is generated from the hash of the two child nodes, that is, 𝑁 𝑜𝑑 𝑒(𝑖) =           from the XMSS public key. If the values match, the signature is valid;
+𝐻(𝑐 ℎ𝑖𝑙𝑑(1) ∥ 𝑐 ℎ𝑖𝑙𝑑(𝑖)), the root node 𝑅𝑜𝑜𝑡 serves as the XMSS public            otherwise, it is invalid.
+key.
+    Step4: Output the key pair                                                    3. Hash-based post-quantum ring signature scheme
+    Public key: 𝑝𝑘 = (𝑟𝑜𝑜𝑡, 𝑠𝑒𝑒𝑑), the private key consists of the OTS key
+pairs.                                                                               In addition to its high computational efficiency and excellent scal-
+                                                                                  ability, the hash function-based signature scheme exhibits greater al-
+2.3.3. Message signature                                                          gorithmic maturity compared to other post-quantum digital signature
+    To sign a message, an unused WOTS+ private key is selected, and               schemes, such as XMSS and SPHINCS+. Furthermore, post-quantum
+the Merkle tree path proof is generated to output the signature SIG.
+                                                                                  ring signatures ensure both the anonymity and unforgeability of signa-
+    Step1: Select WOTS+ key
+                                                                                  tures. Consequently, in light of the security threats posed by the rapid
+    Choose an unused WOTS+ private key 𝑠𝑘𝑖 , ensuring it is used only
+                                                                                  advancement of quantum computing, it is highly significant to integrate
+once.
+                                                                                  the post-quantum ring signature scheme with vehicle networking.
+    Step2: Generate WOTS+ one-time signature
+    Use the WOTS+ private key to sign message M, producing the OTS
+signature 𝑆 𝑖𝑔𝑂𝑇 𝑆 .                                                              3.1. Design principles
+    Step3: Merkle tree path proof
+    Hash path from leaf node 𝑝𝑘𝑖 to Root node, this path proves that                  The Merkle tree is an efficient data structure, a binary hash tree
+OTS public key is valid.                                                          where each node represents the hash value of a data block. The root
+    Step4: Generate XMSS signature                                                node represents the hash of the entire data set. The characteristics
+    The signature includes: serial number 𝑖 (using the 𝑖 th OTS key),             of the Merkle tree make it a highly efficient method for storing and
+OTS signature 𝑆 𝑖𝑔𝑂𝑇 𝑆 , and AuthPath for authentication of the Merkle            verifying large amounts of data. In blockchain, Merkle trees are widely
+tree 𝑆 𝑖𝑔𝑋 𝑀 𝑆 𝑆 = (𝑖, 𝑆 𝑖𝑔𝑂𝑇 𝑆 , 𝐴𝑢𝑡ℎ𝑃 𝑎𝑡ℎ).                                     used to store transaction data and block hashes. Ring signatures enable
+
+                                                                              5
+S. Liu et al.                                                                                                                   Journal of Systems Architecture 160 (2025) 103345
+
+                Table 4
+                Meaning of parameters in the proposed scheme.
+                                                                                           ⎡ 𝐸 𝑣𝑎𝑙𝑟 ((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝑋 ∗ ) → 𝛺∗                              ⎤
+                 Parameter      Description                                                ⎢       𝑖
+                                                                                                                                                           ⎥
+                                                                                       𝑃 𝑟 ⎢ (Gen(1𝑘 , 𝑡) → (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ))(𝐴(𝑝𝑘𝛺 ) → (𝑤𝑖𝑡∗𝑥𝑖 , 𝑥∗𝑖 , 𝑋 ∗ )) ⎥ ≤ 𝜀(𝑘)
+                 𝑘              Security parameter
+                                                                                           ⎢ 𝑉 𝑒𝑟𝑖𝑓 𝑦(𝑝𝑘𝛺 , 𝛺∗ , 𝑤𝑖𝑡∗ , 𝑥∗ ) = 1 ∧ 𝑥𝑖 ∈ 𝑋 ∗                ⎥
+                 𝑡              Maximum number of elements to accumulate                   ⎣                        𝑥𝑖 𝑖                                   ⎦
+                 𝑖              𝑖 ∈ [0, 2ℎ − 1]
+                 ℎ∈𝑁            Height of the tree                                         The implementation of the Merkle tree ring signature is described
+                 𝐻              Hash function, 𝐻 ∶ {0, 1}∗ → {0, 1}𝑚
+                                                                                       next, and the whole process is covered in Algorithm 1.
+                 (𝑠𝑘𝛺 , 𝑝𝑘𝛺 )   A key pair
+                                             {                 }                       Step1: Key Generation: 𝐺𝑒𝑛(1𝑘 , 𝑡)
+                 𝑋              The set of 𝑥𝑖 ∣ 𝑖 ∈ [0, 2ℎ − 1]                                                                   { }
+                 𝛺              The accumulator                                            First, determine the hash functions 𝐻𝑘 𝑘∈𝐾 𝐾 , where for any 𝑘 ∈
+                 𝑎𝑢𝑥            The auxiliary information                              𝐾 𝐾 , the hash function 𝐻𝑘 ∶ {0, 1}∗ → {0, 1}𝐾 . The hash function can be
+                 𝑤𝑖𝑡𝑥𝑖          The certificate for 𝑥𝑖                                 chosen as SHA functions, SM2, SM3, etc. Determine the parameter N,
+                                                                                       which represents the number of ring members, and 𝑡, the upper bound
+                                                                                       for accumulating elements. Then, generate the key pairs and return
+                                                                                       (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ).
+a message sender to demonstrate possession of at least one public
+                                                                                       Step2: Public Key Evaluation Eval: 𝐸 𝑣𝑎𝑙((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝑋)
+key within a set while concealing the specific public key used, thus
+                                                                                           Parse the number of ring members N. The parsing rule is that if N
+providing anonymity and unlinkability. This feature makes ring sig-
+natures particularly valuable in applications centered on privacy and                  is not a power of 2, the function returns false, as it must be a perfect
+secure communication. Within ring signatures, Merkle trees can be                      binary tree. If N is a power of 2, begin computation from layer 0 (the
+employed to organize the hashes of messages or data blocks into a                      leaf nodes at the lowest level) and continue until the root (the single
+tree structure, facilitating efficient verification of data integrity and              node at the top) is obtained. Let 𝐿𝑢,𝑣 represent the node at layer v and
+authenticity. Furthermore, ring signatures can leverage Merkle trees                   the u-th leaf index. The auxiliary variable aux stores the hash values
+to obscure the identity of sender by integrating the public key of                     corresponding to each layer.
+signer with those of other members in a ring. Consequently, the signer                 Step3: Certificate Creation: 𝑊 𝑖𝑡((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝛺𝑋 , 𝑎𝑢𝑥𝑥𝑖 , 𝑥𝑖 )
+can validate ownership of at least one public key in the set without                       First, parse aux into nodes at each level of the Merkle tree. Then, re-
+disclosing the specific key used. Even if an attacker intercepts the                   construct the Merkle tree from bottom to top. The 𝑊 𝑖𝑡𝐶 𝑟𝑒𝑎𝑡 algorithm
+signed message, they would be unable to ascertain the true identity                    involves using intermediate nodes to build up to the root hash value.
+of the signer.                                                                         Step4: Certificate Verification: 𝑉 𝑒𝑟𝑖𝑓 𝑦(𝑝𝑘𝛺 , 𝛺𝑋 , 𝑤𝑖𝑡𝑥𝑖 , 𝑥𝑖 )
+                                                                                           The final step is verification. Start by setting the leaves to the hash
+3.2. Scheme description                                                                values of each party and proceed to compute hashes from the bottom
+                                                                                       up. Check if the final result matches the root node value. If it matches,
+   This scheme is based on the definition of Merkle tree accumulators                  it verifies that the member is part of the ring. For example, node 𝑙0,2 is
+as described in [35], with slight modifications to accommodate the                     visualized in Fig. 6, showing how node 𝑙0,2 reconstructs the root node
+proposed post-quantum ring signature scheme utilizing hash functions,                  in a Merkle tree with height ℎ = 3 and 𝑁 = 8 leaf nodes.
+specifically designed for vehicular networks. This formalism facilitates
+the restatement of the Merkle tree accumulator algorithm within the
+current framework. The main parameters of this scheme are given in                      Algorithm 1 Extend Merkle tree accumulator
+Table 4.                                                                                input: 𝑘, 𝑡, {𝐻𝑘 }𝑘∈𝐾 𝜅 , 𝐻𝑘 ∶ {0, 1}∗ → {0, 1}𝜅
+                                                                                        output: (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝐿𝑢,𝑣 , 𝑤𝑖𝑡𝑥𝑖 , 0 or 1
+Definition 4 (Extend Merkle Tree Accumulator). The Merkle tree accu-
+mulator algorithm (Algorithm 1) comprises the following subroutines                     1.    𝑘 ∈ 𝐾𝜅                                         # Key generation 𝐺𝑒𝑛(1𝑘 , 𝑡)
+(Gen, Eval, WitCreate, Verify), defined as follows:                                     2.    (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ) ← {𝐻𝑘 }𝑘∈𝐾 𝜅
+    𝐺𝑒𝑛(1𝑘 , 𝑡): The key generation algorithm takes a security parameter                3.    𝐻𝑘 ← 𝑝𝑘𝛺                                           # Public Key Resolution
+𝑘 and a parameter 𝑡, where 𝑡 is the upper bound on the number of                        4.    (𝑥0 , 𝑥1 , … , 𝑥𝑛−1 ) ← 𝑋
+elements to be accumulated, and returns a key pair (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ).                        5.    If 𝑛 = 2𝑘 ∣ 𝑘 ∈ N, 𝑣 ≤ 𝑘:
+     𝐸 𝑣𝑎𝑙((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝑋): This algorithm takes the key pair (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ) and
+                                                                                        6.       𝐻𝑘 (𝐿2𝑢,𝑣+1 ∥𝐿2𝑢+1,𝑣+1 ) if 𝑣 < 𝑘 else 𝐻𝑘 (𝑥𝑖 )
+the set of elements X to be accumulated, returning the accumulator 𝛺𝑋
+and some auxiliary information aux.                                                     7.    Else False
+                                                                                              ( )
+     𝑊 𝑖𝑡𝐶 𝑟𝑒𝑎𝑡((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝛺𝑋 , 𝑎𝑢𝑥, 𝑥𝑖 ): This algorithm takes the key              8.     𝑙𝑢,𝑣 (𝑢∈[𝑛∕2𝑘−𝑣 ])       ← 𝑎𝑢𝑥              # Creates a certificate
+                                                                                                                𝑣∈[𝑘]
+pair(𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), accumulator 𝛺𝑋 , auxiliary information aux, and an
+                                                                                               𝑊 𝑖𝑡𝐶 𝑟𝑒𝑎𝑡𝑒((𝑝𝑘𝛺 , 𝑠𝑘𝛺 ), 𝛺𝑋 , 𝑎𝑢𝑥𝑋 , 𝑥𝑖 )
+element 𝑥𝑖 . If 𝑥𝑖 is not in the set X, it returns false; otherwise, it returns
+a certificate𝑤𝑖𝑡𝑥𝑖 for 𝑥𝑖 .                                                             9.    𝑤𝑖𝑡𝑥𝑖 ← (𝑙⌊𝑖∕2𝑣 ⌋ + 𝜂 , 𝑘 − 𝑣), 0 ≤ 𝑣 ≤ 𝑘
+     𝑉 𝑒𝑟𝑖𝑓 𝑦(𝑝𝑘𝛺 , 𝛺𝑋 , 𝑤𝑖𝑡𝑥𝑖 , 𝑥𝑖 ): This algorithm takes the public key 𝑝𝑘𝛺 ,        10.    1 if ⌊𝑖∕2𝑣 ⌋ (mod 2) = 0 else −1
+accumulator 𝛺𝑋 certificate 𝑤𝑖𝑡𝑥𝑖 , and element 𝑥𝑖 . If 𝑤𝑖𝑡𝑥𝑖 is a valid                 11.    𝐻𝑘 ← 𝑝𝑘𝛺 , 𝐿0,0 ← 𝛺𝑋                         # Certificate authentication
+certificate for 𝑥𝑖 it returns 1; otherwise, it returns 0.
+                                                                                                𝑉 𝑒𝑟𝑖𝑓 𝑦(𝑝𝑘𝛺 , 𝛺𝑋 , 𝑤𝑖𝑡𝑥𝑖 , 𝑥𝑖 )
+     The Merkle tree accumulator ensures both correctness and collision
+resistance. Collision resistance indicates the difficulty of finding an                 12.    𝐿𝑖,𝑘 ← 𝐻𝑘 (𝐿⌊𝑖∕2𝑣 ⌋,𝑘−𝑣 ∥𝐿⌊𝑖∕2𝑣 ⌋+1,𝑘−𝑣 ) If ⌊𝑖∕2𝑣 ⌋ (mod 2) = 0
+element 𝑥𝑖,𝑗 that does not belong to X yet possesses a valid certificate                        else 𝐿𝑖,𝑘 ← 𝐻𝑘 (𝐿⌊𝑖∕2𝑣 ⌋,𝑘−𝑣 ∥𝐿⌊𝑖∕2𝑣 ⌋,𝑘−𝑣 )
+𝑥𝑖,𝑗 .                                                                                  13.    1 if 𝑤𝑖𝑡𝑥𝑖 is a valid witness for 𝑥𝑖 ∈ 𝑋 else 0
+
+Definition 5 (Collision Resistance). Collision resistance implies that for
+an adversary 𝐴 possessing a valid key pair (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ) generated by                   3.3. Signature algorithm description
+the Gen algorithm, and under the assumption that intermediate values
+are correct, the probability of finding an element 𝑥∗𝑖 that is not in the                  The hash-based post-quantum ring signature scheme explored in
+accumulator 𝑋 ∗ but still produces a verification result of 1 is negligible.           this work is based on the XMSS algorithm, which incorporates two
+Assuming the existence of a negligible function 𝜀(𝑘), collision resistance             primary frameworks: the WOTS+ algorithm and the Merkle tree algo-
+is formally defined as follows:                                                        rithm. Below is an overview of these frameworks.
+
+                                                                                   6
+S. Liu et al.                                                                                                         Journal of Systems Architecture 160 (2025) 103345
+
+
+                                                                                     The formal signing process begins by selecting the corresponding one-
+                                                                                    time signature (OTS) key pair (𝑥𝑖 , 𝑦𝑖 ), specifically the 𝑖th OTS key pair.
+                                                                                    The signer then uses the private OTS key 𝑥𝑖 to sign the message,
+                                                                                    creating a one-time signature 𝜎𝑂𝑇 𝑆 and calculating the authentication
+                                                                                    path. The final signature comprises: the index 𝑖, the one-time signature
+                                                                                    𝜎𝑂𝑇 𝑆 , the public key 𝑦𝑖 , and the authentication path for 𝑦𝑖 , denoted
+                                                                                    𝑎𝑢𝑡ℎ𝑖 . The signature is formally represented as 𝜎 = (𝑖, 𝜎𝑂𝑇 𝑆 , 𝑌𝑖 , 𝑎𝑢𝑡ℎ𝑖 ).
+                                                                                    The Fig. 7 illustrates the signing process using leaf node𝑥2 as the signing
+                                                                                    node, where the shaded areas represent the authentication path of the
+Fig. 6. A Merkle tree with a height of h = 3 and a number of leaf nodes N = 8       signature.
+visualizes the reconstruction of the root node by 𝑙0.2 nodes.
+                                                                                    Step 4: Signature Verification
+                                                                                       As shown in Algorithm 4, signature verification begins by first
+                                                                                    verifying the one-time signature 𝜎𝑂𝑇 𝑆 . If this check is successful, the
+Definition 6 (Merkle Tree Ring Signature Algorithm). The Merkle tree-               next step involves reconstructing the Merkle tree root based on the
+based ring signature algorithm comprises four main steps: parameter                 chosen index 𝑖 and the public key 𝑦𝑖 . The reconstructed root is then
+definition, public key generation, signature generation, and signature              compared with the stored public key. If the two match, verification is
+verification. These steps are outlined as follows:                                  deemed successful.
+Step 1: Parameter Definition
+                                                                                     Algorithm 4 Signature verification
+    The height h of the tree represents its number of layers, meaning a
+Merkle tree with height ℎ has 2ℎ leaf nodes, indicating 2ℎ ring members              input: 𝜎
+and corresponding key pairs (𝑥𝑖 , 𝑦𝑖 ), 𝑖 ∈ [0, 2ℎ − 1].                             output: true or false
+                                                                                     1 If
+    In practical application scenarios, if the number of vehicles does
+                                                                                     2      𝑉𝐸𝑅(𝑀 , 𝑠𝑖𝑔(𝑂𝑇 𝑆), 𝑌𝑖 ) = 𝑡𝑟𝑢𝑒
+not satisfy this condition, it is recommended to either introduce virtual
+                                                                                     3      Reconstruct the 𝑟𝑜𝑜𝑡∗ node of the merkle tree
+members into the ring or divide the vehicles into multiple rings.
+                                                                                          according to i and Yi
+Step 2: Public Key Generation/Merkle Tree Construction
+                                                                                     4 If
+    As shown in algorithm 2, in the Merkle tree, all leaf nodes together             5         𝑅𝑜𝑜𝑡′ = 𝑃 𝐾
+constitute the ring. Each member in the ring is represented by a public–             6 true
+private key pair corresponding to a leaf node. Each leaf node holds the              7 Else
+hash of the public key derived from a one-time signature (OTS) scheme,               8        False
+while each parent node stores the hash of the concatenation of its two               9 Else
+child nodes. This process repeats according to the same generation rule              10 False
+until the final root node is formed. The value of the root node is the
+final public key, while the private key consists of the 2ℎ OTS private
+                                                                                        To illustrate the reconstruction process, consider node𝑥2 as an
+keys 𝑥𝑖 . The number of ring members equals the number of leaf nodes in
+                                                                                    example, assuming 𝑖 = 2 and 𝑌2 known, along with the signature 𝜎 =
+the Merkle tree. It is essential to ensure that the number of participating
+                                                                                    (2, 𝜎𝑂𝑇 𝑆 , 𝑌2 , 𝑎𝑢𝑡ℎ2 ). Here, 𝑎𝑢𝑡ℎ2 contains values stored in nodes 3, 8, and
+members in the ring is a power of 2. The public key of each ring
+                                                                                    13. The root node can be reconstructed as follows: node14=hash(node
+member corresponds to the public key from the one-time signature.
+                                                                                    12∥node13), node12=hash(node8∥node9), node9= hash(node2∥node3)
+                                                                                    wh-ere node2 stores the value of 𝑌2 . The computed value of node14 is
+ Algorithm 2 Public Key Generation                                                  the value of the reconstructed root 𝑟𝑜𝑜𝑡∗ . This is shown in Fig. 8. By
+ input: h, SK                                                                       hashing upwards from the leaf nodes, if a match with the stored root
+ output: PK                                                                         node is found, the membership of signer in the ring is verified.
+                  (                   )
+ 1. 𝑛𝑜𝑑 𝑒𝑖 = 𝐻 𝑎𝑠ℎ 𝑛𝑜𝑑 𝑒2𝑖+1 ||𝑛𝑜𝑑 𝑒2𝑖 , 𝑖 ∈ [0, 2ℎ − 1]
+ 2. Root=Hash(node1|| node2)                                                        3.4. Application of the scheme in vehicular networks
+ 3. PK=Root
+                                                                                        The proposed hash-based signature scheme offers post-quantum
+                                                                                    security, protecting against quantum threats, and is highly efficient
+Step 3: Signature Generation Before executing the ring signature opera-             with compact signatures, ideal for resource-constrained on-board de-
+tion, the signer hashes the binary message to generate a message digest             vices in IoV. It supports fast information exchange and verification in
+𝑚 = 𝐻(𝑀), where H is the chosen hash function, and M represents the                 dynamic traffic environments, enhancing security and privacy, such as
+original binary message. This digest 𝑚 will be used in the subsequent               in accident reporting systems, while maintaining reporter anonymity.
+steps of the signature generation process. This process is shown in                 Overall, it addresses key security, efficiency, and scalability challenges
+algorithm 3.                                                                        in connected vehicle networks.
+                                                                                        The application of ring signatures in IoV involves three main stages:
+                                                                                    the registration stage, the inter-vehicle communication stage, and the
+ Algorithm 3 Signature generation                                                   signature tracing and broadcast stage.
+ input: M, H, one-time signature key pair (𝑥𝑖 , 𝑦𝑖 )                                Step 1: Registration Stage
+ output: 𝜎                                                                              This stage consists of three main steps, First, the On-Board Unit
+ 1 (𝑥𝑖 , 𝑦𝑖 ), 𝑖 ∈ [0, 2ℎ − 1]                                                      (OBU) sends a registration request to the Trusted Authority (TA).
+ 2 For 𝑥𝑖                                                                           Upon receiving the request, the TA generates a public–private key
+ 3 Select node to perform a one-time digital                                        pair (𝑃 𝐾𝑂𝐵𝑈 , 𝑆 𝐾𝑂𝐵𝑈 ) for the OBU. In the final step, the TA returns
+     signature on message M to generate                                             the private key to the OBU, along with the public key and identity
+     signature 𝜎𝑂𝑇 𝑆                                                                information bound to the blockchain network. The identity information
+ 4 Calculate 𝑦𝑖 authentication path 𝑎𝑢𝑡ℎ𝑖                                           typically includes vehicle certificates, vehicle identification numbers
+ 5 𝜎 = (𝑖, 𝜎𝑂𝑇 𝑆 , 𝑌𝑖 , 𝑎𝑢𝑡ℎ𝑖 )                                                     (VIN), and other vehicle-related data. This process ensures that vehicles
+
+                                                                                7
+S. Liu et al.                                                                                                  Journal of Systems Architecture 160 (2025) 103345
+
+
+
+
+                                                  Fig. 7. Diagram of the signature generation process.
+
+
+
+
+                                                        Fig. 8. Signature verification diagram.
+
+
+
+are properly registered and recognized within the blockchain network,          the signatures and returns the verification results to the requesting
+as illustrated in Fig. 9.                                                      OBU, enabling secure and authenticated access to the information. This
+Step 2: Inter-Vehicle Communication Stage                                      process is further illustrated in Fig. 10.
+    At this stage, the OBU utilizes the public key of the Roadside             Step 3: Signature Tracing and Broadcast Stage
+Unit (RSU) 𝑃 𝐾𝑅𝑆 𝑈 to encrypt its own public key and sends it to the               In the event of an accident, the OBU sends accident-related informa-
+RSU, requesting the creation of a ring. Upon receiving the encrypted           tion to the RSU, which then processes and broadcasts the information
+message, the RSU decrypts it using its private key to obtain 𝑃 𝐾𝑂𝐵 𝑈 ,         to other OBUs. At the same time, the RSU forwards the signature of the
+which is then added to the ring. When the number of ring members               OBU involved in the accident, denoted as 𝑆 𝐼 𝐺(𝑂𝐵 𝑈 𝑎𝑐 𝑐 ) to the TA. The
+reaches the threshold of 2ℎ , the RSU broadcasts the ring structure,           TA uses its private key to identify the relevant vehicle information. If
+allowing all ring members to participate in signing processes.                 the OBU is determined to be malicious, the TA revokes its identity and
+    If the threshold is not met, virtual members may be added, or the          public key on the blockchain network. The TA then sends the revoked
+ring may be split into smaller sub-rings to ensure each ring contains          public key and the adverse record of the malicious OBU to the RSU. The
+2ℎ members. Once the ring is established, the OBU can sign messages            RSU subsequently broadcasts this information to other OBUs, ensuring
+using a ring signature and forward them to the RSU. The RSU sub-               they are aware of the revoked identity and can exclude the malicious
+sequently broadcasts the signed messages to other OBUs, which can              OBU from further network participation. This process is illustrated in
+request verification from the Verification Node (VN). The VN validates         Fig. 11.
+
+                                                                           8
+S. Liu et al.                                                                                                    Journal of Systems Architecture 160 (2025) 103345
+
+
+
+
+                                                                                          Fig. 12. IOV model based on post-quantum ring signature.
+
+
+
+
+                                                                                     accident, sends the public key and adverse record of the vehicle
+                         Fig. 9. Registration phase.
+                                                                                     involved to the RSU.
+                                                                                 [4] Verification Node (VN): Responsible for verifying signature re-
+                                                                                     quests sent by other vehicles.
+                                                                                 [5] Anonymous Blockchain Network (ABN): In this model, vehicle
+                                                                                     public keys are stored in the blockchain network, providing a
+                                                                                     secure and anonymous framework for identity management.
+
+                                                                                  In addition to the interactions between the OBU and the TA, as well
+                                                                               as between the OBU and RSU in the aforementioned process, within
+                                                                               a specific segment of roadway, the OBU is also capable of engaging
+                                                                               with pedestrians, road infrastructure, and stations located within that
+                                                                               segment.
+                                                                                  In general, the integrity and privacy protection of data transmis-
+                                                                               sion are more emphasized in interactions between vehicles and other
+                                                                               vehicles, as well as roadside units. However, interactions between
+                   Fig. 10. Information interaction phase.
+                                                                               vehicles and pedestrians often involve location verification and identity
+                                                                               confirmation. In a vehicular networking system, vehicles may need to
+                                                                               verify both the identity and location of pedestrians, while using post-
+                                                                               quantum ring signatures to ensure the integrity and non-repudiation of
+                                                                               pedestrian information.
+
+
+                                                                               4. Security analysis
+
+                                                                               4.1. Safety assessment
+
+                                                                                   The proposed scheme possesses the following characteristics:
+                                                                                   (1) Anonymity: Ring signatures inherently support anonymity, pro-
+                                                                               tecting the identity of signer. Assuming an attacker has obtained a valid
+                                                                               ring signature generated only by members within the ring, if the ring
+                                                                               contains 𝑛 members, the probability that the attacker identifies the true
+                                                                               signer is 1∕𝑛. For any member other than the signer, the probability of
+                      Fig. 11. Signature tracing phase.                        knowing the identity of signer is 1∕𝑛 − 1.
+                                                                                   (2) Privacy: The generation of a ring signature relies solely on the
+                                                                               signer within the ring, with no involvement from other ring members,
+    When applying this ring signature scheme to a vehicular network            thus preserving the privacy of the signer.
+system, the overall model framework is shown in Fig. 12. The primary               (3) Post-Quantum Security: This scheme employs a post-quantum
+                                                                               ring signature approach based on Merkle trees, leveraging hash-based
+components of the model include:
+                                                                               and post-quantum secure mathematical problems. This design provides
+                                                                               robust security against quantum computing threats. The use of hash-
+   [1] On - Board Unit (OBU): Responsible for sending requests to the
+                                                                               based post-quantum ring signatures combines the strong properties of
+       TA, transferring its public key to the RSU, signing messages with
+                                                                               hash functions with quantum-resilient security, maintaining integrity
+       the ring signature, and sharing traffic accident information.           even under potential quantum computing attacks.
+   [2] Road - Side Unit (RSU): Organizes received public keys into a               (4) Efficiency: The computational efficiency of hash functions makes
+       ring, broadcasts signatures, accident information, and adverse          this scheme suitable for a variety of application scenarios.
+       records to other vehicles, and forwards accident-related signa-             (5) Unforgeability: The scheme ensures unforgeability through the
+       tures to the TA.                                                        one-way and irreversible properties of hash functions in constructing
+   [3] Trusted Authority (TA): Generates key pairs for the OBU, up-            hash chains. Thus, it is highly challenging for anyone other than the
+       loads these to the blockchain network, and, in the event of an          legitimate signer to forge a signature within this scheme.
+
+                                                                           9
+S. Liu et al.                                                                                                              Journal of Systems Architecture 160 (2025) 103345
+
+
+                                                                                     C computes the corresponding 𝜎𝑠 , which S returns as a complete ring
+                                                                                     signature to A.
+                                                                                        Step 4: In the challenge phase, A sends M and an unobserved forged
+                                                                                     ring signature to S, which calculates the corresponding 𝑌𝑠 of the forged
+                                                                                     signer and submits (𝑌𝑠 , 𝜎𝑠 ) to C. If C verifies 𝑌𝑠 and 𝜎𝑠 as valid, then
+                                                                                     S has successfully forged a signature, with output 1; otherwise, S fails,
+                                                                                     outputting 0.
+                                                                                        Since A can break the scheme with non-negligible probability P,
+                                                                                     we deduce that 𝑝𝑟(𝑜𝑢𝑡𝑝𝑢𝑡(𝐺𝑎𝑚𝑒) = 1) = 𝑝, allowing S to break the
+                                                                                     post-quantum ring signature algorithm with non-negligible probability.
+                                                                                     However, this contradicts the assumed security of scheme, proving that
+                                                                                     A cannot successfully forge signatures in polynomial time.
+           Fig. 13. Authentication path diagram of a node with index i = 2.
+                                                                                     Theorem 3. If the underlying hash function family {𝐻𝑘 }, 𝑘 ∈ 𝐾𝐾 is a
+                                                                                     collision-resistant family, then the proposed hash-based post-quantum ring
+4.2. Security proof
+                                                                                     signature scheme is collision-resistant.
+   The following section provides security proofs and discussions for                Proof. During initialization, this reduction interacts with a collision-
+the proposed scheme:                                                                 resistant hash function challenge to acquire 𝐻𝑘 and completes initial-
+                                                                                     ization per the original protocol. If an attacker generates a collision
+Lemma 1. If a one-time signature scheme passes verification and the                  within the accumulator, this implies that the reduction knows two
+reconstructed Merkle root Root∗ matches the original Merkle root Root, then          distinct inputs that collide under 𝐻𝑘 , with the collision probability
+the signature is valid.                                                              bounded by the collision resistance of hash function.
+
+Proof. Suppose the index 𝑖 = 2 is chosen for the one-time signature key              Theorem 4. If the employed hash functions are one-way, then the proposed
+used in the message signature. The nodes from index 𝑖 = 2 to the root                Merkle-tree-based post-quantum ring signature scheme is unforgeable under
+node traverse nodes [2, 9, 12], with sibling nodes [3, 8, 13], forming               chosen-message attacks.
+a verification path [3, 8, 13], In Fig. 13, we illustrate the verification               Let 𝑛, 𝑤, 𝑚 ∈ 𝑁 , 𝑤𝑖𝑡ℎ𝑤, 𝑚 = 𝑝𝑜𝑙𝑦(𝑛), and let the function family 𝐹𝑛 =
+pathway of the leaf node indexed at 2, which is depicted as the gray                 𝑓𝑘 ∶ {0, 1}𝑛 → {0, 1}𝑛 where 𝑘 ∈ {0, 1}𝑛 satisfy second-preimage resistance
+node. Reconstructing the root Root* follows these steps:                             and one-way properties. The variable t represents the computational time.
+𝑁 𝑜𝑑 𝑒(9) = Hash(𝑛𝑜𝑑 𝑒(2) ∥ 𝑛𝑜𝑑 𝑒(3))                                                The term 𝜔 ⋅ 𝐼 𝑛𝑆 𝑒𝑐 𝑈 𝐷 (𝐹𝑛 ; 𝑡∗ ) reflects the undetectability (UD) security of
+                                                                                     the function family 𝐹𝑛 , while 𝐼 𝑛𝑆 𝑒𝑐 𝑂𝑊 (𝐹𝑛 ; 𝑡′ ) represents its one-way(OW)
+𝑁 𝑜𝑑 𝑒(12) = Hash(𝑛𝑜𝑑 𝑒(9) ∥ 𝑛𝑜𝑑 𝑒(8))                                               security. Additionally, the term 𝜔 ⋅ 𝐼 𝑛𝑆 𝑒𝑐 𝑆 𝑃 𝑅 (𝐹𝑛 ; 𝑡′ ) denotes the second-
+                                                                                     preimage resistance(SPR) security, scaled by the parameter 𝜔. The formal
+                                                                                     definitions of EU-CMA and SPR are provided in [14], and will not be
+𝑁 𝑜𝑑 𝑒(14) = Hash(𝑛𝑜𝑑 𝑒(12) ∥ 𝑛𝑜𝑑 𝑒(13))
+                                                                                     elaborated on here.
+   The value of node 9 is computed from nodes 2 and 3, the value of                     We define the unforgeability insecurity under chosen-message at-
+node 12 is computed from nodes 9 and 8, and the value of the root node               tack of WOTS+ as follows:
+Root∗ (node 14) is computed from nodes 12 and 13. This computed
+                                                                                     lnSecEU-CMA (WOTS+ (1𝑛 , 𝑤, 𝑚); 𝑡, 1)
+Root∗ value is then compared with the public key. Clearly, the hash of
+Root∗ matches the original public key. The proof process for any other                 ≤ 𝑤 ⋅ ln SecUD (𝐹𝑛 ; 𝑡∗ ) + 𝑤𝑙
+node is identical, thus confirming the correctness of the signature.                      ⋅ max{ln SecOW (𝐹𝑛 ; 𝑡′ ), 𝑤 ⋅ ln SecSPR (𝐹𝑛 ; 𝑡′ )} with 𝑡′
+                                                                                       = 𝑡 + 3𝑙𝑤 and 𝑡∗
+Theorem 1. The proposed post-quantum ring signature scheme preserves
+                                                                                       = 𝑡 + 3𝑙𝑤 + 𝑤 − 1
+anonymity.
+    Assuming a valid signature 𝜎 = (𝑖, 𝜎𝑂𝑇 𝑆 , 𝑌𝑖 , 𝑎𝑢𝑡ℎ𝑖 ), where each value           For WOTS+ combined with Merkle trees, the non-forgeability under
+of 𝑖 is within the appropriate range 𝑖 ∈ [0, 2ℎ − 1], the probability that           chosen-message attacks on the Merkle tree can be defined as follows:
+any other person can identify the true signer is 1∕2ℎ (for a ring with                          (            (          )      )
+                                                                                     InSecEU-CMA Merkle-tree 1𝑛 , 𝑇 = 2ℎ ; 𝑡, 1
+2ℎ members). For other ring members, the probability of knowing the                                       { ℎ+log 𝓁−1
+                                                                                                ≤ 2 ⋅ max 2      2    ⋅
+identity of signer is 1∕(2ℎ − 1).                                                                                                     }
+                                                                                                        SPR
+                                                                                                  InSec     (WOTS+ (1𝑛 , 𝜔, 𝑚) ; 𝑡, 1)
+Theorem 2. The proposed ring signature scheme is unforgeable.                            Using the derived insecurity function for the Merkle tree combined
+Proof. Suppose an attacker A could successfully forge a ring signature               with W-OTS, which employs pseudorandom key generation and 𝐺𝑒𝑛2ℎ
+with non-negligible probability P within polynomial time. We construct               we arrive at the following results:
+                                                                                                 (                       )
+a simulator S to challenge a ring signature algorithm claimed to be                  InSecEU-CMA XMSS(1𝑛 , 𝑇 = 2ℎ ); 𝑡, 1
+                                                                                                    (                      )
+secure by challenger C as follows:                                                    ≤ InSecEU-CMA WOTS+(1𝑛 , 𝜔, 𝑚); 𝑡, 1
+   Step 1: The challenger initializes 𝑛 signing instances with the MSS                                (                              )
+                                                                                        + InSecEU-CMA Merkle-tree(1𝑛 , 𝑇 = 2ℎ ); 𝑡, 1
+signing algorithm, generating 𝑛 key pairs (𝑠𝑘, 𝑝𝑘) and sends all public
+keys pk to simulator S.                                                              = InSecPRF (𝐹𝑛 , 𝑡′ + 2ℎ , 2ℎ )
+   Step 2: Upon receiving the public keys, S initializes the ring sig-                          ⎧(2ℎ+log2 𝑙−1 ) ⋅ InSecSPR (𝐻𝑛 , 𝑡′ ),                        ⎫
+nature algorithm by randomly selecting additional parameters and                                ⎪ ℎ         PRF        ′
+                                                                                                                                                              ⎪
+                                                                                                ⎪2 ⋅ InSec (𝐹𝑛 ; 𝑡 + 𝑙, 𝑙)+                                   ⎪
+forwarding the public keys to attacker A.                                               + 2 max ⎨                  (                 {      OW         ′
+                                                                                                                                                           }) ⎬.
+   Step 3: In the query phase, A selects a message M and sends it to                            ⎪              UD            ∗
+                                                                                                                                       InSec (𝐹𝑛 ; 𝑡 ),       ⎪
+                                                                                                ⎪   𝜔 ⋅ InSec        𝐹𝑛  ; 𝑡   + max                          ⎪
+S. Following the ring signature algorithm, S randomly selects a user                            ⎩                                      InSecSPR (𝐹𝑛 ; 𝑡′ )    ⎭
+𝑠 to generate the ring signature, computes 𝑌𝑠 , and forwards it to C.
+
+                                                                                10
+S. Liu et al.                                                                                                           Journal of Systems Architecture 160 (2025) 103345
+
+                Table 5
+                Test 16 XMSS-SHA2_10_256 signatures.
+                 Number           Signature time       Verification time
+                 0                1.990014             0.001119
+                 1                1.980151             0.000947
+                 2                1.969849             0.001210
+                 3                1.965888             0.001184
+                 4                1.969898             0.001056
+                 5                1.980296             0.001144
+                 6                2.017889             0.001093
+                 7                2.054971             0.001101
+                 8                2.016147             0.001241
+                 9                2.020737             0.001267
+                 10               1.954583             0.001016
+                 11               2.021315             0.001060
+                 12               2.029765             0.001043
+                                                                                                  Fig. 14. Signature generation time of 16 test results.
+                 13               2.057487             0.001016
+                 14               1.958401             0.001081
+                 15               1.990919             0.001053
+
+
+
+
+    To prove XMSS is unforgeable under chosen-message attacks, we
+consider the following factors:
+    Random Oracle Model: Assuming the hash function behaves as a
+random oracle, an attacker has no foreknowledge of input–output pairs.
+    Irreversibility: WOTS+ security relies on the irreversibility of hash
+chains; given a hash value 𝐻𝑖 (𝑥), finding the predecessor 𝐻𝑖−1 (𝑥) is
+infeasible.
+    Collision Resistance: The hash function must resist collisions, mak-
+ing it nearly impossible for an attacker to produce distinct messages
+that yield identical hash chains.
+                                                                                                 Fig. 15. Signature verification time of 16 test results.
+
+5. Performance analysis
+                                                                                  Table 6
+                                                                                  Signature efficiency comparison table.
+    This study evaluates the performance of proposed scheme in densely
+                                                                                               Scheme      Number of       Key            Signature         Verification
+trafficked urban areas, focusing particularly on resistance to quantum
+                                                                                                           Members         generation     time/s            time/s
+attacks. The experiments are based on the Merkle tree-ring signature                                                       time/s
+scheme, with a primary emphasis on security strength, as attacks in
+                                                                                   OURS        HBS         210             2.06           1.97              9.47e−04
+the IoV environments are expected to become increasingly complex,                  [33]        LBS         10              0.07           0.06              0.04
+especially with the advent of quantum attacks. Consequently, a high-               [32]        LBS         –               34.1e−06       9.59e−05          3.49e−05
+security, quantum-resistant signature scheme is essential for the IoV              [25]        HBS         210             –              0.16              0.11
+systems.
+    The primary operations in the signature scheme include generating             Table 7
+public and private keys, measuring the time required for message                  Function comparison table of the scheme.
+signing and verification, and instantiating the SHA-256 function as                           Scheme         Post-          Anonymity     Traceability      Application
+the underlying hash function. Key parameters include the security                                            quantum                                        to IOV
+parameter 𝑛, the Winternitz parameter 𝜔, and the number of ring                                              security
+
+members, with specific values assigned to each. These operations allow             OURS       HBS            YES            YES           YES               YES
+                                                                                   [33]       LBS            NO             YES           YES               YES
+us to measure metrics such as key generation time, signature generation
+                                                                                   [32]       LBS            YES            NO            NO                YES
+time, and signature verification time.                                             [25]       HBS            YES            YES           YES               NO
+    In this scheme, the digital signature algorithm is set to XMSS-
+SHA2-10-256, utilizing the SHA-256 hash function with a Merkle tree
+height of 10, enabling a maximum of 210 = 1024 possible ring signa-
+tures. The number of signature tests is set to 16 to balance efficiency           of Merkle tree as 10, and the number of ring members as 210 . Among
+and data stability, ensuring valid results without excessive resource             them, HBS stands for the scheme based on hash and LBS stands for a
+consumption.                                                                      scheme based on lattices.
+    To present the data more intuitively, the experimental results of the             Comparing the scheme proposed in this paper with the scheme
+16 tests shown in Table 5 are depicted in graphical form, resulting in            in [33], it can be seen that the post-quantum ring signature scheme
+Fig. 14 and Fig. 15. Fig. 14 illustrates the signature generation times           based on Merkle tree has great advantages. First, in this evaluation, the
+across the 16 tests, while Fig. 15 displays the signature verification            number of ring members our scheme can accommodate is 210 , which
+times. These figures show that both the signature generation time and             is much larger than the number of ring members evaluated in [33].
+verification time fluctuate within a certain range, indicating variability        When the road section is wider and crowded, the scheme proposed in
+rather than fixed values. Select one of the 16 test results to compare            this paper is more suitable. Secondly, this scheme has post-quantum
+with relevant literature studies. The attributes of comparison include            security, which is more secure; Moreover, although the key generation
+key generation time, signature generation time, signature verification            time of our scheme is slightly longer than that of the scheme with
+time, resistance to quantum attacks, anonymity, traceability, and ap-             fewer ring members in [33], it is much faster in terms of signature time
+plication to the IoV. The comparison results are drawn in Tables 6 and            and verification time, especially the verification time is nearly 44 times
+7, In our scheme, we set the parameters as n = 32, 𝜔 = 16, the height             faster than that of [25].
+
+                                                                             11
+S. Liu et al.                                                                                                          Journal of Systems Architecture 160 (2025) 103345
+
+
+    Compared with the scheme in [32], the outstanding feature of                  Data availability
+the scheme in this paper is ring signature, which has anonymity and
+traceability, making it more suitable for the Internet of vehicles en-               No data was used for the research described in the article.
+vironment. In addition, the scheme in this paper uses Merkle tree
+structure, which reduces the storage cost of public key and signature.
+                                                                                  References
+In general, lattice signature may require special optimization in high
+performance computing. The algorithm maturity is not high, but the
+                                                                                   [1] I. Wanger, Car production: Number of cars produced worldwide, Statista (2020).
+underlying hash function of the post-quantum ring signature scheme in              [2] Patrick Miner, Barbara M. Smith, Anant Jani, Geraldine McNeill, Alfred
+this paper is SHA-256, and the SHA-256 function has passed the test of                 Gathorne-Hardy, Car harm: A global review of automobility’s harm to people
+time in many practical applications, and has high algorithm maturity.                  and the environment, J. Transp. Geogr. 115 (2024) 103817.
+    Comparing the scheme in this paper with the scheme in [25], it can             [3] Juan Contreras-Castillo, Sherali Zeadally, Juan Antonio Guerrero-Ibañez, Internet
+                                                                                       of vehicles: Architecture, protocols, and security, IEEE Internet Things J. 5 (5)
+be seen that both papers are based on hash function. The advantages                    (2018) 3701–3709, http://dx.doi.org/10.1109/JIOT.2017.2690902.
+of the scheme in this paper are as follows: First, although the time               [4] David Deutsch, Quantum theory, the Church–Turing principle and the universal
+of signature generation in [25] is nearly 12 times faster than that in                 quantum computer, Proc. R. Soc. A 400 (1818) (1985) 97–117.
+this paper, the time of signature verification in this paper is nearly 100         [5] Rasha Shajahan, Kurunandan Jain, Prabhakar Krishnan, A survey on NIST 3
+                                                                                       rd round post quantum digital signature algorithms, in: 2024 5th International
+times faster than that in [25]. In addition, the scheme in this paper is
+                                                                                       Conference on Mobile Computing and Sustainable Informatics, ICMCSI, IEEE,
+also applied to the vehicle networking model.                                          2024, pp. 132–140.
+    As shown in Table 7, this study compares the attributes of ‘‘Post-             [6] David A. Cooper, Daniel C. Apon, Quynh H. Dang, Michael S. Davidson, Morris J.
+quantum’’, ‘‘Anonymity’’, ‘‘Traceability’’, and ‘‘Application to IoV’’.                Dworkin, Carl A. Miller, et al., Recommendation for stateful hash-based signature
+The comparison reveals that our scheme offers post-quantum security,                   schemes, NIST Spec. Publ. 800 (208) (2020) 208–800.
+                                                                                   [7] Samira El Madani, Saad Motahhir, Abdelaziz El Ghzizal, Internet of vehicles:
+anonymity, traceability, and the ability to apply to IoV, with the
+                                                                                       concept, process, security aspects and solutions, Multimedia Tools Appl. 81 (12)
+advantages of our proposed scheme becoming more evident through                        (2022) 16563–16587.
+this comprehensive comparison.                                                     [8] Cesar Castellon, Swapnoneel Roy, Patrick Kreidl, Ayan Dutta, Ladislau Bölöni,
+                                                                                       Energy efficient merkle trees for blockchains, in: 2021 IEEE 20th International
+6. Conclusion                                                                          Conference on Trust, Security and Privacy in Computing and Communications,
+                                                                                       TrustCom, IEEE, 2021, pp. 1093–1099.
+                                                                                   [9] Daniel J. Bernstein, Andreas Hülsing, Stefan Kölbl, Ruben Niederhagen, Joost
+    The hash-based post-quantum ring signature scheme offers advan-                    Rijneveld, Peter Schwabe, The SPHINCS+ signature framework, in: Proceedings
+tages such as high signature efficiency, good scalability, and inde-                   of the 2019 ACM SIGSAC Conference on Computer and Communications Security,
+pendence from complex mathematical assumptions. In the context of                      2019, pp. 2129–2146.
+                                                                                  [10] Kaiyi Zhang, Hongrui Cui, Yu Yu, SPHINCS-𝛼: A compact stateless hash-based
+increasing security threats posed by advancements in quantum com-
+                                                                                       signature scheme, 2022, Cryptology ePrint Archive.
+puting, applying post-quantum ring signatures in IoV can enhance                  [11] Mikhail Kudinov, Andreas Hülsing, Eyal Ronen, Eylon Yogev, SPHINCS+ C:
+anonymity and privacy protection while ensuring quantum-resistant                      Compressing SPHINCS+ with (almost) no cost, 2022, Cryptology ePrint Archive.
+security. This paper presents a hash-based post-quantum ring signature            [12] Sun Siwei, Liu Tianyu, Guan Zhi, SM3-based post-quantum digital signature
+scheme built on the XMSS algorithm and demonstrates its application                    schemes, J. Cryptologic Res. 10 (1) (2023) 46.
+                                                                                  [13] Andreas Hülsing, Mikhail Kudinov, Recovering the tight security proof of
+in the IoV system. The proposed scheme is analyzed and proven secure.
+                                                                                       SPHINCS+, in: International Conference on the Theory and Application of
+Performance analysis is conducted following 16 experimental tests,                     Cryptology and Information Security, Springer, 2022, pp. 3–33.
+with comparisons made to other similar schemes. The results show                  [14] Andreas Hülsing, Denis Butin, Stefan Gazdag, Joost Rijneveld, Aziz Mohaisen,
+that the proposed scheme exhibits significant advantages in signature                  XMSS: Extended Merkle Signature Scheme, Technical Report, 2018.
+verification time compared to other approaches. This is due to the                [15] Jan Philipp Thoma, Tim Güneysu, A configurable hardware implementation of
+                                                                                       XMSS, 2021, Cryptology ePrint Archive.
+efficient hash computations and Merkle tree verification paths, which             [16] Siwei Sun, Tianyu Liu, Zhi Guan, Yifei He, Jiwu Jing, Lei Hu, Zhenfeng
+maintain low time complexity and high efficiency even with large                       Zhang, Hailun Yan, XMSS-SM3 and MT-XMSS-SM3: Instantiating extended Merkle
+data sets. Moreover, the scheme satisfies the properties of quantum                    signature schemes with SM3, 2022, Cryptology ePrint Archive.
+resistance, anonymity, traceability, and applicability to IoV.                    [17] Andreas Hülsing, W-OTS+–shorter signatures for hash-based signature schemes,
+                                                                                       in: Progress in Cryptology–AFRICACRYPT 2013: 6th International Conference on
+    Future research will aim to further improve the practicality and
+                                                                                       Cryptology in Africa, Cairo, Egypt, June 22-24, 2013. Proceedings 6, Springer,
+security of the scheme in response to the evolving threats posed by                    2013, pp. 173–188.
+quantum computing, and second, interdisciplinary collaboration can                [18] Kaiyi Zhang, Hongrui Cui, Yu Yu, Revisiting the constant-sum winternitz
+be strengthened in future research to provide valuable insights for                    one-time signature with applications to SPHINCS+ and XMSS, in: Annual
+optimizing solutions in real-world scenarios.                                          International Cryptology Conference, Springer, 2023, pp. 455–483.
+                                                                                  [19] Xie Jia, Liu Shizhao, Wang Lu, Research progress and prospects of ring signature
+                                                                                       technology., J. Front. Comput. Sci. Technol. 17 (5) (2023).
+CRediT authorship contribution statement                                          [20] Rohit Chatterjee, Kai-Min Chung, Xiao Liang, Giulio Malavolta, A note on the
+                                                                                       post-quantum security of (ring) signatures, in: IACR International Conference on
+   Shuanggen Liu: Conceptualization. Xiayi Zhou: Writing – original                    Public-Key Cryptography, Springer, 2022, pp. 407–436.
+                                                                                  [21] Yuxi Xue, Xingye Lu, Man Ho Au, Chengru Zhang, Efficient linkable ring signa-
+draft. Xu An Wang: Supervision. Zixuan Yan: Investigation. He Yan:
+                                                                                       tures: new framework and post-quantum instantiations, in: European Symposium
+Formal analysis. Yurui Cao: Resources.                                                 on Research in Computer Security, Springer, 2024, pp. 435–456.
+                                                                                  [22] Abida Haque, Alessandra Scafuro, Threshold ring signatures: new definitions
+Declaration of competing interest                                                      and post-quantum security, in: Public-Key Cryptography–PKC 2020: 23rd IACR
+                                                                                       International Conference on Practice and Theory of Public-Key Cryptography,
+                                                                                       Edinburgh, UK, May 4–7, 2020, Proceedings, Part II 23, Springer, 2020, pp.
+    The authors declare that they have no known competing finan-
+                                                                                       423–452.
+cial interests or personal relationships that could have appeared to              [23] Maxime Buser, Joseph K. Liu, Ron Steinfeld, Amin Sakzad, Post-quantum id-based
+influence the work reported in this paper.                                             ring signatures from symmetric-key primitives, in: International Conference on
+                                                                                       Applied Cryptography and Network Security, Springer, 2022, pp. 892–912.
+Acknowledgments                                                                   [24] J. Odoom, X. Huang, Z. Zhou, et al., Linked or unlinked: A systematic review
+                                                                                       of linkable ring signature schemes, J. Syst. Archit. 134 (2023) 102786.
+                                                                                  [25] Shiwei Xu, Tao Wang, Ao Sun, Yan Tong, Zhengwei Ren, Rongbo Zhu,
+    This work was supported by the National Natural Science Founda-                    Houbing Herbert Song, Post-quantum anonymous, traceable and linkable au-
+tion of China (NSFC) under Grant No. 62172436.The first author and                     thentication scheme based on blockchain for intelligent vehicular transportation
+the third author are the corresponding authors of this paper.                          systems, IEEE Trans. Intell. Transp. Syst. (2024).
+
+
+                                                                             12
+S. Liu et al.                                                                                                                       Journal of Systems Architecture 160 (2025) 103345
+
+
+[26] Nyothiri Aung, Tahar Kechadi, Tao Zhu, Saber Zerdoumi, Tahar Guerbouz,                    [33] Cui Yongquan, Cao Ling, Zhang Xiaoyu, Privacy protection of internet of vehicles
+     Sahraoui Dhelim, Blockchain application on the internet of vehicles (iov),                     based on lattice-based ring signature, Chinese J. Comput. 42 (5) (2019) 980–992.
+     in: 2022 IEEE 7th International Conference on Intelligent Transportation                  [34] Cesar Castellon, Swapnoneel Roy, Patrick Kreidl, Ayan Dutta, Ladislau Bölöni,
+     Engineering, ICITE, IEEE, 2022, pp. 586–591.                                                   Energy efficient merkle trees for blockchains, in: 2021 IEEE 20th International
+[27] Haibin Zhang, Jiajia Liu, Huanlei Zhao, Peng Wang, Nei Kato, Blockchain-based                  Conference on Trust, Security and Privacy in Computing and Communications,
+     trust management for internet of vehicles, IEEE Trans. Emerg. Top. Comput. 9                   TrustCom, IEEE, 2021, pp. 1093–1099.
+     (3) (2020) 1397–1409.                                                                     [35] David Derler, Sebastian Ramacher, Daniel Slamanig, Post-quantum zero-
+[28] Mirador Labrador, Weiyan Hou, Implementing blockchain technology in the                        knowledge proofs for accumulators with applications to ring signatures from
+     internet of vehicle (IoV), in: 2019 International Conference on Intelligent
+                                                                                                    symmetric-key primitives, in: Post-Quantum Cryptography: 9th International Con-
+     Computing and Its Emerging Applications, ICEA, IEEE, 2019, pp. 5–10.
+                                                                                                    ference, PQCrypto 2018, Fort Lauderdale, FL, USA, April 9-11, 2018, Proceedings
+[29] Y. Liu, Q. Xia, X. Li, et al., An authentication and signature scheme for UAV-
+                                                                                                    9, Springer, 2018, pp. 419–440.
+     assisted vehicular ad hoc network providing anonymity, J. Syst. Archit. 142
+                                                                                               [36] Xinyu Zhang, Ron Steinfeld, Joseph K. Liu, Muhammed F. Esgin, Dongxi
+     (2023) 102935.
+[30] X. Feng, X. Wang, K. Cui, et al., A distributed message authentication scheme                  Liu, Sushmita Ruj, DualRing-PRF: Post-quantum (linkable) ring signatures from
+     with reputation mechanism for internet of vehicles, J. Syst. Archit. 145 (2023)                Legendre and power residue PRFs, in: Australasian Conference on Information
+     103029.                                                                                        Security and Privacy, Springer, 2024, pp. 124–143.
+[31] S. Thapliyal, M. Wazid, D.P. Singh, et al., Robust authenticated key agreement            [37] David A. Cooper, Daniel C. Apon, Quynh H. Dang, Michael S. Davidson, Morris J.
+     protocol for internet of vehicles-envisioned intelligent transportation system, J.             Dworkin, Carl A. Miller, et al., Recommendation for stateful hash-based signature
+     Syst. Archit. 142 (2023) 102937.                                                               schemes, NIST Spec. Publ. 800 (208) (2020) 208–800.
+[32] Nikhil Verma, Swati Kumari, Pranavi Jain, Post quantum digital signature change           [38] Ralph C. Merkle, A certified digital signature, in: Conference on the Theory and
+     in iota to reduce latency in internet of vehicles (iov) environments, in: 2022                 Application of Cryptology, Springer, 1989, pp. 218–238.
+     International Conference on IoT and Blockchain Technology, ICIBT, IEEE, 2022,
+     pp. 1–6.
+
+
+
+
+                                                                                          13
+
--- a/papers_txt/A-load-balanced-acceleration-method-for-small-and-irr_2025_Journal-of-System.txt
+++ b/papers_txt/A-load-balanced-acceleration-method-for-small-and-irr_2025_Journal-of-System.txt
@@ -0,0 +1,929 @@
+                                                            Journal of Systems Architecture 160 (2025) 103341
+
+
+                                                                 Contents lists available at ScienceDirect
+
+
+                                                        Journal of Systems Architecture
+                                                       journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+A load-balanced acceleration method for small and irregular batch matrix
+multiplication on GPU
+Yu Zhang a , Lu Lu a,b ,∗, Zhanyu Yang a , Zhihong Liang c,d , Siliang Suo c,d
+a School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China
+b
+  Peng Cheng Laboratory, Shenzhen, 518055, China
+c
+  Electric Power Research Institute, CSG, Guangzhou, China
+d
+  Guangdong Provincial Key Laboratory of Power System Network Security, Guangzhou, China
+
+
+
+ARTICLE               INFO                              ABSTRACT
+
+Keywords:                                               As an essential mathematical operation, GEneral Matrix Multiplication (GEMM) plays a vital role in many
+Batch GEMM                                              applications, such as high-performance computing, machine learning, etc. In practice, the performance of
+Thread workload                                         GEMM is limited by the dimension of matrix and the diversity of GPU hardware architectures. When dealing
+Multi-thread kernel
+                                                        with batched, irregular and small matrices, the efficiency of GEMM usually performs poorly. To this end, a
+Tiling algorithm
+                                                        common approach is to segment the matrix into multiple tiles and utilize parallelism between workgroups in
+                                                        GPU to compute the results. However, previous works only consider tile size and inter-workgroup parallelism
+                                                        and ignore the issues of low computational efficiency and hardware resource utilization caused by the
+                                                        difference in workloads between wavefronts. To address these issues, we propose a load-balanced batch GEMM
+                                                        acceleration method, consisting of a multi-thread kernel design and an efficient tiling algorithm. The multi-
+                                                        thread kernel design can address the workload unbalance between wavefronts in different workgroups, and the
+                                                        efficient tiling algorithm can choose the optimal tiling scheme with the new thread-level parallelism calculation
+                                                        method to achieve load-balanced task allocation. Finally, various comparative experiments were conducted
+                                                        on two GPU platforms: AMD and NVIDIA. Experimental results indicate the proposed method outperforms
+                                                        previous methods.
+
+
+
+1. Introduction                                                                                 Many real-world applications, such as deep learning, involve ir-
+                                                                                            regular, small-size matrix multiplication operations in their computa-
+    GEneral Matrix Multiplication (GEMM) is a standard computing                            tions [11]. For example, in Convolutional Neural Networks (CNN) [12–
+kernel that plays an important role in high-performance computing [1],                      14], the structure of these models contains a large number of convo-
+artificial intelligence [2], image processing [3], and other research                       lutional layers. The scale of the convolution kernel tends to be small
+fields. With the explosive growth of data volume and the emergence of                       (e.g. ‘‘1*1’’ and ‘‘3*3’’). Convolution operations are converted to GEMM
+various algorithms, the demand for high-performance GEMM comput-                            using Im2col function, and the dimension of the matrix is typically
+ing is increasing [4,5]. Additional stream processors and memory are                        less than 1000 [15,16]. These small GEMM computations prevent the
+integrated into the GPU to cater to this trend, providing tremendous                        GPU from fully exploiting its hardware computing potential. In this
+computational power for GEMM acceleration. To fully utilize the hard-                       case, the scheduling overhead between batch GEMMs and the regularity
+ware acceleration capability, AMD and NVIDIA, provide developers
+                                                                                            of the matrix poses challenges to computational performance [17,18].
+with a platform for parallel computing based on GPU (ROCm and
+                                                                                            For a GEMM, the tiling is a standard solution method. The matrix is
+CUDA). Based on these parallel computing acceleration platforms, var-
+                                                                                            segmented into multiple tiles, and a thread block is responsible for
+ious optimization algorithms and acceleration libraries have been pro-
+                                                                                            computing individual tiles. Since each tile is independent, multiple tiles
+posed and demonstrated to have powerful effects, such as rocBLAS [6],
+                                                                                            can be computed in parallel by using multiple threads in GPU, to speed
+cuBLAS [7], MAGMA [8], etc. These methods achieve optimal computa-
+tional task allocation through hardware resource scheduling and thread                      up the computation process of GEMM. The larger dimension of tile will
+parallelism to accelerate the matrix multiplication operation [9,10].                       increase the Thread-Level Parallelism (TLP) of a single tile and also will
+
+
+
+    ∗ Corresponding author at: School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China.
+     E-mail addresses: yuzhang0722@163.com (Y. Zhang), lul@scut.edu.cn (L. Lu), yangzhanyu@hotmail.com (Z. Yang), liangzh@csg.cn (Z. Liang),
+suosl@csg.cn (S. Suo).
+
+https://doi.org/10.1016/j.sysarc.2025.103341
+Received 3 September 2024; Received in revised form 3 November 2024; Accepted 8 January 2025
+Available online 23 January 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Y. Zhang et al.                                                                                                  Journal of Systems Architecture 160 (2025) 103341
+
+
+reduce the number of tile, resulting in the failure to fully utilize the          2. Related work and motivation
+hardware resources of GPU [19,20]. The Instruction-Level Parallelism
+(ILP) of a single thread is related to the K-dimension. Generally, for a          2.1. Related work
+large enough matrix size, it can fully use GPU hardware resources and
+achieve higher TLP and ILP [21,22].                                                   Several approaches have been proposed for batch GEMM computa-
+    To improve computational efficiency, previous studies have pro-               tion, which mainly focus on algorithm-level optimization or architecture-
+posed some acceleration methods for matrix multiplication. For in-                level optimization. The former mainly explores lower bounds on the
+stance, rocBLAS [6] and cuBLAS [7] provide batch GEMM API                         time complexity of GEMM operations at the mathematical level and
+(rocblasSgemmBatched and cublasSgemmBatched), which can support                   optimizes the computational effort. The latter is based on different GPU
+multiple GEMMs to be simultaneously calculated on GPUs. However,                  architecture features and uses corresponding optimization techniques
+these APIs support only uniform matrix sizes that considerably limit              to improve the computational efficiency of GEMM. In algorithm-level
+these applications. NVIDIA also provides a C++-style template library,            optimization, Strassen et al. [24] proposed a novel GEMM algorithm
+CUTLASS [23], which utilizes built-in tile templates and sorting to
+                                                                                  based on the property that matrix addition is faster than matrix multi-
+accelerate matrix multiplication operations. In fact, the size of matrices
+                                                                                  plication to speed up the computational process, which uses seven-time
+is variable in many real-world applications [11]. To solve this issue,
+                                                                                  multiplications and multiple addition operations instead of eight-time
+a Vbatch GEMM route that supports batch GEMM in various sizes is
+                                                                                  multiplications. This approach mathematically reduced the time com-
+designed and implemented by MAGMA (magmablas_sgemm_vbatched). It
+                                                                                  plexity of GEMM to 𝑂(𝑛2.81 ) for the first time. To reduce the require-
+adapts to batch GEMMs with multiple tiling strategies, assigning the ap-
+                                                                                  ment of Strassen’s algorithm for extra memory space, three different
+propriate tile to a single GEMM for huge performance gains. Although
+                                                                                  methods were proposed in [25]: pre-additions, overwriting the input
+variable sizes are supported in MAGMA, it still has some limitations.
+First, MAGMA only supports some coarse-grained tiling strategies that             matrix, and recursive scheduling to alleviate this problem. At the
+are not appropriate for all GEMM. Coarse-grained tiling results in an             same time, due to the powerful effect of deep neural networks in
+unbalanced kernel workload and GPU utilization reduction. Second, the             various domains, Alhussein Fawzi et al. [26] transformed the process
+grid size is determined by the tiling of the largest matrix, which leads          of finding the optimal complexity of matrix multiplication into a tensor
+to idle threads and a waste of GPU computing power. Third, the lack               decomposition problem and used reinforcement learning to explore
+of an evaluation criterion for tiling leads to lower efficiency of strategy       lower bounds on the complexity of matrix multiplication. In particular,
+choice.                                                                           for a 4 × 4 matrix, the multiplication number was as low as 47 multi-
+    To thoroughly support batch GEMM with variable sizes, it is es-               plications. This performance was better than the two-level Strassen’s
+sential to design a tiling algorithm that can be adapted to all GEMMs             algorithm, which involves 49 multiplications. Although the above
+and adaptively choose tile sizes, not limited to single size. The optimal         approach reduces the mathematical complexity of matrix multiplication
+tiling for each GEMM is different, depending on the size of the matrix            operations, it is difficult to take advantage of the performance benefits
+dimensions (𝑀, 𝑁, 𝐾). How to choose a suitable tile is a challenge                of these approach due to the neglect of computational scheduling
+for batch GEMM. At the same time, an evaluation criterion based on                strategies and multi-level memory architecture features on the GPU.
+the current GPU hardware and tiling strategy is also essential. With                  In architecture-level optimization, GPU vendors (NVIDIA and AMD)
+GPU hardware, an appropriate tiling for each GEMM can be chosen                   have designed and implemented computing libraries such as cuBLAS [6]
+to fully utilize the GPU computing capabilities and achieve better                and rocBLAS [7] based on their parallel computing platforms to im-
+computational performance. How to measure the effectiveness of the                prove GPU hardware utilization and parallelism. However, due to the
+tiling algorithm on the GPU hardware is a challenging problem. The tile           restriction of uniform-sized matrix, the performance is poor when faced
+with various sizes can lead to significant differences in computational           with small and irregular batch GEMMs. Although NVIDIA provides
+effort within each workgroup, further to an unbalanced distribution of            a C++-style template library, the small size of the matrix and the
+computational tasks and excessive load differences between threads.               lack of assembly-level optimizations make it difficult for CUTLASS
+Hence, for tiles with various sizes, balancing thread computation and             to fully exploit its performance advantages for irregular and small
+data loading during computation is also a challenge for batch GEMM.               matrix multiplication [23]. These irregular and small-sized matrices
+    To address the above challenges, we propose a batch GEMM accel-
+                                                                                  often lead to unbalanced workloads among threads in different work-
+eration method with a multi-thread kernel design. Furthermore, an ef-
+                                                                                  groups, which can reduce kernel performance. For Sparse GEneral
+ficient tiling algorithm is proposed to achieve load-balanced and higher
+                                                                                  Matrix-Matrix multiplication (SpGEMM), the matrix’s sparsity leads to
+hardware resource utilization. Our contributions can be summarized as
+                                                                                  significant differences in thread workloads [27,28]. To address the
+follows:
+                                                                                  unbalanced workload, Chen et al. [29] optimized the matrix segmen-
+    • A multi-threaded kernel design scheme is proposed to balance                tation by analyzing the distribution of the floating point calculations
+      thread computation and data loading in different workgroups to              of the CSR-based SpGEMM, which achieves load balance and perfor-
+      compute the various tiles.                                                  mance improvement on Sunway TaihuLight. For the issue of workload
+    • A novel TLP computation method is designed to select the optimal            unbalance in threads, it is necessary to conduct a detailed analysis
+      tiling algorithm by combining the kernel occupancy of the GPU               of the computation process and hardware platform characteristics to
+      and the tiling operation.                                                   design an efficient parallel framework implementation [30,31]. Xiao
+    • An efficient tiling algorithm is implemented by considering the             et al. [32] introduce a fine-grained partitioning strategy to select ap-
+      GPU hardware architecture and the batch GEMM workload.                      propriate segmentation dimensions, efficiently utilizing the parallelism
+    • The proposed method can efficiently handle batch irregular GEMM             of multi-thread and improving the performance of binary sparse tensor
+      and achieve state-of-the-art performance on AMD and NVIDIA                  contracts. The diversity of matrix sizes makes it difficult to utilize a
+      GPU platforms.                                                              unified routine for calculations, resulting in some threads being idle
+    The rest of the paper is organized as follows. Section 2 provides             in CU [33,34]. Indeed, the size of matrices is variable and irregular
+related work and motivation. Section 3 introduces background on batch             in various scientific computing scenarios. To overcome the matrix
+GEMM, GPU architecture, and kernel occupancy. Section 4 presents                  restriction of uniform size, MAGMA [8] proposes a Vbatch routine
+the details of the multi-thread kernel design and load-balanced tiling            to support batch GEMM with various sizes. In this way, it uses a 3D
+algorithm. Section 5 demonstrates and evaluates the experimental re-              grid to indicate batch GEMM’s kernel design, where grid.z represents
+sult. Section 6 provides the conclusions of the paper and future work.            batch size. Each GEMM corresponds to one of the 2D-grid planes, and
+The source code of this paper can be obtained in this repository link:            the size of the two-dimensional plane (grid.x, grid.y) is determined by
+https://github.com/zhangyu0722/BatchGEMM.git.                                     the largest GEMM. In the case of irregular GEMM, if the dimension
+
+                                                                              2
+Y. Zhang et al.                                                                                                     Journal of Systems Architecture 160 (2025) 103341
+
+
+
+
+                                                       Fig. 1. GEMM and batch GEMM schematic diagram.
+
+
+difference between the largest GEMM and the rest is too large, a large             3. Background
+number of threads and workgroups will be idle, resulting in a waste of
+GPU computing resources. For various parallel acceleration platforms,              3.1. GEMM and batch GEMM
+different hardware characteristics, such as register size and number of
+CUs, will affect the allocation of computing resources in the kernel. To               For a single GEMM, its accumulation routine is 𝐶 = 𝛼 𝐴𝐵+𝛽 𝐶, where
+ensure kernel performance, it is necessary to flexibly set parameters              𝐴 ∈ 𝑅𝑀×𝐾 , 𝐵 ∈ 𝑅𝐾×𝑁 and 𝐶 ∈ 𝑅𝑀×𝑁 are dense matrices, 𝑀, 𝑁, and
+based on different matrix sizes and hardware architectures [9,35].                 𝐾 represent matrix dimensions, and 𝛼 and 𝛽 are constant scalars. A
+To solve this problem, a coordinated tiling and batching strategy is               common approach is tiling matrix C into multiple tiles [21,36], which
+proposed in [21], where a different tiling strategy is used for each               utilizes the parallel computing of thread in GPU to calculate each tile
+GEMM in batch GEMM and appropriate batching is used according                      and splices together the result. As shown in Fig. 1 (b), given a GEMM
+to the tile size to improve the computational efficiency of the GPU.
+                                                                                   with size 𝑀 × 𝑁 × 𝐾, the matrix C is segmented into multiple tiles with
+Wang et al. [36] proposed the sort-up algorithm based on the GEMM
+                                                                                   𝑇𝑚 × 𝑇𝑛 . Each workgroup is responsible for the calculation of a tile and
+workload and split-down in the tiling process, which can segment large
+                                                                                   needs to access the row section of matrix A with size 𝑇𝑚 ×𝐾 and column
+tiles into multiple smaller tiles. This approach can make better use of
+                                                                                   section of matrix B with size 𝐾 × 𝑇𝑛 . However, the row cross-section
+CU utilization when the number of GEMM is limited.
+                                                                                   of A and the column cross-section of B (represented in Fig. 1 (b) by
+2.2. Motivation                                                                    the gray parts of matrices A and B, respectively) are too large to store
+                                                                                   in shared memory and registers. Hence, the row section of A and the
+    Although the above-mentioned methods improve the parallel com-                 column section of B are segments of multiple A tiles with 𝑇𝑚 × 𝑇𝑘 and B
+puting efficiency of batch GEMM on GPU from various perspectives,                  tiles with 𝑇𝑘 × 𝑇𝑛 , respectively. The partial result of C can be obtained
+there are two problems. One is that the workload of threads varies                 by calculating with A tile and B tile, and accumulative partial results
+significantly across the kernel. In the above approach, tiles with various         can obtain the final result.
+sizes are designed, and each tile is responsible for the corresponding                 To batch-run multiple GEMMs, a naive routine is computed for
+kernel, where the number of threads is fixed. In general, larger tiles             each GEMM individually. However, when the matrix size is small,
+have better TLP. This will also increase the workload of each thread               a single GEMM does not fully utilize the GPU’s computing power,
+for large-size tiles, and the thread responsible for computing large tiles         leaving the CU idle [37,38]. To avoid this situation, a batch GEMM
+requires more hardware resources (VGPR, SGPR, LDS) and computing                   method is proposed to design multiple kernels for various GEMM in
+time. The other one is that differences between wavefronts within dif-             the GPUs [36,39]. Compared to GEMM, batch GEMM is expressed in
+ferent workgroups are ignored in the TLP calculations. The workgroup               (𝑀 × 𝑁 × 𝐾 × 𝐵𝑠𝑖𝑧𝑒 ), where 𝑀, 𝑁 and 𝐾 represent the dimensions of
+will be transformed into multiple wavefronts during GPU computation                the matrix, and 𝐵𝑠𝑖𝑧𝑒 represents the batch size. A batch GEMM is 3D-
+and be executed in parallel on the CU. Each CU can run multiple                    dimension grid, where grid.z is batch sizes, and grid.x and grid.y are the
+wavefronts simultaneously, and the number of wavefronts depends on
+                                                                                   lengths and widths of a two-dimensional plane respectively [40]. To
+the hardware resources required by the wavefront. Thus, the TLP on the
+                                                                                   balance the workload of a batch GEMM, a variety of tile sizes are used
+GPU should be determined by the number of threads in the wavefront
+                                                                                   for GEMM tiling. The two-dimensional grid size has the corresponding
+that can be executed in parallel on the CU.
+                                                                                   matrix C and tiling strategy. Each tile is responsible for the correspond-
+    To solve the above problems, we propose an efficient and load-
+balanced batch GEMM acceleration method, which consists of two                     ing workgroup. A workgroup is decomposed into multiple wavefronts
+parts: a multi-thread kernel design scheme and an efficient tiling algo-           that execute on the CU. The 3D grid of batch GEMM is shown in Fig. 1
+rithm. A multi-thread kernel design is proposed to balance the amount              (a).
+of loading and computation in the thread corresponding to each tile.
+Tiles with various sizes correspond to the number of threads selected.             3.2. GPU architecture and kernel occupancy
+Although this is limited by the parallel programming interfaces of the
+CUDA and ROCm platforms, the number of threads responsible for                         With the improvement of hardware architecture and parallel com-
+computing a tile is uniform. To overcome this shortcoming, we use the              puting programming platforms (such as ROCm1 and CUDA2 ), GPUs
+corresponding filtering operation in the kernel execution process to ef-           are becoming the most popular hardware accelerator. The two most
+fectively alleviate this problem. An efficient tiling algorithm can choose         commonly used GPUs are AMD and NVIDIA, widely used in various
+the optimal scheme based on different GEMMs and GPUs. To measure                   scientific computing platforms. However, some basic concepts of ex-
+the effect of tiling, we propose a new way of TLP computation based                pression in ROCm and CUDA are different. We chose AMD’s official
+on wavefronts. The optimal tiling scheme is obtained by adjusting the
+tiling strategy according to the TLP. Finally, we obtain an efficient tiling
+algorithm based on the new TLP calculation method. In Section 4, the                 1
+                                                                                         https://rocm.docs.amd.com/en/latest/
+                                                                                     2
+details of the proposed method are introduced.                                           https://docs.nvidia.com/cuda/
+
+
+                                                                               3
+Y. Zhang et al.                                                                                                       Journal of Systems Architecture 160 (2025) 103341
+
+                  Table 1
+                  ROCm/CUDA terminology.
+                   ROCm                    CUDA                                Description
+                   Compute Unit (CU)       Streaming                           One of many parallel vector processors in a GPU that contains
+                                           Multiprocessor (SM)                 parallel ALUs. All waves in a workgroup are assigned
+                                                                               to the same CU.
+                   Kernel                  Kernel                              Functions launched to the GPU that are executed by multiple
+                                                                               parallel workers on the GPU. Kernels can work in
+                                                                               parallel with CPU.
+                   Wavefront               Warp                                Collection of operations that execute in lockstep, run the
+                                                                               same instructions, and follow the same control-flow path.
+                                                                               Individual lanes can be masked off.
+                   Workgroup               Thread block                        Think of this as a vector thread. A 64-wide wavefront
+                                                                               is a 64-wide vector op.
+                   Work-item/Thread        Thread                              GPU programming models can treat this as a separate thread
+                                                                               of execution, though this does not necessarily get
+                                                                               forward sub-wavefront progress.
+                   Global Memory           Global Memory                       DRAM memory accessible by the GPU that goes
+                                                                               through some layers cache.
+                   Local Memory            Shared Memory                       Scratchpad that allows communication between wavefront
+                                                                               in a workgroup.
+                   Private Memory          Local Memory                        Per-thread private memory often mapped to registers.
+
+
+
+terminology for this paper to provide precise specifications. To clarify        resources. In order to fully utilize the hardware resources of the GPU
+some differences and relationships between ROCm and CUDA terms, a               and improve the efficiency of parallel computing, the kernel occupancy
+comparison of terminology is given in Table 1.                                  should be improved as much as possible without data overflow [46,47].
+    A GPU is composed of multiple Shader Engines (SE) and a com-                In batch GEMM, an efficient kernel design should properly allocate
+mand processor. Each SE has its own workload manager. One SE is                 the data loading and computation workload for each work-item in the
+integrated with multiple CU and workload manager. Each CU contains              wavefront, so that the memory space and computing power on the CU
+an enormous amount of Arithmetic and Logic Units (ALUs), a small                can be more efficiently utilized [48,49].
+number of control units, and caches. Hence, GPUs are suitable for a
+large number of simple parallel computing tasks. A GPU kernel consists          4. Overview
+of one or multiple workgroups, the size of which is determined by the
+number of wavefronts and threads. On the memory hierarchy, the GPU              4.1. Multi-thread kernel design
+has global memory, local memory, and private memory from slow to
+fast according to memory access speed, and local memory and private                 Tile size and kernel design are closely related in the design of batch
+memory are much smaller than global memory [41,42].                             GEMM algorithms, and there are two matrix tile design routes. The
+    Kernel Occupancy represents the actual utilization of computing             first way is to design a tile to adapt to all GEMMs, and the second
+unit resources by a kernel function on GPU, which is the ratio of               is to design the various tiles to adapt to different GEMMs. Compared
+actived wavefront to the maximum wavefront supported by CU [35,43].             with the first method, for irregular GEMM, the latter method is more
+An active wavefront running on CU requires resources such as Vec-               flexible and efficient to utilize the computing resources of GPU. For
+tor General-Purpose Register (VGPR), Scalar General-Purpose Registers           GEMMs with various shapes and sizes, using a single tile can easily
+(SGPR), Local Data Share (LDS), etc. A wavefront can be activated               lead to increased workload differences between threads in multiple
+and run on a CU when all required resources are available. When the
+                                                                                workgroups, affecting the allocation of computing resources. In this
+utilization of CU resources is low, the number of active wavefronts
+                                                                                paper, we perform a multi-thread kernel design for the second matrix
+is small, which leads to the waste of hardware resources and the
+                                                                                segmentation method. Two different tile design strategies are shown
+degradation of the parallel performance of the kernel. On the other
+                                                                                in Fig. 2. Here we present the effect of two different tile strategies on
+hand, when the number of active wavefronts in the CU increases, the
+                                                                                the occupancy of the 3D grid. For the batch GEMM, different tile sizes
+resources used by each wavefront and the available register storage
+                                                                                lead to different numbers of workgroups, resulting in different 3D grid
+space of each work-item in the wavefront decrease [44,45].
+                                                                                occupancies.
+    The number of active wavefronts on a CU is mainly limited by the
+                                                                                    For a single GEMM, matrix C is tiled into multiple tiles. The tile
+following factors: the number of work-items in each workgroup and
+                                                                                size can be flexibly designed, and each tile can be run in parallel
+the sizes of VGPR, SGPR, and LDS. For example, in AMD’s MI1003
+                                                                                without data interference. Each tile is calculated by the corresponding
+and MI210,4 a wavefront consists of 64 work-terms. When the number
+                                                                                workgroup and can be represented by a 2D-grid as a whole. When the
+of work-items in a workgroup is less than or equal to 64, only one
+                                                                                size and number of tiles is large enough, efficient parallel execution
+wavefront is included. The VGPR, SGPR, and LDS sizes on the CU have a
+                                                                                efficiency can usually be obtained. However, in real-world cases, the
+corresponding upper bound for each work-item. According to the kernel
+                                                                                size of matrices in batch GEMM tends to be small and irregular,
+design, the resources on the CU need to be allocated before executing
+                                                                                which leads to poor performance of traditional methods. Therefore, the
+each work-item. When resource requirements of the work-item are
+                                                                                previous method adopts a variety of tiles to adapt to the corresponding
+satisfied, the wavefront can be active and run on the CU. Otherwise,
+                                                                                GEMM, and each tile is based on a unified number of threads, which
+it will not run until other wavefronts accomplish tasks and release
+                                                                                will lead to the workload of threads in large-scale tiles being much
+                                                                                larger than that of small tiles. This gap in the workload of threads
+  3
+    https://www.amd.com/system/files/documents/instinct-mi100-                  results in unbalanced thread loading and reduces GPU parallel com-
+brochure.pdf                                                                    puting efficiency. Table 2 lists the detailed parameters for tiles with
+  4
+    https://www.amd.com/content/dam/amd/en/documents/instinct-                  various sizes based on the same work-item design (The number of work-
+business-docs/white-papers/amd-cdna2-white-paper.pdf                            items in the kernel is 128). 𝑊𝐶 𝑃 and 𝑊𝐷𝐿 represent the computation
+
+                                                                           4
+Y. Zhang et al.                                                                                                                Journal of Systems Architecture 160 (2025) 103341
+
+
+
+
+Fig. 2. Two different tile design strategies for batch GEMMs. ((a) All GEMMs adopt the same tiling scheme, which is divided into multiple tiles of the same size. (b) Different
+GEMMs adopt different tiling schemes and are divided into multiple tiles of different sizes.).
+
+
+Table 2                                                                                    speed of global memory is considerably lower than that of registers,
+The common kernel design scheme for batch GEMM (There are significant workload             threads’ data access efficiency decreases, and overall time consumption
+gaps between threads).
+                                                                                           increases. At the same time, since the variety of thread workloads,
+ Tile               𝑇𝑚           𝑇𝑛           𝑇𝑘             𝑊𝐶 𝑃            𝑊𝐷𝐿
+                                                                                           when a thread with a heavy workload is run on the CU, the number
+ small              16           16           8/16           2               4/6           of active wavefronts on the CU is less, resulting in the CU’s kernel
+ medium             32           32           8/16           8               12/16
+ large              64           64           8/16           32              40/48
+                                                                                           occupancy (The ratio between the number of active wavefronts and the
+                                                                                           maximum number of supported wavefronts) will be reduced. The state
+                                                                                           of the CU with low kernel occupancy will be longer due to the longer
+                                                                                           work-item computation time.
+amount and data loading amount of work-item, respectively, and their                           To solve this problem, we propose a multi-thread kernel design,
+calculation expressions are considered as:                                                 which ensures that the workload of each thread is balanced as much as
+        𝑇 × 𝑇𝑛                                                                             possible. The experimental results in Fig. 3 show that multiple kernels’
+𝑊𝐶 𝑃 = 𝑚                                                         (1)
+         𝑊𝑛𝑢𝑚                                                                              performance varies when calculating the same tile. For example, the
+          𝑇𝑚 × 𝑇𝑛 + 𝑇𝑚 × 𝑇𝑘 + 𝑇𝑘 × 𝑇𝑛                                                      128-thread kernel performs best when calculating a tile with ‘‘32*32’’,
+𝑊𝐷 𝐿 =                                                                          (2)        as shown in Fig. 3. The performance gap mentioned above is mainly
+                     𝑊𝑛𝑢𝑚
+                                                                                           because of the varying workloads of threads under different kernels,
+where 𝑊𝑛𝑢𝑚 represents the number of work-items responsible for com-
+                                                                                           which affects the overall performance. For the 128-thread kernel, when
+puting the tile.
+                                                                                           calculating a tile with ‘‘32*32’’, each thread needs to complete the
+    For different tiles, there is a significant gap in workload between
+                                                                                           calculation of 8 elements and the loading of 16 elements. When cal-
+threads (𝑊𝐶 𝑃 ∈ [2, 32] and 𝑊𝐷𝐿 ∈ [4, 48]). The choice of 𝑇𝑘 also has a
+                                                                                           culating a tile with ‘‘64*64’’, the workload of the threads is heavy, and
+certain impact on the data load of work-item. Each thread is responsible
+                                                                                           each thread needs to complete the calculation of 32 elements and the
+for more data loads when 𝑇𝑘 is larger. For example, in large tile, when                    loading of 64 elements. When calculating larger tiles, the workload of
+the value of 𝑇𝑘 is set to 8 or 16, each work-item is responsible for                       the thread increases significantly. To avoid significant differences in
+loading 40 and 48 elements, respectively. The workload differences                         workload between threads, we used a multi-thread kernel to calculate
+caused by these different tile sizes impact kernel performance.                            various tiles by considering the computation amount (𝑊𝐶 𝑃 ) and data
+    To explore the impact of the number of work-items in the work-                         loading amount (𝑊𝐷𝐿 ) of threads in the kernel. For larger tiles such
+group and the tile size on the performance of batch GEMM, some                             as ‘‘32*64’’ and ‘‘64*64’’, a 256-thread kernel is used for computation.
+experiments are performed, whose results are given in Fig. 3. As shown                     In this way, increasing the number of threads will reduce the thread’s
+in Fig. 3, under the condition that the number of GEMMs is large and                       computation amount and data loading amount, thereby reducing the
+𝑀, 𝑁, and 𝐾 are large enough, various thread-kernels (thread number                        gaps between threads’ workloads and achieving load balancing. There
+is 64, 128, 256, and 512) are used to compute multiple tiles (The nine                     are five tiles and two kernels (𝑊𝑛𝑢𝑚 ) for small and irregular batch
+tiles are shown in Fig. 3). In Fig. 3, four thread kernels commonly                        matrix multiplication, as shown in Table 3. Compared to Table 2, we
+used in previous work are selected as benchmarks [21,34,36]. We used                       balance the thread workload by setting the tile size and number of
+these kernels to investigate their performance under various tiles in                      kernel threads so that thread computation and data loading are as
+comparative experiments. Fig. 3 shows that the kernel’s performance                        consistent as possible across different workgroups. In the calculation
+first increases and then decreases for different tiles. When the tile                      process of GEMM, five tile types are designed for GEMM calculation
+size is small, the thread’s workload is also tiny. In this case, threads                   of different sizes, from ‘‘small’’ to ‘‘large’’. To ensure that the amount
+in the kernel only compute a few elements, which causes a lack of                          of computation and data loading for the work-item responsible for
+full utilization of threads’ computing power. As the tile size increases,                  computing different tiles are as equal as possible, the number of threads
+the number of elements that the thread needs to calculate and store                        varies depending on the tile size. In Table 3, two different thread
+is also increasing. Under the condition that the register data does                        numbers are used (128 and 256), respectively, and the computation
+not overflow, the computing efficiency of the thread is continuously                       amount (𝑊𝐶 𝑃 ) and data loading amount (𝑊𝐷𝐿 ) of the work-item in
+improving. When the tile corresponding to the thread is too large, the                     each scheme are given. Although the current ROCm and CUDA platform
+register data overflows, and the data will be transferred to the global                    programming interfaces only support the kernel design of a uniform
+memory. For example, for a 64-thread-kernel, when computing ‘‘8*8’’                        thread number, we use a screening operation in the early stage of kernel
+and ‘‘32*32’’ tiles, respectively, each thread needs to compute 1 and 32                   execution to achieve the effect of kernel design of multiple threads. For
+elements in matrix C. It is obvious that ‘‘32*32’’ requires more register                  example, in this paper, the number of kernel threads is set to 256. When
+memory. However, the register memory of each thread is precious.                           the tiles of ‘‘small’’, ‘‘small-medium’’ and ‘‘medium’’ are executed, the
+When the maximum limit of the register memory is exceeded, the data                        extra threads will be terminated immediately and the corresponding
+will be transferred to the global memory for storage. Because the access                   computing resources will be released because these tiles only need
+
+                                                                                       5
+Y. Zhang et al.                                                                                                         Journal of Systems Architecture 160 (2025) 103341
+
+
+
+
+                                                           Fig. 3. Experimental results of multi-thread kernel.
+
+
+Table 3                                                                                 and each tile is computed by a workgroup. Workgroups are further
+The multi-thread kernel design scheme with a more balanced workload.
+                                                                                        transformed into wavefronts based on their hardware resource require-
+ Tile                 𝑇𝑚       𝑇𝑛        𝑇𝑘        𝑊𝑛𝑢𝑚        𝑊𝐶 𝑃        𝑊𝐷𝐿
+                                                                                        ments and the number of work-item. Finally, these wavefronts are run
+ small                16       16        16        128         2           6            in parallel on multiple CUs for batch GEMM calculations. Due to the
+ small-medium         16       32        16        128         4           10
+                                                                                        difference between tile sizes, the computation amount and data loading
+ medium               32       32        16        128         8           16
+ medium–large         32       64        16        256         8           14           amount of threads are not unified in the different wavefront, which
+ large                64       64        16        256         16          24           will lead to unbalanced hardware resource requirements. The execution
+                                                                                        time of the wavefront on the CU is also different. The overall time of the
+                                                                                        batch GEMM is the maximum of all CU execution time. If the workload
+                                                                                        difference between wavefronts is too significant, the execution time of
+128 threads. Terminating threads early allows for a better allocation
+                                                                                        one wavefront will be excessive, increasing the overall calculation time
+of computational resources to threads responsible for computing other
+                                                                                        consumption.
+tiles. With this implementation, we can achieve the effect of a multi-
+                                                                                            Therefore, Eq. (3) does not consider the workload gaps between
+threaded kernel. Even though the performance may be degraded in
+                                                                                        wavefronts. To solve this problem, we propose a new TLP calculation
+comparison with an actual multi-threaded kernel, the experimental
+                                                                                        method as follows:
+results in Section 5 demonstrate the excellent performance of this                                       (           )
+                                                                                                    ∑      𝑀𝑖 × 𝑁𝑖
+method.                                                                                 𝑇 𝐿𝑃 𝑛𝑒𝑤 =     𝜑               × 𝑇𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡                           (4)
+                                                                                                     𝑖
+                                                                                                           𝑇𝑚𝑖 × 𝑇𝑛𝑖
+4.2. Tiling algorithm                                                                   where the expression of 𝑀𝑖 , 𝑁𝑖 , 𝑇𝑚𝑖 and 𝑇𝑛𝑖 have the same meaning
+                                                                                        as Eq. (3), and 𝑇𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 is number of work-item in wavefront, 𝜑
+4.2.1. Criteria for evaluation                                                          represents the conversion process of workgroup to wavefront.
+    The tiling can be seen as a re-assignment of GEMM computation                           The conversion process mainly considers the following factors: the
+task. Efficient tiling algorithm can transform GEMM operations and                      number of workitems in the workgroup, the size of VGPR, SGPR, LDS
+improve hardware resource utilization. When various kernel designs                      required by a workitem, and the maximum number of wavefront sup-
+are implemented, choosing an appropriate tiling scheme becomes a                        ported in the CU. These factors are related to GPU hardware architec-
+crucial issue. In general, for a GEMM, there will be better parallelism                 ture. Next, take AMD’s MI210, which is based on CDNA2.0 architecture,
+within the workgroup when the tile size is larger. However, a larger tile               as an example. Under the limitation of the number of workitems in the
+means that the number of tiles needs to be reduced. If the number of                    workgroup, the number of wavefront can be calculated as follows:
+tiles is too few, the CU cannot be fully utilized, resulting in a waste of                                (       )
+                                                                                                            𝑊 𝐼𝑤𝑔
+computing resources. Therefore, choosing a suitable tiling evaluation                   𝑊 𝐹𝑤𝑔 = 16 × ceil                                                  (5)
+                                                                                                             64
+criteria is crucial. In the previous study, TLP was used to quantify the
+parallelism of tiling strategies on GPUs. Given a GEMM and a tiling                     where 𝑊 𝐹𝑤𝑔 is the maximum number of wavefronts under the limit of
+strategy, its TLP can be calculated as follows:                                         the number of work-item in the workgroup, and 𝑊 𝐼𝑤𝑔 represents the
+          ∑ 𝑀𝑖 × 𝑁 𝑖                                                                    number of work-item in the workgroup. Eq. (5) indicates that when the
+𝑇 𝐿𝑃 =                 × 𝑇𝑤𝑜𝑟𝑘𝑔𝑟𝑜𝑢𝑝                                    (3)
+             𝑇𝑚𝑖 × 𝑇𝑛𝑖                                                                  number of work-item is less than or equal to 64, a workgroup contains
+           𝑖
+                                                                                        only one wavefront, and the number of workgroups is limited to 16 in
+where 𝑀𝑖 and 𝑁𝑖 are the dimension size of matrix C of the 𝑖th GEMM,
+                                                                                        the CU.
+and 𝑇𝑚𝑖 and 𝑇𝑛𝑖 are the tile sizes chosen by matrix C. 𝑇𝑤𝑜𝑟𝑘𝑔𝑟𝑜𝑢𝑝 is
+                                                                                           Limited by the size of VGPR, SGPR, and LDS, the number of the
+the number of threads in workgroup. However, the above formulation
+                                                                                        wavefront can be calculated as follows:
+only considers TLP from the level of the workgroup. Indeed, during                                       (                 )
+                                                                                                             𝑉 𝐺𝑃 𝑅𝑚𝑎𝑥
+the computation of the GEMM, the workgroup needs to be further                          𝑊 𝐹𝑉 = 4 × floor                                                  (6)
+                                                                                                           𝑉 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 × 64
+transformed into wavefronts and run on the CU in the form of a
+wavefront. The execution process of batch GEMM can be divided into                      where 𝑊 𝐹𝑉 is the maximum number of wavefronts under the limit of
+four phases: segmentation, workgroup, wavefront, and execution. In the                  the size of VGPR, 𝑉 𝐺𝑃 𝑅𝑚𝑎𝑥 is the size of VGPR in the Single Instruction
+segmentation phase, the GEMM is tiling into tiles with various sizes,                   Multiple Data (SIMD) unit, and 𝑉 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 is the VGPR size used by a
+
+                                                                                    6
+Y. Zhang et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103341
+
+
+work-item. In the CDNA2.0 hardware architecture, each CU consists of                decreases. This fine-tuning approach ensures that the CU is not idle
+four SIMDs.                                                                         by increasing the utilization of hardware resources at the expense of
+             (            )
+               𝑆 𝐺𝑃 𝑅𝑚𝑎𝑥                                                            intra-tile parallelism.
+𝑊 𝐹𝑆 = floor                                                     (7)
+               𝑆 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑                                                           Algorithm 1 The Tiling algorithm.
+where 𝑊 𝐹𝑆 is the maximum number of wavefronts under the limit of                    1: Initialize       𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 ,  𝑇 𝐿𝑃 =0,       𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝=0,
+the size of SGPR, 𝑆 𝐺𝑃 𝑅𝑚𝑎𝑥 is the size of SGPR in the CU, and 𝑆 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑               𝑡𝑜𝑡𝑎𝑙_𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 = 0;
+is the size of SGPR used by a wavefront.                                             2: for 𝑖 = 0 to 𝐵𝑠𝑖𝑧𝑒 − 1 do
+               (         )        (        )
+                 𝐿𝐷𝑆𝑚𝑎𝑥             𝑊 𝐼𝑤𝑔                                            3:    Calculate 𝑇𝑚𝑖 , 𝑇𝑛𝑖 according to equation (10);
+𝑊 𝐹𝐿 = floor               × ceil                                     (8)
+                 𝐿𝐷𝑆𝑢𝑠𝑒𝑑              64                                             4:    𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝+ = (𝑀𝑖 ∗ 𝑁𝑖 )∕(𝑇𝑚𝑖 ∗ 𝑇𝑛𝑖 );
+                                                                                     5: end for
+where 𝑊 𝐹𝐿 is the maximum number of wavefronts under the limit of
+the size of LDS, 𝐿𝐷𝑆𝑚𝑎𝑥 is the size of LDS in the workgroup, 𝐿𝐷𝑆𝑢𝑠𝑒𝑑                 6:    𝑇 𝐿𝑃𝑛𝑒𝑤 = 𝜑(𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝) × 𝑇𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 ;
+is the size of LDS used by a workgroup, and the expression of 𝑊 𝐼𝑤𝑔                  7:    𝑇 𝑖𝑙𝑒[𝑠𝑖𝑧𝑒] represent to "large" to "small";
+                                                                                     8: while ( 𝑇 𝐿𝑃𝑛𝑒𝑤 >= 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 ) do
+have same meaning as Eq. (5).
+                                                                                     9:    for 𝑗 = 0 to 𝐵𝑠𝑖𝑧𝑒 − 1 do
+    To sum up, the number of wavefronts should meet the limitations
+                                                                                    10:       if 𝑇 𝑖𝑙𝑒[𝑗] is "large" then
+of all the above factors, and the calculation method is as follows.
+                                                                                    11:            Set 𝑇 𝑖𝑙𝑒[𝑗] is "medium-large";
+𝑊 𝐹 = min(𝑊 𝐹𝑤𝑔 , 𝑊 𝐹𝑉 , 𝑊 𝐹𝑆 , 𝑊 𝐹𝐿 , 𝑊 𝐹𝐶 )                            (9)        12:       else if 𝑇 𝑖𝑙𝑒[𝑗] is "medium-large" then
+                                                                                    13:            Set 𝑇 𝑖𝑙𝑒[𝑗] is "medium";
+where 𝑊 𝐹 is the number of activated wavefronts, 𝑊 𝐹𝐶 is the maxi-
+                                                                                    14:       else if 𝑇 𝑖𝑙𝑒[𝑗] is "medium" then
+mum number of wavefront supported in the CU.
+                                                                                    15:            Set 𝑇 𝑖𝑙𝑒[𝑗] is "small-medium";
+    The number of wavefronts and the corresponding number of threads
+                                                                                    16:       else if 𝑇 𝑖𝑙𝑒[𝑗] is "small-medium" then
+are introduced into Eq. (4) to compute the TLP more accurately and
+                                                                                    17:            Set 𝑇 𝑖𝑙𝑒[𝑗] is "small";
+appropriately. Compared to Eq. (3), the former only considers the
+                                                                                    18:       end if
+workload at the workgroup-level, which neglects further conversion
+                                                                                    19:       𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝+ = (𝑀𝑗 ∗ 𝑁𝑗 )∕(𝑇𝑚𝑗 ∗ 𝑇𝑛𝑗 );
+between the workgroup and wavefront at runtime. Eq. (3) is valid only
+                                                                                    20:    end for
+if the following two conditions are satisfied. One is that all thread
+                                                                                    21:    𝑇 𝐿𝑃𝑛𝑒𝑤 = 𝜑(𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝) × 𝑇𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 ;
+computations and data load amounts are consistent. The other one
+                                                                                    22: end while
+is that the hardware resources required for activated wavefront do
+not exceed the limit in the CU. Note that for GEMM with different                      𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 is used as a threshold to ensure parallelism among
+precision, threads have different requirements for computing resources              multiple tiles in fine-tuning phase. Note that 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 has an impor-
+(VGPR, SGPR, LDS) during the computation process. Therefore, for                    tant influence on the selection of tiling scheme for different hardware
+matrices with different precision, the values of 𝑉 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 , 𝑆 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 ,          architectures. As a measure, the TLP values of the batch GEMM vary
+and 𝐿𝐷𝑆𝑢𝑠𝑒𝑑 in Eqs. (6)–(8) above are different. This will affect the               according to the different tiling schemes. The setting of the 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑
+number of activated wavefronts.                                                     value is related to the architecture of the GPU because it uses the
+                                                                                    number of wavefront and the number of threads in the wavefront to
+4.2.2. Tiling fine-tuning                                                           measure the parallelism of the tiling scheme. The hardware resources
+    For batch GEMM, an initial tiling scheme is first assigned to solve             and the maximum number of wavefronts supported by each CU are
+the problem of switching between contexts and low hardware resource                 diverse, so corresponding 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 should be set for different GPU
+utilization caused by the matrix’s variable scale. Then, the tiling scheme          architectures.
+is adjusted according to the TLP estimation of batch GEMM and the                       The specific process of selecting a tiling scheme for batch GEMM
+hardware architecture of GPU, and finally, the best tiling scheme is ob-            is given in Algorithm 1: (1) when batch GEMM is given, an ‘‘initial
+tained. In the first stage, the tile size chosen by each GEMM according             scheme’’ is obtained according to Eq. (10). (2) The TLP of this scheme
+to the dimensions of the matrix should meet the following conditions:               is calculated according to the given batch GEMM and tiling scheme.
+⎧𝑇𝑚𝑖 ≤ 𝑀𝑖 and 𝑀𝑖 𝑚𝑜𝑑 𝑇𝑚𝑖 = 0                                                        (3) Compare the TLP of the current tiling scheme with the 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 .
+⎪
+⎨𝑇𝑛𝑖 ≤ 𝑁𝑖 and 𝑁𝑖 𝑚𝑜𝑑 𝑇𝑛𝑖 = 0                                            (10)        If the TLP is not reached, the fine-tuning operation will be performed,
+⎪                                                                                   and the current tiling scheme will be changed and then returned to
+⎩𝑇𝑘𝑖 ≤ 𝐾𝑖 and 𝐾𝑖 𝑚𝑜𝑑 𝑇𝑘𝑖 = 0
+                                                                                    step (2). If the current TLP is greater than or equal to the threshold,
+where 𝑇𝑚𝑖 and 𝑇𝑛𝑖 represent the size of the tile dimension corresponding            go to step (4). (4) The batch GEMM is calculated according to the final
+to the tiling scheme, and 𝑇𝑘𝑖 is the sub-tile size along the dimension of           tiling scheme. In the above procedures, the TLP is used as an evaluation
+𝐾. There are two issues. (1) After the first phase, batch GEMM is only
+                                                                                    criterion to measure the effectiveness of the tiling scheme on the batch
+an ‘‘initial scheme’’ that cannot achieve optimal parallel computing
+                                                                                    GEMM. If the threshold is not reached, fine-tuning is used to adjust and
+efficiency. (2) Due to the variability of matrix size in batch GEMM, one
+                                                                                    improve the utilization of GPU hardware resources. The optimal tiling
+or several items of 𝐵𝑠𝑖𝑧𝑒 , 𝑀, 𝑁, and 𝐾 values may be particularly small
+                                                                                    scheme can be obtained to ensure an optimal implementation at the
+in batch GEMM, which is called an extreme GEMM case. In this case,
+                                                                                    GEMM and workgroup level. After the final tiling scheme, the multi-
+the ‘‘initial scheme’’ cannot get enough tiles, which will make some CU
+                                                                                    thread kernel is calculated based on the tile size so that the wavefront
+in an idle state, resulting in a waste of GPU computing power.
+                                                                                    and work-item levels can achieve a ‘‘workload balance’’ state.
+    To solve these problems, the ‘‘initial scheme’’ is adjusted reasonably
+and efficiently in the second stage. For the larger-size matrix, smaller                The proposed method is based on the GPU platforms of AMD and
+tiles are used to segment, and the number of tiles is increased by                  NVIDIA for implementation. The hardware characteristics of the GPU
+reducing the tile size to avoid CU being idle. The details are as follows:          platform can also significantly impact GEMM performance. For exam-
+for a GEMM, given an appropriate ‘‘initial scheme’’, to avoid the waste             ple, in AMD and NVIDIA platforms, threads are based on wavefront
+of GPU hardware resources, some larger GEMMs are cut with smaller                   and warp as the basic execution units containing 64 and 32 threads,
+tiles to ensure that the number of tiles is sufficient. For example, for            respectively. The number of threads in the kernel needs to be an integer
+tiles whose initial value is ‘‘64 * 64’’, tiles with ‘‘32 * 32’’ are used for       multiple of the number of threads in wavefront and warp to improve
+segmentation. As a result, the number of tiles increases as the tile size           kernel occupancy. Meanwhile, the size of registers and shared memory
+
+                                                                                7
+Y. Zhang et al.                                                                                                     Journal of Systems Architecture 160 (2025) 103341
+
+Table 4                                                                             set of value ranges. The experimental results were represented by the
+The configuration of platforms for evaluation.
+                                                                                    average value of GFLOPS (Giga Floating-point Operations Per Second),
+ Platform setup                    AMD-platform               NVIDIA-platform       which is calculated as:
+ CPU                               EPYC 7763                  Platinum 8358                       ∑𝑛−1
+                                                                                                        2(𝑀𝑖 × 𝑁𝑖 × 𝐾𝑖 )
+ GPU                               MI210                      A800                  𝐺𝐹 𝐿𝑂𝑃 𝑆 = 𝑖=0                                                   (11)
+ OS                                Ubuntu 20.04               Ubuntu 20.04                          𝑡𝑜𝑡𝑎𝑙_𝑡𝑖𝑚𝑒 ∗ 1.0𝑒9
+ ROCm/CUDA                         ROCm 5.6                   CUDA 12.0             where 𝑀𝑖 , 𝑁𝑖 and 𝐾𝑖 represent the matrix dimension of the 𝑖th GEMM,
+                                                                                    and 𝑡𝑜𝑡𝑎𝑙_𝑡𝑖𝑚𝑒 represents the running time on this GPU, 𝑛 represents
+Table 5                                                                             batch sizes. For simplicity, the experimental data is represented as
+The configuration of GPUs for evaluation.                                           single-precision floating-point data and the storage format is based
+ Name             MI210                           A800                              on the row-first format. The experimental results are averaged over
+ Architecture     CDNA 2.0                        Ampere                            10 consecutive runs. The final experimental results were rounded to
+ Core             1700 MHz                        1410 MHz                          preserve two decimal places.
+ Caches           L1 16 KB (per CU) L2 16 MB      L1 192 KB (per SM) L2 40 MB
+ Memory           64 GB 3.2 Gbps HBM2             80 GB 2.4 Gbps HBM2
+ Bandwidth        1.6 TB/s                        2.04 TB/s                         5.2. Speed up
+
+                                                                                        In the two platforms, we first compare with the default methods
+                                                                                    rocBLAS and cuBLAS. These two methods do not support batch irreg-
+can affect parameter settings during implementation based on different              ular GEMMs; we convert batch GEMMs into multiple single GEMMs
+hardware architectures. Based on this difference, the proposed method               and compute the results. The specific experimental results are shown
+considers parallelism at the wavefront or warp level when performing                in Figs. 4–5. Figs. 4–5 show that the proposed method achieves 5.09×
+matrix segmentation on two GPU platforms. In this way, the proposed                 and 7.18× average speedup compared to rocBLAS and cuBLAS. This
+method can flexibly select tiling schemes based on the hardware char-               result is primarily due to the fact that this method does not sup-
+acteristics of the GPU to achieve optimal performance. In this way,                 port GEMMs of different scales when computing batch GEMMs, so
+the proposed method can avoid exceeding the maximum register limit                  it can only compute one GEMM simultaneously. When faced with
+and prevent data overflow, which improves its applicability for various             a small matrix, the computational resources of the GPU cannot be
+hardware architectures.                                                             fully utilized due to the cost of context switching between multiple
+                                                                                    GEMMs. As the batch size gradually increases, the advantage of the
+5. Evaluation                                                                       proposed method becomes more evident. This shows that for batch
+                                                                                    and irregular GEMMs, rocBLAS and cuBLAS are at a disadvantage in
+5.1. Setup                                                                          terms of computational efficiency and switching between instances.
+                                                                                    Meanwhile, we also compare CUTLASS, which handles batch GEMM,
+    Experiment platform and matrix generation. The overall configu-                 using sorting to solve the problem of significant workload differences
+ration of the experimental platform and the details of the two GPUs                 between multiple matrix multiplications. Fig. 5 shows that the proposed
+are shown in Tables 4 and 5, respectively. To ensure the irregular-                 method has a 4.64× speedup, which is because CUTLASS’s built-in
+ity and variability of the input matrix, the GEMM size parameters                   tiles are unsuitable when the matrix dimensions are small. Therefore,
+𝑀, 𝑁, and 𝐾 are randomly generated within corresponding ranges                      the proposed method performs better acceleration than CUTLASS for
+([𝑀 𝑖𝑛, 𝑀 𝑎𝑥_𝑀(𝑁)] and [𝑀 𝑖𝑛, 𝑀 𝑎𝑥_𝐾]). 𝑀 𝑎𝑥_𝑀, 𝑀 𝑎𝑥_𝑁, and 𝑀 𝑎𝑥_𝐾                  batch, irregular, and small-size matrix multiplication. We then perform
+represent the upper bounds of 𝑀, 𝑁, and 𝐾, respectively. The lower                  a detailed comparison and analysis of the experimental performance
+bound for each experiment is denoted uniformly by 𝑀 𝑖𝑛. In this paper,              based on MAGMA. The proposed method has 4.37× and 3.36× speed
+the value of 𝑀 𝑖𝑛 is set to 16. For example, Max_M(N) = 512 and                     improvement compared to MAGMA. Figs. 4–5 show that the advantage
+Max_K = 128 indicate that the range of matrix dimensions is 𝑀 ∈                     of our method becomes more pronounced as the batch size increases.
+[16, 512], 𝑁 ∈ [16, 512] and 𝐾 ∈ [16, 128]. Thus, multiple sets of                  This is because MAGMA only uses the largest GEMM size in the batch
+matrix dimension ranges can be obtained, and the parameters needed                  GEMM to set grid.x. Due to the irregularity of the matrix size, a
+for GEMM generation are chosen from the different value ranges by                   large number of computational resources in the grid will be idle. The
+random selection.                                                                   proposed method, in this case, employs fine-grained filtering operations
+    Comparison method. First, for the two GPU experimental platforms,               to ensure further efficient utilization of computational resources, which
+the default GEMM processing methods rocBLAS [6] and cuBLAS [7]                      is more evident when the difference between matrix dimensions is
+provided by the respective GPU manufacturers are chosen as the basic                significant.
+comparison methods to demonstrate the effectiveness of the proposed                     As shown in Fig. 4, the proposed method achieves an average
+method. Since these methods do not support the way of batch invo-                   1.88× speedup performance compared to Wang. It is noted that the
+cation, in this paper, rocBLAS and cuBLAS compute batch GEMM in a                   advantage of the proposed method is more pronounced when 𝑀 𝑎𝑥_𝐾
+loop manner. No stream operations are used during the computation.                  and 𝑀 𝑎𝑥_𝑀 are small. For example, in the case of (𝑀 𝑎𝑥_𝑀(𝑁) = 128,
+Meanwhile, we also compared the CUTLASS [23], which supports                        𝑀 𝑎𝑥_𝐾 = 128), the average speedup can reach 1.95×. This is mainly
+batch GEMM based on sorting and built-in tiles. We then compare                     due to the fact that when the dimension of matrix is small, there are
+with MAGMA [8] supported by the University of Tennessee ICL Lab,                    not enough tiles to cover the time consumption of data loading in the
+which only extends the 𝑔 𝑟𝑖𝑑 .𝑧 to support batch GEMM but does not                  wavefront, which is more pronounced in workgroups with heavy loads.
+have a fine-grained optimization strategy. The MAGMA comparison                     The proposed method adjusts the wavefront workload corresponding
+experiments were run on two GPU platforms. Meanwhile, to show the                   to the tiles through a multi-thread kernel and ensures consistent com-
+advancement of our proposed method, we compare with the state-of-                   putation and data loading by different workgroups. At the same time,
+the-art methods such as Wang [36] and Li [21] on their respective                   it has also shown that the state of load and computation balancing
+platforms. All of the above methods perform a warp-up operation to                  between wavefronts is more conducive to improving the efficiency of
+eliminate the effect of the first kernel boot.                                      GPU parallel computing. In the NVIDIA platform, Fig. 5 shows that the
+    Evaluation criteria. In the following experiments, there are 12 sets            proposed method has average 1.94× speedup performance compared to
+of GEMM dimension ranges. The experiments with batch sizes 8, 16,                   Li. The advantage of the proposed method becomes clearer as the batch
+32, 64, 128, and 256 were run continuously for ten epochs under each                size increases. There are two reasons for this speedup performance :
+
+                                                                                8
+Y. Zhang et al.                                                                                                  Journal of Systems Architecture 160 (2025) 103341
+
+
+
+
+                          Fig. 4. The comparative results on MI210. (5.09×, 4.37×, 1.88× speedup over rocBLAS, MAGMA, Wang).
+
+
+
+
+                  Fig. 5. The comparative results with on A800. (7.18×, 4.64×, 3.63×, 1.94× speedup over cuBLAS, CUTLASS, MAGMA, Li).
+
+
+
+                                                                           9
+Y. Zhang et al.                                                                                                   Journal of Systems Architecture 160 (2025) 103341
+
+
+
+
+                                                     Fig. 6. The kernel occupancy on two GPU platforms.
+
+
+
+
+                                                        Fig. 7. The time overhead of tiling algorithm.
+
+
+(1) Li et al. used batching to balance the workload among different               wavefronts and 𝑁 𝑢𝑚_𝑡𝑜𝑡𝑎𝑙 is the theoretical number of wavefronts that
+blocks but did not consider the difference between the workload of                CU can execute simultaneously. 𝑁 𝑢𝑚_𝑎𝑐 𝑡𝑖𝑣𝑒𝑑 and 𝑁 𝑢𝑚_𝑡𝑜𝑡𝑎𝑙 represent
+threads in different tiles. (2) When selecting the tiling scheme, the TLP         the number of warps in activation and the number of warps that are
+is calculated only by considering the block, and the fine-grained warp            theoretically parallelizable simultaneously in the NVIDIA platform.
+level is neglected, which leads to the inaccurate calculation of TLP. The             The results of the experiment are shown in Fig. 6. By comparing
+proposed method adjusts the wavefront workload corresponding to the               rocBLAS and cuBLAS, it can be seen that the proposed method has a
+tiles through a multi-thread kernel and ensures consistent computation            clear advantage in the case of batch GEMM. The proposed method is
+and data loading by different workgroups. At the same time, it has                also in the best position compared to the other methods (CUTLASS,
+also shown that the state of load and computation balancing between               MAGMA, Wang, Li), showing high efficiency in terms of utilization of
+wavefronts is more conducive to improving the efficiency of GPU                   GPU resources. As shown in Fig. 6, the proposed method consistently
+parallel computing.                                                               maintains the optimal kernel occupancy on both GPU platforms, which
+                                                                                  indicates that the proposed method can better exploit the computing
+                                                                                  power of the GPU.
+5.3. Kernel occupancy
+
+                                                                                  5.4. The overhead of tiling algorithm
+   To explore the difference between the proposed method and the
+comparison methods in terms of GPU resource utilization, we present
+                                                                                      This section presents the proportion of the runtime that is taken
+the kernel occupancy of the various methods on two GPU platforms.
+                                                                                  up by the tiling algorithm when executing the proposed method on
+The formula for kernel occupancy can be expressed as:
+                                                                                  two different GPU platforms with various batch sizes. The experimental
+                    𝑁 𝑢𝑚_𝑎𝑐 𝑡𝑖𝑣𝑒𝑑
+kernel occupancy =                                             (12)               results are presented in Fig. 7. From Fig. 7, it is evident that the tiling
+                     𝑁 𝑢𝑚_𝑡𝑜𝑡𝑎𝑙
+                                                                                  algorithm’s runtime percentage decreases as the batch size increases.
+To obtain more accurate performance metrics, we utilize Omniperf5                 When batch size is 8, the runtime of the tiling algorithm on the two
+and Nsight6 commands, profiling tools provided by AMD and NVIDIA,                 GPU platforms is 6.06% and 6.37%, respectively. As the batch size
+to evaluate the resource utilization of the kernel during the execution           increases, more and more GEMMs are executed on the GPU, and the
+process. The kernel occupancy has distinct interpretations owing to               execution time of these GEMMs on the GPU side takes up most of the
+the distinctions in GPU architecture between AMD MI210 and NVIDIA                 time, resulting in a smaller runtime portion of the tiling algorithm.
+A800. On the AMD platform, 𝑁 𝑢𝑚_𝑎𝑐 𝑡𝑖𝑣𝑒𝑑 is the number of activated               For example, with a batch size is 1024, the tiling algorithm takes less
+                                                                                  than 1% of the runtime. The experimental results on two GPUs indicate
+                                                                                  that the time overhead of the tiling algorithm in the batch GEMM
+  5
+      https://github.com/ROCm/omniperf                                            execution process is negligible, especially when the batch size is large.
+  6
+      https://docs.nvidia.com/nsight-compute/NsightCompute/index.html             In real-world scenarios such as deep learning, where a large number of
+
+                                                                             10
+Y. Zhang et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103341
+
+
+
+
+                                  Fig. 8. The performance improvement of the proposed TLP on MI210. (1.077× average speedup).
+
+
+GEMM operations are often required, the tiling algorithm will have less            4.53×, and 1.62× compared to rocBLAS, MAGMA, and Wang, respec-
+overhead in the execution process.                                                 tively. The proposed method has the lowest latency performance on
+                                                                                   MI210, indicating higher computational efficiency and can effectively
+5.5. The performance benefits of the proposed TLP                                  reduce latency. On A800, the proposed method showed performance
+                                                                                   improvements of 3.02×, 2.59×, 2.45×, and 1.89× compared to cuBLAS,
+    This section presents the comparative experimental results on two              MAGMA, CUTLASS, and Li, respectively. Fig. 10 shows that as the
+GPU platforms to provide a more detailed evaluation of the proposed                batch size gradually increases, the kernel latency increases on both
+TLP. The detailed experimental results are shown in Figs. 8–9. From                GPU platforms. rocBLAS and cuBLAS have the highest latency as the
+Figs. 8–9, it is clear that the proposed TLP performs better overall than          batch size increases. This phenomenon is because the traditional loop
+traditional TLP. The proposed methods have a speedup of 1.077× and                 scheduling method significantly increases latency consumption due to
+1.085× on MI210 and A800, respectively. From Fig. 8, the proposed                  context switching between kernels when the batch size is large. From
+method significantly improves performance when the batch size is                   Fig. 10, it can be seen that some methods exhibit different latency
+larger. For example, on MI210, the proposed method has an average                  performances at various batch sizes. For example, when batch size
+speedup of 1.04× when batch size <= 16. When batch size >= 32, the                 <= 16, MAGMA has the highest latency performance on two GPU
+proposed method can improve performance by 1.10×. The performance                  platforms. When the batch size is large, its computational performance
+improvement gap is because when the batch size and matrix dimension                improves, indicating that the MAGMA performs better when there are
+are small, it is difficult to utilize hardware resources fully. When there         many matrices. The experimental results on two platforms show that
+are a large number of tiles, the proposed TLP can more accurately                  the proposed method has the lowest latency under various batch sizes,
+evaluate the thread’s workload and select the optimal tiling scheme.               indicating better performance and broad applicability.
+The same performance trend is also reflected in the A800 platform. On
+A800, the proposed TLP has performance improvements of 1.04× and
+1.11× when batch size <= 16 and batch size >= 32, respectively. The                5.7. The improved performance on inception layers of CNN
+effectiveness of the proposed TLP can be further demonstrated through
+comparative experiment results on two GPU platforms.                                   Modern CNN model architectures often have multiple branches to
+                                                                                   capture features at different scales. Convolution operations of differ-
+5.6. The latency                                                                   ent scales in each branch can be represented as batch GEMM oper-
+                                                                                   ations with various dimensions, e.g. GoogleNet [13], DenseNet [50],
+   This section compares kernel latency on two GPU platforms to                    SqueezeNet [12], etc. To demonstrate the effectiveness of the proposed
+provide a more detailed evaluation of the proposed method. We mea-                 method in real-world scenarios, we use various Inception module as
+sured kernel latency with different batch sizes in the comparative                 a typical application to perform the forward computation process on
+experiment. The detailed experimental results are shown in Fig. 10.                two GPU platforms. The Inception module involves a large number of
+On MI210, the proposed method has a latency reduction of 3.87×,                    irregular, small-size GEMM operations. The deep learning frameworks
+
+                                                                              11
+Y. Zhang et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103341
+
+
+
+
+                                   Fig. 9. The performance improvement of the proposed TLP on A800. (1.085× average speedup).
+
+
+
+
+                                              Fig. 10. The latency performance of the kernel on two GPU platforms.
+
+
+MIOpen7 and cuDNN8 are used as benchmark implementations on                        the other Inception module, and the dimensions of these matrices are
+both GPU platforms. In this section, we select several commonly used               smaller than the former two. Finally, the proposed method has been
+Inception modules to evaluate the proposed method’s speedup perfor-                proven to significantly accelerate CNN models with various branch
+mance. The GEMM sizes in Inception modules are shown in Table 6.                   structures on two different GPU platforms, particularly in scenarios
+Fig. 11 shows the speedup performance of the proposed method in each               involving multiple branches, irregular shapes, and small dimensions.
+Inception module. As shown in Fig. 11, the average speedups are 2.88×
+and 1.87× respectively. The gray boxes represent the average speedup               6. Conclusion
+ratios of the different Inception modules in Fig. 11. The experimental
+results suggest that the Inception 8–9 series has the highest average                  In this paper, we propose a load-balanced batch GEMM acceleration
+speedup ratio (3.68× and 2.66× respectively) among the Inception                   method for the problem of low parallel computing efficiency and poor
+modules, because Inception 8–9 has more matrix shapes compared to                  hardware resource utilization in batch, irregular, and variable matrix
+                                                                                   multiplication scenarios. The kernel occupancy and hardware resource
+                                                                                   utilization can be effectively improved by a multi-thread kernel design
+  7
+      https://github.com/ROCm/MIOpen                                               that balances the computational and data load in the work-item. A
+  8
+      https://github.com/NVIDIA/cudnn-frontend                                     novel approach to TLP computation is devised, where the parallelism of
+
+                                                                              12
+Y. Zhang et al.                                                                                                                   Journal of Systems Architecture 160 (2025) 103341
+
+
+
+
+                                                      Fig. 11. The speedup performance on Inception layers.
+
+
+                              Table 6
+                              The size of GEMM in various Inception modules.
+                               Inception module                GEMM size (M ×N × K)
+                               Inception-1                     784 × 96 × 192, 784 × 64 × 192, 784 × 32 × 192, 784 × 16 × 192
+                               Inception-2                     784 × 64 × 192, 784 × 32 × 192, 784 × 128 × 192
+                               Inception-3                     196 × 192 × 192, 196 × 16 × 192, 196 × 96 × 192, 196 × 64 × 192
+                               Inception-4                     196 × 64 × 192, 196 × 24 × 192, 196 × 160 × 192
+                               Inception-5                     196 × 64 × 192, 196 × 128 × 192, 196 × 24 × 192
+                               Inception-6                     196 × 112 × 192, 196 × 144 × 192, 196 × 32 × 192, 196 × 64 × 192
+                               Inception-7                     196 × 256 × 192, 196 × 160 × 192, 196 × 128 × 192
+                               Inception-8                     49 × 160 × 192, 49 × 128 × 192, 49 × 256 × 192, 49 × 160 × 192, 49 × 32 × 192
+                               Inception-9                     49 × 192 × 192, 49 × 128 × 192, 49 × 384 × 192, 49 × 192 × 192, 49 × 48 × 192
+
+
+
+the tiling scheme is measured by the number of activated wavefronts.                  References
+This approach allows the optimal tiling scheme to be selected based on
+different GPU architectures. Experiments are conducted on two GPU                      [1] P. Valero-Lara, I. Jorquera, F. Lui, J. Vetter, Mixed-precision S/DGEMM using
+                                                                                           the TF32 and TF64 frameworks on low-precision AI tensor cores, in: Proceedings
+platforms to validate the effectiveness and progress of our proposed
+                                                                                           of the SC’23 Workshops of the International Conference on High Performance
+method.                                                                                    Computing, Network, Storage, and Analysis, 2023, pp. 179–186.
+    Future work includes exploring batch GEMM with various preci-                      [2] H. Martínez, S. Catalán, A. Castelló, E.S. Quintana-Ortí, Parallel GEMM-based
+sion performances. With the development of Transformer-based, many                         convolutions for deep learning on multicore ARM and RISC-V architectures, J.
+GEMM operations are involved in the training and inference process                         Syst. Archit. (2024) 103186.
+                                                                                       [3] J. Fornt, P. Fontova-Musté, M. Caro, J. Abella, F. Moll, J. Altet, C. Studer, An
+of Large Language Models (LLMs), which often have lower accuracy,                          energy-efficient gemm-based convolution accelerator with on-the-fly im2col, IEEE
+such as FP16, FP8, etc. For example, quantized LLMs often involve                          Trans. Very Large Scale Integr. (VLSI) Syst. 31 (11) (2023) 1874–1878.
+GEMM operations where the weight matrices and activation values                        [4] H. Kim, W.J. Song, Las: locality-aware scheduling for GEMM-accelerated
+have different precisions, e.g. W4A16, W8A8. More complex precisions                       convolutions in GPUs, IEEE Trans. Parallel Distrib. Syst. 34 (5) (2023)
+                                                                                           1479–1494.
+and storage formats pose challenges to the performance of GEMM                         [5] W. Yang, J. Fang, D. Dong, X. Su, Z. Wang, Optimizing full-spectrum matrix
+operations.                                                                                multiplications on ARMv8 multi-core CPUs, IEEE Trans. Parallel Distrib. Syst.
+                                                                                           (2024).
+CRediT authorship contribution statement                                               [6] AMD, Next generation BLAS implementation for ROCm platform, 2024, https:
+                                                                                           //github.com/ROCm/rocBLAS.
+                                                                                       [7] B. Tuomanen, Hands-On GPU Programming with Python and CUDA: Explore
+    Yu Zhang: Writing – review & editing, Writing – original draft. Lu                     High-Performance Parallel Computing with CUDA, Packt Publishing Ltd, 2018.
+Lu: Writing – review & editing, Supervision. Zhanyu Yang: Writing                      [8] ICL, Matrix algebra for GPU and multicore architectures, 2024, https://icl.utk.
+– review & editing. Zhihong Liang: Supervision, Conceptualization.                         edu/magma/.
+                                                                                       [9] T. Faingnaert, T. Besard, B. De Sutter, Flexible performant GEMM kernels on
+Siliang Suo: Supervision, Conceptualization.
+                                                                                           GPUs, IEEE Trans. Parallel Distrib. Syst. 33 (9) (2021) 2230–2248.
+                                                                                      [10] W.S. Moses, I.R. Ivanov, J. Domke, T. Endo, J. Doerfert, O. Zinenko, High-
+Declaration of competing interest                                                          performance gpu-to-cpu transpilation and optimization via high-level parallel
+                                                                                           constructs, in: Proceedings of the 28th ACM SIGPLAN Annual Symposium on
+    The authors declare that they have no known competing finan-                           Principles and Practice of Parallel Programming, 2023, pp. 119–134.
+                                                                                      [11] H. Kim, H. Nam, W. Jung, J. Lee, Performance analysis of CNN frameworks
+cial interests or personal relationships that could have appeared to
+                                                                                           for GPUs, in: 2017 IEEE International Symposium on Performance Analysis of
+influence the work reported in this paper.                                                 Systems and Software, ISPASS, IEEE, 2017, pp. 55–64.
+                                                                                      [12] F.N. Iandola, S. Han, M.W. Moskewicz, K. Ashraf, W.J. Dally, K. Keutzer,
+Acknowledgments                                                                            SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB
+                                                                                           model size, 2016, arXiv preprint arXiv:1602.07360.
+                                                                                      [13] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V.
+   This work was supported by the Natural Science Foundation of                            Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: Proceedings of
+Guangdong Province (2024A1515010204) and the Technological Re-                             the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp.
+search Project of Southern Power Grid Company (ZBKJXM20232483).                            1–9.
+                                                                                      [14] G. Pant, D. Yadav, A. Gaur, ResNeXt convolution neural network topology-based
+                                                                                           deep learning model for identification and classification of pediastrum, Algal Res.
+Data availability
+                                                                                           48 (2020) 101932.
+                                                                                      [15] S. Barrachina, M.F. Dolz, P. San Juan, E.S. Quintana-Ortí, Efficient and
+    No data was used for the research described in the article.                            portable GEMM-based convolution operators for deep neural network training
+                                                                                           on multicore processors, J. Parallel Distrib. Comput. 167 (2022) 240–254.
+
+
+                                                                                 13
+Y. Zhang et al.                                                                                                                     Journal of Systems Architecture 160 (2025) 103341
+
+
+[16] S. Rajbhandari, Y. He, O. Ruwase, M. Carbin, T. Chilimbi, Optimizing cnns on              [35] G. Alaejos, A. Castelló, H. Martínez, P. Alonso-Jordá, F.D. Igual, E.S. Quintana-
+     multicores for scalability, performance and goodput, ACM SIGARCH Comput.                       Ortí, Micro-kernels for portable and efficient matrix multiplication in deep
+     Archit. News 45 (1) (2017) 267–280.                                                            learning, J. Supercomput. 79 (7) (2023) 8124–8147.
+[17] C. Rivera, J. Chen, N. Xiong, S.L. Song, D. Tao, Ism2: Optimizing irregular-shaped        [36] R. Wang, Z. Yang, H. Xu, L. Lu, A high-performance batched matrix multiplica-
+     matrix-matrix multiplication on gpus, 2020, arXiv preprint arXiv:2002.03258.                   tion framework for gpus under unbalanced input distribution, J. Supercomput.
+[18] K. Matsumoto, N. Nakasato, S.G. Sedukhin, Performance tuning of matrix                         78 (2) (2022) 1741–1758.
+     multiplication in opencl on different gpus and CPUs, in: 2012 SC Companion:               [37] Y. Zhang, Y. Wang, Z. Mo, Y. Zhou, T. Sun, G. Xu, C. Xing, L. Yang, Accelerating
+     High Performance Computing, Networking Storage and Analysis, IEEE, 2012, pp.                   small matrix multiplications by adaptive batching strategy on GPU, in: 2022
+     396–405.                                                                                       IEEE 24th Int Conf on High Performance Computing & Communications; 8th
+[19] G.E. Moon, H. Kwon, G. Jeong, P. Chatarasi, S. Rajamanickam, T. Krishna, Eval-                 Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int
+     uating spatial accelerator architectures with tiled matrix-matrix multiplication,              Conf on Dependability in Sensor, Cloud & Big Data Systems & Application,
+     IEEE Trans. Parallel Distrib. Syst. 33 (4) (2021) 1002–1014.                                   HPCC/DSS/SmartCity/DependSys, IEEE, 2022, pp. 882–887.
+[20] Q. Han, H. Yang, M. Dun, Z. Luan, L. Gan, G. Yang, D. Qian, Towards                       [38] A. Abdelfattah, S. Tomov, J. Dongarra, Matrix multiplication on batches of small
+     efficient tile low-rank GEMM computation on sunway many-core processors, J.                    matrices in half and half-complex precisions, J. Parallel Distrib. Comput. 145
+     Supercomput. 77 (5) (2021) 4533–4564.                                                          (2020) 188–201.
+[21] X. Li, Y. Liang, S. Yan, L. Jia, Y. Li, A coordinated tiling and batching                 [39] A. Abdelfattah, A. Haidar, S. Tomov, J. Dongarra, Novel HPC techniques to batch
+     framework for efficient GEMM on GPUs, in: Proceedings of the 24th Symposium                    execution of many variable size BLAS computations on GPUs, in: Proceedings of
+     on Principles and Practice of Parallel Programming, 2019, pp. 229–241.                         the International Conference on Supercomputing, 2017, pp. 1–10.
+[22] P. Tillet, D. Cox, Input-aware auto-tuning of compute-bound HPC kernels, in:              [40] A. Abdelfattah, A. Haidar, S. Tomov, J. Dongarra, Performance, design, and
+     Proceedings of the International Conference for High Performance Computing,                    autotuning of batched GEMM for GPUs, in: High Performance Computing: 31st
+     Networking, Storage and Analysis, 2017, pp. 1–12.                                              International Conference, ISC High Performance 2016, Frankfurt, Germany, June
+[23] NVIDIA, CUDA templates for linear algebra subroutines, 2024, https://github.                   19-23, 2016, Proceedings, Springer, 2016, pp. 21–38.
+     com/NVIDIA/cutlass.                                                                       [41] A. Li, G.-J. van den Braak, H. Corporaal, A. Kumar, Fine-grained synchronizations
+[24] J. Huang, C.D. Yu, R.A.v.d. Geijn, Strassen’s algorithm reloaded on GPUs, ACM                  and dataflow programming on GPUs, in: Proceedings of the 29th ACM on
+     Trans. Math. Softw. 46 (1) (2020) 1–22.                                                        International Conference on Supercomputing, 2015, pp. 109–118.
+[25] B. Boyer, J.-G. Dumas, C. Pernet, W. Zhou, Memory efficient scheduling of                 [42] J. Li, H. Ye, S. Tian, X. Li, J. Zhang, A fine-grained prefetching scheme
+     strassen-winograd’s matrix multiplication algorithm, in: Proceedings of the 2009               for DGEMM kernels on GPU with auto-tuning compatibility, in: 2022 IEEE
+     International Symposium on Symbolic and Algebraic Computation, 2009, pp.                       International Parallel and Distributed Processing Symposium, IPDPS, IEEE, 2022,
+     55–62.                                                                                         pp. 863–874.
+[26] A. Fawzi, M. Balog, A. Huang, T. Hubert, B. Romera-Paredes, M. Barekatain,                [43] Z. Yang, L. Lu, R. Wang, A batched GEMM optimization framework for deep
+     A. Novikov, F.J. R Ruiz, J. Schrittwieser, G. Swirszcz, et al., Discovering faster             learning, J. Supercomput. 78 (11) (2022) 13393–13408.
+     matrix multiplication algorithms with reinforcement learning, Nature 610 (7930)           [44] H. Mei, H. Qu, J. Sun, Y. Gao, H. Lin, G. Sun, GPU occupancy prediction of
+     (2022) 47–53.                                                                                  deep learning models using graph neural network, in: 2023 IEEE International
+[27] G. Xiao, C. Yin, T. Zhou, X. Li, Y. Chen, K. Li, A survey of accelerating parallel             Conference on Cluster Computing, CLUSTER, IEEE, 2023, pp. 318–329.
+     sparse linear algebra, ACM Comput. Surv. 56 (1) (2023) 1–38.                              [45] I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, J.
+[28] Y. Chen, G. Xiao, K. Li, F. Piccialli, A.Y. Zomaya, fgSpMSpV: A fine-grained                   Dongarra, Algorithms and optimization techniques for high-performance matrix-
+     parallel SpMSpV framework on HPC platforms, ACM Trans. Parallel Comput. 9                      matrix multiplications of very small matrices, Parallel Comput. 81 (2019)
+     (2) (2022) 1–29.                                                                               1–21.
+[29] Y. Chen, G. Xiao, W. Yang, Optimizing partitioned CSR-based SpGEMM on the                 [46] G. Park, B. Park, M. Kim, S. Lee, J. Kim, B. Kwon, S.J. Kwon, B. Kim, Y. Lee,
+     sunway TaihuLight, Neural Comput. Appl. 32 (10) (2020) 5571–5582.                              D. Lee, Lut-gemm: Quantized matrix multiplication based on luts for efficient
+[30] Y. Chen, K. Li, W. Yang, G. Xiao, X. Xie, T. Li, Performance-aware model for                   inference in large-scale generative language models, 2022, arXiv preprint arXiv:
+     sparse matrix-matrix multiplication on the sunway taihulight supercomputer,                    2206.09557.
+     IEEE Trans. Parallel Distrib. Syst. 30 (4) (2018) 923–938.                                [47] B. Feng, Y. Wang, G. Chen, W. Zhang, Y. Xie, Y. Ding, EGEMM-TC: accelerating
+[31] G. Xiao, K. Li, Y. Chen, W. He, A.Y. Zomaya, T. Li, Caspmv: A customized                       scientific computing on tensor cores with extended precision, in: Proceedings
+     and accelerative spmv framework for the sunway taihulight, IEEE Trans. Parallel                of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel
+     Distrib. Syst. 32 (1) (2019) 131–146.                                                          Programming, 2021, pp. 278–291.
+[32] G. Xiao, C. Yin, Y. Chen, M. Duan, K. Li, Efficient utilization of multi-threading        [48] G. Shobaki, A. Kerbow, S. Mekhanoshin, Optimizing occupancy and ILP on the
+     parallelism on heterogeneous systems for sparse tensor contraction, IEEE Trans.                GPU using a combinatorial approach, in: Proceedings of the 18th ACM/IEEE
+     Parallel Distrib. Syst. (2024).                                                                International Symposium on Code Generation and Optimization, 2020, pp.
+[33] D.E. Tanner, Tensile: Auto-tuning gemm gpu assembly for all problem sizes,                     133–144.
+     in: 2018 IEEE International Parallel and Distributed Processing Symposium                 [49] A.B. Hayes, L. Li, D. Chavarría-Miranda, S.L. Song, E.Z. Zhang, Orion: A
+     Workshops, IPDPSW, IEEE, 2018, pp. 1066–1075.                                                  framework for gpu occupancy tuning, in: Proceedings of the 17th International
+[34] S. Wang, FlexGEMM: A flexible micro-kernel generation framework, in: Proceed-                  Middleware Conference, 2016, pp. 1–13.
+     ings of the 5th International Conference on Computer Information and Big Data             [50] G. Huang, S. Liu, L. Van der Maaten, K.Q. Weinberger, Condensenet: An
+     Applications, 2024, pp. 164–170.                                                               efficient densenet using learned group convolutions, in: Proceedings of the IEEE
+                                                                                                    Conference on Computer Vision and Pattern Recognition, 2018, pp. 2752–2761.
+
+
+
+
+                                                                                          14
+
--- a/papers_txt/A-multi-criteria-process-for-IT-project-success-evaluat_2026_Computer-Standa.txt
+++ b/papers_txt/A-multi-criteria-process-for-IT-project-success-evaluat_2026_Computer-Standa.txt
@@ -0,0 +1,943 @@
+                                                                   Computer Standards & Interfaces 97 (2026) 104122
+
+
+                                                                        Contents lists available at ScienceDirect
+
+
+                                                            Computer Standards & Interfaces
+                                                                journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+A multi-criteria process for IT project success evaluation–Addressing a
+critical gap in standard practices
+João Carlos Lourenço a , João Varajão b,*
+a
+    CEGIST, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal
+b
+    Centro ALGORITMI, Universidade do Minho, Campus de Azurém, 4804-533 Guimarães, Portugal
+
+
+
+
+A R T I C L E I N F O                                         A B S T R A C T
+
+Keywords:                                                     The evaluation of project success is widely recognised as valuable for improving IT (Information Technology)
+Project success                                               project performance and impact. However, many processes fail to adequately address the requirements for a
+Project evaluation                                            sound evaluation due to their inherent complexity or by not complying with fundamental practical and theo
+Multi-criteria evaluation
+                                                              retical concepts. This paper presents a process that combines a problem structuring method with a multi-criteria
+MACBETH
+Process
+                                                              decision analysis approach to evaluate the success of IT projects. Put into practice in the context of a software
+Methodology                                                   development project developed for a leading global supplier of technology and services, it offers a new way of
+                                                              creating a model for evaluating project success and tackling uncertainty, bringing clarity and consistency to the
+                                                              overall assessment process. A strong advantage of this process is that it is theoretically sound and can be easily
+                                                              applied to other evaluation problems involving other criteria. It also serves as a call to action for the development
+                                                              of formal standards in evaluation processes. Practical pathways to achieve such standardization include
+                                                              collaboration through industry consortia, development and adoption of ISO frameworks, and embedding eval
+                                                              uation processes within established maturity models. These pathways can foster consistency, comparability, and
+                                                              continuous improvement across organizations, paving the way for more robust and transparent evaluation
+                                                              practices.
+
+
+
+
+1. Introduction                                                                                         Additionally, several errors identified by decision analysis literature
+                                                                                                    [12,13] are often made, generating meaningless project success evalu
+    The sustainable success of virtually any organisation is strongly                               ations [14]. Some common mistakes involve not including relevant
+associated with the success of its projects [1]. A key factor for project                           criteria in the evaluation model, not distinguishing the performance of a
+success is that project managers clearly understand what success means                              project from its value, assigning weights to evaluation criteria without
+[2], which is usually not the case [3]. Despite different notions about                             considering the ranges of variation of their performance scales, and
+what constitutes “project success” and the many criteria that can be used                           making calculations that violate measurement scales’ properties. In
+for evaluation (e.g., cost, time, and performance, among others) [4], a                             other words, such evaluations are inconsistent with multi-attribute
+project must satisfy its clients to be considered successful [5–8].                                 value theory (MAVT) and value measurement foundations.
+    Given the importance and complexity of the evaluation of projects,                                  Considering these limitations, this research proposes a process that
+companies should define and implement systematic processes for eval                                combines a problem structuring method with a multi-criteria approach
+uating success to improve project management performance and the                                    for evaluating the success of information technology (IT) projects sup
+impact of deliverables [9]. However, despite the models and techniques                              ported by a real-world case. This process was developed and applied in
+that are currently available for assessing project success, they are typi                          the context of a project of GlobalSysMakers (for confidentiality reasons,
+cally challenging to implement for a variety of reasons, notably the                                the name of the company herein is anonymized), a leading global sup
+complexity caused by using multiple and often conflicting objectives (e.                            plier of technology and services.
+g., minimise cost and maximise quality), the scarcity of empirical studies                              In the GlobalSysMakers project, the need for a new process arose
+reporting their genuine use in projects [10], and the fact that practices                           because the project management team felt that the scoring model
+employed in companies are generally informal and simplistic [11].                                   initially defined for success assessment, while helpful, lacked accuracy.
+
+
+    * Corresponding author.
+      E-mail address: varajao@dsi.uminho.pt (J. Varajão).
+
+https://doi.org/10.1016/j.csi.2025.104122
+Received 12 August 2025; Received in revised form 7 November 2025; Accepted 23 December 2025
+Available online 24 December 2025
+0920-5489/© 2025 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
+J.C. Lourenço and J. Varajão                                                                                      Computer Standards & Interfaces 97 (2026) 104122
+
+
+Following an appraisal of several methodological alternatives, a new              weights of several stakeholders without a discussion obliterates their
+multi-criteria approach combined with a problem structuring method                individual differences [26]. Additionally, the “importance of the
+was shown to be the best solution, providing the required precision and           criteria” should consider their respective performance ranges; other
+transparency to the process, along with a better understanding of the             wise, the resulting weights would be arbitrary [27].
+real meaning of the relative importance of each evaluation criterion.                 Basar [28] proposes a methodology to evaluate the performance of IT
+This paper describes the process developed in detail so that it can be            projects in a fuzzy environment. She first identifies the evaluation
+replicated in other projects. Also, the results are presented and dis            criteria using the balanced scorecard method. Second, she determines
+cussed, including contributions to theory and practice.                           the criteria weights with expert judgments and hesitant fuzzy weights.
+    The proposed process, which combines a problem structuring                    Then, the weights are used to evaluate the performance of IT projects in
+method with a multi-criteria approach for evaluating IT project success,          a Turkish company. The weighting process described in this paper is
+offers several theoretical implications. First, it advances the conceptu         difficult for a non-expert evaluator to understand. Additionally, the
+alization of project success by integrating both subjective stakeholder           quantitative performances of projects on the criteria are systematically
+perspectives and objective performance criteria, addressing the multi            normalised to scores between 0 and 1 with a linear transformation that
+dimensional and context-dependent nature of success in IT projects.               may not correspond to the preferences of evaluators (which may be
+Second, it contributes to decision theory and project management                  non-linear). The paper does not explain how to address the evaluation of
+literature by demonstrating how problem structuring methods—typi                 the qualitative criteria.
+cally underutilized in IT evaluation—can enhance the clarity and rele                Ismail [29] applies the Delphi method and conducts a seminar with
+vance of criteria selection and prioritization. Third, the integration of         experts to identify a construction project’s potential evaluation criteria
+these methodologies provides a foundation for developing more robust,             and group them into clusters. A relative importance index is calculated
+transparent, and adaptable evaluation frameworks, which can inform                for each criterion with a weighted average of the responses to a survey
+future theoretical models and empirical studies. Ultimately, this                 expressed on a Likert scale. In a subsequent step, the experts 1) reduced
+research supports the movement toward standardization by offering a               the number of clusters and criteria and 2) assigned the same weight to
+replicable and theoretically grounded process that can be refined and             the latter. Then, a priority index was calculated for each criterion with
+generalized across different organizational and project contexts.                 the Priority Evaluation Model (PEM) [30], which combines the “satis
+    The remainder of this paper is organised as follows. Section 2 briefly        faction” rate (assigned by the experts) and the “importance” of the cri
+reviews previous related work on project evaluation methods, cases, and           terion. The overall project success is obtained with a weighted sum of
+multi-criteria evaluation methods. Section 3 describes the case context           the averages of the priority indexes obtained on each cluster and the
+and the development of the success evaluation model using a process               clusters’ weights. However, the paper does not explain how these
+that combines a problem structuring model with a multi-criteria deci             weights were assessed. Additionally, the Likert scale classifications
+sion analysis approach. Section 4 discusses the results obtained. Finally,        cannot be used for calculating averages or other arithmetic calculations.
+Section 5 presents the conclusions and avenues for further work.                      Nguvulu et al. [31] use a Deep Belief Network (DBN) to evaluate eight
+                                                                                  IT projects’ performances after training the DBN with five projects of 12
+2. Previous related work                                                          months duration. The DPN automatically assigned weights and scores to
+                                                                                  the criteria, considering possible interactions between them. The au
+2.1. Success of projects                                                          thors stress the advantage of this approach by not considering human
+                                                                                  subjectivity. However, from our point of view, this is a weakness
+    Evaluation can be defined as the assessment and analysis of the ef           because the subjective preferences of project managers, clients, and
+ficiency and effectiveness of the project’s activities and results. The           other stakeholders should be considered in an evaluation process to
+evaluation looks at what is planned to do, what has been achieved, and            avoid arbitrary results generated by inadequate analytical approaches.
+how it has been achieved [15]. Kahan and Goodstadt [16] conceive                      Wohlin and Andrews [32] apply principal component analysis and
+evaluation as a set of questions and methods properly articulated to              subjective evaluation factors to estimate which projects are successful or
+review processes, activities, and strategies to achieve better results.           unsuccessful out of a set of projects. This statistical approach may be
+Therefore, the purpose of an evaluation is not just to find out what              used to identify key project characteristics, but it does not allow for
+happened but to use that information to make the project better [17,18].          evaluating the project’s success according to stakeholders’ preferences.
+    There are several evaluation approaches in the literature, some                   Yan [33] suggests the combined use of the balanced scorecard (BSC)
+considerably complex regarding their practical operationalisation and             [34], the Analytic Hierarchy Process (AHP), and the Fuzzy Comprehensive
+use. Varajão et al. [10] present a comprehensive review of models and            Analysis method (FCA), respectively, to construct a performance criteria
+methods for evaluating information systems project success. Some ex              system, assess the criteria weights, and obtain an overall evaluation
+amples are described and analysed next.                                           score. The author explains how to obtain the performance criteria sys
+    Bannerman and Thorogood [19] propose a framework for defining IT              tem, but does not explain the weighting and scoring components.
+project success that provides a common language for communication                     Yang et al. [35] apply a multi-criteria model for evaluating a soft
+and compares what stakeholders perceive as important. The authors list            ware development project’s success using the Analytical Network Process
+the criteria that should be used to assess the success of a project within        (ANP) [36] to assess the criteria weights at several hierarchical levels.
+five domains (process, project management, product, business, and                 The scores of a project on a given criterion were obtained by calculating
+strategy). However, they do not explain how to consider these domains             the average of the scores assigned by five experts using a 5-point Likert
+and criteria together.                                                            scale. Note that, as mentioned above, averages should not be calculated
+    Barclay and Osei-Bryson [20] describe a structured framework                  with ordinal scales. In addition, ANP is based on AHP, a method with
+named Project Objectives Measurement Model (POMM) to identify the                 known issues that affect the validity of the criteria weights (see, e.g.,
+criteria for evaluating an information system (IS) project and assigning a        [37–39]).
+performance measure to each criterion. POMM applies value-focused                     Section 2.2 reviews important concepts and methods related to
+thinking principles [21] and goal question metric methods [22]. An                multi-criteria evaluation that are needed to create a proper value mea
+illustrative case is presented in which the importance of each criterion is       surement model [40,41] to assess the success of a project.
+directly assessed using an average of the stakeholders’ answers based on
+a 5-point Likert scale. However, despite its virtues, this operation is           2.2. Multi-criteria evaluation
+neither quantitatively nor substantively meaningful [23], respectively,
+because a Likert scale is an ordinal scale [24,25] and averaging the                 In a multi-criteria value model, the measure of success of a project is
+
+                                                                              2
+J.C. Lourenço and J. Varajão                                                                                              Computer Standards & Interfaces 97 (2026) 104122
+
+
+given by the additive value function model:                                               generates a proposal of weights compatible with the inputted qualitative
+                         n                        n
+                                                                                          judgments by solving the linear programming problem described in
+                         ∑           ( )          ∑
+V(x1 , x2 , …, xn ) =           wj vj xj , with         wj = 1 and wj > 0, ∀j   (1)       Bana e Costa et al. [52]. The evaluators should validate the proposed
+                         j=1                      j=1                                     weighting scale and adjust it if needed.
+
+Where V is the overall value score of the success of the project, wj is the               2.2.2. Methods to build value scales
+weight of criterion j, vj(xj) is the value score on criterion j of the per                   We must assign fixed scores to the previously defined references to
+formance xj, and nrepresents the number of evaluation criteria.                           build a criterion value scale. For example, we may assign 100 and
+    Despite being straightforward in form, this model is often poorly                     0 value units to the “best” and the “worst” performances in each crite
+applied. We highlight that the criteria weights wj are scaling constants                  rion, respectively, although two other scores could be used so that the
+[42], which represent trade-offs between criteria and not the erroneous                   highest score is assigned to the most preferred reference. Though this
+notion of criteria’s measures of importance [21]. In addition, vj is a                    arbitrary assignment of scores leads to obtaining interval value scales
+measurable value function, which represents both a preference order                       [25]. Additionally, the score of a project on a given criterion should
+between performances on criterion j and a strength-of-preference order                    consider the preferences expressed by the evaluators upon performance
+on differences of performances [43]. Moreover, the model requires the                     ranges within the criterion [43] (e.g., the difference in value between
+criteria to be mutually preferentially independent [44], which entails                    performances A and B is worth twice the difference between C and D).
+special care during the model structuring phase.                                          Hereinafter, we present two numerical scoring methods and a qualita
+    There are some fundamental aspects to note regarding the desired                      tive one.
+properties for each evaluation criterion and also for the whole set of                        Edwards [53] presents the direct rating method. This numerical
+criteria [45]. Each criterion should be essential for the evaluation and                  procedure first requires evaluators to rank the project performances in
+controllable in the sense that the performance of the project influences                  order of decreasing attractiveness. The highest score (100 units) is
+the degree to which the criterion is satisfied, independently of other                    assigned to the “best” performance and the lowest score (0 units) to the
+additional decisions. Also, a family of evaluation criteria should be:                    “worst”. Intermediate scores are assigned to other performance levels
+complete (the set of criteria should represent all of the relevant conse                 considering the intensities of preferences between each two of them,
+quences of the project); nonredundant (the criteria should not repeat the                 knowing that the difference between the “best” and “worst” is worth 100
+same concerns); concise (the number of criteria should be kept to the                     value units. This method allows scoring a project directly or indirectly
+necessary minimum to evaluate the project); specific (each criterion                      using a performance measure (e.g., quantitative continuous, quantita
+should be able to assess the consequences of the project, instead of being                tive discrete, or qualitative). von Winterfeldt and Edwards [54] describe
+so broad that it compromises this purpose); and understandable (the                       the bisection method, also known as the mid-value splitting technique [55],
+evaluation criteria should be clear in the eyes of any interested                         to create a value scale for a criterion. This numerical method assigns the
+individual).                                                                              highest score to the “best” performance (100) on the criterion and the
+    Depending on the ability to use appropriate numerical principles and                  lowest score (zero) to the “worst”. Then, it is asked which performance p
+fluency to express oneself in words, an evaluator may prefer to apply a                   has a value equally distant from the “best” and the “worst” perfor
+numerical method or a non-numerical one [46]. In light of this, the                       mances, which means that the ranges “p–to–best” and “p–to–worst” have
+remainder of this section focuses on quantitative and qualitative tech                   the same strength-of-preference. Therefore, the performance p would get
+niques tailored for these two types of evaluators. Specifically, we delve                 a midpoint score of 50. Similar midpoint questions are asked to identify
+into methods for criteria weighting and building a value scale for each                   other points that can be used to form a piecewise linear value function or
+criterion.                                                                                a curve. This method allows the creation of value functions upon a
+                                                                                          quantitative and continuous performance measure on the criterion.
+2.2.1. Weighting methods                                                                      Bana e Costa and Vansnick [50] developed MACBETH [51] to create
+    A theoretically sound weighting method must consider the perfor                      a value scale for a criterion (and to weight criteria, as described in the
+mance ranges defined by two fixed references on each criterion. Com                      preceding section). Still, contrary to the above-mentioned methods, it
+mon references are, for example, the “worst” and the “best”                               needs only to elicit qualitative judgments. An evaluator judges the dif
+performances [39] or “neutral” and “good” performances [47]. Below,                       ference in attractiveness between two performances at a time, using the
+we briefly describe two quantitative weighting procedures and one                         qualitative scale presented in the previous section, and inputs them into
+qualitative.                                                                              the software tool M-MACBETH. This tool verifies the consistency of the
+    Keeney and Raiffa [48] developed the trade-off procedure, which is a                  inputted judgments and generates a proposal of a value scale compatible
+numerical method that requires establishing indifferences between two                     with them and with the scores assigned to the reference performances
+fictitious projects using two criteria at each time. After establishing n – 1             “best” and “worst” (or “good” and “neutral”) [52]. In the final step, the
+indifference relationships for the n criteria, a system of equations is                   evaluator must validate and adjust the proposed value scale if needed.
+solved, including one equation in which the sum of the weights equals 1,                  As in direct rating, this method allows scoring a project directly or
+to obtain the criteria weights.                                                           indirectly using any performance measure.
+    Edwards and Barron [49] created the swing weighting method, which
+is a numerical method that involves measuring the relative importance                     2.3. Review summary
+of the improvements (swings) that can be achieved on the criteria,
+considering a change from the “worst” to the “best” performance on                            In the project success literature reviewed, most papers address the
+each of them.                                                                             identification of IT criteria (e.g., Lobato et al. [4] and Assalaarachchi
+    Bana e Costa and Vansnick [50] developed MACBETH [51] to weight                       et al. [56]) or success factors (e.g., Pinheiro et al. [57] and Jayakody and
+the criteria. This procedure requires ranking the worst–best swings and                   Wijayanayake [58]), but only a few present an evaluation approach. In
+judging them using the qualitative scale of difference in attractiveness:                 addition, the evaluation methods identified suffer from one or more
+no (difference), very weak, weak, moderate, strong, very strong, or extreme.              theoretical errors (e.g., weights used as indicators of importance, aver
+This qualitative scale is also used to judge the difference in attractive                ages calculated with ordinal scales, application of techniques with
+ness between two swings at a time. The elicited judgments are used to fill                known flaws, and normalisation procedures that do not consider
+in the upper triangular part of a matrix in the software tool                             non-linear preferences). Furthermore, as far as we know, there is no
+M-MACBETH, which validates each judgment’s consistency with those                         description of a formal process that may guide the evaluators from
+previously inputted (see [52], pp. 425–443). Then, the software tool                      beginning to end, i.e., from identifying the evaluation criteria until
+
+                                                                                      3
+J.C. Lourenço and J. Varajão                                                                                      Computer Standards & Interfaces 97 (2026) 104122
+
+
+reaching an overall measure of project success. Therefore, a gap in the IT        different roles in the project; all of them were somehow interested in the
+project literature needs to be addressed, which will be done by applying          project’s outcomes. The group had three members: two from TEAMGSM
+multi-criteria evaluation principles.                                             and TEAMUNI, and one external consultant. The team members were
+   Given the characteristics of the evaluators, the simplicity of use of          selected considering their managerial responsibilities and to ensure
+the MACBETH method and its software tool M-MACBETH, including its                 representativeness of all the involved parties. All the members agreed to
+ability to validate the consistency of the value judgments expressed by           be involved in the model development tasks. Note that larger groups
+evaluators and to work with any performance measure (be it qualitative            require different group processes, typically having separate meetings
+or quantitative, continuous or discrete), this was the approach selected          with stakeholders of different areas of interest to develop parts of the
+to weight the criteria and build a value function for each criterion in the       model, and with merge meetings gathering higher-level representatives
+real-world case described in this paper.                                          of the client to validate the work done by the stakeholders and to finish
+                                                                                  the overall model [63].
+3. Model development                                                                  Fig. 1 depicts the model development tasks. The first task involves
+                                                                                  identifying the aspects of interest for evaluating the project’s success
+3.1. Research setting                                                             (“problem structuring”, described in Section 3.3). This is a critical task
+                                                                                  because it is not possible to develop a proper evaluation model without
+    GlobalSysMakers develops solutions in four business areas: mobility           understanding the problem, which is the reason why several publica
+solutions, industrial technology, consumer goods, and energy and                  tions have been devoted to identifying the fundamental evaluation
+building technology. It has several divisions, including automobile               concerns to be addressed (e.g., [28,64]). Second, all the relevant eval
+multimedia, automobile accessories, electric tools, heating and hot               uation criteria should be included in the model, and a descriptor of
+water, and home appliances. It employs roughly 410,000 associates                 performance should be identified for each of them, enabling the
+worldwide, has about 440 subsidiaries and regional companies in 60                assessment of the extent to which each criterion is met (“model struc
+countries, and employs nearly 70,000 associates in research and devel            turing”, Section 3.4). Third, the evaluation component of the model must
+opment at 125 locations.                                                          be built (“value model building”, Section 3.5), which includes the con
+    The target project, here identified as PROJRD, was part of an R&D             struction of a value function for each criterion to transform the perfor
+program that had the participation of GlobalSysMakers and a university.           mances of the project into value scores (Section 3.5.1), and weighting
+The project had as its primary goal the development of a software tool to         the criteria to depict their trade-offs (Section 3.5.2). Last, the evaluation
+automate the assessment of printed circuit boards (PCBs) design. PCBs             model should be tested for adequacy and consistency (Section 4.1).
+are essentially boards that connect electronic components used in all
+(but the simplest) electronic products, such as household appliances or
+vehicles. In addition to the software tool, the project deliverables              3.3. Problem structuring
+included technical specifications, prototypes, and presentations.
+    The software development process adopted was based on a hybrid/                   The problem structuring task aims to identify the fundamental ob
+agile methodology supported by SCRUM [59]. Agile methods for soft                jectives [45] that determine the project’s success from the client’s
+ware development have been increasingly used in the IT sector [60] and            perspective. Such objectives are essential reasons for the project’s suc
+are now mainstream [61]. In this project, agility enabled greater                 cess. Therefore, they should be used as criteria in the evaluation model.
+adaptability of the development phases according to the company’s                     However, the identification of these objectives in ill-structured
+needs and requirements, which evolved along with the project lifecycle.           problems may not be easy, which is why we opted to apply a problem
+Thus, it was possible to deal with changes in the requirements that were          structuring method (PSM) known as group map [65], which can be used in
+reflected in the final deliverables during the project development. In a          combination with a multi-criteria decision analysis approach [66].
+later phase of the project, the SCRUM was coupled with a waterfall                    To begin structuring the problem, the decision-making group was
+process since the objectives stabilised without needing a periodic up            asked to say which aspects or concerns were relevant to evaluate the
+date. The project team was multidisciplinary, incorporating engineers             project’s success. Then, for each of the concerns expressed, it was asked,
+from GlobalSysMakers (TEAMGSM) and researchers from the university                “Why is that important?” or “What would be the consequences of doing
+(TEAMUNI). Together, the teams (TEAMGSM and TEAMUNI) had                          that?”, which allowed us to identify other aspects.
+electronics, software engineering, and project management skills.                     Fig. 2 depicts the complete group causal map built with the answers
+    On average, the team allocated 1040 h per month to the project
+(approximately 6.5 Full-Time Equivalent), distributed by the different
+tasks of the project and according to the functions performed by each
+element (three of the team members were not full-time in the project).
+The project had a duration of 36 months.
+    The project’s overall success was first assessed using a simple grid
+scoring model built by non-specialists in evaluation, which directly
+scored the project on several criteria and assigned importance weights.
+However, the project management team felt the need for a more
+advanced model to improve confidence in the evaluation. More in-depth
+research on multi-criteria evaluation revealed some misinterpretations
+in that process, which ultimately led to the development of a new model
+in line with decision analysis principles. This paper describes the new
+evaluation model.
+
+3.2. Development tasks
+
+   The model development process started by asking the project man
+ager to identify the members who should form the decision-making
+group [62], i.e., the group in charge of developing the model to eval
+uate the project’s success. It was recommended to select members with                                  Fig. 1. Model development tasks.
+
+                                                                              4
+J.C. Lourenço and J. Varajão                                                                                         Computer Standards & Interfaces 97 (2026) 104122
+
+
+
+
+                                                                      Fig. 2. Group map.
+
+
+of the elements of the group using the software tool “Decision Explorer”
+(from Banxia Software Ltd., https://banxia.com/dexplore), which
+automatically numbered the concerns for identification purposes. This
+map results from several iterations, adding some aspects and removing
+others. Note that a specific concern may be expressed by one statement
+(e.g., “(33) good requirements definition”) or by two statements sepa
+rated by an ellipsis, which depicts a positive pole and a negative one to
+clarify the meaning of the concern (e.g., “15 time fulfilment… time
+exceeded”). An arrow between two concerns indicates the direction of
+causality. When an arrow points to a concern with two poles, it means
+that the concern affected is the one at the positive pole (e.g., a “(29) good                        Fig. 3. Project’s success evaluation criteria.
+contract management” contributes to the positive pole of “(1) cost
+fulfilment… cost exceeded”; in the reverse case, the arrow would have a              problem structuring task.
+negative sign near its head).                                                           The concerns represented by these criteria are as follows:
+    In Fig. 2, it is possible to identify chains of means-ends objectives. For
+example, an “(31) effective change management” contributes to the                     • Scope/quality fulfilment (ScoQual)—the extent to which the planned
+“(36) deliverables use”, which respectively allows to “(41) reduce users’               (functional and non-functional) requirements were fulfilled (this
+repetitive work”, which contributes to “increase users’ satisfaction”.                  criterion resulted from concern 14 in Fig. 2).
+Although the “(41) reduce users’ repetitive work” is a means-objective
+to the end-objective “(39) increase users’ satisfaction”, the group                      The prime deliverable of the project is a software tool to support the
+considered the former a fundamental objective because it is important in             PCB’s design assessment, the other deliverables being subsidiary to this
+itself and not because of its contribution to the latter. Therefore, “(41)           tool. In the end, if the software tool does not comply with a minimum set
+reduce users' repetitive work” will be used as an evaluation criterion.              of planned requirements, it will not be able to assess the PCB’s design
+Objective “(39) increase users' satisfaction” was considered too broad to            and will compromise the investment objectives.
+evaluate the project’s success and thus will not be used.
+                                                                                      • Cost fulfilment (Cost)—the extent to which the planned cost was
+                                                                                        fulfilled (this criterion resulted from concern 1 in Fig. 2).
+3.4. Model structuring
+                                                                                         The budget defined for the project needs to be carefully managed due
+3.4.1. Evaluation criteria                                                           to being financed by an external R&D entity with a very narrow margin
+   Fig. 3 depicts the seven evaluation criteria that emerged from the                of deviation.
+concerns highlighted in bold in the group causal map developed in the
+
+                                                                                 5
+J.C. Lourenço and J. Varajão                                                                                              Computer Standards & Interfaces 97 (2026) 104122
+
+
+ • Time fulfilment (Time)—the extent to which the planned time was                   direct (the descriptor levels should directly describe the performances on
+   fulfilled (this criterion resulted from concern 15 in Fig. 2).                    the corresponding criterion), operational (the information concerning
+                                                                                     the performances of the project can be obtained and value judgments
+     Since this project is part of a large program, time fulfillment is a            can be made), understandable (performances and value judgments made
+significant management aspect because all the program’s projects must                using the descriptor can be clearly understood and communicated).
+be finished simultaneously due to the program’s constraints. In other                    Table 1 presents the list of all the descriptors created to measure the
+words, not meeting the deadline in this project would mean completing                performance of the project, as well as two reference performance levels,
+it in whatever form it is in when the program reaches its end, complying             “neutral” and “good”, for each of them. Note that the definition of two
+or not with the scope, and delivering or not what was planned.                       reference performance levels is required to weigh the criteria, allowing
+                                                                                     comparisons between criteria preference ranges and defining two fixed
+ • Increase of the number and type of errors identified in each verification         anchors for the value scales (see Section 2.2). Furthermore, the use of a
+   cycle (IncNoType)—the extent to which the number and type of errors               “neutral” performance level (which corresponds to a performance that is
+   identified in each PCB’s verification cycle increase (this criterion              neither positive nor negative on the criterion) and of a “good” perfor
+   resulted from concern 43 in Fig. 2).                                              mance level (which corresponds to a very positive performance on the
+                                                                                     criterion) allows to increase the understandability of the criterion, and
+   Before the project was implemented in the company, the PCB designs                are thus preferable to the “worst” and the “best” references used as ex
+had been checked mainly in a semi-automatic way by specialised engi                 amples in Section 2.2.
+neers. Due to the many PCB components, details, and rules to review, it                  As shown in Table 1, the criteria scope/quality fulfilment and increase
+was virtually impossible to check all of the required features. The                  in the number and type of errors identified in each verification cycle do not
+consequence was the late detection of some errors in more advanced                   have direct descriptors of performance. For these criteria, constructed
+stages of the projects, or, in other words, in later verification cycles. This       descriptors were developed combining the characteristics inherent to
+accounts for the importance of the new software tool to increase the                 those criteria, as explained next (Bana e Costa et al. [67] describe a
+number and type of errors identified early on in each verification cycle,            detailed procedure for creating constructed descriptors).
+thereby reducing the design costs.                                                       To measure the performance of the project on the scope/quality
+                                                                                     fulfilment criterion, several requirements that deliver different contri
+ • Reduction of the number of verification cycles (RNVC)—the extent to               butions to the project’s success were considered, following the MoSCoW
+   which the number of verification cycles is reduced (this criterion                method principles [68]. These requirements were classified into three
+   resulted from concern 37 in Fig. 2).                                              types (“must have”, “important to have”, and “nice to have”) and
+                                                                                     combined to obtain the performance levels of the descriptor presented in
+     A PCB typically needs to go through several verification cycles until           Table 2.
+it is free from errors and ready for production. When errors are detected                To measure the performance of the project on the increase of the
+in a verification cycle, the PCB design needs to be corrected and tested             number and type of errors identified in each verification cycle criterion,
+again, possibly requiring a new verification cycle. Each verification                several combinations of the number and type of errors identified at each
+cycle of a PCB design implies high costs. Furthermore, there is the risk of          verification cycle (based on a past project) need to be considered (see
+detecting errors only at the production stage, with even more severe                 Table 3). For example, a “5 % increase in the number of identified er
+consequences. A primary expected result of the new software tool is to               rors” and a “10 % increase in the type of identified errors” is a perfor
+reduce the number of verification cycles by enabling the early detection             mance depicted as level “E5 T10”. A verification cycle includes a series
+of errors.                                                                           of tests to check for errors in the PCB’s design or if it is ready for pro
+                                                                                     duction (free from errors).
+ • Improve efficiency (ImpEff)—the extent to which the number of                         We note that the indicators used in the constructed scales presented
+   verified rules increases in each verification cycle without increasing            in Tables 2 and 3 cannot be considered in isolation, as they are mutually
+   the involved human resources (this criterion resulted from concern                preferentially dependent. For example, in Table 3, an increase of 10 % in
+   42 in Fig. 2).
+
+   Since the process for verifying the PCB’s design rules is semi-                   Table 1
+automatic, with a substantial part of manual labour, the current num                Descriptors of performance.
+ber of specialised engineers can only check some of the relevant aspects.             Criterion                        Descriptor             Neutral     Good
+With the new software tool, it is expected that the same number of en
+                                                                                      Scope/quality fulfilment         Constructed            L2          L3
+gineers can check a greater number of design rules, not spending more                   (ScoQual)                      descriptor (see
+time doing it.                                                                                                         Table 2)
+                                                                                      Cost fulfilment (Cost)           Cost of the project    Planned     95 % of the
+ • Reduction of the repetitive work of the users (RRWU)—the extent to                                                  (k€)                   cost        planned cost
+                                                                                                                                              (k€ 500)    (k€ 450)
+   which the number of rules manually verified is reduced in each                     Time fulfilment (Time)           Project duration       Planned     95 % of the
+   verification cycle (this criterion resulted from concern 41 in Fig. 2).                                             (weeks)                time        planned time
+                                                                                                                                              (96         (90 weeks)
+   In the semi-automatic verification of PCB’s design rules, manual la                                                                       weeks)
+                                                                                      Increase in the number and       Constructed            E5 T0       E10 T5
+bour is repetitive and prone to errors due to the fatigue of specialists.
+                                                                                        type of errors identified in   descriptor (see
+Automating most of the rules’ assessment is expected to reduce the re                  each verification cycle        Table 3)
+petitive work of these specialists and free them to perform other tasks.                (IncNoType)
+                                                                                      Reduction of the number of       Number of              1 cycle     2 cycles
+3.4.2. Descriptors of performance                                                       verification cycles            verification cycles
+                                                                                        (RNVC)                         decreased
+    In this task, we associate a descriptor of performance with each                  Improve efficiency (ImpEff)      Number of verified     0%          40 %
+evaluation criterion to measure how much the project satisfies the cri                                                rules increased ( %)
+terion. According to Keeney [45], a descriptor should be unambiguous (to              Reduction of the repetitive      Number of rules        0%          10 %
+describe the performances on the associated criterion clearly), compre                 work of the users (RRWU)       manually verified
+                                                                                                                       reduced ( %)
+hensive (to cover the range of possible performances on the criterion),
+
+                                                                                 6
+J.C. Lourenço and J. Varajão                                                                                                 Computer Standards & Interfaces 97 (2026) 104122
+
+
+Table 2                                                                                      scope/quality fulfilment criterion with a discrete descriptor, and time
+Scale for “scope/quality fulfilment” criterion.                                              fulfilment criterion with a continuous descriptor.
+  Performance levels                                                                             Fig. 4 presents the matrix of judgments for the scope/quality fulfilment
+                                                                                             criterion. Table 2 shows the constructed descriptor for this criterion
+  The project…                                                               
+  …satisfied all the requirements “must have” and “important to have”        L1              where: L1 means “the project satisfied all the requirements ‘must have’
+    and most of the “nice to have”                                                           and ‘important to have’ and the majority of the ‘nice to have’”, L2 means
+  …satisfied all the requirements “must have” and at least 85 % of the       L2 = Good       “the project satisfied all the requirements ‘must have’ and at least 85 %
+    “important to have” and at least 20 % of the “nice to have” (or an                       of the ‘important to have’ and at least 20 % of the ‘nice to have’ (or an
+    equivalent performance on the requirements “important to have”
+    and “nice to have”)
+                                                                                             equivalent performance)”, and L3 means “the project satisfied all the
+  …satisfied all the requirements “must have” and at least 60 % of the       L3 =            requirements ‘must have’ and at least 60 % of the ‘important to have’
+    “important to have” and at least 20 % of the “nice to have” (or an       Neutral         and at least 20 % of the ‘nice to have’ (or an equivalent performance)”.
+    equivalent performance on the requirements “important to have”                           We can see in Fig. 4 that the difference in attractiveness between “L1”
+    and “nice to have”)
+                                                                                             and “L2 = Good” was deemed weak by the evaluators, whereas the
+  …did not satisfy one requirement “must have”, or satisfied less than 60    L4
+    % of the requirements “important to have”                                                difference in attractiveness between “L2 = Good” and “L3 = Neutral”
+  …did not satisfy more than one requirement “must have”                     L5              was considered moderate. Therefore, the difference in value between
+                                                                                             “L1” and “L2 = Good” should be lower than the difference between “L2
+                                                                                             = Good” and “L3 = Neutral”, which can be confirmed in the value scale
+Table 3                                                                                      presented in Fig. 6a, where the former difference corresponds to 65
+Constructed scale for “increase of the number and type of errors identified in               value units and the latter to 100.
+each verification cycle” criterion.                                                              The time fulfilment criterion has the descriptor of performance
+  Increase in the number of            Increase in the type of              Level
+                                                                                             “project duration (in weeks)” with the references “96 weeks = Neutral”
+  identified errors (E)                identified errors (T)                                 and “90 weeks = Good”. To build a value function for this criterion, first,
+                                                                                             we created three more equally spaced performance levels: one worse
+  10 %                                 10 %                                 E10 T10
+  10 %                                 5%                                   E10 T5 =         than “neutral” (99 weeks), one between “neutral” and “good” (93
+                                                                            Good             weeks), and one better than “good” (87 weeks). Then, the evaluators
+  10 %                                 0%                                   E10 T0           judged the differences in attractiveness between each two of these
+  5%                                   10 %                                 E5 T10           levels, together with the “neutral” and the “good” levels, resulting in the
+  5%                                   5%                                   E5 T5
+  5%                                   0%                                   E5 T0 =
+                                                                                             matrix of judgments presented in Fig. 5.
+                                                                            Neutral              Looking at the diagonal (above the grey shaded cells) of the matrix in
+  0%                                   0%                                   E0 T0            Fig. 5 we see that the intensities of the differences in attractiveness
+                                                                                             between each two consecutive levels increase more when the number of
+                                                                                             weeks exceeds 93 weeks: the evaluators considered weak the differences
+the number of identified errors (E) is valued more highly when the per
+                                                                                             in attractiveness between “87” and “90 = Good” (and also between “90
+centage increase in the type of identified errors (T) is greater. Otherwise, the
+                                                                                             = Good” and “93”), whereas they considered moderate the difference in
+number and the type of identified errors could have been used as in
+                                                                                             attractiveness between “93” and “96 = Neutral”, and very strong the
+dicators for two separate evaluation criteria.
+                                                                                             difference between “96 = Neutral” and “99”. Therefore, the difference in
+    After the seven criteria had been clearly identified and their de
+                                                                                             value between “87” and “90 = Good” (and also between “90 = Good”
+scriptors of performance established, the decision-making group was
+                                                                                             and “93”) should be lower than the difference in value between “93” and
+asked whether there was any additional aspect that might be considered
+                                                                                             “96 = Neutral”, and the latter should also be lower than the difference in
+in assessing the project’s success. The negative response indicated that
+                                                                                             value between “96 = Neutral” and “99”, which can be confirmed in the
+this set of criteria was exhaustive and, consequently, that the value tree
+                                                                                             value function presented in Fig. 6c (each of the first two intervals cor
+presented in Fig. 3 could be considered complete.
+                                                                                             responds to 40 value units, whereas the third and fourth equal 60 value
+                                                                                             units and 160, respectively). Therefore, this function shows that the
+3.5. Value model building                                                                    evaluators considered that increments in time after 93 weeks are
+                                                                                             increasingly penalizing for the project’s success.
+3.5.1. Value functions                                                                           We emphasize that the decision group made these judgments for
+     As previously described, a descriptor of performance provides a way                     each criterion independently of the performance levels or the differences
+of measuring the project’s performance on its associated criterion.                          in attractiveness on the remaining criteria, thereby supporting the
+However, to build a value model, we also need to obtain the value of                         assumption of mutual preferential independence between criteria.
+each plausible performance of the project (in the form of a value scale or                       Fig. 6 (6a–6g) presents the value functions of all the evaluation
+value function), which requires knowing the preferences of the evalua                       criteria.
+tors upon differences in performances on the corresponding criterion.
+     For that purpose, we applied the MACBETH method [51]. As                                3.5.2. Criteria weighting
+described in Section 2.2, the questioning procedure of MACBETH re                               Weighting requires establishing trade-offs between criteria, which is
+quires the evaluators to answer questions of difference in attractiveness                    typically demanding because it implies comparing performance im
+between two performance levels at each time, using the qualitative                           provements on different criteria. The improvements (swings) are defined
+scale: no (difference in attractiveness), very weak, weak, moderate,                         between the two predefined performance references, “neutral” and
+strong, very strong, and extreme. The answers provided are used for                          “good”, in each criterion.
+filling in a matrix of judgments in the M-MACBETH software tool, which                           According to the MACBETH weighting procedure, the first step was
+analyses the consistency of the answers as soon as they are inserted, and                    to rank the “neutral–good” swings in order of decreasing preference
+then generates (by linear programming) a proposal of value scale which                       (Fig. 7). The evaluators considered the swing from “1 to 2 verification
+is compatible with the answers provided, given the fixed value scores                        cycles decreased” as the most important one (1st in Fig. 7), which im
+assigned to the “neutral” and the “good” performances (0 and 100 value                       plies that the criterion “reduction of the number of verification cycles
+units, respectively).                                                                        (RNVC)” will have the highest weight. In contrast, the criterion
+     We present two examples of applying the MACBETH method to build                         “reduction of repetitive work of the users (RRWU)” will obtain the
+value functions for criteria with different descriptors of performance:                      lowest weight because it has the least important “neutral–good” swing
+
+                                                                                         7
+J.C. Lourenço and J. Varajão                                                                                             Computer Standards & Interfaces 97 (2026) 104122
+
+
+
+
+                                       Fig. 4. MACBETH judgment matrix for the “Scope/quality fulfilment” criterion.
+
+
+
+
+                                           Fig. 5. MACBETH judgment matrix for the “time fulfilment” criterion.
+
+
+(7th in Fig. 7).                                                                  criteria, because their performances are not worse than “neutral” in any
+    In the second step, the improvements provided by the criteria swings          of the criteria and are better than it in several criteria. Therefore, both
+were judged qualitatively using the MACBETH semantic scale (Fig. 8),              scenarios dominate [69] a “neutral project”. Additionally, we may see
+which allowed filling in the rightmost column in Fig. 9. For example, the         that scenario “PCB red 2 cycles” has an overall score very close to that of
+improvement provided by the most important swing [RNVC] was                       a “good project” (100 units), whereas the value of scenario “PCB red 1
+considered extreme, whereas the least important “neutral–good” swing              cycle” is almost mid-distance from a “neutral project” and a “good
+[RRWU] was judged weak.                                                           project”.
+    Then, the differences in attractiveness between each two “neu                    However, it is not robust to say that the scenario “PCB no red of
+tral–good” swings were assessed to fill in the remaining cells of the first       cycles” corresponds to an unsuccessful project, looking only at its overall
+row of the weighting matrix and fill in the diagonal above the shaded             value score. We must determine if its overall result will always be worse
+cells in Fig. 9. For example, Fig. 10 depicts the comparison of the               than that of a “neutral project” when in the face of the uncertainty
+“neutral–good” swings in the reduction of the number of verification cycles       defined for the model parameters (i.e., the value scores and criteria
+(RNVC) criterion and in the increase in the number and type of errors             weights). In fact, the evaluators considered it plausible that: a) each
+identified in each verification cycle (IncNoType) criterion, which was            criterion weight (wj,j = 1, …, 7) may vary within an interval defined by
+                                                                                                              (                           )
+deemed as very strong (v. strong in Fig. 9). The other cells with no              the lower and upper limits wj ≤ wj ≤ wj , j = 1, …, 7 shown in Table 6;
+judgments were filled in automatically (by transitiveness) with “P”               and b) the value scores of the scenario “PCB no red of cycles” may have
+(positive) judgments by M-MACBETH.                                                                                                            ( )        ( )
+                                                                                  plus or minus 5 value units (respectively denoted by vj yj and vj yj ,
+    Finally, the software tool applied the linear programming model
+described in Bana e Costa et al. [51] to generate a proposal of a                  j = 1,…,7) in all the criteria for which this scenario has a performance
+weighting scale consistent with the qualitative judgments expressed in            different from “neutral” and “good”, otherwise it will keep 0 and 100,
+the weighting matrix, which were subsequently validated by the eval              respectively.
+uators (with some minor adjustments), resulting in the weights pre                   The linear programming (LP) problem (2) was then used to test
+sented in Fig. 11.                                                                whether a “neutral project” additively dominates [70] the scenario “PCB
+                                                                                  no red of cycles”, which would require a negative maxD. The result
+                                                                                  maxD = 9.575denotes that there is at least one combination of plausible
+4. Results and discussion
+                                                                                  scores and weights for which scenario “PCB no red of cycles” has a
+                                                                                  higher overall value than that of a “neutral project”.
+4.1. Model testing and results
+                                                                                      The worst possible overall value for scenario “PCB no red of cycles”
+                                                                                  was also calculated, with the LP problem (3), resulting in minD =
+    At this point, the actual performances of the project are already
+                                                                                  –14.10. Therefore, in the face of the uncertainty, the overall value score
+known for most of the criteria, but not for the reduction of the number of
+                                                                                  of scenario “PCB no red of cycles” may vary between –14.10 and 9.575.
+verification cycles (RNVC) criterion, which will only be identified in the
+long term. Therefore, three alternative scenarios were created with                           7
+                                                                                              ∑          [ ( )      (         )]
+hypothetical future performances on RNCV: no reduction at all (PCB no             maxD =               wj vj yj − vj neutralj                                         (2)
+                                                                                                 j=1
+red cycles), a decrease of one verification cycle (PCB red 1 cycle), and a
+decrease of two verification cycles (PCB red 2 cycles). The performances                Subject to:
+of these scenarios are shown in Table 4.
+                                                                                  7
+                                                                                  ∑
+    Applying the value functions previously defined for each criterion to               wj = 1
+the performances presented in Table 4, we obtain the partial and the              j=1
+overall value scores of the three scenarios shown in Table 5 using the
+previously assessed criteria weights.                                             wj ≤ wj ≤ wj , j = 1, …, 7
+    As seen in Table 5, the most advantageous scenario corresponds to
+                                                                                                         [ ( )                  ]
+“PCB red 2 cycles” with 94.60 overall value units, followed by “PCB red                       7
+                                                                                              ∑                     (         )
+1 cycle” with 49.60, and “PCB no red of cycles” with –6.65.                       minD =               wj vj yj − vj neutralj                                         (3)
+                                                                                              j=1
+    Scenarios “PCB red 2 cycles” and “PCB red 1 cycle” undoubtedly
+denote a successful project independently of the weights assigned to
+
+                                                                              8
+J.C. Lourenço and J. Varajão                                                                                                 Computer Standards & Interfaces 97 (2026) 104122
+
+
+
+
+Fig. 6. Value functions of criteria: (a) scope/quality fulfilment, (b) cost fulfilment, (c) time fulfilment, (d) increase in the number and type of errors identified in each
+verification cycle, (e) reduction of the number of verification cycles, (f) improve efficiency, (g) reduction of the repetitive work of the users.
+
+
+
+
+                                                                                      9
+J.C. Lourenço and J. Varajão                                                                                        Computer Standards & Interfaces 97 (2026) 104122
+
+
+
+
+                                                              Fig. 7. Neutral–good swings ranking.
+
+
+
+
+                                                       Fig. 8. Neutral–good swings’ weighting judgments.
+
+
+
+
+            Fig. 9. MACBETH weighting matrix (the P and I within the matrix respectively mean positive difference in attractiveness and indifference).
+
+
+subject to:
+
+
+
+                                                                                10
+J.C. Lourenço and J. Varajão                                                                                                     Computer Standards & Interfaces 97 (2026) 104122
+
+
+                                                                                              members. Therefore, the model has a form and content sufficient to
+                                                                                              evaluate the project’s success [71].
+
+                                                                                              5. Discussion
+
+                                                                                                  The absence of a formal evaluation of project success results in the
+                                                                                              waste of relevant lessons that can be used to enhance project manage
+                                                                                              ment practices [9,72]. This is a strong reason for implementing
+                                                                                              well-structured processes to evaluate project success.
+                                                                                                  Any evaluation process should start by identifying the success
+                                                                                              criteria according to the decision-makers’ preferences and systems of
+                                                                                              values, which are inherently subjective. We underscore that an evalua
+                                                                                              tion model has an objective component (factual data) and a subjective
+                                                                                              one (value judgments), which should be independently addressed.
+                                                                                              Therefore, subjectivity is a key component in an evaluation process, but
+                                                                                              it should not be confused with ambiguity, which should be avoided. That
+                                                                                              is why the success evaluation criteria should be carefully identified, and
+                                                                                              a measure of the performance of a project on each of those criteria must
+                                                                                              be operationalised. The “neutral” and “good” references of intrinsic
+                                                                                              value allow identifying the project’s success level.
+Fig. 10. Assessment of the difference in attractiveness between the “neu                         Throughout the development of the evaluation model, the members
+tral–good” swings in RNVC and IncNoType.                                                      of the decision-making group were encouraged to engage in open dis
+                                                                                              cussion whenever differences of opinion arose. This approach enabled a
+                                                                                              better understanding of their points of view and helped the group reach
+                                                                                              an agreement on the way forward.
+                                                                                                  In the case described herein, the success of the project may depend
+                                                                                              on the future performance of the reduction of the number of verification
+                                                                                              cycles (RNVC) criterion. With “no reduction of verification cycles”, the
+                                                                                              project may be unsuccessful, with –6.65 overall value units, caused by
+                                                                                              its low performance and corresponding negative score (–125 value
+                                                                                              units) on this criterion. However, as we have seen, given the uncertainty
+                                                                                              defined for the partial value scores and the criteria weights, this scenario
+                                                                                              is not guaranteed to correspond to a negative evaluation. In fact, its
+                                                                                              overall value may vary between –14.10 and 9.575 units.
+                                                                                                  With a “reduction of 1 verification cycle”, the project would obtain
+                                                                                              49.60 overall value units, which is nearly a mid-distance evaluation
+                                                                                              between a “good project” and a “neutral project”. With a “reduction of 2
+                                                                                              verification cycles”, the project would obtain 94.60 overall value units,
+                                Fig. 11. Criteria weights.                                    which is very close to that of a “good project”.
+                                                                                                  Developing a transparent evaluation process, such as the one
+7
+∑                                                                                             described here, will promote the decision-making group’s understand
+      wj = 1                                                                                  ing and acceptance of the results. The participation of the decision-
+j=1                                                                                           makers in all of the process phases is a key element for this purpose,
+                                                                                              which will allow them to develop a sense of ownership of the model
+wj ≤ wj ≤ wj , j = 1, …, 7                                                                    [63]. However, this is not a practice found in the literature related to
+    After concluding the robustness analysis, the evaluation group                            evaluating project success, which offers an opportunity for
+revisited the model and considered that it could deal with all the plau                      improvement.
+sible performances and adequately considered the value judgments of its                           The proposed process, which integrates a problem structuring
+
+
+Table 4
+Performance profiles of the project’s success for the three scenarios.
+  Scenario / Criterion               ScoQual             Cost (k€)           Time                 IncNoType            RNVC                           ImpEff               RRWU
+                                                                             (weeks)                                                                  ( %)                 ( %)
+
+  PCB no red of cycles               L2                  480                 96                   E10 T10              No decrease                    60                   15
+  PCB red 1 cycle                    L2                  480                 96                   E10 T10              Decrease 1 cycle               60                   15
+  PCB red 2 cycles                   L2                  480                 96                   E10 T10              Decrease 2 cycles              60                   15
+
+
+
+
+Table 5
+Value scores of the project success for the three scenarios.
+  Scenario / Criterion             ScoQual          Cost             Time          IncNoType             RNVC            ImpEff             RRWU               Overall value score
+                                   (15 %)           (5 %)            (8 %)         (22 %)                (45 %)          (3 %)              (2 %)
+
+  PCB no red of cycles             100              40               0             115                   –125            150                140                –6.65
+  PCB red 1 cycle                  100              40               0             115                   0               150                140                49.60
+  PCB red 2 cycles                 100              40               0             115                   100             150                140                94.60
+
+
+                                                                                         11
+J.C. Lourenço and J. Varajão                                                                                     Computer Standards & Interfaces 97 (2026) 104122
+
+
+Table 6
+Plausible intervals for the criteria weights.
+  Criterion                        ScoQual         Cost             Time                IncNoType              RNVC                ImpEff                RRWU
+
+  Index (j)                        1               2                3                   4                      5                   6                     7
+  Current weight (wj)              15 %            5%               8%                  22 %                   45 %                3%                    2%
+              ( )
+  Upper limit wj                   18 %            7%               10 %                25 %                   45 %                4%                    2.5 %
+  Lower limit (wj )                12 %            5%               8%                  19 %                   40 %                3%                    2%
+
+
+
+method with a multi-criteria decision analysis (MCDA) approach for                encouraging future research to refine, validate, and extend the proposed
+evaluating the success of information technology (IT) projects, offers            framework. Ultimately, this work not only enriches theoretical under
+several significant theoretical contributions to the fields of project            standing but also provides a foundation for more consistent, transparent,
+management, decision sciences, and IS. First, it advances the conceptual          and stakeholder-aligned evaluation practices in the IT project domain.
+understanding of IT project success by addressing its inherently multi
+dimensional and context-dependent nature. Traditional models often                6. Conclusions
+rely on narrow success criteria—such as time, cost, and scope—while
+this research introduces a more holistic and stakeholder-sensitive                   Evaluating the success of IT projects should be a mandatory project
+framework. By incorporating problem structuring methods, the pro                 management activity. However, this is not observed in the practice [11,
+cess facilitates the elicitation and organization of the stakeholder per         72]. There are several contributions given by the process herein
+spectives, which are often overlooked or underrepresented in                      described, which can be easily adapted to other evaluation problems:
+conventional evaluation models. This contributes to theory by empha
+sizing the social and interpretive dimensions of project success, aligning         • It shows how a multi-criteria approach may be used to evaluate IT
+with contemporary views that success is not an objective outcome but a               (software development) projects while avoiding committing critical
+negotiated construct [73].                                                           mistakes.
+    Second, the integration of MCDA techniques provides a rigorous and             • It offers a transparent process.
+transparent mechanism for prioritizing and aggregating evaluation                  • It involves the decision-makers in all of the model development
+criteria, thereby enhancing the methodological robustness of success                 tasks.
+assessment. This methodological synthesis bridges a gap in the literature          • It identifies the fundamental objectives of decision-makers with the
+by demonstrating how qualitative insights from problem structuring can               help of a problem structuring method, avoiding ending up solving
+be systematically translated into quantitative decision models. Theo                the wrong problem [76].
+retically, this supports the development of hybrid evaluation frame               • It allows establishing quantitative and substantive meaningful [23]
+works that are both contextually grounded and analytically sound.                    trade-offs between criteria (i.e., mathematically valid and unam
+Third, the application of the proposed process in a real-world case adds             biguously understood).
+empirical depth to the theoretical model, offering evidence of its prac           • It allows the management of the project to focus on what matters for
+tical relevance and adaptability. This empirical grounding strengthens               the project’s success.
+the external validity of the framework and encourages further theoret             • It can be implemented to evaluate the success of other projects, in
+ical exploration across different organizational and project contexts.               similar or different contexts.
+    The MACBETH approach has been successfully employed, with                      • The use of descriptors of performance clarifies what is intended to be
+different nuances and across various processes, to evaluate projects or              achieved in each criterion.
+decision alternatives in diverse problem settings and for a wide range of          • It distinguishes performance from value, instead of directly attrib
+organizations [74]. The process described in this paper, which combines              uting scores to the project, mixing these two components.
+problem structuring with the MACBETH approach and robustness                       • And, it allows creating value scales adjusted to the preferences of
+analysis, may also be applied in other contexts, subject to the necessary            evaluators, upon different types of performance (e.g., qualitative or
+adjustments.                                                                         quantitative, continuous or discrete).
+    Our proposed process can also be scaled to the program or portfolio
+level, although this should be done with caution. In the case presented              Additionally, it enables the identification of alternative scenarios to
+here, we applied an additive value function model, which is compen               deal with unknown future performances and to test the robustness of the
+satory—meaning that poor performance on one criterion can be offset               conclusions considering uncertainties on the model parameters.
+by good performance on others. However, this assumption may not al                  In the target organization, given the shortcomings recognised in a
+ways hold. In a program or portfolio context, for instance, if a key              previous “grid scoring model”, the multi-criteria evaluation model of the
+project performs poorly, that alone may render the entire program or              real-world case described in this paper was built during an advanced
+portfolio unsuccessful, regardless of the performance of the remaining            stage of the project’s development. This late development can be
+projects. In such cases, a mixed model should be adopted, combining               considered a threat to internal validity regarding consistency and a
+classification rules to address the non-compensatory criteria with an             limitation since the evaluation model should be built during the plan
+additive component for the compensatory ones.                                     ning phase of a project and revisited during the project development to
+    Moreover, the research highlights the absence of standardized ap             be improved, if needed, or adjusted to possible changes to the project
+proaches for evaluating IT project success, which has long been a limi           aim. Another threat to external validity should also be disclosed.
+tation in both academic and professional domains. Standardization                 Namely, concerning scalability, further research is needed to test if the
+facilitates the dissemination of knowledge and enhances predictability,           proposed process can be scaled or adapted for different project sizes or
+thereby minimizing uncertainty and reducing risk [75]. By proposing a             types.
+replicable and adaptable process, the study lays the groundwork for the              In future work, it would be interesting to create a process capable of
+development of formalized evaluation standards. This has implications             dealing with all project phases, allowing the evaluation of its develop
+for theory-building, as it suggests a pathway toward unifying frag               ment and evolution at several milestones, from the project initiation
+mented evaluation practices under a coherent, theoretically informed              until its termination. The process described in this paper may be
+model. In doing so, it contributes to the ongoing discourse on stan              extended to evaluate project success throughout the project lifecycle.
+dardization in project management and information systems evaluation,             This requires developing a model that includes both final and
+
+
+                                                                             12
+J.C. Lourenço and J. Varajão                                                                                                             Computer Standards & Interfaces 97 (2026) 104122
+
+
+intermediate objectives (criteria) for measuring project success. The                             [10] J. Varajão, J.C. Lourenço, J. Gomes, Models and methods for information systems
+                                                                                                       project success evaluation–a review and directions for research, Heliyon 8 (12)
+intermediate objectives should be used during project development and
+                                                                                                       (2022), https://doi.org/10.1016/j.heliyon.2022.e11977.
+later deactivated by setting their weights to zero and rescaling the                              [11] J. Varajão, J.Á. Carvalho, Evaluating the success of IS/IT projects: how are
+remaining criteria weights so that they sum to one. Monitoring the                                     companies doing it?, in: Proceedings of the 13th Pre-ICIS International Research
+evolution of a project’s success against a well-defined set of criteria will                           Workshop on IT Project Management (IRWITPM 2018), San Francisco, USA, 2018.
+                                                                                                  [12] R.L. Keeney, Common mistakes in making value trade-offs, Oper. Res. 50 (6)
+allow identifying problems sooner and taking proper measures in time.                                  (2002) 935–945, https://doi.org/10.1287/opre.50.6.935.357.
+Furthermore, the integration of the proposed evaluation process in the                            [13] J.E. Russo, P.J.H. Schoemaker, Decision Traps: The Ten Barriers to Brilliant
+success management process [77] will add value to the management                                       Decision-Making and How to Overcome Them, Doubleday, 1989.
+                                                                                                  [14] S. Lipovetsky, A. Tishler, D. Dvir, A. Shenhar, The relative importance of project
+efforts.                                                                                               success dimensions, R&D Manag. 27 (2) (1997) 97–106, https://doi.org/10.1111/
+    Finally, since artificial intelligence technology, especially with the                             1467-9310.00047.
+rise of Large Language Models (LLMs), has shown great potential in                                [15] Shapiro, J. (2005). Monitoring and evaluation. C.-W. A. f. C. Participation. htt
+                                                                                                       ps://www.civicus.org/view/media/Monitoring%20and%20Evaluation.pdf.
+revolutionizing the automation of various complex tasks [78], it is                               [16] Kahan, B., & Goodstadt, M. (2005). The IDM manual: basics. http://sites.utoronto.
+imperative to explore it in the context of success evaluation.                                         ca/chp/download/IDMmanual/IDM_basics_dist05.pdf.
+                                                                                                  [17] V. Arumugam, J. Antony, M. Kumar, Linking learning and knowledge creation to
+                                                                                                       project success in Six Sigma projects: an empirical investigation, Int. J. Prod. Econ.
+CRediT authorship contribution statement                                                               141 (1) (2013) 388–402, https://doi.org/10.1016/j.ijpe.2012.09.003.
+                                                                                                  [18] R. Linzalone, G. Schiuma, A review of program and project evaluation models,
+    João Carlos Lourenço: Writing – review & editing, Writing – orig                                 Meas. Bus. Excell. 19 (3) (2015) 90–99, https://doi.org/10.1108/MBE-04-2015-
+                                                                                                       0024.
+inal draft, Visualization, Validation, Software, Methodology, Investiga                          [19] P.L. Bannerman, A. Thorogood, Celebrating IT projects success: a multi-domain
+tion, Formal analysis, Conceptualization. João Varajão: Writing –                                    analysis, in: Proceedings of the 45th Hawaii International Conference on System
+review & editing, Writing – original draft, Validation, Methodology,                                   Sciences, Maui, HI, 2012.
+                                                                                                  [20] C. Barclay, K. Osei-Bryson, Determining the contribution of IS projects: an
+Investigation, Data curation, Conceptualization.
+                                                                                                       approach to measure performance, in: Proceedings of the 42nd Hawaii
+                                                                                                       International Conference on System Sciences, Waikoloa, HI, 2009.
+                                                                                                  [21] R.L. Keeney, Value-Focused Thinking: A Path to Creative Decisionmaking, Harvard
+Declaration of competing interest
+                                                                                                       University Press, 1992.
+                                                                                                  [22] R. Solingen, E. Berghout, The Goal/Question/Metric Method: A Practical Guide for
+    The authors declare that they have no known competing financial                                    Quality Improvement of Software Development, McGraw-Hill, 1999.
+                                                                                                  [23] S. French, Decision Theory: An Introduction to the Mathematics of Rationality,
+interests or personal relationships that could have appeared to influence
+                                                                                                       Ellis Horwood, 1986.
+the work reported in this paper.                                                                  [24] R. Göb, C. McCollin, M. Ramalhoto, Ordinal methodology in the analysis of Likert
+                                                                                                       scales, Qual. Quant. 41 (5) (2007) 601–626, https://doi.org/10.1007/s11135-007-
+                                                                                                       9089-z.
+Acknowledgement
+                                                                                                  [25] S.S. Stevens, On the theory of scales of measurement, Science 103 (2684) (1946)
+                                                                                                       677–680, https://doi.org/10.1126/science.103.2684.677.
+    This work has been supported by FCT – Fundação para a Ciência e                             [26] W. Edwards, J.R. Newman, Multiattribute evaluation, in: T. Connolly, H.R. Arkes,
+Tecnologia within the R&D Unit Project Scope UID/00319/2025 -                                          K.R. Hammond (Eds.), Judgment and Decision Making: An Interdisciplinary
+                                                                                                       Reader, 2nd ed, Cambridge University Press, 2000, pp. 17–34.
+Centro ALGORITMI (ALGORITMI/UM). João C. Lourenço acknowledges                                   [27] R. von Nitzsch, M. Weber, The effect of attribute ranges on weights in
+the financial support of Portuguese funds through FCT – Fundação para                                 multiattribute utility measurements, Manag. Sci. 39 (8) (1993) 937–943, https://
+a Ciência e a Tecnologia, I.P., under the project UID/97/2025 (CEGIST).                               doi.org/10.1287/mnsc.39.8.937.
+                                                                                                  [28] A. Basar, A novel methodology for performance evaluation of IT projects in a fuzzy
+João C. Lourenço acknowledges the financial support of Portuguese                                     environment: a case study, Soft Comput. 24 (14) (2020) 10755–10770, https://doi.
+funds through FCT – Fundação para a Ciência e a Tecnologia, I.P., under                              org/10.1007/s00500-019-04579-y.
+the project UID/97/2025 (CEGIST).                                                                 [29] H.N. Ismail, Measuring success of water reservoir project by using delphi and
+                                                                                                       priority evaluation method, in: Proceedings of the IOP Conference Series: Earth
+                                                                                                       and Environmental Science 588, 2020 042021, https://doi.org/10.1088/1755-
+Data availability                                                                                      1315/588/4/042021.
+                                                                                                  [30] J.H. Yu, H.R. Kwon, Critical success factors for urban regeneration projects in
+                                                                                                       Korea, Int. J. Proj. Manag. 29 (7) (2011) 889–899, https://doi.org/10.1016/j.
+    The data is presented in the article.                                                              ijproman.2010.09.001.
+                                                                                                  [31] A. Nguvulu, S. Yamato, T. Honma, Project performance evaluation using deep
+References                                                                                             belief networks, IEEJ Trans. Electron. Inf. Syst. 132 (2) (2012) 306–312, https://
+                                                                                                       doi.org/10.1541/ieejeiss.132.306.
+                                                                                                  [32] C. Wohlin, A.A. Andrews, Assessing project success using subjective evaluation
+ [1] R. Colomo-Palacios, I. González-Carrasco, J.L. López-Cuadrado, A. Trigo, J.
+                                                                                                       factors, Softw. Qual. J. 9 (1) (2001) 43–70, https://doi.org/10.1023/a:
+     E. Varajao, I-Competere: using applied intelligence in search of competency gaps in
+                                                                                                       1016673203332.
+     software project managers, Inf. Syst. Front. 16 (4) (2014) 607–625, https://doi.
+                                                                                                  [33] X. Yan, Utilizing the BSC method for IT performance evaluation of construction
+     org/10.1007/s10796-012-9369-6.
+                                                                                                       companies, in: Proceedings of the First International Conference on Information
+ [2] M.A. Kafaji, Interchange roles of formal and informal project management on
+                                                                                                       Science and Engineering, Nanjing, China, 2009.
+     business operational success, Prod. Plan. Control (2022) 1–21, https://doi.org/
+                                                                                                  [34] R.S. Kaplan, D.P. Norton, The balanced scorecard–measures that drive
+     10.1080/09537287.2022.2089265.
+                                                                                                       performance, Harv. Bus. Rev. 70 (1) (1992) 71–79.
+ [3] L.A. Ika, J.K. Pinto, The “re-meaning” of project success: updating and recalibrating
+                                                                                                  [35] C.L. Yang, R.H. Huang, M.T. Ho, Multi-criteria evaluation model for a software
+     for a modern project management, Int. J. Proj. Manag. 40 (7) (2022) 835–848,
+                                                                                                       development project, in: Proceedings of the IEEE International Conference on
+     https://doi.org/10.1016/j.ijproman.2022.08.001.
+                                                                                                       Industrial Engineering and Engineering Management, Hong Kong, China, 2009.
+ [4] B. Lobato, J. Varajão, C. Tam, A.A. Baptista, CrEISPS–a framework of criteria for
+                                                                                                  [36] T.L. Saaty, The Analytic Hierarchy Process: Planning, Priority Setting, Resource
+     evaluating success in information systems projects, Procedia Comput. Sci. 256
+                                                                                                       Allocation, McGraw-Hill, 1980.
+     (2025) (2025) 1821–1835, https://doi.org/10.1016/j.procs.2025.02.323.
+                                                                                                  [37] C.A. Bana e Costa, J.C. Vansnick, A critical analysis of the eigenvalue method used
+ [5] N. Agarwal, U. Rathod, Defining ‘success’ for software projects: an exploratory
+                                                                                                       to derive priorities in AHP, Eur. J. Oper. Res. 187 (3) (2008) 1422–1428, https://
+     revelation, Int. J. Proj. Manag. 24 (4) (2006) 358–370, https://doi.org/10.1016/j.
+                                                                                                       doi.org/10.1016/j.ejor.2006.09.022.
+     ijproman.2005.11.009.
+                                                                                                  [38] J.S. Dyer, Remarks on the analytic hierarchy process, Manag. Sci. 36 (3) (1990)
+ [6] R. Atkinson, Project management: cost, time and quality, two best guesses and a
+                                                                                                       249–258, https://doi.org/10.1287/mnsc.36.3.249.
+     phenomenon, its time to accept other success criteria, Int. J. Proj. Manag. 17 (6)
+                                                                                                  [39] P. Goodwin, G. Wright, Decision Analysis for Management Judgment, 5th ed., John
+     (1999) 337–342, https://doi.org/10.1016/S0263-7863(98)00069-6.
+                                                                                                       Wiley & Sons, 2014.
+ [7] H. Landrum, V.R. Prybutok, X. Zhang, The moderating effect of occupation on the
+                                                                                                  [40] V. Belton, T.J. Stewart, Multiple Criteria Decision Analysis: An Integrated
+     perception of information services quality and success, Comput. Ind. Eng. 58 (1)
+                                                                                                       Approach, Kluwer Academic Publishers, 2002.
+     (2010) 133–142, https://doi.org/10.1016/j.cie.2009.09.006.
+                                                                                                  [41] R.L. Keeney, D. von Winterfeldt, Practical value models, in: W. Edwards, R.
+ [8] J.K. Pinto, D.P. Slevin, Project success: definitions and measurement techniques,
+                                                                                                       F. Miles Jr., D. von Winterfeldt (Eds.), Advances in Decision Analysis: From
+     Proj. Manag. J. 19 (1) (1988) 67–72.
+                                                                                                       Foundations to Applications, Cambridge University Press, 2007, pp. 232–252.
+ [9] J. Varajão, L. Magalhães, L. Freitas, P. Rocha, Success management–from theory to
+     practice, Int. J. Proj. Manag. 40 (5) (2022) 481–498, https://doi.org/10.1016/j.
+     ijproman.2022.04.002.
+
+
+                                                                                             13
+J.C. Lourenço and J. Varajão                                                                                                           Computer Standards & Interfaces 97 (2026) 104122
+
+[42] J.S. Dyer, J.E. Smith, Innovations in the science and practice of decision analysis:        [61] V. Henriquez, J.A. Calvo-Manzano, A.M. Moreno, T. San Feliu, Agile governance
+     the role of management science, Manag. Sci. 67 (9) (2020) 5364–5378, https://doi.                practices by aligning CMMI V2.0 with portfolio SAFe 5.0, Comput. Stand.
+     org/10.1287/mnsc.2020.3652.                                                                      Interfaces 91 (2025) (2025) 103881, https://doi.org/10.1016/j.csi.2024.103881.
+[43] J.E. Smith, J.S. Dyer, On (measurable) multiattribute value functions: an                   [62] V. Ferretti, G. Montibeller, Key challenges and meta-choices in designing and
+     expository argument, Decis. Anal. 18 (4) (2021) 247–256, https://doi.org/                        applying multi-criteria spatial decision support systems, Decis. Support Syst. 84
+     10.1287/deca.2021.0435.                                                                          (2016) 41–52, https://doi.org/10.1016/j.dss.2016.01.005.
+[44] J.S. Dyer, R.K. Sarin, Measurable multiattribute value functions, Oper. Res. 27 (4)         [63] L.D Phillips, Decision conferencing, in: W. Edwards, R.F. Miles Jr., D. von
+     (1979) 810–822, https://doi.org/10.1287/opre.27.4.810.                                           Winterfeldt (Eds.), Advances in Decision Analysis: From Foundations to
+[45] R.L Keeney, Developing objectives and attributes, in: W. Edwards, R.F. Miles Jr.,                Applications, Cambridge University Press, 2007, pp. 375–399.
+     D. von Winterfeldt (Eds.), Advances in Decision Analysis: From Foundations to               [64] T.Y. Chen, H.F. Chang, Critical success factors and architecture of innovation
+     Applications, Cambridge University Press, 2007, pp. 104–128.                                     services models in data industry, Expert Syst. Appl. 213 (2023) 119014, https://
+[46] B. Fasolo, C.A. Bana e Costa, Tailoring value elicitation to decision makers'                    doi.org/10.1016/j.eswa.2022.119014.
+     numeracy and fluency: expressing value judgments in numbers or words, Omega                 [65] C.M. Smith, D. Shaw, The characteristics of problem structuring methods: a
+     44 (0) (2014) 83–90, https://doi.org/10.1016/j.omega.2013.09.006.                                literature review, Eur. J. Oper. Res. 274 (2) (2019) 403–416, https://doi.org/
+[47] C.A. Bana e Costa, E.C. Corrêa, J.M. De Corte, J.C. Vansnick, Facilitating bid                  10.1016/j.ejor.2018.05.003.
+     evaluation in public call for tenders: a socio-technical approach, Omega 30 (3)             [66] M. Marttunen, J. Lienert, V. Belton, Structuring problems for multi-criteria
+     (2002) 227–242, https://doi.org/10.1016/S0305-0483(02)00029-4.                                   decision analysis in practice: a literature review of method combinations, Eur. J.
+[48] R.L. Keeney, H. Raiffa, Decisions With Multiple Objectives: Preferences and Value                Oper. Res. 263 (1) (2017) 1–17, https://doi.org/10.1016/j.ejor.2017.04.041.
+     Tradeoffs, John Wiley & Sons, 1976.                                                         [67] C.A. Bana e Costa, J.C. Lourenço, M.P. Chagas, J.C. Bana e Costa, Development of
+[49] W. Edwards, F.H. Barron, SMARTS and SMARTER: improved simple methods for                         reusable bid evaluation models for the Portuguese Electric Transmission Company,
+     multiattribute utility measurement, Organ. Behav. Hum. Decis. Process. 60 (3)                    Decis. Anal. 5 (1) (2008) 22–42, https://doi.org/10.1287/deca.1080.0104.
+     (1994) 306–325, https://doi.org/10.1006/obhd.1994.1087.                                     [68] D. Clegg, R. Barker, Case Method Fast-Track: A RAD Approach, Addison-Wesley
+[50] C.A. Bana e Costa, J.C. Vansnick, MACBETH – An interactive path towards the                      Longman Publishing, 1994.
+     construction of cardinal value functions, Int. Trans. Oper. Res. 1 (4) (1994)               [69] M. Weber, Decision making with incomplete information, Eur. J. Oper. Res. 28 (1)
+     489–500, https://doi.org/10.1016/0969-6016(94)90010-8.                                           (1987) 44–57, https://doi.org/10.1016/0377-2217(87)90168-8.
+[51] C.A. Bana e Costa, J.M. De Corte, J.C. Vansnick, MACBETH, Int. J. Inf. Technol.             [70] C.A. Bana e Costa, P. Vincke, Measuring credibility of compensatory preference
+     Decis. Mak. 11 (2) (2012) 359–387, https://doi.org/10.1142/                                      statements when trade-offs are interval determined, Theory Decis. 39 (2) (1995)
+     S0219622012400068.                                                                               127–155, https://doi.org/10.1007/BF01078981.
+[52] C.A. Bana e Costa, J.M. De Corte, J.C. Vansnick, On the mathematical foundations            [71] L.D. Phillips, A theory of requisite decision models, Acta Psychol. 56 (1–3) (1984)
+     of MACBETH, in: S. Greco, M. Ehrgott, J.R. Figueira (Eds.), Multiple Criteria                    29–48, https://doi.org/10.1016/0001-6918(84)90005-2.
+     Decision Analysis: State of the Art Surveys, Springer, 2016, pp. 421–463, https://          [72] J. Pereira, J. Varajão, N. Takagi, Evaluation of information systems project
+     doi.org/10.1007/978-1-4939-3094-4_11.                                                            success–insights from practitioners, Inf. Syst. Manag. (2021) 1–18, https://doi.org/
+[53] W. Edwards, How to use multiattribute utility measurement for social                             10.1080/10580530.2021.1887982.
+     decisionmaking, IEEE Trans. Syst. Man Cybern. 7 (5) (1977) 326–340, https://doi.            [73] N. Takagi, J. Varajão, ISO 21502 and Success Management: A Required Marriage in
+     org/10.1109/TSMC.1977.4309720.                                                                   Project Management, SAGE Open, 2025, pp. 1–11, https://doi.org/10.1177/
+[54] D. von Winterfeldt, W. Edwards, Decision Analysis and Behavioral Research,                       21582440251355046. July-September.
+     Cambridge University Press, 1986.                                                           [74] F.A.F. Ferreira, S.P. Santos, Two decades on the MACBETH approach: a
+[55] C.W. Kirkwood, Strategic Decision Making: Multiobjective Decision Analysis with                  bibliometric analysis, Ann. Oper. Res. 296 (1) (2021) 901–925, https://doi.org/
+     Spreadsheets, Duxbury Press, 1997.                                                               10.1007/s10479-018-3083-9v.
+[56] L.I. Assalaarachchi, M.P.P. Liyanage, C. Hewagamage, A framework of critical                [75] J. Varajão, L. Lopes, A. Tenera, Framework of standards, guides and methodologies
+     success factors of cloud-based project management software adoption, Int. J. Inf.                for project, program, portfolio, and PMO management, Comput. Stand. Interfaces
+     Syst. Proj. Manag. 13 (2) (2025) e4, https://doi.org/10.12821/ijispm130204.                      92 (2025) (2025) 103888, https://doi.org/10.1016/j.csi.2024.103888.
+[57] N. Pinheiro, J. Vrajão, I. Moura, Success factors of public sector information             [76] I.I. Mitroff, T.R. Featheringham, On systemic problem solving and the error of the
+     systems projects in developing countries, Sustain. Futures 10 (2025) (2025)                      third kind, Behav. Sci. 19 (6) (1974) 383–393, https://doi.org/10.1002/
+     101095, https://doi.org/10.1016/j.sftr.2025.101095.                                              bs.3830190605.
+[58] J. Jayakody, W. Wijayanayake, Critical success factors for DevOps adoption in               [77] J. Varajão, Success Management as a PM knowledge area – work-in-progress,
+     information systems development, Int. J. Inf. Syst. Proj. Manag. 11 (3) (2023)                   Procedia Comput. Sci. 100 (2016) (2016) 1095–1102, https://doi.org/10.1016/j.
+     60–82, https://doi.org/10.12821/ijispm110304.                                                    procs.2016.09.256.
+[59] K. Schwaber, J. Sutherland, The Scrum Guide - The Definitive Guide to Scrum: The            [78] Y. Kong, N. Zhang, Z. Duan, B. Yu, Collaboration with generative AI to improve
+     Rules of the Game, scrumguides.org, 2020. https://scrumguides.org/docs/sc                        requirements change, Comput. Stand. Interfaces 94 (2025) (2025) 104013, https://
+     rumguide/v2020/2020-Scrum-Guide-US.pdf.                                                          doi.org/10.1016/j.csi.2025.104013.
+[60] M. Jovanovic, A.L. Mesquida, A. Mas, R. Colomo-Palacios, Agile transition and
+     adoption frameworks, issues and factors: a systematic mapping, IEEE Access 8
+     (2020) (2020) 15711–15735, https://doi.org/10.1109/ACCESS.2020.2967839.
+
+
+
+
+                                                                                            14
+
--- a/papers_txt/A-novel-hybrid-WOA-GWO-algorithm-for-multi-objective-opti_2026_Computer-Stan.txt
+++ b/papers_txt/A-novel-hybrid-WOA-GWO-algorithm-for-multi-objective-opti_2026_Computer-Stan.txt
--- a/papers_txt/A-requirement-driven-method-for-process-mining-base_2026_Computer-Standards-.txt
+++ b/papers_txt/A-requirement-driven-method-for-process-mining-base_2026_Computer-Standards-.txt
--- a/papers_txt/ARMOR--A-multi-layered-adaptive-defense-framework-for-ro_2026_Computer-Stand.txt
+++ b/papers_txt/ARMOR--A-multi-layered-adaptive-defense-framework-for-ro_2026_Computer-Stand.txt
@@ -0,0 +1,726 @@
+                                                              Computer Standards & Interfaces 97 (2026) 104117
+
+
+                                                                   Contents lists available at ScienceDirect
+
+
+                                                        Computer Standards & Interfaces
+                                                            journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+ARMOR: A multi-layered adaptive defense framework for robust deep
+learning systems against evolving adversarial threatsI
+                                      ∗
+Mahmoud Mohamed                        , Fayaz AlJuaid
+Electrical and Computer Engineering , King Abdul Aziz University, Saudi Arabia
+
+
+
+ARTICLE               INFO                                ABSTRACT
+
+Keywords:                                                 Introduction: Adversarial attacks represent a major challenge to deep learning models deployed in critical
+Adversarial machine learning                              fields such as healthcare diagnostics and financial fraud detection. This paper addresses the limitations of
+Deep learning security                                    single-strategy defenses by introducing ARMOR (Adaptive Resilient Multi-layer Orchestrated Response), a novel
+Multi-layered defense
+                                                          multi-layered architecture that seamlessly integrates multiple defense mechanisms.
+Robustness evaluation
+                                                          Methodology: We evaluate ARMOR against seven state-of-the-art defense methods through extensive experi-
+Adaptive security
+                                                          ments across multiple datasets and five attack methodologies. Our approach combines adversarial detection, in-
+                                                          put transformation, model hardening, and adaptive response layers that operate with intentional dependencies
+                                                          and feedback mechanisms.
+                                                          Results: Quantitative results demonstrate that ARMOR significantly outperforms individual defense methods,
+                                                          achieving a 91.7% attack mitigation rate (18.3% improvement over ensemble averaging), 87.5% clean accuracy
+                                                          preservation (8.9% improvement over adversarial training alone), and 76.4% robustness against adaptive
+                                                          attacks (23.2% increase over the strongest baseline).
+                                                          Discussion: The modular framework design enables flexibility against emerging threats while requiring only
+                                                          1.42× computational overhead compared to unprotected models, making it suitable for resource-constrained
+                                                          environments. Our findings demonstrate that activating and integrating complementary defense mechanisms
+                                                          represents a significant advance in adversarial resilience.
+
+
+
+1. Introduction                                                                               However, existing defenses are typically based on single strategies
+                                                                                          such as adversarial training [6], input preprocessing [7], or detection
+    Deep learning technologies have been widely adopted in critical                       models [8]. While effective against specific attacks, these methods
+sectors including autonomous vehicles, medical diagnostics, and cy-                       often fail when facing diverse or adaptive attacks [9]. This limita-
+bersecurity. While they offer powerful capabilities, they also introduce                  tion is increasingly concerning as adversaries continue to evolve their
+new security vulnerabilities. Adversarial examples—carefully crafted                      strategies. Furthermore, existing techniques often suffer from high com-
+inputs designed to deceive models—pose significant risks to AI sys-                       putational costs, degraded performance on clean data, and continued
+tems [1,2]. Small, seemingly imperceptible distortions can cause state-                   susceptibility to adaptive attacks [10].
+of-the-art models to misclassify inputs, which may have life-threatening                      Problem Statement: This paper addresses the vulnerability of deep
+consequences in safety-critical applications [3].                                         learning systems to adversarial attacks in mission-critical environments.
+    Recent advances in deep learning have highlighted the importance                      Current defenses exhibit three key weaknesses:
+of robust defense mechanisms. For example, UNet-based segmentation
+models in medical imaging have achieved approximately 96% accuracy                            1. They typically optimize for a single threat model, leaving them
+in COVID-19 detection from CT scans [4]. Similarly, CNN and BiGRU                                exposed to diverse attack strategies.
+models have demonstrated strong performance in traffic network anal-                          2. They employ static approaches that cannot adapt to evolving
+ysis with an R-squared of 0.9912 [5]. These successes underscore the                             threats.
+critical need for robust defenses, particularly as deep learning models                       3. They fail to balance performance and security, often sacrificing
+are increasingly integrated into high-stakes decision-making processes.                          accuracy on benign data.
+
+
+
+ I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces.
+ ∗ Corresponding author.
+     E-mail address: mhassan0085@stu.kau.edu.sa (M. Mohamed).
+
+https://doi.org/10.1016/j.csi.2025.104117
+Received 2 June 2025; Received in revised form 2 December 2025; Accepted 12 December 2025
+Available online 17 December 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+M. Mohamed and F. AlJuaid                                                                                       Computer Standards & Interfaces 97 (2026) 104117
+
+
+These weaknesses motivate the need for an agile and flexible defense            2.3. Detection-based defenses
+architecture.
+    Research Gaps: Our comprehensive literature survey, following                   Detection methods aim to identify adversarial examples without
+systematic review methodologies [11], identifies several critical gaps:         necessarily correcting them. Metzen et al. [8] attached a binary detec-
+                                                                                tor subnetwork to identify adversarial inputs. Lee et al. [22] used Ma-
+    • Most defenses optimize for a single threat model, creating vulner-        halanobis distance-based confidence scores to detect out-of-distribution
+      abilities across diverse attack strategies [12].
+                                                                                samples.
+    • Current ensemble approaches typically use simple voting or aver-              Recent approaches include statistical methods using odds ratio
+      aging, failing to leverage the complementary strengths of different
+                                                                                tests [23] and Local Intrinsic Dimensionality (LID) [24] to characterize
+      defense mechanisms [13].
+                                                                                adversarial regions in feature space.
+    • There is insufficient focus on dynamic adaptation to evolving
+                                                                                    While detection mechanisms can be accurate, adaptive attacks
+      threats in real-time operational environments [14].
+                                                                                specifically target their vulnerabilities [25]. Moreover, they do not
+    • The performance-security trade-off is poorly addressed, with
+                                                                                provide predictions for identified adversarial examples.
+      many techniques significantly degrading model performance on
+      benign inputs [15].
+                                                                                2.4. Certified robustness approaches
+   Our ARMOR framework addresses these gaps through:
+                                                                                    Certified defenses provide theoretical guarantees that perturbations
+    • Orchestrated Integration: Complementary defense layers oper-              within certain bounds will not alter predictions. Cohen et al. [26]
+      ate cooperatively rather than in isolation.                               applied randomized smoothing to create certifiably robust classifiers
+    • Dynamic Threat Assessment: Adaptive response mechanisms                   against L2-norm bounded perturbations. Gowal et al. [27] developed
+      learn from observed attack patterns.                                      interval bound propagation for training verifiably robust networks.
+    • Explicit Trade-off Optimization: High clean accuracy is main-                 Recent progress includes DeepPoly [28], which provides tighter
+      tained while improving robustness.                                        bounds for neural network verification, and improved certification
+    • Comprehensive Testing: Evaluation across diverse attacks, in-             bounds for cascading architectures [29].
+      cluding engineered adaptive attacks.                                          While certified methods offer valuable theoretical assurances, they
+    • Modular Design: New defense mechanisms can be incorporated                generally achieve lower empirical robustness than adversarial training
+      as they emerge.                                                           and can be significantly more resource-intensive [30].
+   As shown in Table 1, our method advances the state-of-the-art
+                                                                                2.5. Ensemble and hybrid approaches
+across multiple performance dimensions while maintaining reasonable
+computational overhead.
+                                                                                    Ensemble methods combine multiple models or defense mechanisms
+2. Related work                                                                 to enhance robustness. Tramèr et al. [31] proposed Ensemble Adversar-
+                                                                                ial Training, which augments training data with adversarial examples
+   This section analyzes current adversarial defense mechanisms, their          from other models. Pang et al. [13] introduced adaptive diversity
+limitations, and specific gaps our framework addresses. We categorize           promoting (ADP) training to develop robust ensemble models. Sen
+existing work into adversarial training, input transformation, detection-       et al. [32] integrated detection and adversarial training in a two-stage
+based methods, certified robustness, and ensemble approaches.                   process.
+                                                                                    However, most current ensembles employ basic averaging or voting
+2.1. Adversarial training methods                                               schemes that fail to leverage the complementary strengths of different
+                                                                                defense types [33].
+    Adversarial training remains one of the most effective empirical
+defense mechanisms. Madry et al. [6] introduced PGD adversarial                 2.6. Research gaps and contributions
+training, which serves as a strong baseline but suffers from reduced
+clean accuracy and high computational cost.
+                                                                                   Based on our literature review, we identify the following critical
+    Recent advances include TRADES [15], which explicitly regularizes
+                                                                                research gaps:
+the trade-off between standard accuracy and robustness; Fast Adver-
+sarial Training [16], which improves computational efficiency using                • Poor Integration: Most studies focus on single defenses or simple
+FGSM with randomization; and Robust Self-Training (RST) [17], which                  combinations that fail to leverage synergistic effects.
+leverages additional unlabeled data to enhance robustness.
+                                                                                   • Static Defense Mechanisms: Current approaches use fixed
+    Despite these improvements, adversarial training techniques remain
+                                                                                     strategies that cannot adapt to evolving threats.
+fundamentally constrained: they are typically resistant only to attacks
+                                                                                   • Performance-Security Trade-offs: Robust models frequently sac-
+encountered during training, often fail on out-of-distribution samples,
+                                                                                     rifice clean-data accuracy.
+and exhibit reduced performance on clean data [18].
+                                                                                   • Lack of Standardization: Inconsistent evaluation protocols hin-
+2.2. Input transformation approaches                                                 der fair comparisons.
+                                                                                   • Insufficient Adaptive Attack Testing: Most defenses are not
+    Input transformation methods aim to remove adversarial perturba-                 evaluated against adaptive attacks designed to circumvent them.
+tions before model inference. Guo et al. [7] explored various image
+transformations, finding that total variance minimization and image                Our ARMOR framework addresses these gaps through:
+quilting provide moderate robustness. Xie et al. [19] proposed random
+resizing and padding as preprocessing defenses.                                    • Orchestrated Integration: Complementary defense layers oper-
+    More recent work includes Neural Representation Purifiers [20],                  ate cooperatively rather than in isolation.
+which use self-supervised learning to clean adversarial inputs, and                • Dynamic Threat Assessment: Response mechanisms adapt based
+ComDefend [21], a compression-decompression architecture that elim-                  on observed attack patterns.
+inates adversarial perturbations.                                                  • Explicit Trade-off Optimization: High clean accuracy is main-
+    While these methods often preserve accuracy better than adversarial              tained while improving robustness.
+training, they remain vulnerable to adaptive attacks that account for              • Comprehensive Testing: Evaluation across diverse attacks, in-
+the transformation process [10].                                                     cluding engineered adaptive attacks.
+
+                                                                            2
+M. Mohamed and F. AlJuaid                                                                                              Computer Standards & Interfaces 97 (2026) 104117
+
+
+Table 1
+Comparison of state-of-the-art adversarial defense methods (2020–2025).
+ Reference               Year    Defense type             Multi-attack robustness           Clean accuracy        Computation overhead     Adaptive attack resistance
+ Madry et al. [6]        2018    Adversarial training     Medium (66.4%)                    Low (87.3%)           High (10×)               Medium (54.2%)
+ Zhang et al. [15]       2019    Adv. training (TRADES)   Medium (73.5%)                    Medium (84.9%)        High (7×)                Medium (61.8%)
+ Cohen et al. [26]       2019    Certified defense        Low (49.2%)                       Medium (83.5%)        Very high (30×)          High (guaranteed bounds)
+ Wong et al. [16]        2020    Fast Adv. training       Medium (71.2%)                    Medium-high (85.8%)   Medium (3×)              Medium (58.3%)
+ Rebuffi et al. [17]     2021    Robust self-training     High (76.5%)                      Medium-high (86.1%)   High (12×)               Medium-high (64.5%)
+ Ma et al. [24]          2021    Detection-based          Low-medium (detection only)       Very high (99.1%)     Low (1.2×)               Low (35.6%)
+ Naseer et al. [20]      2020    Input transformation     Medium (68.7%)                    High (88.3%)          Medium (2.5×)            Low (42.1%)
+ Pang et al. [13]        2019    Ensemble                 Medium-high (74.8%)               Medium (83.2%)        Very high (15×)          Medium (63.1%)
+ Sen et al. [32]         2020    Hybrid                   Medium-high (75.1%)               Medium (83.9%)        High (8×)                Medium (62.5%)
+ Kariyappa et al. [34]   2019    Diversity ensemble       Medium-high (73.9%)               Medium (84.1%)        Very high (18×)          Medium-high (65.8%)
+ Jia et al. [21]         2019    Stochastic defense       Medium (67.2%)                    High (89.5%)          Low (1.5×)               Low-medium (53.6%)
+ Gowal et al. [27]       2019    Interval bound Prop.     Medium (68.8%)                    Medium (82.8%)        High (9×)                High (certified regions)
+ Yang et al. [29]        2020    Certified defense        Medium (64.3%)                    Medium (84.2%)        High (7×)                High (certified regions)
+ Croce et al. [30]       2022    Regularization           Medium-high (73.8%)               Medium-high (85.7%)   Medium (4×)              Medium (60.9%)
+ Wei et al. [35]         2021    Adv. distillation        Medium-high (75.6%)               Medium-high (86.3%)   Medium (3.5×)            Medium-High (64.2%)
+ Our work (ARMOR)        2025    Multi-layered            Very high (91.7%)                 High (87.5%)          Low-medium (1.42×)       High (76.4%)
+
+
+
+
+                                Fig. 1. ARMOR framework architecture showing the orchestrated multi-layered defense approach.
+
+
+    • Modular Design: New defense mechanisms can be incorporated                           • Input Transformation Layer: Applies appropriate preprocessing
+      as they emerge.                                                                        techniques to remove or reduce adversarial perturbations.
+                                                                                           • Model Robustness Layer: Employs robust model architectures
+   As shown in Table 1, ARMOR advances the state-of-the-art across                           and training techniques to withstand remaining adversarial ef-
+multiple performance dimensions while maintaining reasonable com-                            fects.
+putational overhead.                                                                       • Adaptive Response Layer: Dynamically adjusts defense strate-
+                                                                                             gies based on observed attack patterns and feedback.
+3. Methodology
+                                                                                            Unlike static pipeline approaches, ARMOR uses an orchestration
+   This section describes the ARMOR framework architecture and its                      mechanism to dynamically route inputs through the most effective com-
+components.                                                                             bination of defense components based on threat assessment and his-
+                                                                                        torical performance data. This orchestrated approach provides stronger
+3.1. Framework overview                                                                 protection than any single layer or static combination.
+
+   As shown in Fig. 1, ARMOR integrates four complementary defense
+                                                                                        3.2. Threat assessment layer
+layers:
+
+    • Threat Assessment Layer: Analyzes inputs to detect potential                         The threat assessment layer employs multiple detection methods to
+      adversarial examples and characterize their properties.                           identify and classify adversarial examples:
+
+                                                                                    3
+M. Mohamed and F. AlJuaid                                                                                                   Computer Standards & Interfaces 97 (2026) 104117
+
+
+3.2.1. Feature space analysis                                                       3.3.2. Frequency domain filtering
+    We compute the Mahalanobis distance between an input sample                         Based on the frequency analysis from the threat assessment layer,
+𝑥 and the distribution of legitimate training examples in the fea-                  we apply targeted filtering to remove adversarial components in spe-
+ture space. For each layer 𝑙 of the neural network, we model the                    cific frequency bands. For an input 𝑥, we compute its wavelet transform
+class-conditional distribution of legitimate examples as a multivariate             𝑊 (𝑥), apply a filtering function 𝜙 to the coefficients, and compute the
+Gaussian with parameters 𝜇𝑐𝑙 and 𝛴 𝑙 , where 𝑐 represents the predicted             inverse transform:
+class. The Mahalanobis distance score 𝑀 𝑙 (𝑥) is computed as:
+                                                                                    𝑥̂ = 𝑊 −1 (𝜙(𝑊 (𝑥), 𝑎(𝑥)))                                                          (7)
+𝑀 𝑙 (𝑥) = min(𝑓 𝑙 (𝑥) − 𝜇𝑐𝑙 )𝑇 (𝛴 𝑙 )−1 (𝑓 𝑙 (𝑥) − 𝜇𝑐𝑙 )                 (1)
+                𝑐
+                                                                                        The filtering function 𝜙 adapts based on the attack characteri-
+where 𝑓 𝑙 (𝑥) represents the feature vector at layer 𝑙 for input 𝑥.                 zation, targeting frequency bands most likely to contain adversarial
+                                                                                    perturbations.
+3.2.2. Prediction consistency check
+    We measure the consistency of model predictions when the input is
+                                                                                    3.3.3. Randomized smoothing
+subjected to small benign transformations. Given a set of 𝑘 transforma-
+                                                                                       For inputs with high uncertainty, we apply randomized smoothing
+tions {𝑇1 , 𝑇2 , … , 𝑇𝑘 } and model 𝑓 , the consistency score 𝐶(𝑥) is defined
+as:                                                                                 with Gaussian noise:
+
+          1∑
+                𝑘                                                                   𝑥̂ = 𝑥 +  (0, 𝜎 2 𝐼)                                                               (8)
+𝐶(𝑥) =          I[𝑓 (𝑇𝑖 (𝑥)) = 𝑓 (𝑥)]                                    (2)
+          𝑘 𝑖=1                                                                     where 𝜎 is dynamically adjusted based on the threat score and attack
+where I[⋅] is the indicator function.                                               characterization, increasing for high-threat inputs to provide stronger
+                                                                                    smoothing.
+3.2.3. Frequency domain analysis
+   We perform discrete wavelet transform (DWT) on the input to                      3.4. Model robustness layer
+analyze its frequency characteristics. Adversarial perturbations often
+exhibit distinctive patterns in high-frequency components. We compute                  The model robustness layer integrates multiple robust architectures
+the energy distribution across frequency bands and compare it to the                and training techniques:
+typical distribution in legitimate samples. The frequency abnormality
+score 𝐹 (𝑥) is calculated as:
+                                                                                    3.4.1. Diverse model ensemble
+          ∑
+          𝑚
+                                                                                        We employ an ensemble of models with diverse architectures and
+𝐹 (𝑥) =         𝑤𝑖 ⋅ |𝐸𝑖 (𝑥) − 𝜇𝐸𝑖 |                                     (3)
+          𝑖=1                                                                       training procedures:
+where 𝐸𝑖 (𝑥) is the energy in frequency band 𝑖, 𝜇𝐸𝑖 is the mean energy               = {𝑓1 , 𝑓2 , … , 𝑓𝑛 }                                                             (9)
+for legitimate samples in that band, and 𝑤𝑖 are learned weights.
+                                                                                        Instead of simple averaging, we compute weighted predictions
+3.2.4. Integrated threat score                                                      based on each model’s historical performance against the detected
+   The individual detection scores are combined into an integrated                  attack type:
+threat score 𝑇 (𝑥) using a logistic regression model:
+                                                                                               ∑
+                                                                                               𝑛
+
+𝑇 (𝑥) = 𝜎(𝑤𝑀 ⋅ 𝑀(𝑥) + 𝑤𝐶 ⋅ 𝐶(𝑥) + 𝑤𝐹 ⋅ 𝐹 (𝑥) + 𝑏)                        (4)        𝑝(𝑦|𝑥) =         𝑤𝑖 (𝑎(𝑥)) ⋅ 𝑝𝑖 (𝑦|𝑥)                                             (10)
+                                                                                               𝑖=1
+where 𝜎 is the sigmoid function, and 𝑤𝑀 , 𝑤𝐶 , 𝑤𝐹 , and 𝑏 are learned               where 𝑤𝑖 (𝑎(𝑥)) is the weight assigned to model 𝑖 based on the attack
+parameters.                                                                         characterization 𝑎(𝑥).
+    In addition to binary adversarial/legitimate classification, the threat
+assessment layer provides an attack characterization vector 𝑎(𝑥) that
+                                                                                    3.4.2. Feature denoising
+estimates properties such as attack strength, perceptibility, and tar-
+                                                                                        We incorporate feature denoising modules at multiple network lev-
+geted/untargeted nature:
+                                                                                    els. For a feature map ℎ, the denoised features ℎ̂ are computed as:
+𝑎(𝑥) = 𝑔(𝑀(𝑥), 𝐶(𝑥), 𝐹 (𝑥), 𝑓 (𝑥))                                       (5)
+where 𝑔 is a small neural network trained on a diverse set of known                 ℎ̂ = ℎ + 𝛾 ⋅ 𝐺(ℎ, 𝑎(𝑥))                                                           (11)
+attacks.
+                                                                                    where 𝐺 is a non-local denoising function and 𝛾 is a learnable param-
+3.3. Input transformation layer                                                     eter controlling denoising strength.
+
+   The input transformation layer employs multiple preprocessing                    3.4.3. Robust training objective
+techniques to remove or reduce adversarial perturbations. Rather than                  Models in the ensemble are trained using a composite objective
+applying all transformations sequentially (which would degrade clean                function balancing standard accuracy, adversarial robustness, and
+performance), ARMOR selectively applies the most appropriate trans-                 model diversity:
+formations based on threat assessment:
+                                                                                     = 𝛼 ⋅ 𝐶𝐸 (𝑥) + 𝛽 ⋅ 𝐴𝐷𝑉 (𝑥) + 𝛾 ⋅ 𝐷𝐼𝑉 (𝑥,  )                                 (12)
+3.3.1. Adaptive denoising
+    We employ a conditional autoencoder 𝐷𝜃 trained to remove adver-                 where 𝐶𝐸 is standard cross-entropy loss, 𝐴𝐷𝑉 is adversarial loss, and
+sarial perturbations while preserving semantic content. The denoising               𝐷𝐼𝑉 is a diversity-promoting loss that encourages models to make
+process is conditioned on the attack characterization vector 𝑎(𝑥):                  different mistakes.
+
+𝑥̂ = 𝐷𝜃 (𝑥, 𝑎(𝑥))                                                        (6)        3.5. Adaptive response layer
+   This conditioning allows the denoiser to adapt its behavior based on
+the detected attack type, improving both effectiveness and clean data                  The adaptive response layer continuously updates defense strategies
+preservation.                                                                       based on observed attack patterns and performance feedback:
+
+                                                                                4
+M. Mohamed and F. AlJuaid                                                                                       Computer Standards & Interfaces 97 (2026) 104117
+
+
+3.5.1. Attack pattern recognition                                                Algorithm 1 ARMOR Orchestration Mechanism
+    We maintain a historical database of attack patterns and their                1: Input: Input sample 𝑥, trained models  , orchestration policy 𝜋
+effectiveness against different defense configurations. New inputs are            2: Output: Prediction 𝑦, updated effectiveness scores
+compared to this database to identify similar patterns:                           3: Compute threat assessment 𝑇 (𝑥) and attack characterization 𝑎(𝑥)
+               (                   )                                              4: Select initial defense configuration 𝑐0 = 𝜋(𝑥, 𝑇 (𝑥), 𝑎(𝑥))
+                  ‖𝑎(𝑥) − 𝑎(𝑥𝑖 )‖2
+𝑠(𝑥, 𝑥𝑖 ) = exp −                                                 (13)            5: Apply defenses in 𝑐0 to 𝑥, obtaining intermediate result 𝑥̂ 0
+                        2𝜎 2
+                                                                                  6: Evaluate model confidence on 𝑥̂ 0
+where 𝑠(𝑥, 𝑥𝑖 ) measures similarity between the current input 𝑥 and               7: if confidence below threshold then
+historical sample 𝑥𝑖 .                                                            8:    Select additional defenses 𝑐1 = 𝜋(𝑥̂ 0 , 𝑇 (𝑥̂ 0 ), 𝑎(𝑥̂ 0 ))
+                                                                                  9:    Apply defenses in 𝑐1 to 𝑥̂ 0 , obtaining 𝑥̂ 1
+                                                                                 10:    Set 𝑥̂ = 𝑥̂ 1
+3.5.2. Defense effectiveness tracking                                            11: else
+    For each defense component 𝑑 and attack type 𝑎, we track historical          12:    Set 𝑥̂ = 𝑥̂ 0
+effectiveness 𝐸(𝑑, 𝑎) based on successful mitigation. This score updates         13: end if
+after each prediction:                                                           14: Compute final prediction 𝑦 = 𝑓 (𝑥)  ̂
+                                                                                 15: Update effectiveness scores 𝐸(𝑑, 𝑎(𝑥)) for all applied defenses 𝑑
+𝐸(𝑑, 𝑎) ← 𝜆 ⋅ 𝐸(𝑑, 𝑎) + (1 − 𝜆) ⋅ 𝑆(𝑑, 𝑥)                            (14)        16: return 𝑦, updated 𝐸
+
+where 𝑆(𝑑, 𝑥) indicates success of defense component 𝑑 on input 𝑥, and
+𝜆 is a forgetting factor weighting recent observations.
+                                                                                 3.7. Implementation details
+
+3.5.3. Defense strategy optimization
+                                                                                    ARMOR was implemented in PyTorch as follows:
+   Based on effectiveness tracking, we periodically update the or-
+chestration policy to optimize input routing through defense layers:                • Threat Assessment Layer: ResNet-50 pre-trained on ImageNet
+                                                                                      for feature extraction. Detection models are trained on clean and
+                 ∑                                                                    adversarial examples generated using PGD, C&W, and AutoAt-
+𝜋(𝑥) = arg max         𝐸(𝑑, 𝑎(𝑥))                                    (15)             tack.
+             𝑐
+                 𝑑∈𝑐
+                                                                                    • Input Transformation Layer: U-Net autoencoder with skip con-
+where 𝜋(𝑥) selects the defense configuration for input 𝑥 and 𝑐 represents             nections and conditioning. Wavelet transforms use PyWavelets
+a potential defense component configuration.                                          with db4 wavelets.
+                                                                                    • Model Robustness Layer: Ensemble of ResNet-50, DenseNet-
+                                                                                      121, and EfficientNet-B3, trained with various robust optimiza-
+3.6. Orchestration mechanism                                                          tion methods (TRADES, MART, AWP).
+                                                                                    • Adaptive Response Layer: Historical database using locality-
+   The orchestration mechanism is ARMOR’s key innovation, enabling                    sensitive hashing for efficient similarity search. Orchestration
+dynamic routing of inputs through the most effective combination of                   policy trained using Proximal Policy Optimization (PPO).
+defense components. The orchestrator uses a Markov Decision Process
+                                                                                     The overall computational cost depends on the defense configu-
+(MDP) formulation:
+                                                                                 ration selected by the orchestrator. In our experiments, the average
+                                                                                 overhead is 1.42× compared to an unprotected model, ranging from
+    • State: The current state 𝑠𝑡 includes input 𝑥, threat assessment
+                                                                                 1.1× (minimal defense) to 2.8× (full defense stack).
+      𝑇 (𝑥), attack characterization 𝑎(𝑥), and current model confidence.
+    • Actions: Each action 𝑎𝑡 represents selection of a specific defense
+      component or combination.                                                  4. Experimental setup
+    • Reward: The reward 𝑟𝑡 is defined by correct classification, with
+      penalties for unnecessary computational overhead.                          4.1. Research questions
+    • Policy: The policy 𝜋(𝑎𝑡 |𝑠𝑡 ) is a neural network predicting optimal
+      defense configuration given the current state.                                Our study addresses the following research questions:
+
+   The policy is trained using reinforcement learning on diverse attacks            • RQ1: How does ARMOR compare to state-of-the-art individual
+and inputs. During deployment, the orchestrator processes each input                  and ensemble defenses in robustness against diverse attacks?
+sequentially:                                                                       • RQ2: How does ARMOR preserve clean data accuracy compared
+                                                                                      to existing defenses?
+    1. Compute threat assessment and attack characterization.                       • RQ3: What is ARMOR’s resistance to adaptive attacks targeting
+    2. Select initial defense configuration based on the policy.                      its components?
+    3. Apply selected defenses and evaluate the result.                             • RQ4: How does ARMOR’s computational overhead compare to
+    4. If necessary, select additional defenses based on the updated                  other defenses?
+       state.                                                                       • RQ5: What are the contributions of individual ARMOR compo-
+    5. Return final prediction and update effectiveness tracking.                     nents to overall effectiveness?
+
+
+    This dynamic approach allows ARMOR to provide strong protec-                 4.2. Datasets
+tion while minimizing computational overhead. Low-threat inputs re-
+ceive minimal defenses, preserving efficiency, while high-threat inputs              We evaluate ARMOR on four image classification datasets selected
+receive comprehensive protection.                                                to represent varying complexity and domains:
+
+                                                                             5
+M. Mohamed and F. AlJuaid                                                                                        Computer Standards & Interfaces 97 (2026) 104117
+
+
+    • CIFAR-10: 60,000 32 × 32 color images across 10 classes (50,000          Table 2
+      training, 10,000 test). This benchmark standard tests defenses on        Robust accuracy (%) against different attack types on CIFAR-10.
+      small to medium-complexity images [36].                                   Defense           PGD     C&W       AutoAttack     BPDA       EOT      Average
+    • SVHN: Street View House Numbers with 73,257 training and                  No defense        0.0     0.0       0.0            0.0        0.0      0.0
+      26,032 test images of digits. This dataset evaluates defense gen-         AT                47.3    54.1      43.8           46.2       45.9     47.5
+                                                                                TRADES            49.8    55.6      45.2           48.3       47.1     49.2
+      eralization to digit recognition [37].
+                                                                                RS                38.9    42.3      36.5           25.1       18.4     32.2
+    • GTSRB: German Traffic Sign Recognition Benchmark with 39,209              FD                45.7    50.2      41.3           44.5       44.1     45.2
+      training and 12,630 test images across 43 traffic sign classes.           IT                35.4    38.6      21.7           15.3       33.2     28.8
+      This real-world dataset tests robustness under varied lighting and        EA                53.2    59.8      48.6           50.1       49.4     52.2
+      perspectives [38].                                                        ADP               56.1    62.3      51.4           53.6       52.8     55.2
+                                                                                ARMOR (Ours)      67.8    73.5      65.2           64.1       63.7     66.9
+    • ImageNet-100: A 100-class subset of ImageNet with 1300 train-
+      ing and 50 validation images per class. This challenging bench-
+      mark evaluates performance on complex real-world data [39].
+                                                                                   • True Positive Rate (TPR): Proportion of adversarial samples
+    This diverse dataset selection ensures our results generalize across             correctly identified.
+different data environments.                                                       • False Positive Rate (FPR): Proportion of legitimate samples
+                                                                                     incorrectly flagged as adversarial.
+4.3. Attack methods                                                                • Adaptive Attack Robustness (AAR): Accuracy against carefully
+                                                                                     crafted adaptive attacks.
+   We evaluate robustness against five attack types:
+                                                                               4.6. Adaptive attacks
+    • PGD (Projected Gradient Descent): Strong iterative attack with
+      𝜖 = 8∕255, 𝛼 = 2∕255, and 20 iterations.
+                                                                                  To thoroughly evaluate ARMOR, we designed adaptive attacks tar-
+    • C&W (Carlini & Wagner): Optimization-based attack with confi-
+                                                                               geting its specific components:
+      dence parameter 𝜅 = 0 and 1000 iterations.
+    • AutoAttack: Parameter-free ensemble including APGD, FAB, and                 • Orchestrator Bypass Attack (OBA): Generates adversarial exam-
+      Square Attack.                                                                 ples with low threat scores to route through minimal defenses.
+    • BPDA (Backward Pass Differentiable Approximation): Adap-                     • Transformation-Aware Attack (TAA): Uses EOT to average gra-
+      tive attack designed to circumvent gradient obfuscation defenses.              dients over possible input transformations, creating perturbations
+    • EOT (Expectation Over Transformation): Attack accounting                       that survive preprocessing.
+      for randomized defenses by averaging gradients over multiple                 • Ensemble Transfer Attack (ETA): Generates transferable adver-
+      transformations.                                                               sarial examples targeting the diverse model ensemble.
+                                                                                   • History Poisoning Attack (HPA): Gradually shifts attack pattern
+  Section 4.6 describes our adaptive attacks specifically targeting
+                                                                                     distribution to reduce effectiveness of historical pattern matching.
+ARMOR components.
+                                                                                 These adaptive attacks combine EOT, BPDA, and transferability
+4.4. Baseline defenses                                                         methods with ARMOR-specific modifications.
+
+   We compare ARMOR against the following state-of-the-art defenses:           5. Results
+
+    • Adversarial Training (AT): Standard PGD adversarial training.               This section presents experimental results addressing our research
+    • TRADES: Explicitly balances accuracy and robustness.                     questions.
+    • Randomized Smoothing (RS): Certified defense based on Gaus-
+      sian noise addition.                                                     5.1. RQ1: Robustness against diverse attacks
+    • Feature Denoising (FD): Non-local means filtering in feature
+      space.                                                                       Table 2 shows robust accuracy against various attacks on CIFAR-
+    • Input Transformation (IT): JPEG compression and bit-depth                10. ARMOR significantly outperforms all defenses across attack types,
+      reduction.                                                               achieving 66.9% average robust accuracy compared to 55.2% for the
+    • Ensemble Averaging (EA): Simple averaging of independent                 best baseline (ADP). Performance is particularly strong against adap-
+      robust models.                                                           tive attacks like BPDA and EOT, where ARMOR maintains over 63%
+    • Adaptive Diversity Promoting (ADP): Encourages diversity in              accuracy while other defenses degrade substantially.
+      ensemble predictions.                                                        Fig. 2 shows robust accuracy across all four datasets against Au-
+                                                                               toAttack. ARMOR consistently outperforms baselines, with the largest
+4.5. Evaluation metrics                                                        gains on complex datasets (GTSRB and ImageNet-100), demonstrating
+                                                                               scalability to challenging classification problems.
+   We use the following performance metrics:
+                                                                               5.2. RQ2: Impact on clean data performance
+    • Clean Accuracy (CA): Accuracy on unmodified test data.
+    • Robust Accuracy (RA): Accuracy on adversarial examples.                      Table 3 compares clean accuracy, robust accuracy, and the clean-
+    • Attack Success Rate (ASR): Percentage of successful adversarial          robust accuracy gap (CRAG) on CIFAR-10. ARMOR achieves 87.5%
+      examples that deceive the model.                                         clean accuracy—higher than most comparably robust defenses. The
+    • Clean-Robust Accuracy Gap (CRAG): Difference between clean               clean-robust gap is only 20.6%, compared to 28.6% for the next best
+      and robust accuracy.                                                     approach (ADP), indicating a better performance-security trade-off.
+    • Computational Overhead (CO): Inference time relative to an                   Fig. 3 visualizes the clean-robust accuracy trade-off across datasets.
+      undefended model.                                                        Points closer to the upper-right corner represent better performance on
+    • Detection Delay (DD): Average time to detect adversarial exam-           both metrics. ARMOR consistently occupies the most favorable region
+      ples.                                                                    of this trade-off space.
+
+                                                                           6
+M. Mohamed and F. AlJuaid                                                                                               Computer Standards & Interfaces 97 (2026) 104117
+
+
+                                                                                    Table 4
+                                                                                    Robust accuracy (%) against adaptive attacks on CIFAR-10.
+                                                                                     Defense          Standard attack      OBA       TAA     ETA       HPA      Average
+                                                                                     AT               47.5                 47.5      47.5    47.5      47.5     47.5
+                                                                                     TRADES           49.2                 49.2      49.2    49.2      49.2     49.2
+                                                                                     RS               32.2                 32.2      18.4    32.2      32.2     29.4
+                                                                                     FD               45.2                 45.2      45.2    45.2      45.2     45.2
+                                                                                     IT               28.8                 28.8      15.3    28.8      28.8     26.1
+                                                                                     EA               52.2                 52.2      49.4    40.6      52.2     49.3
+                                                                                     ADP              55.2                 55.2      52.8    45.1      55.2     52.7
+                                                                                     ARMOR (Ours)     66.9                 58.3      56.7    52.4      59.8     58.8
+
+
+                                                                                    Table 5
+                                                                                    Computational overhead and memory requirements.
+                                                                                     Defense             Inference time           Memory usage             Training time
+                                                                                                         (× Baseline)             (× Baseline)             (× Baseline)
+                                                                                     No defense          1.00×                    1.00×                    1.00×
+                                                                                     AT                  1.05×                    1.00×                    7.80×
+  Fig. 2. Robust accuracy comparison across datasets against AutoAttack.             TRADES              1.05×                    1.00×                    8.50×
+                                                                                     RS                  3.20×                    1.05×                    1.20×
+                                                                                     FD                  1.30×                    1.20×                    1.50×
+Table 3                                                                              IT                  1.15×                    1.00×                    1.00×
+Clean accuracy and clean-robust accuracy gap on CIFAR-10.                            EA                  3.10×                    3.00×                    7.80×
+ Defense            Clean accuracy (%)    Robust accuracy (%)     CRAG (%)           ADP                 3.15×                    3.00×                    9.20×
+ No defense         95.6                  0.0                     95.6               ARMOR (Min)         1.10×                    1.15×                    –
+ AT                 83.4                  47.5                    35.9               ARMOR (Avg)         1.42×                    1.35×                    12.50×
+ TRADES             84.9                  49.2                    35.7               ARMOR (Max)         2.80×                    3.20×                    –
+ RS                 87.3                  32.2                    55.1
+ FD                 85.7                  45.2                    40.5
+ IT                 89.5                  28.8                    60.7
+                                                                                    Table 6
+ EA                 82.6                  52.2                    30.4              Detection performance of ARMOR’s threat assessment layer.
+ ADP                83.8                  55.2                    28.6               Dataset              TPR (%)              FPR (%)              Detection delay (ms)
+ ARMOR (Ours)       87.5                  66.9                    20.6
+                                                                                     CIFAR-10             92.3                 3.7                  12.4
+                                                                                     SVHN                 93.1                 3.2                  11.8
+                                                                                     GTSRB                91.7                 4.1                  13.2
+                                                                                     ImageNet-100         90.8                 4.5                  15.6
+
+
+
+
+                                                                                    5.4. RQ4: Computational overhead
+
+
+                                                                                        Table 5 compares inference time, memory usage, and training time
+                                                                                    across defenses. ARMOR’s computational cost varies by configuration.
+                                                                                    With minimal defenses (low-threat inputs), overhead is only 1.10×.
+                                                                                    With maximal defenses (highly suspicious inputs), overhead reaches
+                                                                                    2.80×.
+                                                                                        ARMOR’s average inference overhead of 1.42× is substantially
+                                                                                    lower than ensemble methods like EA (3.10×) and ADP (3.15×), despite
+                                                                                    providing superior robustness. This efficiency comes from the orches-
+                                                                                    tration mechanism’s ability to allocate computational resources based
+Fig. 3. Trade-off between clean accuracy and robust accuracy across defenses.       on threat assessment.
+                                                                                        Table 6 shows the threat assessment layer’s detection performance
+                                                                                    in terms of true positive rate (TPR), false positive rate (FPR), and aver-
+5.3. RQ3: Effectiveness against adaptive attacks                                    age detection delay. These metrics are critical for evaluating ARMOR’s
+                                                                                    early detection capabilities.
+    Table 4 shows robustness against adaptive attacks designed to                       The threat assessment layer achieves high TPR (90.8–93.1%) with
+exploit defense-specific vulnerabilities. We test all adaptive attacks              low FPR (3.2–4.5%) across all datasets. Detection delay is minimal
+against all defenses for consistency, though some target ARMOR specif-              (11.8–15.6 ms), enabling real-time threat assessment without signifi-
+ically (e.g., OBA).                                                                 cant computational cost.
+    ARMOR maintains 58.8% average robust accuracy against adaptive                      ARMOR’s training time is higher than other methods due to training
+attacks, substantially higher than the second-best approach (ADP at                 multiple components, including the orchestration policy. However, this
+52.7%). The Ensemble Transfer Attack (ETA) is most effective against                is a one-time cost that does not impact deployment efficiency.
+ARMOR, reducing robust accuracy to 52.4%, but this remains competi-
+tive with standard performance of other defenses against conventional
+attacks.                                                                            5.5. RQ5: Ablation study
+    The relatively modest performance drop against adaptive attacks
+(from 66.9% to 58.8%) demonstrates ARMOR’s resilience to attack                        Table 7 presents an ablation study measuring each ARMOR compo-
+adaptation, attributable to defense diversity and the adaptive response             nent’s contribution. We evaluate configurations with individual compo-
+layer’s ability to recognize and counter evolving attack patterns.                  nents removed (w/o X) and single-component-only versions (X Only).
+
+                                                                                7
+M. Mohamed and F. AlJuaid                                                                                              Computer Standards & Interfaces 97 (2026) 104117
+
+
+                        Table 7
+                        Ablation study: Component contributions on CIFAR-10.
+                            Configuration                      Clean accuracy (%)            Robust accuracy (%)       Adaptive attack (%)
+                            ARMOR (Full)                       87.5                          66.9                      58.8
+                            w/o threat assessment              86.8                          61.2                      49.5
+                            w/o input transformation           85.3                          59.7                      52.1
+                            w/o model robustness               87.9                          42.3                      35.8
+                            w/o adaptive response              87.2                          63.5                      48.9
+                            w/o orchestration (Pipeline)       84.1                          65.7                      54.2
+                            Threat assessment only             95.1                          0.0                       0.0
+                            Input transformation only          89.3                          28.7                      16.5
+                            Model robustness only              83.4                          53.2                      46.8
+                            Adaptive response only             95.5                          0.0                       0.0
+
+
+
+
+                                                  Fig. 4. Contribution of ARMOR components to overall performance.
+
+
+    Each component contributes significantly to ARMOR’s performance.                       • Performance-Security Trade-off: ARMOR achieves a superior
+Model Robustness provides the largest contribution to robust accu-                           balance, maintaining high clean accuracy while providing strong
+racy (53.2% when used alone), but the full system achieves 66.9%,                            robustness.
+demonstrating additive benefits from integration.                                          • Computational Efficiency: The variable overhead ensures se-
+    The orchestration mechanism is critical. Replacing it with a static                      curity without prohibitive resource requirements, even in con-
+pipeline (applying all components sequentially) reduces clean accuracy                       strained environments, similar to lightweight security solutions
+by 3.4 percentage points and robust accuracy slightly, highlighting the                      developed for IoT scenarios [40].
+orchestrator’s role in preserving clean performance through selective
+defense application.                                                                       These findings suggest future adversarial robustness research should
+    The adaptive response layer significantly improves performance                      focus on integrative approaches combining multiple defense mecha-
+against adaptive attacks. Without it, robustness drops to 48.9% versus                  nisms for enhanced effectiveness and efficiency.
+58.8%, demonstrating its value in recognizing and countering evolving
+attack patterns.                                                                        6.2. Real-world applications
+    Fig. 4 visualizes component contributions across performance met-
+rics. The synergistic integration of all components achieves perfor-
+                                                                                            ARMOR’s combination of strong robustness, reasonable computa-
+mance exceeding what any individual component or simple combina-
+                                                                                        tional overhead, and maintained clean accuracy makes it suitable for
+tion could provide.
+                                                                                        practical deployment:
+6. Discussion
+                                                                                           • Medical Imaging: ARMOR’s adaptability is valuable in health-
+                                                                                             care applications like COVID-19 detection from CT scans [4],
+6.1. Key findings and implications
+                                                                                             where diagnostic accuracy is critical. High clean accuracy (87.5%
+                                                                                             on CIFAR-10) and robustness help prevent costly false negatives.
+   Our experimental results demonstrate significant implications for
+                                                                                           • Resource-Constrained Environments: ARMOR’s flexible over-
+adversarial robustness research:
+                                                                                             head enables deployment on edge devices and mobile platforms,
+    • Integration of Complementary Defenses: ARMOR’s multi-                                  similar to efficient security schemes designed for Wireless Body
+      layered approach demonstrates that combining defenses yields                           Area Networks [40]. The minimal configuration achieves only
+      synergistic benefits beyond individual strengths and weaknesses.                       1.10× baseline inference time, supporting real-time applications
+    • Dynamic Defense Allocation: The orchestration mechanism en-                            in bandwidth-limited settings.
+      ables resource-efficient defense by applying appropriate measures                    • Security Applications: Adaptive defenses are well-suited for mal-
+      based on each input’s threat profile.                                                  ware and intrusion detection domains. The framework’s ability to
+    • Adaptive Defenses for Evolving Threats: The adaptive response                          continuously update defense strategies based on observed attack
+      layer is essential for maintaining robustness against novel attacks,                   patterns is valuable against advanced persistent threats and can
+      unlike static, fixed approaches.                                                       be applied to infrastructure surveillance systems [5].
+
+                                                                                    8
+M. Mohamed and F. AlJuaid                                                                                             Computer Standards & Interfaces 97 (2026) 104117
+
+
+    ARMOR’s modularity enables integration with existing security so-                • Explainability and Interpretability: Improving understanding
+lutions while accommodating domain-specific requirements, making it                    of ARMOR’s decision-making process to provide transparency
+practical for real-world critical applications.                                        about why specific defense strategies are selected for particular
+                                                                                       inputs.
+7. Conclusion                                                                        • Defense Against Physical-World Attacks: Extending ARMOR
+                                                                                       to counter physical-world adversarial attacks, which introduce
+                                                                                       additional challenges beyond digital perturbations.
+    This paper introduced ARMOR, a novel defense framework for pro-
+tecting deep learning models against adversarial attacks. Our approach
+advances the state-of-the-art through several key innovations:                   CRediT authorship contribution statement
+
+    • A multi-layered architecture that orchestrates complementary de-              Mahmoud Mohamed: Writing – original draft, Supervision, Soft-
+      fense strategies to provide synergistic protection exceeding indi-         ware, Conceptualization. Fayaz AlJuaid: Writing – review & editing,
+      vidual methods.                                                            Validation, Resources, Methodology, Formal analysis, Data curation.
+    • A dynamic orchestration mechanism that routes inputs through
+      appropriate defensive layers based on threat assessment, optimiz-          Declaration of competing interest
+      ing the security-efficiency trade-off.
+    • An adaptive response system that continuously updates defense                  The authors declare that they have no known competing finan-
+      strategies based on observed attack patterns, providing resilience         cial interests or personal relationships that could have appeared to
+      against evolving threats.                                                  influence the work reported in this paper.
+    • Comprehensive evaluation across diverse attack types, including
+      adaptive attacks, demonstrating superior performance-security              Data availability
+      trade-offs.
+                                                                                    Data will be made available on request.
+   Extensive experimental evaluation shows ARMOR significantly out-
+performs existing defenses:
+                                                                                 References
+    • 91.7% attack mitigation rate (18.3% improvement over ensemble
+      averaging)                                                                  [1] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial
+    • 87.5% clean accuracy preservation (8.9% improvement over ad-                    examples, in: International Conference on Learning Representations, ICLR, 2015.
+                                                                                  [2] N. Carlini, D. Wagner, Towards evaluating the robustness of neural networks,
+      versarial training alone)
+                                                                                      in: IEEE Symposium on Security and Privacy, (SP), 2017, pp. 39–57.
+    • 76.4% robustness against adaptive attacks (23.2% increase over              [3] N. Akhtar, A. Mian, Threat of adversarial attacks on deep learning in computer
+      the strongest baseline)                                                         vision: A survey, IEEE Access 6 (2018) 14410–14430.
+    • Minimal 1.42× computational overhead compared to unprotected                [4] O. Akinlade, E. Vakaj, A. Dridi, S. Tiwari, F. Ortiz-Rodriguez, Semantic seg-
+      models, substantially lower than alternative ensemble methods                   mentation of the lung to examine the effect of COVID-19 using UNET model,
+                                                                                      in: Communications in Computer and Information Science, Vol. 2440, Springer,
+                                                                                      2023, pp. 52–63, http://dx.doi.org/10.1007/978-3-031-34222-6_5.
+    Our results demonstrate that integrating and coordinating comple-
+                                                                                  [5] C. Wang, O. Akinlade, S.A. Ajagbe, Dynamic resilience assessment of urban traffic
+mentary defense mechanisms substantially improves adversarial robust-                 systems based on integrated deep learning, in: Advances in Transdisciplinary
+ness. By addressing the limitations of single-dimension strategies, AR-               Engineering, Springer, 2025, http://dx.doi.org/10.3233/atde250238.
+MOR provides more comprehensive and sustainable protection against                [6] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning
+                                                                                      models resistant to adversarial attacks, in: International Conference on Learning
+diverse and dynamic adversarial threats, moving closer to trustworthy
+                                                                                      Representations, ICLR, 2018.
+deep learning systems for high-performance, security-critical applica-            [7] C. Guo, M. Rana, M. Cisse, L. Van Der Maaten, Countering adversarial im-
+tions.                                                                                ages using input transformations, in: International Conference on Learning
+    Future Directions: While ARMOR shows significant improvements,                    Representations, ICLR, 2018.
+                                                                                  [8] J.H. Metzen, T. Genewein, V. Fischer, B. Bischoff, On detecting adversarial
+several research directions remain:
+                                                                                      perturbations, in: International Conference on Learning Representations, ICLR,
+                                                                                      2017.
+    • Domain Expansion: Extending ARMOR to domains beyond im-                     [9] F. Tramèr, N. Carlini, W. Brendel, A. Madry, On adaptive attacks to adver-
+      age classification (e.g., natural language processing, speech recog-            sarial example defenses, Adv. Neural Inf. Process. Syst. (NeurIPS) 33 (2020)
+      nition, reinforcement learning), which present unique attack sur-               1633–1645.
+      faces and defense requirements.                                            [10] A. Athalye, N. Carlini, D. Wagner, Obfuscated gradients give a false sense
+                                                                                      of security: Circumventing defenses to adversarial examples, in: International
+    • Certified Robustness: Developing theoretical guarantees for AR-                 Conference on Machine Learning, ICML, 2018, pp. 274–283.
+      MOR’s robustness. While we have strong empirical results, for-                                                 ̇ From manual to automated systematic review:
+                                                                                 [11] D. Kalibatiene, J. Miliauskaite,
+      mal certification would provide stronger security assurances for                Key attributes influencing the duration of systematic reviews in software en-
+      safety-critical applications.                                                   gineering, Comput. Stand. Interfaces 96 (2026) 104073, http://dx.doi.org/10.
+                                                                                      1016/j.csi.2025.104073.
+    • Advanced Training Strategies: Investigating meta-learning
+                                                                                 [12] Y. Dong, Q.A. Fu, X. Yang, T. Pang, H. Su, Z. Xiao, J. Zhu, Benchmarking
+      strategies for the orchestration policy to enable rapid adaptation              adversarial robustness on image classification, IEEE Conf. Comput. Vis. Pattern
+      to completely novel attack types.                                               Recognit. (CVPR) 32 (2020) 1–331.
+    • Online Learning Capabilities: Enhancing the adaptive response              [13] T. Pang, K. Xu, C. Du, N. Chen, J. Zhu, Improving adversarial robustness via
+                                                                                      promoting ensemble diversity, in: International Conference on Machine Learning,
+      layer with online learning to continuously update defense strate-
+                                                                                      (ICML), 2019, pp. 4970–4979.
+      gies in real-time without periodic retraining.                             [14] G.R. Machado, E. Silva, R.R. Goldschmidt, Adversarial machine learning in image
+    • Hardware Optimization: Optimizing ARMOR for deployment                          classification: A survey toward the defender’s perspective, ACM Comput. Surv.
+      on resource-constrained hardware, especially edge devices. This                 54 (5) (2021) 1–35.
+      could involve creating specialized versions that leverage hard-            [15] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically princi-
+                                                                                      pled trade-off between robustness and accuracy, in: International Conference on
+      ware acceleration for specific defense components, building on                  Machine Learning, ICML, 2019, pp. 7472–7482.
+      approaches from lightweight security schemes for IoT and Wire-             [16] E. Wong, L. Rice, J.Z. Kolter, Fast is better than free: Revisiting adversarial
+      less Body Area Networks [40].                                                   training, in: International Conference on Learning Representations, ICLR, 2020.
+
+
+                                                                             9
+M. Mohamed and F. AlJuaid                                                                                                             Computer Standards & Interfaces 97 (2026) 104117
+
+
+[17] S.A. Rebuffi, S. Gowal, D.A. Calian, F. Stimberg, O. Wiles, T. Mann, Fixing data            [27] S. Gowal, K. Dvijotham, R. Stanforth, R. Bunel, C. Qin, J. Uesato, R. Arand-
+     augmentation to improve adversarial robustness, Adv. Neural Inf. Process. Syst.                  jelovic, T. Mann, P. Kohli, Scalable verified training for provably robust image
+     (NeurIPS) 34 (2021) 10213–10224.                                                                 classification, in: IEEE International Conference on Computer Vision, ICCV, 2019,
+[18] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, A. Madry, Robustness may be                    pp. 4842–4851.
+     at odds with accuracy, in: International Conference on Learning Representations,            [28] G. Singh, T. Gehr, M. Püschel, M. Vechev, An abstract domain for certifying
+     ICLR, 2019.                                                                                      neural networks, Proc. ACM Program. Lang. 3 (POPL) (2019) 1–30.
+[19] C. Xie, J. Wang, Z. Zhang, Z. Ren, A. Yuille, Mitigating adversarial effects through        [29] G. Yang, T. Duan, J. Hu, H. Salman, I. Razenshteyn, J. Li, Randomized smoothing
+     randomization, in: International Conference on Learning Representations, ICLR,                   of all shapes and sizes, in: International Conference on Machine Learning, ICML,
+     2018.                                                                                            2020, pp. 10693–10705.
+[20] M. Naseer, S. Khan, M. Hayat, F.S. Khan, F. Porikli, A self-supervised approach             [30] F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M.
+     for adversarial robustness, in: IEEE Conference on Computer Vision and Pattern                   Chiang, P. Mittal, M. Hein, RobustBench: a standardized adversarial robustness
+     Recognition, CVPR, 2020, pp. 262–271.                                                            benchmark, Adv. Neural Inf. Process. Syst. (NeurIPS) 35 (2022) 32634–32651.
+[21] X. Jia, X. Wei, X. Cao, H. Foroosh, ComDefend: An efficient image compression               [31] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, P. McDaniel,
+     model to defend adversarial examples, in: IEEE Conference on Computer Vision                     Ensemble adversarial training: Attacks and defenses, in: International Conference
+     and Pattern Recognition, CVPR, 2019, pp. 6084–6092.                                              on Learning Representations, ICLR, 2018.
+[22] K. Lee, K. Lee, H. Lee, J. Shin, A simple unified framework for detecting out-              [32] S. Sen, N. Baracaldo, H. Ludwig, et al., A hybrid approach to adversarial
+     of-distribution samples and adversarial attacks, Adv. Neural Inf. Process. Syst.                 detection and defense, IEEE Int. Conf. Big Data 423 (2020) 3–4242.
+     (NeurIPS) 31 (2018) 7167–7177.                                                              [33] T. Pang, C. Du, J. Zhu, et al., Towards robust detection of adversarial examples,
+[23] K. Roth, Y. Kilcher, T. Hofmann, The odds are odd: A statistical test for detecting              Adv. Neural Inf. Process. Syst. (NeurIPS) 33 (2020) 10256–10267.
+     adversarial examples, in: International Conference on Machine Learning, ICML,               [34] S. Kariyappa, M. Qureshi, A survey of adversarial attacks on deep learning
+     2019, pp. 5498–5507.                                                                             in computer vision: A comprehensive review, 2019, arXiv preprint arXiv:1901.
+[24] X. Ma, Y. Niu, L. Gu, Y. Wang, Y. Zhao, J. Bailey, F. Lu, Understanding                          09984.
+     adversarial attacks on deep learning based medical image analysis systems,                  [35] X. Wei, B. Liang, Y. Li, et al., Adversarial distillation: A survey, IEEE Trans.
+     Pattern Recognit. 110 (2021) 107332.                                                             Neural Netw. Learn. Syst. (2021).
+[25] N. Carlini, D. Wagner, Adversarial examples are not easily detected: Bypassing              [36] A. Krizhevsky, et al., CIFAR,-10 dataset, 2009, https://www.cs.toronto.edu/kriz/
+     ten detection methods, in: ACM Workshop on Artificial Intelligence and Security,                 cifar.html.
+     2017, pp. 3–14.                                                                             [37] Y. Netzer, et al., SVHN, dataset, 2011, http://ufldl.stanford.edu/housenumbers/.
+[26] J. Cohen, E. Rosenfeld, Z. Kolter, Certified adversarial robustness via randomized          [38] J. Stallkamp, et al., GTSRB, dataset, 2011, https://benchmark.ini.rub.de/gtsrb_
+     smoothing, in: International Conference on Machine Learning, ICML, 2019, pp.                     dataset.html.
+     1310–1320.                                                                                  [39] J. Deng, et al., ImageNet dataset, 2009, https://image-net.org/.
+                                                                                                 [40] Z. Ali, J. Hassan, M.U. Aftab, N.W. Hundera, H. Xu, X. Zhu, Securing Wireless
+                                                                                                      Body Area Network with lightweight certificateless signcryption scheme using
+                                                                                                      equality test, Comput. Stand. Interfaces 96 (2026) 104070, http://dx.doi.org/10.
+                                                                                                      1016/j.csi.2025.104070.
+
+
+
+
+                                                                                            10
+
--- a/papers_txt/AdaTraj-DP--An-adaptive-privacy-framework-for-contex_2026_Computer-Standards.txt
+++ b/papers_txt/AdaTraj-DP--An-adaptive-privacy-framework-for-contex_2026_Computer-Standards.txt
@@ -0,0 +1,750 @@
+                                                               Computer Standards & Interfaces 97 (2026) 104125
+
+
+                                                                    Contents lists available at ScienceDirect
+
+
+                                                         Computer Standards & Interfaces
+                                                             journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+AdaTraj-DP: An adaptive privacy framework for context-aware trajectory
+data publishingI
+Yongxin Zhao a , Chundong Wang a,b ,∗, Hao Lin c ,∗∗, Xumeng Wang d , Yixuan Song a , Qiuyu Du c
+a
+  Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin, China
+b
+  TianJin Police Institute, Tianjin, China
+c
+  College of Intelligent Science and Technology (College of Cyberspace Security), Inner Mongolia University of Technology, Inner Mongolia, China
+d
+  College of Cryptology and Cyber Science, Nankai University, Tianjin, China
+
+
+
+ARTICLE                INFO                                ABSTRACT
+
+Keywords:                                                  Trajectory data are widely used in AI-based spatiotemporal analysis but raise privacy concerns due to their fine-
+Differential privacy                                       grained nature and the potential for individual re-identification. Existing differential privacy (DP) approaches
+Trustworthy AI                                             often apply uniform perturbation, which compromises spatial continuity, or adopt personalized mechanisms
+Trajectory data publishing
+                                                           that overlook structural utility. This study introduces AdaTraj-DP, an adaptive differential privacy framework
+Personalized perturbation
+                                                           designed to balance trajectory-level protection and analytical utility. The framework combines context-aware
+                                                           sensitivity detection with hierarchical aggregation. Specifically, a dynamic sensitivity model evaluates privacy
+                                                           risks according to spatial density and semantic context, enabling adaptive allocation of privacy budgets. An
+                                                           adaptive perturbation mechanism then injects noise proportionally to the estimated sensitivity and represents
+                                                           trajectories through Hilbert-based encoding for prefix-oriented hierarchical aggregation with layer-wise budget
+                                                           distribution. Experiments conducted on the T-Drive and GeoLife datasets indicate that AdaTraj-DP maintains
+                                                           stable query accuracy, spatial consistency, and downstream analytical utility across varying privacy budgets
+                                                           while satisfying formal differential privacy guarantees.
+
+
+
+1. Introduction                                                                                 differential privacy for trajectory data has become essential to support
+                                                                                                reliable and ethically compliant AI development.
+    The proliferation of mobile devices, GPS sensors, and intelligent                               Differential Privacy (DP) [6] provides a rigorous mathematical guar-
+transportation infrastructures has resulted in the large-scale collection                       antee against information leakage. However, its application to tra-
+of spatiotemporal data. Such data serve as the foundation for numerous                          jectory publishing introduces a persistent trade-off between privacy
+Location-Based Services (LBS), including navigation, ride-hailing, and                          strength, data utility, and personalization, which conventional mecha-
+urban planning [1,2]. Trajectory datasets record detailed sequences of                          nisms fail to reconcile. Two primary gaps remain unresolved: (1) the
+individual movements, enabling a wide range of AI applications such as                          tension between point-level perturbation and structural integrity;(2)
+traffic forecasting, mobility prediction, and behavioral modeling. These                        the difficulty of adapting privacy budgets to varying contextual sen-
+applications have become indispensable for smart city management and                            sitivity. Early studies injected uniform Laplace noise into each location
+autonomous systems, where the integrity and granularity of trajectory                           point [7,8], which protected individual coordinates but severely dis-
+data directly affect analytical and decision-making accuracy.                                   torted the spatiotemporal correlation essential for route-level analysis.
+    Despite their utility, trajectory datasets raise critical privacy con-                      Subsequent hierarchical schemes based on prefix trees or space-filling
+cerns for trustworthy AI. A single trajectory may expose an individual’s                        curves [9,10] preserved aggregate statistics but relied on global, fixed
+home, workplace, or health-related locations, revealing sensitive be-                           privacy parameters, ignoring heterogeneous sensitivity across trajecto-
+havioral patterns and social relationships [3,4]. Even after removing                           ries. Recent progress in Personalized Differential Privacy (PDP) [11–13]
+explicit identifiers, re-identification attacks can reconstruct personal                        introduced adaptive noise based on semantic or frequency-based sen-
+traces with minimal auxiliary information [5]. Consequently, ensuring                           sitivity, yet these methods typically lack integration with hierarchical
+
+
+
+    I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces.
+     ∗ Corresponding author at: Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin, China.
+    ∗∗ Corresponding author.
+    E-mail addresses: zyx4237@163.com (Y. Zhao), michael3769@163.com (C. Wang), suzukaze_aoba@126.com (H. Lin), wangxumeng@nankai.edu.cn
+(X. Wang), fykatb0824@163.com (Q. Du).
+
+https://doi.org/10.1016/j.csi.2025.104125
+Received 29 October 2025; Received in revised form 25 December 2025; Accepted 29 December 2025
+Available online 30 December 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Y. Zhao et al.                                                                                                                  Computer Standards & Interfaces 97 (2026) 104125
+
+
+aggregation, resulting in limited query accuracy and poor scalability            quadtree variants support spatial indexing under privacy constraints [7,
+for AI model training.                                                           10]. Recent work improves spatial locality and query accuracy us-
+    To bridge this gap, we propose AdaTraj-DP, an adaptive differ-               ing Hilbert/Geohash encodings and adaptive tree strategies [9]. Zhao
+entially private trajectory publishing framework that unifies context-           et al.’s PerTrajTree-DP further integrates point-level sensitivity with
+aware sensitivity modeling and hierarchical aggregation. AdaTraj-DP              prefix-tree publishing to better support trustworthy AI analytics [24].
+introduces a two-stage protection mechanism. The first stage detects             Complementary systems research on private data access and expla-
+and quantifies sensitivity using contextual and statistical cues, allowing       nation (e.g., DPXPlain, Saibot) demonstrates practical techniques for
+adaptive privacy budget assignment at the point level. The second                supporting DP-protected analytics and helping users interpret noisy
+stage encodes perturbed trajectories into a hierarchical prefix tree,            aggregates [25,26].
+applying layer-wise budget allocation to preserve structural consistency
+for downstream analysis. This design ensures both localized protection           2.3. Personalized and adaptive privacy protection
+and global analytical utility, addressing the core limitations of prior
+DP-based trajectory mechanisms.                                                      Personalized Differential Privacy (PDP) methods adapt protection
+    The main contributions of this work are summarized as follows:               to varying point- or user-level sensitivity. Semantics-driven approaches
+                                                                                 use POI categories or external labels to identify sensitive locations [27,
+   (1) We propose AdaTraj-DP, an adaptive framework that unifies per-            28], and movement-model-based frameworks like OPTDP estimate pri-
+       sonalized perturbation and hierarchical aggregation. By estab-            vacy risk from mobility patterns [11]. Statistical personalization meth-
+       lishing a mathematical link between local coordinate noise and            ods infer sensitivity from dataset properties; for example, TF–IDF-based
+       global prefix-tree structures, the framework ensures that fine-           approaches quantify local importance and global rarity to guide bud-
+       grained point-level protection remains structurally consistent            get allocation [12,13]. Interactive tools and visual analytics (DPKnob,
+       with trajectory-level differential privacy guarantees, enabling           Defogger) provide practical support for configuring heterogeneous DP
+       high-fidelity reconstruction for downstream tasks.                        strategies according to utility goals [20,21].
+   (2) We design a context-aware sensitivity model that combines spa-                In parallel, recent advances in differentially private deep learning
+       tial density with semantic context to guide adaptive budget               and private model training yield methods for improved utility in noisy
+       allocation. This mechanism quantifies privacy risks at a granular         training regimes (e.g., optimized DP-SGD variants, selective-update
+                                                                                 training, and heterogeneous-noise schemes) that can inform budget
+       level, enabling the dynamic adjustment of perturbation intensity
+                                                                                 allocation and model-aware privacy strategies in trajectory publish-
+       to balance privacy protection and data fidelity.
+                                                                                 ing [25,26,29–31]. These works highlight opportunities to close the
+   (3) We implement a hierarchical aggregation scheme utilizing Hilbert
+                                                                                 gap between personalized point-level protection and structural aggrega-
+       spatial mapping and logarithmic layer-wise budget distribution.
+                                                                                 tion, motivating AdaTraj-DP’s integration of context-aware sensitivity
+       Experiments on the T-Drive and GeoLife datasets validate the
+                                                                                 detection, adaptive perturbation, and hierarchical encoding to support
+       framework’s effectiveness in preserving query accuracy, spatial
+                                                                                 AI-oriented downstream tasks.
+       consistency, and AI model performance under varying privacy
+       budgets.                                                                  3. Preliminaries
+
+2. Related work
+                                                                                 Trajectory Representation. A trajectory 𝑇𝑖 of user 𝑢𝑖 is a temporally
+    Existing privacy-preserving trajectory publishing approaches can             ordered sequence of geo-referenced points [32]:
+be broadly categorized into three classes: (1) foundational differen-            𝑇𝑖 = {(𝑝𝑖,1 , 𝑡𝑖,1 ), (𝑝𝑖,2 , 𝑡𝑖,2 ), … , (𝑝𝑖,𝐿𝑖 , 𝑡𝑖,𝐿𝑖 )},                               (1)
+tial privacy models that ensure privacy but compromise trajectory
+continuity; (2) structural aggregation mechanisms that enhance data              where 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ) denotes the spatial coordinate and 𝑡𝑖,𝑗 is the
+utility via hierarchical organization; and (3) personalized and adaptive         timestamp. The trajectory dataset is denoted as  = {𝑇1 , 𝑇2 , … , 𝑇𝑁 }.
+privacy protection strategies that tailor noise to sensitivity but often             Each point can be projected into a discrete grid cell 𝑐𝑖,𝑗 for statistical
+lack integration with structural models. This section reviews these three        analysis or further spatial encoding. The dimensionality and sampling
+directions and discusses recent advances that motivate AdaTraj-DP.               irregularity of  result in high sparsity and heterogeneous sensitivity
+                                                                                 among locations, which requires adaptive privacy mechanisms.
+2.1. Foundational models for differentially private trajectory publishing        Differential Privacy. Let 1 and 2 be two neighboring datasets dif-
+                                                                                 fering in at most one trajectory. A randomized mechanism  satisfies
+    Differential Privacy (DP) [6] is the standard formalism for privacy-         𝜀-differential privacy if for any measurable subset 𝑂 in the output
+preserving data publication. Early approaches discretize continuous              space:
+spatio-temporal domains and inject Laplace noise into cell counts
+                                                                                 Pr[(1 ) ∈ 𝑂] ≤ 𝑒𝜀 Pr[(2 ) ∈ 𝑂].                                                        (2)
+or simple aggregates [14,15], but such methods often disrupt tra-
+jectory continuity and reduce utility for route-level analysis [7]. To               The privacy budget 𝜀 > 0 controls the trade-off between privacy pro-
+address this, research has explored trajectory generalization and syn-           tection and data utility. Smaller 𝜀 implies stronger privacy guarantees
+thetic data generation under DP, including clustering-based generaliza-          but larger perturbation noise.
+tion [16] and GAN-based synthetic trajectory models [17–19]. Work                    For a numerical query 𝑓 ∶  → R𝑘 with 𝓁1 sensitivity 𝛥𝑓 =
+on DP-aware data exploration and visualization—e.g., DPKnob and                  max1 ,2 ‖𝑓 (1 ) − 𝑓 (2 )‖1 , the Laplace mechanism adds independent
+Defogger—highlights the challenge of configuring DP mechanisms to                noise drawn from the Laplace distribution:
+balance utility and risk in interactive settings and motivates user- or
+                                                                                 () = 𝑓 () + Lap(𝛥𝑓 ∕𝜀).                                                                 (3)
+task-guided privacy configuration [20,21].
+                                                                                    This mechanism provides 𝜀-differential privacy and is used in sub-
+2.2. Structural aggregation for utility enhancement                              sequent trajectory perturbation and aggregation processes.
+                                                                                 Geographic Indistinguishability. For any two spatial points 𝑥, 𝑥′ ∈ R2
+   Hierarchical structures—such as prefix trees, Hilbert-encoded se-
+                                                                                 and any reported location 𝑧, a mechanism  achieves 𝜀-geographic
+quences, and spatial index trees—have been widely adopted to preserve
+                                                                                 indistinguishability if
+aggregate query utility under DP. Early prefix-tree methods aggre-
+                                                                                                                 ′
+gate shared prefixes to reduce noise impact [22,23], while R-tree and            Pr[(𝑥) = 𝑧] ≤ 𝑒𝜀⋅𝑑(𝑥,𝑥 ) Pr[(𝑥′ ) = 𝑧],                                                  (4)
+
+                                                                             2
+Y. Zhao et al.                                                                                                                        Computer Standards & Interfaces 97 (2026) 104125
+
+
+                                                                                          by combining statistical frequency and contextual semantics to guide
+                                                                                          subsequent adaptive perturbation.
+                                                                                          Spatial Discretization. The continuous geographical domain is parti-
+                                                                                          tioned into a uniform grid of 𝐺 × 𝐺 cells. Each point 𝑝𝑖,𝑗 is mapped to
+                                                                                          a corresponding grid cell 𝑐𝑖,𝑗 . This transformation converts raw coordi-
+                                                                                          nates into discrete spatial tokens, enabling frequency-based statistical
+                                                                                          analysis.
+
+            Fig. 1. Framework of the proposed AdaTraj-DP scheme.                          Context-aware Sensitivity Measure. For each cell 𝑐𝑖,𝑗 , a sensitivity score
+                                                                                          𝑆(𝑐𝑖,𝑗 ) is defined as
+
+                                                                                          𝑆(𝑐𝑖,𝑗 ) = TF(𝑐𝑖,𝑗 , 𝑇𝑖 ) ⋅ IDF(𝑐𝑖,𝑗 ) ⋅ 𝜔𝑐 ,                                           (6)
+where 𝑑(𝑥, 𝑥′ ) is the Euclidean distance between 𝑥 and 𝑥′ [33].                                                        count(𝑐𝑖,𝑗 ∈𝑇𝑖 )
+  This formulation extends differential privacy to continuous spatial                     where TF(𝑐𝑖,𝑗 , 𝑇𝑖 ) =              𝐿𝑖
+                                                                                                                                           represents the normalized local fre-
+                                                                                                                                                              ||
+domains and provides distance-dependent protection.                                       quency of visits within trajectory 𝑇𝑖 , and IDF(𝑐𝑖,𝑗 ) = log |{𝑇 ∈∶𝑐
+                                                                                                                                                           𝑘    𝑖,𝑗 ∈𝑇𝑘 }|
+Hierarchical Aggregation Structure. Trajectory data exhibit hierarchi-                    denotes the global rarity of the location across the dataset. The term
+cal correlations that can be represented through prefix-based aggre-                      𝜔𝑐 is a contextual weighting coefficient that quantifies the semantic
+gation. Let each discretized or encoded trajectory be expressed as a                      sensitivity of a location category. Following the semantic sensitivity
+                                                                                          hierarchy established in [34], we assign higher weights to privacy-
+sequence of spatial identifiers 𝑆𝑖 = [𝑠𝑖,1 , 𝑠𝑖,2 , … , 𝑠𝑖,𝐿𝑖 ]. A prefix tree 
+                                                                                          critical categories (e.g., 𝜔ℎ𝑒𝑎𝑙𝑡ℎ𝑐𝑎𝑟𝑒 = 1.5, 𝜔𝑟𝑒𝑠𝑖𝑑𝑒𝑛𝑡𝑖𝑎𝑙 = 1.2) to enforce
+organizes all trajectories in  by shared prefixes, where each node 𝑣
+                                                                                          stricter protection, while assigning lower base weights to public infras-
+corresponds to a spatial prefix and maintains a count 𝑐(𝑣) of trajectories
+                                                                                          tructure (e.g., 𝜔𝑟𝑜𝑎𝑑 = 1.0). These semantic categories are mapped from
+passing through it. The hierarchical form allows noise to be injected at
+                                                                                          public map services (e.g., OpenStreetMap), ensuring that the sensitivity
+multiple granularities while preserving global spatial consistency.
+                                                                                          configuration relies solely on public knowledge and does not consume
+   The total privacy budget 𝜀tree is distributed across tree layers to                    the private budget.
+balance upper-level accuracy and lower-level detail preservation.
+                                                                                          Normalization and Classification. To unify the sensitivity scale, all
+Problem Definition. Given a trajectory dataset  consisting of 𝑁 users                    scores are normalized into [0, 1]:
+and a total privacy budget𝜀total , the objective is to design a mechanism
+                                                                                                      𝑆(𝑐𝑖,𝑗 ) − min(𝑆)
+traj that releases a trajectory dataset ̃ = traj () satisfying:                       ̂ 𝑖,𝑗 ) =
+                                                                                          𝑆(𝑐                            .                                             (7)
+                                                                                                      max(𝑆) − min(𝑆)
+                                                                                              Each point 𝑝𝑖,𝑗 is then labeled as sensitive or non-sensitive according
+   (1) traj ensures 𝜀total -differential privacy at the trajectory level;
+                                                                                          to a predefined threshold 𝜃𝑆 :
+   (2) The released dataset ̃ preserves statistical and structural prop-                                {
+       erties essential for AI-based spatiotemporal analysis;                                                       ̂ 𝑖,𝑗 ) ≥ 𝜃𝑆 ,
+                                                                                                            1, if 𝑆(𝑐
+                                                                                          label(𝑝𝑖,𝑗 ) =                                                               (8)
+   (3) The expected analytical error between results obtained from ̃                                       0, otherwise.
+       and  remains bounded.                                                                 The resulting annotated dataset is represented as ′ = {𝑇1′ , 𝑇2′ , … , 𝑇𝑁′ },
+                                                                                          where each 𝑇𝑖′ contains the points and corresponding sensitivity labels.
+    Let 𝑓AI (⋅) denote an AI model trained or evaluated on trajectory                     The normalized score 𝑆(𝑐   ̂ 𝑖,𝑗 ) serves as a continuous privacy indicator in
+data. The utility preservation objective is formulated as                                 the subsequent adaptive perturbation phase.
+            [                    ]
+                   ̃ − 𝑓AI ()‖2 ,
+𝐿utility = E ‖𝑓AI ()                                             (5)
+                               2                                                          4.2. Adaptive personalized perturbation
+subject to ̃ satisfying 𝜀total -differential privacy. The goal is to minimize
+𝐿utility while maintaining formal privacy guarantees.                                         This phase injects controlled noise into all trajectory points in ′ to
+                                                                                          ensure trajectory-level differential privacy. All locations are perturbed
+4. Proposed framework                                                                     to avoid inference risks arising from selective protection. The perturba-
+                                                                                          tion strength is adaptively adjusted based on the normalized sensitivity
+                                                                                           ̂ 𝑖,𝑗 ) and local spatial density, allowing the mechanism to preserve
+                                                                                          𝑆(𝑐
+    Rapid development of AI-driven spatiotemporal analysis has in-
+creased the demand for high-quality trajectory data with strong privacy                   analytical fidelity while maintaining formal privacy guarantees.
+protection. Traditional differential privacy mechanisms often adopt                       Adaptive Privacy Budget Allocation. Each trajectory point 𝑝𝑖,𝑗 is as-
+fixed noise scales or uniform budget allocation, which can cause exces-                   signed an individual privacy budget 𝜀𝑝𝑖,𝑗 determined by both its sensi-
+sive utility degradation in dense areas or insufficient protection in sensi-              tivity level and spatial context.
+tive regions. To address these limitations, this study proposes AdaTraj-                       Let 𝜌(𝑝𝑖,𝑗 ) denote the local point density around 𝑝𝑖,𝑗 within a neigh-
+DP, a framework that integrates adaptive personalized perturbation                        borhood radius 𝑟. The adaptive budget is defined as
+with hierarchical aggregation to achieve trajectory-level differential                                                    (                                 )
+                                                                                                                             ̂ 𝑖,𝑗 ) + (1 − 𝛼)(1 − 𝜌(𝑝𝑖,𝑗 )) ,
+                                                                                          𝜀𝑝𝑖,𝑗 = 𝜀max − (𝜀max − 𝜀min ) × 𝛼 𝑆(𝑐                                    (9)
+privacy while maintaining analytical utility for AI-based modeling.
+As illustrated in Fig. 1, AdaTraj-DP operates in three main phases:                       where 𝛼 ∈ [0, 1] controls the balance between sensitivity-based and
+(1) trajectory preprocessing and context-aware sensitivity detection;                     density-based adaptation.
+(2) adaptive personalized perturbation guided by local sensitivity and                        A higher 𝑆(𝑐 ̂ 𝑖,𝑗 ) or lower 𝜌(𝑝𝑖,𝑗 ) leads to a smaller 𝜀𝑝 , introducing
+                                                                                                                                                          𝑖,𝑗
+spatial density; (3) hierarchical aggregation using Hilbert encoding and                  stronger noise for privacy-critical or sparsely visited regions. The range
+dynamic layer-wise budget allocation.                                                     [𝜀min , 𝜀max ] defines the permissible privacy strength, ensuring stability
+                                                                                          across heterogeneous data distributions.
+4.1. Context-aware sensitivity detection
+                                                                                          Two-Dimensional Laplace Perturbation. For each point 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ),
+                                                                                          independent Laplace noise is applied to both coordinates according to
+    Let  = {𝑇1 , … , 𝑇𝑁 } denote the trajectory dataset after basic
+                                                                                          the assigned privacy budget:
+preprocessing. Each trajectory 𝑇𝑖 = {(𝑝𝑖,1 , 𝑡𝑖,1 ), … , (𝑝𝑖,𝐿𝑖 , 𝑡𝑖,𝐿𝑖 )} consists               {
+of temporally ordered spatial points 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ). The objective                      lat 𝑖,𝑗 + Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )
+                                                                                          𝑝′𝑖,𝑗 =                                                                 (10)
+of this phase is to quantify the privacy sensitivity of each spatial point                          lon𝑖,𝑗 + Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )
+
+                                                                                      3
+Y. Zhao et al.                                                                                                                         Computer Standards & Interfaces 97 (2026) 104125
+
+
+Algorithm 1 Adaptive Personalized Perturbation under AdaTraj-DP                            Algorithm 2 Dynamic Hierarchical Aggregation under AdaTraj-DP
+Input: Annotated dataset           ′ ,    privacy range [𝜀min , 𝜀max ], sensitivity       Input: Perturbed dataset ′′ , total tree budget 𝜀tree , height ℎ,
+    scores 𝑆, ̂ balance coefficient 𝛼                                                          parameters 𝑎, 𝛾, encoding length 𝐿enc
+Output: Perturbed dataset ′′                                                              Output: Privacy-aware prefix tree  ′
+ 1: ′′ ← ∅                                                                                 1: Initialize empty tree 
+ 2: for each trajectory 𝑇𝑖 ∈ ′ do                                                          2: for each trajectory 𝑇𝑖′′ = {𝑝′𝑖,1 , … , 𝑝′𝑖,𝐿 } in ′′ do
+                                                                                                                                            𝑖
+ 3:   𝑇𝑖′′ ← ∅                                                                              3:    Encode trajectory:
+ 4:   for each point 𝑝𝑖,𝑗 in 𝑇𝑖 do                                                                𝑆𝑖 ← [Encode1D(𝐻(𝑝′𝑖,1 )), … , Encode1D(𝐻(𝑝′𝑖,𝐿 ))]
+                                                                                                                                                        𝑖
+ 5:      Compute local density 𝜌(𝑝𝑖,𝑗 )                                                     4:    Insert 𝑆𝑖 into  and increment node counts along each path
+ 6:      𝜀𝑝𝑖,𝑗 ← 𝜀max − (𝜀max − 𝜀min ) × (𝛼 𝑆(𝑐    ̂ 𝑖,𝑗 ) + (1 − 𝛼)(1 − 𝜌(𝑝𝑖,𝑗 )))         5: end for
+ 7:      𝑛lat ∼ Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )                                                        6: for layer 𝑖 = 1 to ℎ do
+ 8:      𝑛lon ∼ Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )                                                        7:    Compute node count variance 𝜎𝑖2
+ 9:      𝑝′𝑖,𝑗 ← (lat 𝑖,𝑗 + 𝑛lat , lon𝑖,𝑗 + 𝑛lon )                                                               (log(𝑖+𝑎))(1+𝛾𝜎𝑖2 )
+                                                                                            8:     𝜀level,𝑖 ← ∑ℎ                          ⋅ 𝜀tree
+10:       Append 𝑝′𝑖,𝑗 to 𝑇𝑖′′                                                                                                      2
+                                                                                                                 𝑗=1 (log(𝑗+𝑎))(1+𝛾𝜎𝑗 )
+11:   end for                                                                               9:   for each node 𝑣 at layer 𝑖 do
+12:   Add 𝑇𝑖′′ to ′′                                                                      10:     𝑐 ′ (𝑣) ← 𝑐(𝑣) + Laplace(0, 1∕𝜀level,𝑖 )
+13: end for                                                                                11:     Update 𝑐(𝑣) ← 𝑐 ′ (𝑣)
+14: return ′′                                                                             12:   end for
+                                                                                           13: end for
+                                                                                           14: return  ′
+
+   The perturbed trajectory 𝑇𝑖′′ = {𝑝′𝑖,1 , 𝑝′𝑖,2 , … , 𝑝′𝑖,𝐿 } is constructed by
+                                                               𝑖
+replacing each original point with its perturbed counterpart. The com-
+plete differentially private dataset is denoted as  = {𝑇1′′ , 𝑇2′′ , … , 𝑇𝑁′′ }.
+                                                            ′′                             loss in fine-grained trajectories, the logarithmic term ensures that leaf
+   Algorithm 1 outlines the adaptive personalized perturbation proce-                      nodes retain sufficient privacy budget to preserve local spatial details.
+dure.                                                                                      Differentially Private Node Perturbation. For each node 𝑣 at layer 𝑖,
+                                                                                           the sensitivity of its count query is 𝛥𝑓 = 1. Laplace noise is applied
+                                                                                           according to its layer-wise budget:
+4.3. Hierarchical aggregation with dynamic budget allocation
+                                                                                                                   (            )
+                                                                                                                          1
+                                                                                           𝑐 ′ (𝑣) = 𝑐(𝑣) + Laplace 0,            .                          (13)
+    This phase organizes the perturbed trajectories into a structured                                                  𝜀level,𝑖
+form for privacy-preserving analytical querying and AI model training.                        The resulting prefix tree  ′ with perturbed counts serves as a
+A hierarchical prefix tree is constructed from the encoded trajectories,                   privacy-preserving hierarchical representation supporting aggregate
+where node counts are perturbed under a dynamically adjusted budget                        analytics and AI-based trajectory modeling.
+to preserve global consistency while mitigating noise propagation.                            Algorithm 2 summarizes the hierarchical aggregation process with
+                                                                                           dynamic budget adjustment.
+Spatial Encoding via Hilbert Curve. Each perturbed point 𝑝′𝑖,𝑗 ∈ ′′
+is mapped into a one-dimensional integer value 𝑣𝑖,𝑗 using a Hilbert
+space-filling curve 𝐻(⋅), ensuring spatial locality preservation:                          4.4. Privacy analysis
+
+𝑣𝑖,𝑗 = 𝐻(𝑝′𝑖,𝑗 ).                                                             (11)
+                                                                                              The proposed AdaTraj-DP framework comprises two sequential
+    Each integer value 𝑣𝑖,𝑗 is then converted into a fixed-length binary                   privacy-preserving mechanisms: adaptive personalized perturbation
+string 𝑠𝑖,𝑗 of length 𝐿enc , forming a discretized trajectory representation               (with budget 𝜀point ) and hierarchical aggregation (with budget 𝜀tree ).
+𝑆𝑖 = [𝑠𝑖,1 , 𝑠𝑖,2 , … , 𝑠𝑖,𝐿𝑖 ]. The set of all encoded trajectories {𝑆𝑖 } consti-         By the sequential composition theorem of differential privacy, the total
+tutes the input to hierarchical aggregation. The technical details of this                 privacy guarantee satisfies
+Hilbert-to-binary-string encoding, including the relationship between                      𝜀total = 𝜀point + 𝜀tree .                                                             (14)
+the curve’s order and the string length, are elaborated in Appendix.
+
+Prefix Tree Construction. A prefix tree  is built from {𝑆𝑖 }, where each                  Privacy of Adaptive Personalized Perturbation (𝜀point ). The adaptive
+path from the root to a node 𝑣 represents a spatial prefix, and the node                   perturbation mechanism assigns an individual privacy budget 𝜀𝑝𝑖,𝑗 to
+count 𝑐(𝑣) indicates the number of trajectories sharing that prefix. The                                                                                      ̂ 𝑖,𝑗 )
+                                                                                           each trajectory point 𝑝𝑖,𝑗 derived from its normalized sensitivity 𝑆(𝑐
+maximum tree depth ℎ corresponds to the maximum trajectory length                          and local density 𝜌(𝑝𝑖,𝑗 ). To ensure rigorous privacy guarantees, it is
+or encoding depth.                                                                         assumed that the global weighting parameters (e.g., contextual weights
+                                                                                           𝜔𝑐 and density thresholds) are computed from public sources, such as
+Dynamic Layer-wise Budget Allocation. The total privacy budget 𝜀tree
+                                                                                           map topologies or non-sensitive historical statistics. This reliance on
+is distributed across tree layers according to both layer depth and
+                                                                                           public metadata is a standard practice in privacy-preserving spatial
+statistical variance. Let 𝜎𝑖2 denote the empirical variance of node counts
+                                                                                           publishing [14,33], ensuring that the sensitivity calibration process
+at layer 𝑖. The adaptive allocation for layer 𝑖 is defined as
+                                                                                           itself does not leak private information. Consequently, the allocated
+            (log(𝑖 + 𝑎)) ⋅ (1 + 𝛾𝜎𝑖2 )                                                     budget 𝜀𝑝𝑖,𝑗 depends solely on the characteristics of its corresponding
+𝜀level,𝑖 = ∑ℎ                            ⋅ 𝜀tree ,                            (12)         trajectory 𝑇𝑖 . Under this assumption:
+                                       2
+            𝑗=1 (log(𝑗 + 𝑎))(1 + 𝛾𝜎𝑗 )
+
+where 𝑎 > 0 is a smoothing parameter and 𝛾 ≥ 0 controls the weight of                         (1) The assignment of 𝜀𝑝𝑖,𝑗 relies solely on local statistics within 𝑇𝑖
+variance-based adjustment. Adopting the logarithmic strategy from [9],                            and public constants, which ensures independence among users.
+the function log(𝑖 + 𝑎) is selected to smooth the budget decay across                         (2) Each trajectory is processed through an independent Laplace
+layers. Unlike linear or exponential allocation schemes, which might                              mechanism. For any point 𝑝𝑖,𝑗 , the Laplace mechanism with scale
+excessively penalize deeper layers and lead to significant information                            1∕𝜀𝑝𝑖,𝑗 satisfies 𝜀𝑝𝑖,𝑗 -differential privacy.
+
+                                                                                       4
+Y. Zhao et al.                                                                                                           Computer Standards & Interfaces 97 (2026) 104125
+
+
+   (3) Because the budgets are bounded within [𝜀min , 𝜀max ], the overall             Both datasets are preprocessed by: (1) removing sampling intervals
+       privacy cost of this phase is dominated by the smallest allocated          exceeding 300 s; (2) filtering out trajectories shorter than 20 points;
+       budget, and the worst-case (strongest) guarantee corresponds to            (3) normalizing all coordinates into a [0, 1] × [0, 1] grid to ensure scale
+       𝜀min -DP for each point.                                                   comparability.
+   (4) By parallel composition across trajectories, the global privacy                These datasets collectively provide both high-density and low-
+       consumption of this phase is 𝜀point = 𝜀max , representing the max-         density spatial distributions, enabling a fair evaluation of the proposed
+       imum privacy loss incurred when the weakest noise is added.                context-aware sensitivity modeling.
+
+   Hence, the adaptive perturbation phase satisfies 𝜀max -differential            5.1.2. Baseline methods
+privacy.                                                                             To demonstrate the advantages of AdaTraj-DP, we compare it with
+Privacy of Hierarchical Aggregation (𝜀tree ). The hierarchical aggrega-           four representative baselines, each reflecting a distinct privacy design
+tion mechanism constructs a prefix tree and perturbs its node counts              paradigm:
+with layer-specific noise calibrated by 𝜀level,𝑖 . Each trajectory affects
+                                                                                      • HA-Tree [9]: A hierarchical aggregation method based on Hilbert
+exactly one node per layer, implying that the sensitivity of the count
+                                                                                        mapping and fixed logarithmic budget allocation, representing
+query at any layer is 𝛥𝑓 = 1. Adding Laplace noise with scale 1∕𝜀level,𝑖
+                                                                                        state-of-the-art static DP trees.
+guarantees 𝜀level,𝑖 -DP for that layer.
+                                                                                      • TFIDF-DP [13]: A personalized perturbation method using TF–
+    Because the per-layer budgets 𝜀level,𝑖 are partitioned from 𝜀tree ac-
+                                                                                        IDF-based sensitivity scoring without hierarchical structure, cor-
+cording to
+                                                                                        responding to point-level DP only.
+∑
+ℎ
+                                                                                      • QJLP (LDP) [7]: A local differential privacy baseline where each
+      𝜀level,𝑖 = 𝜀tree ,                                              (15)              trajectory is perturbed independently on the client side.
+𝑖=1
+                                                                                      • AdaTraj-DP (Ours): The proposed adaptive framework that com-
+and the layers are sequentially composed along each trajectory path,                    bines context-aware sensitivity detection, adaptive perturbation,
+the entire prefix tree synthesis mechanism satisfies 𝜀tree -differential                and dynamic hierarchical aggregation.
+privacy. The dynamic allocation factor (1 + 𝛾𝜎𝑖2 ) modifies the budget
+distribution without altering the total privacy bound, ensuring that the          5.1.3. Evaluation metrics
+overall guarantee remains unchanged.                                                 Performance is evaluated from three complementary perspectives:
+Overall Privacy Guarantee. Applying the sequential composition theo-              Data Utility. We adopt three quantitative metrics: Mean Absolute Error
+rem to the two phases yields the total privacy protection level:                  (MAE), Mean Relative Error (MRE), and Hausdorff Distance (HD).
+𝜀total = 𝜀max + 𝜀tree .                                               (16)        MAE and MRE evaluate accuracy for range-count queries on perturbed
+                                                                                  trajectories, while HD measures spatial fidelity between original and
+    This ensures that AdaTraj-DP provides formal, trajectory-level                released datasets.
+differential privacy. The adaptive and hierarchical mechanisms jointly
+                                                                                  Model Utility. To align with AI-oriented evaluation, we train a down-
+maintain consistent privacy guarantees while supporting utility-
+                                                                                  stream trajectory classification model based on a lightweight Mamba
+preserving analysis for AI-based spatiotemporal modeling.
+                                                                                  encoder [37]. The model predicts driver ID from trajectory segments,
+                                                                                  and classification accuracy on the perturbed data reflects end-task
+5. Experimental evaluation
+                                                                                  utility (𝑈cls ).
+
+   This section presents an extensive empirical evaluation of the pro-            Computational Efficiency. We report total runtime (𝑇total ) from prepro-
+posed AdaTraj-DP framework. The experiments aim to validate both                  cessing to privacy-protected publication, including all three phases of
+privacy preservation and analytical utility in AI-oriented trajectory             AdaTraj-DP.
+publishing. Specifically, we address the following research questions:
+                                                                                  5.1.4. Parameter configuration
+      • RQ1: How does the total privacy budget 𝜀total affect the analytical           Unless otherwise stated, experiments use the following default con-
+        utility of the released trajectories?                                     figuration: the total privacy budget 𝜀total is divided by an allocation
+      • RQ2: How does AdaTraj-DP perform compared to state-of-the-                ratio 𝛼, where 𝛼 ∈ [0.3, 0.7] controls the portion used for adaptive
+        art differential privacy mechanisms in terms of accuracy and              perturbation (𝜀point ), and (1 − 𝛼) for hierarchical aggregation (𝜀tree ):
+        computational efficiency?
+      • RQ3: What are the impacts of the adaptive parameters—including            𝜀point = 𝛼𝜀total , 𝜀tree = (1 − 𝛼)𝜀total .                                       (17)
+        allocation ratio 𝛼 and variance factor 𝛾—on privacy–utility trade-            We vary 𝜀total from 0.5 to 3.0 to investigate the privacy–utility
+        offs?                                                                     trade-off.
+                                                                                      The variance factor 𝛾 controlling dynamic budget adaptation is se-
+5.1. Experimental setup                                                           lected from {0, 0.2, 0.5, 1.0}, and the hierarchical smoothing parameter
+                                                                                  is set to 𝑎 = 1.0. The sensitivity threshold 𝜃𝑆 for classifying sensitive
+    This subsection introduces the datasets, baseline methods, evalua-            points is chosen from {0.6, 0.7, 0.8, 0.9}. The personalized budget range
+tion metrics, and parameter configurations used in the experiments.               is fixed at [𝜀min , 𝜀max ] = [0.1, 1.0].
+                                                                                      To ensure comparability, all methods share identical grid resolution
+5.1.1. Datasets                                                                   (𝐺 = 128) and Hilbert encoding length (𝐿enc = 16). All experiments are
+   Experiments are primarily conducted on the widely used T-Drive                 implemented in Python 3.8 with PyTorch 2.4 on an NVIDIA RTX 4090
+dataset, which records GPS trajectories of 10,357 taxis in Beijing                GPU.
+over seven days (February 2–8, 2008) [35]. It contains approximately
+15 million spatial points after preprocessing. To further verify cross-           5.2. RQ1: Data utility evaluation
+domain robustness, we additionally include the GeoLife dataset [36],
+which comprises 17,621 trajectories from 182 users, covering both                     This experiment evaluates how AdaTraj-DP preserves the analytical
+dense urban and sparse suburban mobility patterns.                                utility of published trajectories under different privacy budgets. All
+
+                                                                              5
+Y. Zhao et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104125
+
+
+
+
+                                             (a) MAE of Count Queries                  (b) MRE of Count Queries
+
+
+                                        Fig. 2. Trajectory count query accuracy under varying 𝜀total on both datasets.
+
+
+evaluations are conducted on both the T-Drive and GeoLife datasets,               Table 1
+covering dense and sparse mobility scenarios to ensure cross-domain               Spatial fidelity comparison (average over T-Drive and GeoLife datasets). Lower
+consistency.                                                                      values indicate higher spatial accuracy.
+                                                                                   𝜀total   Hausdorff Distance (HD)               Mean Displacement (MD)
+
+5.2.1. Accuracy of trajectory count queries                                                 AdaTraj-DP     Best Baseline          AdaTraj-DP      Best Baseline
+    We evaluate the ability of each method to answer prefix-based count            0.5      0.152          0.171 (HA-Tree)        0.098           0.113 (HA-Tree)
+queries accurately. For each dataset, a query set  consisting of 1000             1.0      0.096          0.127 (HA-Tree)        0.069           0.087 (HA-Tree)
+                                                                                   1.5      0.089          0.125 (TFIDF-DP)       0.063           0.088 (TFIDF-DP)
+random trajectory prefixes with lengths between 4 and 8 is selected.
+                                                                                   2.0      0.083          0.118 (TFIDF-DP)       0.059           0.083 (TFIDF-DP)
+Let 𝑐(𝑞) denote the true count of trajectories matching prefix 𝑞 ∈ , and          3.0      0.079          0.130 (QJLP)           0.056           0.094 (QJLP)
+𝑐(𝑞)
+̂    be the noisy count returned by the mechanism. The data utility is
+quantified using Mean Absolute Error (MAE) and Mean Relative Error
+(MRE), defined as:
+                                                                                  tasks. Two representative learning tasks are considered: (1) trajectory
+          1 ∑                          1 ∑ |𝑐(𝑞) − 𝑐(𝑞)|
+                                                     ̂
+MAE =            |𝑐(𝑞) − 𝑐(𝑞)|,
+                         ̂      MRE =                                (18)         classification, which predicts the semantic category of a movement se-
+         || 𝑞∈                      || 𝑞∈ max(𝑐(𝑞), 𝛿)
+                                                                                  quence; (2) destination prediction, which estimates the likely endpoint
+where 𝛿 is a smoothing parameter (set to 1% of the total dataset size)            of an ongoing trajectory. These tasks are evaluated on the T-Drive
+to prevent division by zero for small counts. The results are averaged            and GeoLife datasets to reflect both dense and sparse urban mobility
+over ten repetitions with independent noise realizations.                         environments.
+
+Effect of Privacy Budget 𝜀total . Figs. 2(a) and 2(b) illustrate the quan-        5.3.1. Trajectory classification
+titative relationship between privacy strength and data utility. All                  A hierarchical Transformer-based model with positional encoding is
+methods exhibit a convex error decay curve as 𝜀total increases from 0.5           trained on the published trajectories to perform multi-class trajectory
+to 3.0, reflecting the fundamental differential privacy trade-off.                classification. The model architecture follows a standard encoder setup
+    In the strict privacy regime (𝜖𝑡𝑜𝑡𝑎𝑙 ∈ [0.5, 1.5]), our method achieves       with three attention layers and a hidden size of 256. Each experiment
+the steepest marginal reduction in MAE, indicating a high return on               is repeated five times under independent noise realizations, and the
+privacy budget investment. Specifically, when 𝜖𝑡𝑜𝑡𝑎𝑙 increases from 0.5           average classification accuracy and macro F1-score are reported. The
+to 1.0, AdaTraj-DP reduces the MAE by approximately 45.3% (from                   total privacy budget 𝜀total is varied from 0.5 to 3.0.
+18.1 to 9.9), whereas the second-best baseline, HA-Tree, only achieves
+                                                                                  Effect of Privacy Budget 𝜀total . Figs. 4(a) and 4(b) illustrate the influ-
+a 31.4% reduction. This quantitative gap demonstrates that AdaTraj-
+                                                                                  ence of 𝜀total on model performance. As the privacy budget increases,
+DP yields a significantly higher marginal utility gain for every unit of
+                                                                                  both accuracy and F1-score improve across all methods. AdaTraj-
+privacy budget expended compared to static hierarchical structures.
+                                                                                  DP consistently maintains the highest model utility on both datasets,
+                                                                                  demonstrating that adaptive sensitivity control effectively preserves
+5.2.2. Preservation of spatial distribution
+                                                                                  discriminative features. The hierarchical tree representation mitigates
+   Spatial fidelity evaluates the geometric similarity between the orig-
+                                                                                  local noise accumulation, supporting stable model convergence.
+inal and perturbed trajectories. We use two complementary metrics:
+the Hausdorff Distance (HD) for worst-case deviation and the Mean                 5.3.2. Destination prediction
+Displacement (MD) for average positional distortion.                                 To evaluate predictive consistency, a sequence-to-sequence neural
+Effect of Privacy Budget 𝜀total . Fig. 3 and Table 1 summarize the spatial        decoder is trained to predict the destination region of each trajectory
+accuracy across privacy levels. For both T-Drive and GeoLife datasets,            prefix. Prediction accuracy is measured by the top-1 hit rate, while
+AdaTraj-DP consistently achieves smaller deviations, demonstrating its            spatial accuracy is quantified by the mean geodesic distance between
+robustness across data densities and spatial patterns. The sensitivity-           predicted and true destinations.
+guided perturbation preserves local consistency, while adaptive budget            Effect of Privacy Budget 𝜀total . Figs. 5(a) and 5(b) illustrate the results
+redistribution reduces distortion in dense urban regions.                         of destination prediction across both datasets. AdaTraj-DP maintains
+   Overall, AdaTraj-DP demonstrates consistent spatial and statisti-              stable predictive performance even under strict privacy constraints
+cal accuracy across both datasets, validating its generalizability to             (𝜀total < 1.0), consistently outperforming fixed-budget baselines that
+heterogeneous mobility distributions.                                             cannot adapt to local sensitivity variations. As the privacy budget
+                                                                                  increases, the prediction accuracy steadily improves, while the mean
+5.3. RQ2: Model utility evaluation                                                spatial deviation between predicted and true destinations decreases.
+                                                                                  This demonstrates that adaptive perturbation and hierarchical encoding
+   This experiment evaluates how the differentially private trajectories          together preserve mobility semantics and ensure downstream models
+generated by AdaTraj-DP retain their utility for AI-based downstream              can effectively capture trajectory intent despite injected noise.
+
+                                                                              6
+Y. Zhao et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104125
+
+
+
+
+                                           (a) Hausdorff Distance vs. Privacy       (b) Mean Displacement vs. Privacy
+                                           Budget                                   Budget
+
+
+                                              Fig. 3. Spatial fidelity comparison on T-Drive and GeoLife datasets.
+
+
+
+
+                                               (a) Classification Accuracy                     (b) F1-score
+
+
+                             Fig. 4. Trajectory classification performance under varying 𝜀total on T-Drive and GeoLife datasets.
+
+
+
+
+                                       (a) Destination Prediction Accuracy          (b) Destination Prediction Mean Dis-
+                                       (Top-1 Hit Rate)                             tance Error (km)
+
+
+                     Fig. 5. Destination prediction accuracy and spatial deviation under varying 𝜀total on T-Drive and GeoLife datasets.
+
+
+5.4. RQ3: Parameter sensitivity analysis                                            𝛼 = 0.6, where both the query error and model accuracy achieve
+                                                                                    near-balanced performance. When 𝛼 < 0.4, excessive noise in point
+    This experiment investigates the effect of key parameters in AdaTraj-           perturbation causes degraded spatial precision, while 𝛼 > 0.8 reduces
+DP on privacy–utility balance, focusing on two critical hyperparame-                the reliability of aggregated counts in the prefix tree, highlighting the
+ters: the budget allocation ratio 𝛼 and the sensitivity threshold 𝜃TFIDF .          necessity of coordinated budget allocation.
+All experiments are conducted with the total privacy budget 𝜀total = 1.5                In practice, the optimal 𝛼 depends on the specific utility require-
+on both the T-Drive and GeoLife datasets.                                           ments. For applications prioritizing fine-grained point precision (e.g.,
+                                                                                    destination prediction), a larger 𝛼 (e.g., 0.6–0.7) is recommended to
+5.4.1. Effect of budget allocation ratio 𝛼                                          allocate more budget to the perturbation phase. Conversely, for range
+    The parameter 𝛼 controls the distribution of the total privacy budget           query tasks relying on aggregate statistics, a smaller 𝛼 favors the hier-
+between the point-level perturbation and the hierarchical tree aggre-               archical tree structure. An empirical strategy for parameter selection
+gation phases, where 𝜀point = 𝛼𝜀total and 𝜀tree = (1 − 𝛼)𝜀total . A small           involves using a small, non-sensitive validation set to estimate the
+𝛼 assigns more budget to aggregation, reducing hierarchical noise,                  inflection point of the loss function. A balanced initialization of 𝛼 = 0.6
+whereas a large 𝛼 increases point-level fidelity at the expense of tree             is recommended as a default setting, which prioritizes neither point-
+consistency. We vary 𝛼 from 0.1 to 0.9 and evaluate both data utility               level perturbation nor structural aggregation excessively. To ensure
+and model accuracy.                                                                 privacy integrity, this validation set is constructed from public histor-
+    Figs. 6 presents the effect of 𝛼 on count query error (MAE) and                 ical trajectory data (e.g., open-source T-Drive samples) or a disjoint
+trajectory classification accuracy. An optimal trade-off is observed near           subset of historical records that does not overlap with the private
+
+                                                                                7
+Y. Zhao et al.                                                                                                          Computer Standards & Interfaces 97 (2026) 104125
+
+
+
+
+                                                                                         Fig. 8. Computational cost decomposition of AdaTraj-DP across three key
+Fig. 6. Impact of budget allocation ratio 𝛼 on query utility and model
+                                                                                         stages.
+performance at 𝜀total = 1.5.
+
+
+
+                                                                                         T-Drive dataset and the sparse, diverse GeoLife dataset. This cross-
+                                                                                         dataset stability suggests that AdaTraj-DP is robust to heterogeneous
+                                                                                         spatial distributions, indicating that a standard parameter configura-
+                                                                                         tion can yield reliable performance without the need for exhaustive
+                                                                                         hyperparameter retuning for every new application scenario.
+
+                                                                                         5.5. Scalability analysis
+
+                                                                                            To address practical deployment concerns, particularly for city-wide
+                                                                                         scenarios, we analyze the scalability of AdaTraj-DP regarding both
+                                                                                         dataset volume (number of users 𝑁) and temporal duration (trajectory
+                                                                                         length 𝐿).
+                                                                                         Scalability to Large-scale User Datasets. The computational complex-
+Fig. 7. Effect of the sensitivity threshold 𝜃TFIDF on spatial fidelity and predic-       ity of AdaTraj-DP is dominated by the linear scanning of trajectory
+tive performance at 𝜀total = 1.5.                                                        points. Specifically, the sensitivity detection and adaptive perturbation
+                                                                                         phases operate on each trajectory independently, with a time complex-
+                                                                                         ity of 𝑂(𝑁 ⋅ 𝐿). This independence allows for trivial parallelization
+                                                                                         across multiple processors, significantly reducing runtime on large-
+dataset . This separation guarantees that the hyperparameter tuning
+                                                                                         scale datasets. Furthermore, the hierarchical aggregation phase inserts
+process relies solely on public knowledge and does not consume the
+                                                                                         encoded sequences into the prefix tree with a complexity of 𝑂(𝑁 ⋅ 𝐿),
+privacy budget allocated for the sensitive data.
+                                                                                         avoiding the quadratic 𝑂(𝑁 2 ) pairwise comparisons often required by
+                                                                                         clustering-based or 𝐾-anonymity approaches. Consequently, the run-
+5.4.2. Effect of sensitivity threshold 𝜃TFIDF                                            time of AdaTraj-DP grows linearly with the number of users, indicating
+    The threshold 𝜃TFIDF determines how many trajectory points are                       that the framework is scalable to large-scale spatiotemporal datasets
+classified as sensitive during the TF–IDF-based detection process. A                     typical of modern urban computing.
+smaller threshold labels more points as sensitive, resulting in stronger
+                                                                                         Robustness for Long Historical Trajectories. For long historical tra-
+protection but higher noise magnitude. We vary 𝜃TFIDF from 0.6 to 1.2
+                                                                                         jectories, the challenge lies in maintaining structural efficiency and
+and evaluate the mean displacement (MD) and destination prediction
+                                                                                         data utility as the sequence length increases. AdaTraj-DP addresses this
+accuracy.
+                                                                                         through two mechanisms:
+    Figs. 7 depicts the variation of spatial fidelity and predictive util-
+ity under different 𝜃TFIDF values. As 𝜃TFIDF increases, the number of                      (1) Efficient Encoding: The Hilbert space-filling curve maps high-
+sensitive points decreases, leading to reduced perturbation intensity                          dimensional spatial points into 1D integers via efficient bit-
+and smaller average displacement. However, excessively large 𝜃TFIDF                            wise operations. Since the encoding complexity is constant per
+weakens privacy coverage and slightly degrades downstream predic-                              point, the computational cost scales linearly with the trajectory
+tion accuracy. The optimal setting is observed around 𝜃TFIDF = 0.9,                            length, avoiding the performance bottlenecks often associated
+balancing spatial accuracy with model generalization.                                          with complex sequence alignment methods.
+                                                                                           (2) Depth-Robust Aggregation: Long trajectories naturally necessitate
+5.4.3. Generalization and parameter stability                                                  deeper prefix trees, which typically suffer from severe budget
+    In the ablation studies presented above, we observed that the frame-                       dilution at lower levels. AdaTraj-DP addresses this through its
+work’s utility is responsive to variations in the budget allocation ratio                      logarithmic layer-wise allocation (Eq. (12)), which dampens
+𝛼 and sensitivity threshold 𝜃TFIDF , particularly when these parameters                        the noise increase rate relative to tree depth. This mechanism
+approach the boundaries of their respective ranges. This sensitivity                           ensures that the tail ends of extended mobility sequences re-
+necessitates a discussion on the model’s generalization capabilities                           tain analytical utility, preventing the rapid signal degradation
+across different data distributions.                                                           commonly observed in uniform allocation schemes.
+    While the framework exhibits sensitivity to extreme parameter vari-
+ations, it is worth noting that the optimal operating points (𝛼 ≈                        Empirical Efficiency Evaluation. To complement the theoretical com-
+0.6, 𝜃TFIDF ≈ 0.9) remain consistent across both the high-density                        plexity analysis, Fig. 8 presents the empirical runtime decomposition
+
+                                                                                     8
+Y. Zhao et al.                                                                                                          Computer Standards & Interfaces 97 (2026) 104125
+
+
+of AdaTraj-DP on the T-Drive dataset. The total processing time is                    This transformation is controlled by the Hilbert curve’s order pa-
+approximately 250 s. As observed, the TF–IDF Analysis phase con-                  rameter, designated as 𝑘. When applying a Hilbert curve with order 𝑘,
+stitutes the majority of the computational overhead (approx. 60%)                 the two-dimensional space becomes divided into a (2𝑘 ) × (2𝑘 ) cellular
+due to the necessity of global statistical aggregation across the spatial         grid. To guarantee that every coordinate within dataset 𝐷 receives
+grid. However, the core privacy mechanisms—Prefix Tree Construction               a distinct Hilbert index √assignment, the order parameter must fulfill
+and Perturbation—demonstrate high efficiency. Notably, the adaptive               the condition 𝑘 ≥ ⌈log |𝐷|⌉. This configuration assigns each cell,
+perturbation phase accounts for less than 10% of the total time, con-             including any coordinate it contains, to a unique integer within the
+firming that the granular noise injection introduces negligible latency.          interval [0, (2𝑘 )2 − 1].
+This performance profile validates that AdaTraj-DP is well-suited for                 The binary sequence length, denoted 𝐿enc , depends on the total
+periodic batch publishing scenarios (e.g., releasing trajectory updates           count of representable integer values. Representing all (2𝑘 )2 = 22𝑘
+every 5-10 min for traffic monitoring). While the current execution               distinct values necessitates a binary sequence of length 𝐿enc = 2𝑘. The
+time is sufficient for such batch-based near-real-time analytics, we              transformation consists of a direct conversion from integer 𝑣𝑖,𝑗 to its
+acknowledge that strictly latency-critical streaming applications may             𝐿enc -bit binary form, applying leading zero-padding when needed to
+require further optimization of the tree construction process. Neverthe-          maintain uniform length.
+less, for the targeted high-utility analysis tasks, this computational cost           Consider the following illustration: assume a Hilbert curve with
+is a justifiable trade-off for the structural consistency provided by the         order 𝑘 = 8. Under these conditions: The cellular count equals (28 )2 =
+framework.                                                                        65,536. The integer value 𝑣𝑖,𝑗 resides within the interval [0, 65535]. The
+                                                                                  necessary binary sequence length becomes 𝐿enc = 2 × 8 = 16.
+6. Conclusion                                                                         When coordinate 𝑝′𝑖,𝑗 maps to integer 𝑣𝑖,𝑗 = 47593, its 16-bit binary
+                                                                                  sequence representation becomes:
+    This study presented AdaTraj-DP, an adaptive privacy-preserving
+                                                                                  𝑠𝑖,𝑗 = Encode(47593, 16) = "1011100111101001".                                    (A.1)
+framework for publishing trajectory data with differential privacy guar-
+antees. The framework introduces context-aware sensitivity modeling                  This sequence 𝑠𝑖,𝑗 serves as the actual element for navigating and
+and adaptive budget allocation to balance privacy protection and an-              constructing the prefix tree. Individual bits within the sequence deter-
+alytical utility in AI-based mobility analysis. By integrating personal-          mine decisions at corresponding tree levels, establishing a multi-level
+ized perturbation with hierarchical prefix-tree aggregation, AdaTraj-DP           spatial indexing structure. The selection of parameter 𝑘 (and conse-
+enables trajectory-level differential privacy while maintaining spatial           quently 𝐿enc ) represents a crucial design choice that mediates between
+fidelity and downstream model performance.                                        spatial granularity and the prefix tree’s dimensions and computational
+    Future work will focus on extending AdaTraj-DP to support multi-              overhead.
+modal trajectory data, integrating semantic and temporal context under
+unified privacy constraints. Additionally, to address the efficiency con-         Data availability
+cerns in high-frequency streaming environments, we plan to investigate
+incremental tree update algorithms. This would allow the framework                   Data will be made available on request.
+to handle real-time data streams with significantly lower latency while
+maintaining the established privacy guarantees.
+                                                                                  References
+CRediT authorship contribution statement
+                                                                                   [1] W. Zhang, M. Li, R. Tandon, H. Li, Online location trace privacy: An information
+                                                                                       theoretic approach, IEEE Trans. Inf. Forensics Secur. 14 (1) (2018) 235–250.
+    Yongxin Zhao: Writing – review & editing, Writing – original                   [2] F. Jin, W. Hua, M. Francia, P. Chao, M.E. Orlowska, X. Zhou, A survey and
+draft, Visualization, Validation, Methodology, Investigation, Data cu-                 experimental study on privacy-preserving trajectory data publishing, IEEE Trans.
+ration, Conceptualization. Chundong Wang: Writing – review & edit-                     Knowl. Data Eng. 35 (6) (2022) 5577–5596.
+                                                                                   [3] J. Liu, J. Chen, R. Law, S. Wang, L. Yang, Travel patterns and spatial structure:
+ing, Project administration, Methodology. Hao Lin: Visualization, Val-
+                                                                                       understanding winter tourism by trajectory data mining, Asia Pac. J. Tour. Res.
+idation, Methodology. Xumeng Wang: Writing – review & editing,                         29 (11) (2024) 1351–1368.
+Methodology, Conceptualization. Yixuan Song: Methodology, Investi-                 [4] Z. Wu, X. Wang, Z. Huang, T. Zhang, M. Zhu, X. Huang, M. Xu, W. Chen, A
+gation, Conceptualization. Qiuyu Du: Investigation, Conceptualization.                 utility-aware privacy-preserving method for trajectory publication, IEEE Trans.
+                                                                                       Vis. Comput. Graphics.
+                                                                                   [5] S. Schestakov, S. Gottschalk, T. Funke, E. Demidova, RE-Trace: Re-identification
+Declaration of competing interest                                                      of modified GPS trajectories, ACM Trans. Spat. Algorithms Syst. 10 (4) (2024)
+                                                                                       1–28.
+    The authors declare that they have no known competing finan-                   [6] C. Dwork, Differential privacy, in: International Colloquium on Automata,
+cial interests or personal relationships that could have appeared to                   Languages, and Programming, Springer, 2006, pp. 1–12.
+                                                                                   [7] Z. Yang, R. Wang, D. Wu, H. Wang, H. Song, X. Ma, Local trajectory privacy
+influence the work reported in this paper.                                             protection in 5G enabled industrial intelligent logistics, IEEE Trans. Ind. Inform.
+                                                                                       18 (4) (2021) 2868–2876.
+Acknowledgments                                                                    [8] Z. Shen, Y. Zhang, H. Wang, P. Liu, K. Liu, Y. Shen, BiGRU-DP: Improved
+                                                                                       differential privacy protection method for trajectory data publishing, Expert Syst.
+                                                                                       Appl. 252 (2024) 124264.
+   Thanks to the National Key R&D Program of China (2023YFB2703
+                                                                                   [9] Y. Zhao, C. Wang, Protecting privacy and enhancing utility: A novel approach for
+900).                                                                                  personalized trajectory data publishing using noisy prefix tree, Comput. Secur.
+                                                                                       144 (2024) 103922.
+Appendix. Conversion from integer values to binary sequences                      [10] S. Yuan, D. Pi, X. Zhao, M. Xu, Differential privacy trajectory data protection
+                                                                                       scheme based on R-tree, Expert Syst. Appl. 182 (2021) 115215.
+                                                                                  [11] W. Cheng, R. Wen, H. Huang, W. Miao, C. Wang, OPTDP: Towards opti-
+   Our prefix tree construction necessitates the representation of each                mal personalized trajectory differential privacy for trajectory data publishing,
+geographic coordinate as a character sequence. Although the Hilbert                    Neurocomputing 472 (2022) 201–211.
+space-filling curve successfully transforms a two-dimensional coordi-             [12] N. Niknami, M. Abadi, F. Deldar, A fully spatial personalized differentially private
+nate 𝑝′𝑖,𝑗 into a one-dimensional integer 𝑣𝑖,𝑗 , this numerical value can-             mechanism to provide non-uniform privacy guarantees for spatial databases, Inf.
+                                                                                       Syst. 92 (2020) 101526.
+not be directly incorporated into a conventional prefix tree structure.           [13] P. Liu, D. Wu, Z. Shen, H. Wang, K. Liu, Personalized trajectory privacy data
+Consequently, we implement an additional transformation phase that                     publishing scheme based on differential privacy, Internet Things 25 (2024)
+converts this integer into a binary sequence 𝑠𝑖,𝑗 with fixed length.                   101074.
+
+
+                                                                              9
+Y. Zhao et al.                                                                                                                       Computer Standards & Interfaces 97 (2026) 104125
+
+
+[14] W. Qardaji, W. Yang, N. Li, Differentially private grids for geospatial data, in:         [25] T. Wang, Y. Tao, A. Gilad, A. Machanavajjhala, S. Roy, Explaining differen-
+     2013 IEEE 29th International Conference on Data Engineering, ICDE, IEEE, 2013,                 tially private query results with dpxplain, Proc. VLDB Endow. 16 (12) (2023)
+     pp. 757–768.                                                                                   3962–3965.
+[15] G. Cormode, C. Procopiuc, D. Srivastava, E. Shen, T. Yu, Differentially private           [26] Z. Huang, J. Liu, D.G. Alabi, R.C. Fernandez, E. Wu, Saibot: A differentially
+     spatial decompositions, in: 2012 IEEE 28th International Conference on Data                    private data search platform, Proc. VLDB Endow. (PVLDB) 16 (11) (2023) PVLDB
+     Engineering, IEEE, 2012, pp. 20–31.                                                            2023 demo / system paper.
+[16] J. Hua, Y. Gao, S. Zhong, Differentially private publication of general time-             [27] Y. Dai, J. Shao, C. Wei, D. Zhang, H.T. Shen, Personalized semantic trajectory
+     serial trajectory data, in: 2015 IEEE Conference on Computer Communications,                   privacy preservation through trajectory reconstruction, World Wide Web 21
+     INFOCOM, IEEE, 2015, pp. 549–557.                                                              (2018) 875–914.
+[17] Z. Zhang, X. Xu, F. Xiao, LGAN-DP: A novel differential private publication               [28] K. Zuo, R. Liu, J. Zhao, Z. Shen, F. Chen, Method for the protection of
+     mechanism of trajectory data, Future Gener. Comput. Syst. 141 (2023) 692–703.                  spatiotemporal correlation location privacy with semantic information, J. Xidian
+[18] Y. Hu, Y. Du, Z. Zhang, Z. Fang, L. Chen, K. Zheng, Y. Gao, Real-time trajectory               Univ. 49 (1) (2022) 67–77.
+     synthesis with local differential privacy, in: 2024 IEEE 40th International               [29] S. Denisov, H.B. McMahan, J. Rush, A. Smith, A. Guha Thakurta, Improved
+     Conference on Data Engineering, ICDE, IEEE, 2024, pp. 1685–1698.                               differential privacy for sgd via optimal private linear operators on adaptive
+[19] R. Zhang, W. Ni, N. Fu, L. Hou, D. Zhang, Y. Zhang, DP-LTGAN: Differentially                   streams, Adv. Neural Inf. Process. Syst. 35 (2022) 5910–5924.
+     private trajectory publishing via Locally-aware Transformer-based GAN, Future             [30] H. Fang, X. Li, C. Fan, P. Li, Improved convergence of differential private sgd
+     Gener. Comput. Syst. 166 (2025) 107686.                                                        with gradient clipping, in: The Eleventh International Conference on Learning
+[20] S. Jiao, J. Cheng, Z. Huang, T. Li, T. Xie, W. Chen, Y. Ma, X. Wang, DPKnob: A                 Representations, 2023.
+     visual analysis approach to risk-aware formulation of differential privacy schemes        [31] J. Fu, coauthors, DPSUR: Accelerating differentially private training via selective
+     for data query scenarios, Vis. Inform. 8 (3) (2024) 42–52.                                     updates and release, Proc. VLDB Endow. (PVLDB) 17 (2024) PVLDB paper; PDF
+[21] X. Wang, S. Jiao, C. Bryan, Defogger: A visual analysis approach for data                      available from VLDB site.
+     exploration of sensitive data protected by differential privacy, IEEE Trans. Vis.         [32] Y. Zheng, Trajectory data mining: an overview, ACM Trans. Intell. Syst. Technol.
+     Comput. Graphics 31 (1) (2025) 448–458, http://dx.doi.org/10.1109/TVCG.                        (TIST) 6 (3) (2015) 1–41.
+     2024.3456304.                                                                             [33] M.E. Andrés, N.E. Bordenabe, K. Chatzikokolakis, C. Palamidessi, Geo-
+[22] R. Chen, B.C.M. Fung, B.C. Desai, Differentially private trajectory data                       indistinguishability: Differential privacy for location-based systems, in: Proceed-
+     publication, 2011, arXiv:1112.2020, URL https://arxiv.org/abs/1112.2020.                       ings of the 2013 ACM SIGSAC Conference on Computer & Communications
+[23] C. Yin, J. Xi, R. Sun, J. Wang, Location privacy protection based on differential              Security, 2013, pp. 901–914.
+     privacy strategy for big data in industrial internet of things, IEEE Trans. Ind.          [34] W. Zhang, M. Li, R. Tandon, H. Li, Semantic-aware privacy-preserving online
+     Inform. 14 (8) (2017) 3628–3636.                                                               location trajectory data sharing, IEEE Trans. Inf. Forensics Secur. 17 (2022)
+[24] Y. Zhao, C. Wang, E. Zhao, X. Zheng, H. Lin, PerTrajTree-DP: A personalized                    2292–2306.
+     privacy-preserving trajectory publishing framework for trustworthy AI systems,            [35] J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, Y. Huang, T-drive: driving
+     in: Data Security and Privacy Protection, Springer Nature Singapore, Singapore,                directions based on taxi trajectories, in: Proceedings of the 18th SIGSPATIAL
+     ISBN: 978-981-95-3182-0, 2026, pp. 57–75.                                                      International Conference on Advances in Geographic Information Systems, 2010,
+                                                                                                    pp. 99–108.
+                                                                                               [36] Y. Zheng, X. Xie, W.-Y. Ma, et al., GeoLife: A collaborative social networking
+                                                                                                    service among user, location and trajectory, IEEE Data Eng. Bull. 33 (2) (2010)
+                                                                                                    32–39.
+                                                                                               [37] Y. Zhao, C. Wang, L. Li, X. Wang, H. Lin, Z. Liu, TrajMamba: A multi-scale
+                                                                                                    mamba-based framework for joint trajectory and road network representation
+                                                                                                    learning, 2025, https://ssrn.com/abstract=5624451.
+
+
+
+
+                                                                                          10
+
--- a/papers_txt/An-autonomous-deep-reinforcement-learning-based-approac_2026_Computer-Standa.txt
+++ b/papers_txt/An-autonomous-deep-reinforcement-learning-based-approac_2026_Computer-Standa.txt
--- a/papers_txt/An-efficient-string-solver-for-string-constraints-wi_2025_Journal-of-Systems.txt
+++ b/papers_txt/An-efficient-string-solver-for-string-constraints-wi_2025_Journal-of-Systems.txt
--- a/papers_txt/Assessing-the-quantum-readiness-of-cryptographic-standa_2026_Computer-Standa.txt
+++ b/papers_txt/Assessing-the-quantum-readiness-of-cryptographic-standa_2026_Computer-Standa.txt
--- a/papers_txt/Chaos-experiments-in-microservice-architectures--A-_2026_Computer-Standards-.txt
+++ b/papers_txt/Chaos-experiments-in-microservice-architectures--A-_2026_Computer-Standards-.txt
@@ -0,0 +1,979 @@
+                                                                             Computer Standards & Interfaces 97 (2026) 104116
+
+
+                                                                                   Contents lists available at ScienceDirect
+
+
+                                                                      Computer Standards & Interfaces
+                                                                          journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+Chaos experiments in microservice architectures: A systematic literature
+review
+Emrah Esen a , Akhan Akbulut a , Cagatay Catal b                                                   ,∗
+a
+    Department of Computer Engineering, Istanbul Kültür University, 34536, Istanbul, Turkey
+b
+    Department of Computer Science and Engineering, Qatar University, Doha 2713, Qatar
+
+
+
+ARTICLE                     INFO                                        ABSTRACT
+
+Keywords:                                                               This study analyzes the implementation of Chaos Engineering in modern microservice systems. It identifies
+Chaos engineering                                                       key methods, tools, and practices used to effectively enhance the resilience of software systems in production
+Microservice                                                            environments. In this context, our Systematic Literature Review (SLR) of 31 research articles has uncovered 38
+Systematic literature review
+                                                                        tools crucial for carrying out fault injection methods, including several tools such as Chaos Toolkit, Gremlin,
+                                                                        and Chaos Machine. The study also explores the platforms used for chaos experiments and how centralized
+                                                                        management of chaos engineering can facilitate the coordination of these experiments across complex systems.
+                                                                        The evaluated literature reveals the efficacy of chaos engineering in improving fault tolerance and robustness of
+                                                                        software systems, particularly those based on microservice architectures. The paper underlines the importance
+                                                                        of careful planning and execution in implementing chaos engineering and encourages further research in this
+                                                                        field to uncover more effective practices for the resilience improvement of microservice systems.
+
+
+Contents
+
+    1.     Introduction ...................................................................................................................................................................................................... 2
+    2.     Background ....................................................................................................................................................................................................... 2
+            2.1.    Microservice architecture ........................................................................................................................................................................ 3
+            2.2.    Microservice principles ........................................................................................................................................................................... 3
+            2.3.    Challenges/Troubleshooting/Failures in microservice architecture .............................................................................................................. 3
+            2.4.    Chaos engineering .................................................................................................................................................................................. 4
+    3.     Review protocol................................................................................................................................................................................................. 4
+            3.1.    Research questions ................................................................................................................................................................................. 4
+            3.2.    Search strategy....................................................................................................................................................................................... 4
+            3.3.    Study selection criteria ........................................................................................................................................................................... 4
+            3.4.    Study quality assessment......................................................................................................................................................................... 5
+            3.5.    Data extraction ...................................................................................................................................................................................... 5
+            3.6.    Data synthesis ........................................................................................................................................................................................ 6
+    4.     Results .............................................................................................................................................................................................................. 6
+            4.1.    Main statistics ........................................................................................................................................................................................ 6
+            4.2.    How is Chaos engineering effectively applied in production environments to enhance the resilience of software systems? .............................. 6
+            4.3.    Which platforms have been used for chaos experiments? ........................................................................................................................... 6
+            4.4.    How can Chaos engineering be effectively applied to microservice architecture to ensure successful implementation and enhance system
+                   resilience? .............................................................................................................................................................................................. 10
+            4.5.    To what extent can the centralized provision of Chaos engineering effectively facilitate the management of chaos experiments across complex
+                   systems?................................................................................................................................................................................................. 10
+            4.6.    What are the challenges reported in the relevant papers? .......................................................................................................................... 10
+    5.     Discussion ......................................................................................................................................................................................................... 10
+            5.1.    General discussion .................................................................................................................................................................................. 10
+            5.2.    Threats to validity .................................................................................................................................................................................. 12
+
+
+
+     ∗ Corresponding author.
+         E-mail address: ccatal@qu.edu.qa (C. Catal).
+
+https://doi.org/10.1016/j.csi.2025.104116
+Received 22 September 2024; Received in revised form 28 November 2025; Accepted 12 December 2025
+Available online 15 December 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+E. Esen et al.                                                                                                                                                    Computer Standards & Interfaces 97 (2026) 104116
+
+
+  6.     Conclusion ........................................................................................................................................................................................................ 12
+         CRediT authorship contribution statement ........................................................................................................................................................... 12
+         Declaration of competing interest ........................................................................................................................................................................ 12
+         Data availability ................................................................................................................................................................................................ 12
+         References......................................................................................................................................................................................................... 12
+
+
+
+                                                                                                                  challenges faced, and solutions. In addition, it will assess the effective-
+1. Introduction                                                                                                   ness of chaos experiments in enhancing the reliability and robustness of
+                                                                                                                  microservice systems by using data obtained from real-world scenarios
+    In recent years, the adoption of microservice architecture has led                                            to develop strategic recommendations. This study is a critical step
+to the transformation of application infrastructures into distributed                                             in understanding the applicability and impact of chaos engineering
+systems. These systems are designed to enhance maintainability by de-                                             within the complexity of microservice architectures and aims to make
+coupling services. The primary benefit of this architecture is the ease of                                        significant contributions to the body of knowledge in this field. Recent
+maintenance of individual services within the microservice ecosystem                                              research has applied chaos engineering for this architectural style, how-
+due to their smaller and more modular nature [1]. However, despite                                                ever, a systematic overview of the state-of-the-art on the use of chaos
+these advantages, the distributed nature of microservices introduces                                              engineering in the microservice architecture is lacking. Therefore, a
+significant challenges. Specifically, the complex management of ser-                                              Systematic Literature Review (SLR) has been performed to provide an
+vices and their tight integration can considerably complicate software                                            overview of how chaos engineering was applied.
+debugging. Debugging becomes complex in this architecture due to its                                                  This article primarily targets peer-reviewed research papers to main-
+distributed nature, the necessity to pinpoint the exact service causing                                           tain methodological consistency and ensure scholarly rigor. We specif-
+the problem, and the dynamic characteristics of microservices. Con-                                               ically chose a systematic literature review (SLR) methodology because
+sequently, debugging in microservice architecture demands a greater                                               peer-reviewed academic studies are subject to rigorous validation pro-
+level of effort and specialized expertise compared to conventional                                                cesses, which enhance the reliability and validity of our findings [8,
+monolithic architectures [2]. However, it becomes quite challenging to                                            9]. Although excluding industry-specific, grey literature may restrict
+predict what will happen if there is an unexpected error or if a service                                          certain practical perspectives, this choice was deliberately made to
+on the network goes out of service. Service outages can be caused by                                              avoid potential biases and uphold the scientific integrity of our re-
+anything from a malicious cyberattack to a hardware failure to simple                                             view [10,11]. However, future studies could broaden the scope to
+human error, and they can have devastating financial consequences.                                                incorporate industrial case studies and practical experiences, which
+Although such unexpected situations are rare, they can interfere with                                             would enrich our understanding of chaos engineering’s applicability
+the operation of distributed systems and devastatingly affect the live                                            beyond the academic context.
+environment in which the application is located [3]. It is necessary to                                               The main contributions of this study are listed as follows:
+detect points in the system before an error occurs and spreads to the
+                                                                                                                        1. To the best of our knowledge, this is the first study to employ
+entire system.
+                                                                                                                           a systematic literature review approach in the field of chaos
+    Microservice architecture applications undergo testing procedures
+                                                                                                                           engineering on microservice architecture applications [12]. The
+to ensure their quality and dependability. These include unit testing,
+                                                                                                                           study provides an extensive systematic literature review of how
+service test, end-to-end test, behavior-driven test, integration test, and
+                                                                                                                           chaos engineering can be applied to enhance the resilience of mi-
+regression test [4]. The comprehensive approach to microservices test-
+                                                                                                                           croservice architectures. It collates findings from various sources
+ing also encompasses live testing strategies for complex systems [5].
+                                                                                                                           to provide insights into the current state of research and practice
+This thorough process emphasizes different aspects such as function-
+                                                                                                                           in this field.
+ality, interoperability, performance of individual services within the
+                                                                                                                        2. The study categorizes and summarizes the range of chaos en-
+architecture. It aims to detect and resolve issues early to ensure stable
+                                                                                                                           gineering tools and methods used in industry and academia,
+and high-quality microservice applications [1,6]. However, considering
+                                                                                                                           highlighting their functionalities in process/service termination,
+that microservices consist of multiple services, the application should
+                                                                                                                           network simulation, load stressing, security testing, and fault
+not have an impact on the user experience in cases such as network
+                                                                                                                           injection within application code.
+failures and suddenly increased service loads. For example, if the
+                                                                                                                        3. This research paper discusses contemporary techniques and ap-
+microservice that adds the product to favorites on a shopping site fails
+                                                                                                                           proaches for implementing chaos engineering in microservice
+or responds late, the user should be able to continue the shopping ex-
+                                                                                                                           architectures. It also emphasizes the ongoing work in this field,
+perience. Therefore, testing operations in production-like environments
+                                                                                                                           offering a significant reference for future research endeavors.
+become inevitable. No matter how distributed or complex the system
+                                                                                                                           The paper systematically reviews existing literature to showcase
+is, there is a need for a method to manage unforeseeable situations
+                                                                                                                           how chaos engineering can enhance system resilience, laying a
+that can build trust in the system against unexpected failures. chaos
+                                                                                                                           comprehensive groundwork for further exploration into chaos
+engineering is defined as the discipline of conducting experiments in a
+                                                                                                                           experimentation strategies and innovating new fault injection
+live environment to test or verify the reliability of software [7].
+                                                                                                                           methods or tools within microservice architectures.
+    The primary objective of this research is to conduct a thorough
+investigation into how chaos experiments are performed in the widely                                                  The rest of the paper is structured as follows: Section 2 explains
+used microservices-based systems of today. Microservice architectures                                             the background and related work. Section 3 presents the methodology
+have come to the forefront in modern software development processes                                               of the research. Section 4 presents the results and Section 5 compre-
+due to their advantages such as flexibility, scalability, and rapid de-                                           hensively discusses the presented answers to research questions and
+velopment. However, these architectures also bring unique challenges                                              validity threats. Lastly, the conclusion is presented in Section 6.
+due to complex service dependencies and dynamic operational environ-
+ments. This study aims to comprehensively address the methodologies,                                              2. Background
+application scenarios, and impacts of chaos experiments conducted
+to test the resilience of microservice systems and identify potential                                                The microservice approach breaks down a large application into a
+weak points. The research intends to present the current state of chaos                                           network of small, self-contained units, each running its own process
+engineering practices by analyzing them, highlighting best practices,                                             and often communicating through web APIs. Unlike large, single-piece
+
+                                                                                                              2
+E. Esen et al.                                                                                                      Computer Standards & Interfaces 97 (2026) 104116
+
+
+monolithic systems, these small services are robust, easy to scale up or            Technology heterogeneity. They are treated as small services, each run-
+down, and can be updated individually using various programming lan-                ning independently and communicating with each other using open
+guages and technologies. This structure allows development teams to be              protocols. While monolithic applications are developed with a single
+smaller and more agile, leading to faster updates and improvements.                 programming language and database system, services included in a
+Yet, managing many interconnected services can become complicated,                  microservice ecosystem may use a different programming language and
+especially when something goes wrong. To enhance system reliability                 database. This allows the advantages of each programming language
+and resilience, a method known as chaos engineering is employed. This               and database to be used.
+involves deliberately introducing problems into the live system to test
+                                                                                    Resilience. When an error occurs in the system in monolithic applica-
+its ability to cope and recover. This technique helps to uncover and
+                                                                                    tions, the whole system is affected. In the microservice architecture,
+rectify flaws, thereby making the system stronger overall. Regular and
+                                                                                    only the part under the responsibility of the relevant service is affected,
+automated tests mimic real-life problems to ensure that the system can              the places belonging to other services are not affected and the user
+handle unexpected challenges and remain stable and efficient.                       experience continues.
+
+2.1. Microservice architecture                                                      Scalability. While the scaling process on monolithic applications covers
+                                                                                    the entire application, the services that are under heavy load can be
+    Microservice architectures have gained significant popularity in the            scaled in applications developed with microservice architecture. This
+software industry due to their ability to address the challenges and                prevents extra resource costs for partitions that do not need to be scaled
+complexities of developing modern applications [6,13].                              unnecessarily and increases the user experience.
+
+                                                                                    Deployment. Microservice architecture facilitates the autonomous de-
+2.2. Microservice principles                                                        ployment of individual services, enabling updates or changes without
+                                                                                    impacting others. Various deployment strategies, including blue–green,
+    Microservice architectures are based on the concept of decentral-               canary, and rolling deployment, minimize disruptions during the de-
+ization, where each service is independently developed, deployed, and               ployment process [18]. As a result, microservice architecture provides
+managed. This emphasizes autonomy and minimal inter-service depen-                  increased flexibility and resilience in deployment, distinguishing it
+dencies. Each microservice is designed to focus on a single function or             from monolithic applications.
+closely related set of functions and supports technology heterogeneity
+by allowing different services to use different technology stacks that              Organizational alignment. In software development processes, some
+best suit their needs. Resilience is a core aspect, with services built to          challenges may be encountered due to large teamwork and large pieces
+withstand failures without affecting the entire system while scalability            of code. It is possible to make these challenges more manageable with
+enables services to be scaled independently as per demand. Com-                     smaller teams established. At the same time, this is an indication that
+munication occurs through lightweight mechanisms like HTTP/REST                     microservices applications allow us to form smaller and more cohesive
+APIs, supporting continuous delivery and deployment practices. Due                  teams. Each team is responsible for its own microservice and can take
+to the distributed nature of microservice architecture, comprehensive               action by making improvements if necessary.
+monitoring and logging for observability becomes crucial. Additionally,
+there is often an alignment between the microservice architecture                   2.3. Challenges/Troubleshooting/Failures in microservice architecture
+and organizational structure involving small cross-functional teams
+                                                                                        Microservice architectures pose numerous challenges. As the num-
+responsible for individual services [14].
+                                                                                    ber of services increases, the complexity of service interactions also
+    It is helpful to compare the microservice architecture to the mono-
+                                                                                    grows. Network communication reliance leads to latency and net-
+lithic architecture. The main difference between them is the dimensions
+                                                                                    work failure issues, while ensuring data consistency across multiple
+of the developed applications. The microservice architecture can be
+                                                                                    databases requires careful design and implementation of distributed
+thought of as developing an application as a suite of smaller services,
+                                                                                    transactions or eventual consistency models. Microservices bring typ-
+rather than as a single, monolithic structure. Enterprise applications
+                                                                                    ical distributed system challenges such as handling partial failures,
+usually consist of three main parts: a client-side user interface (i.e., con-
+                                                                                    dealing with latency and asynchrony, complex service discovery, load
+taining HTML pages and Javascript running on the user’s machine
+                                                                                    balancing in dynamic scaling environments, and managing configu-
+in a browser), a database (i.e., composed of many tables, common
+                                                                                    rations across multiple services and environments. Security concerns
+and often relational, added to database management), and a server-
+                                                                                    are heightened due to increased inter-service communications surface
+side application. In the server-side application, HTTP requests are                 area. Testing becomes more complex involving individual service test-
+processed, business logic is executed, HTML views are prepared that                 ing along with testing their interactions; deployment is challenging
+will retrieve data from the database and update it and send it to the               especially when there are dependencies between services; effective
+browser. This structure is a good example of monoliths. Any changes                 observability and monitoring become crucial for timely issue resolu-
+to the system involve creating and deploying a new version of the                   tion; versioning management is critical for maintaining system stability;
+server-side application [15]. The cycles of change are interdependent.              lastly assembling skilled teams proficient in DevOps, cloud computing,
+A change to a small part of the application requires rebuilding and                 programming languages presents a significant challenge. Microservice
+deploying the entire monolith [6].                                                  architecture faces various challenges, troubleshooting, and failures.
+    Microservice architecture, on the other hand, has some common                   While adopting a distributed architecture enhances modularity, it in-
+features, unlike monolithic architecture. These are componentization                herently introduces operational complexities that differ significantly
+with services, organizing around job capabilities, smart interfaces and             from monolithic structures. Recent research has also explored the use
+simple communication, decentralized governance, decentralized data                  of hybrid bio-inspired algorithms to optimize this process dynamically.
+management, infrastructure automation, and design for failure [16].                 For instance, the Hybrid Kookaburra–Pelican Optimization Algorithm
+Today, although modern internet applications seem like a single appli-              has been shown to improve load distribution and system scalability in
+cation, they use microservice architectures behind them. Microservice               cloud and microservice-based environments [19].
+architecture basically refers to small autonomous and interoperability                  In conclusion, while microservices offer numerous advantages such
+services. It has emerged due to increasing needs such as technology                 as improved scalability, flexibility, and agility, they also introduce
+diversity, flexibility, scaling, ease of deployment, organization and               significant challenges in terms of system complexity, operational de-
+management, and provides various advantages in these matters. Its                   mands, and the need for skilled personnel and sophisticated tool-
+advantages are described as follows [17]:                                           ing [20].
+
+                                                                                3
+E. Esen et al.                                                                                                     Computer Standards & Interfaces 97 (2026) 104116
+
+
+2.4. Chaos engineering                                                             3.1. Research questions
+
+
+    ‘‘Chaos engineering is the discipline of experimenting on a dis-                  Research Questions (RQs) and their corresponding motivations are
+tributed system in order to build confidence in the system’s capability            presented as follows:
+to withstand turbulent conditions in production-like environment’’ [7,
+                                                                                       • RQ1: How is Chaos engineering effectively applied in production
+21]. It is the careful and planned execution of experiments to show how
+                                                                                         environments to enhance the resilience of software systems?
+the distributed system will respond to a failure. It is necessary for large-
+                                                                                         Motivation: Understanding the practical implementation of Chaos
+scale software systems because it is practically impossible to simulate
+                                                                                         engineering in production environments is crucial for ensuring
+real events in test environments. Experiments based on real events are                   the resilience of software systems under real-world operating
+created together with chaos engineering [22]. By analyzing the test                      conditions.
+results, improvements are made where necessary, and in this way, it                    • RQ2: Which platforms have been used for Chaos experiments?
+is aimed to increase the reliability of the software in the production                   Motivation: Identifying the platforms provides insights into the
+environment.                                                                             technological landscape and tools available for conducting Chaos
+    Thanks to an experimental and systems-based approach, confidence                     engineering practices.
+is established for the survivability of these systems during collapses.                • RQ3: How is Chaos engineering effectively applied to microser-
+Canary analysis collects data on how distributed systems react to                        vice architectures to ensure its successful implementation in en-
+failure scenarios by observing their behavior in abnormal situations and                 hancing system resilience?
+performing controlled experiments [23]. This method involves applying                    Motivation: Microservice architectures introduce new challenges
+new updates or changes to a specific aspect of the system, enabling                      in system design. Exploring the application of Chaos engineering
+early detection of potential problems before they affect a larger scale.                 in this context can help improve the resilience and fault tolerance
+    Chaos experiments consist of the following principles [24,25]:                       of microservice systems.
+                                                                                       • RQ4: To what extent can the centralized provision of Chaos
+     • Hypothesize steady state: The first step is to hypothesize the                    engineering effectively facilitate the management of Chaos exper-
+       steady state of the system under normal conditions.                               iments across complex systems?
+     • Vary real-world events: The next step is to vary real-world events                Motivation: Understanding the feasibility of providing Chaos en-
+       that can cause turbulence in the system.                                          gineering as a centralized service enables organizations to coor-
+     • Run experiments in production: Experimenters should run the ex-                   dinate Chaos experiments across complex systems.
+       periments in production-like environment to simulate real-world                 • RQ5: What are the challenges reported in the relevant papers?
+       conditions.                                                                       Motivation: Identifying these challenges provides valuable in-
+     • Automate experiments to run continuously: Experimenters should                    sights into overcoming obstacles and advancing the adoption of
+       automate the experiments to run continuously, ensuring that the                   Chaos engineering practices.
+       system can withstand turbulence over time.
+     • Minimize blast radius: The experiments should be designed to                3.2. Search strategy
+       minimize blast radius, i.e., the impact of the experiment on the
+       system should be limited to a small area                                        The primary studies were carefully selected from the papers pub-
+     • Analyze results: Experimenters should analyze the results of the            lished between 2010 and 2022 because the topic is only relevant in
+       experiments to determine the system’s behavior under turbulent              recent years. The databases are IEEE Xplore, ACM Digital Library,
+       conditions.                                                                 Science Direct, Springer, Wiley, MDPI and Scopus and Science Direct.
+     • Repeat experiments: The experiments should be repeated to en-               The initial search involved reviewing the titles, abstracts, and keywords
+       sure that the system can consistently withstand turbulence.                 of the studies identified in the databases. The search results obtained
+       When the experiment is finished, information about the actual               from the databases were stored in the data extraction form using a
+       effect will be provided to the system.                                      spreadsheet tool. Furthermore, this systematic review was conducted
+                                                                                   collaboratively by three authors.
+                                                                                       The following search string was used to broaden the search scope:
+3. Review protocol                                                                 ((chaos engineering) OR (chaos experiments)) OR (microservices)
+                                                                                       The results of the searches made in the databases mentioned above
+    Systematic review studies must be conducted using a well-defined               are shown in Fig. 2.
+and specific protocol. To conduct a systematic review study, all studies
+on a particular topic must be examined [12]. We followed the system-               3.3. Study selection criteria
+atic review process shown in Fig. 1 and took all the steps to reduce risk
+bias in this study. Multiple reviewers were involved in the SLR process,               After applying exclusion inclusion criteria, 55 articles were ob-
+and in cases of conflict, a brief meeting was organized to facilitate              tained. The exclusion criteria in our study are shown as follows:
+consensus. The first step is to define the research questions. Then,
+the most appropriate databases were selected. Based on the selected                    • EC-1: Duplicate papers from multiple sources
+databases, automated searches were conducted and several articles                      • EC-2: Papers without full-text availability
+were identified. Selection criteria were then established to determine                 • EC-3: Papers not written in English
+                                                                                       • EC-4: Survey papers
+which studies should be included and excluded in this research. The
+                                                                                       • EC-5: Papers not related to Chaos engineering
+titles and abstracts of all studies were reviewed. In cases of doubt,
+the full text of the publication was reviewed. Then, after the studies                The inclusion criteria in our study are shown as follows:
+were analyzed in detail, selection criteria were applied. All selected
+studies were assessed using a quality assessment process. Subsequently,                • IC-1: Primary papers discussing the use of Chaos experiments in
+the results were synthesized, listed, and summarized in a clear and                      a microservice architecture
+understandable manner.                                                                 • IC-2: Primary publications that focus on Chaos engineering
+
+                                                                               4
+E. Esen et al.                                                                                                  Computer Standards & Interfaces 97 (2026) 104116
+
+
+
+
+                                                              Fig. 1. SLR review protocol.
+                                                              Source: Adapted from [26–
+                                                              28].
+
+
+
+
+                                                   Fig. 2. Distribution of selected papers per database.
+
+
+3.4. Study quality assessment                                                        Fig. 2 presents the distribution of papers based on databases where
+                                                                                 they were found at different selection stages. After the initial search,
+    The assessment of each study’s quality is an indicator of the strength       4520 papers were retrieved, of which 55 remained after applying the
+of evidence provided by the systematic review. The quality of studies            selection criteria. After quality assessment, 31 papers were selected
+was assessed using various questions. Studies of poor quality were               as primary studies. The 55 papers were carefully read in full and the
+not included in the present study. These criteria based on quality               required data for answering the research questions were extracted.
+instruments were adopted guide and other SLRs research [12]. The                     All the collected articles are listed in Table 1.
+following questions were used to assess the quality of the studies.
+                                                                                 3.5. Data extraction
+     • Q1. Are the aims of the study clearly stated?
+     • Q2. Are the scope and experimental design of the study clearly
+       defined?                                                                      Data required for answering the Research Questions were extracted
+     • Q3. Is the research process documented adequately?                        from the selected articles to answer the research questions. A data
+     • Q4. Are all the study questions answered?                                 extraction form was created to answer the research questions. The data
+     • Q5. Are the negative findings presented?                                  extraction form consists of several metadata such as the author’s first
+     • Q6. Do the conclusions relate to the aim of the purpose of the            and last name, the title of the study, the publication year, and the type
+       study and are they reliable?                                              of study. In addition to this metadata, several columns were created
+                                                                                 to store the required information related to the research questions. By
+    In this study, considering all these criteria, a general quality as-         employing a data extraction form, we ensured that the relevant data
+sessment was performed for each paper. The rating was 2 points for               required to answer each research question were systematically captured
+the ‘‘yes’’ option, 0 points for the ‘‘no’’ option, and 1 point for the          from the selected publications. This approach facilitated the subsequent
+‘‘somewhat’’ option. The decision threshold for classifying the paper            synthesis of the findings. The data extraction process involved meticu-
+as poor quality was determined based on the mean value, which                    lous attention to detail and ensured the reliability and integrity of the
+corresponds to a total of 5 points.                                              data used in our systematic literature review.
+
+                                                                             5
+E. Esen et al.                                                                                                             Computer Standards & Interfaces 97 (2026) 104116
+
+
+Table 1
+Selected primary studies.
+ ID          Reference      Title                                                                                                               Year        Database
+ S1          [29]           Automating Chaos Experiments in Production                                                                          2019        ACM
+ S2          [25]           Getting Started with Chaos engineering—design of an implementation framework in practice                            2020        ACM
+ S3          [30]           Human-AI Partnerships for Chaos engineering                                                                         2020        ACM
+ S4          [31]           3MileBeach: A Tracer with Teeth                                                                                     2021        ACM
+ S5          [32]           Service-Level Fault Injection Testing                                                                               2021        ACM
+ S6          [33]           A Platform for Automating Chaos Experiments                                                                         2016        IEEE Xplore
+ S7          [34]           Automated Fault-Tolerance Testing                                                                                   2016        IEEE Xplore
+ S8          [35]           Gremlin: Systematic Resilience Testing of Microservices                                                             2016        IEEE Xplore
+ S9          [36]           Fault Injection Techniques - A Brief Review                                                                         2018        IEEE Xplore
+ S10         [37]           ORCAS: Efficient Resilience Benchmarking of Microservice Architectures                                              2018        IEEE Xplore
+ S11         [38]           The Business Case for Chaos engineering                                                                             2018        IEEE Xplore
+ S12         [39]           Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service      2018        IEEE Xplore
+ S13         [40]           Security Chaos engineering for Cloud Services: Work In Progress                                                     2019        IEEE Xplore
+ S14         [41]           A Framework of Virtual War Room and Matrix Sketch-Based Streaming Anomaly Detection for Microservice Systems        2020        IEEE Xplore
+ S15         [42]           CloudStrike: Chaos engineering for Security and Resiliency in Cloud Infrastructure                                  2020        IEEE Xplore
+ S16         [43]           Identifying and Prioritizing Chaos Experiments by Using Established Risk Analysis Techniques                        2020        IEEE Xplore
+ S17         [44]           Fitness-guided Resilience Testing of Microservice-based Applications                                                2020        IEEE Xplore
+ S18         [24]           A Chaos engineering System for Live Analysis and Falsification of Exception-Handling in the JVM                     2021        IEEE Xplore
+ S19         [45]           A Study on Chaos engineering for Improving Cloud Software Quality and Reliability                                   2021        IEEE Xplore
+ S20         [46]           Chaos engineering for Enhanced Resilience of Cyber–Physical Systems                                                 2021        IEEE Xplore
+ S21         [47]           ChaosTwin: A Chaos engineering and Digital Twin Approach for the Design of Resilient IT Services                    2021        IEEE Xplore
+ S22         [48]           Platform Software Reliability for Cloud Service Continuity—Challenges and Opportunities                             2021        IEEE Xplore
+ S23         [49]           Trace-based Intelligent Fault Diagnosis for Microservices with Deep Learning                                        2021        IEEE Xplore
+ S24         [50]           A Guided Approach Towards Complex Chaos Selection, Prioritization and Injection                                     2022        IEEE Xplore
+ S25         [51]           Chaos Driven Development for Software Robustness Enhancement                                                        2022        IEEE Xplore
+ S26         [22]           Maximizing Error Injection Realism for Chaos engineering With System Calls                                          2022        IEEE Xplore
+ S27         [52]           On Evaluating Self-Adaptive and Self-Healing Systems using Chaos engineering                                        2022        IEEE Xplore
+ S28         [53]           Observability and chaos engineering on system calls for containerized applications in Docker                        2021        ScienceDirect
+ S29         [54]           Scalability resilience framework using application-level fault injection for cloud-based software services          2022        Springer
+ S30         [55]           Chaos as a Software Product Line—A platform for improving open hybrid-cloud systems resiliency                      2022        Wiley
+ S31         [56]           The Observability, Chaos engineering, and Remediation for Cloud-Native Reliability                                  2022        Wiley
+
+
+
+3.6. Data synthesis                                                                         Chaos engineering involves several categories of functionality that
+                                                                                        serve distinct purposes in resilience testing. The first category involves
+    To answer the research questions, the data obtained are collected                   intentionally terminating processes or services to evaluate system be-
+and summarized in an appropriate manner, which is called data syn-                      havior and recovery from failures [7]. Another category is network
+thesis. To perform the data synthesis, a qualitative analysis process                   simulation, which allows engineers to replicate adverse network condi-
+was conducted on the data obtained. For instance, synonyms used                         tions to assess system performance and reliability [25]. In the Stressing
+for different categories were identified and merged in the respective                   Machine category, engineers subject the system to extreme loads to
+fields. This comprehensive data synthesis approach allowed us to derive                 identify limits and potential bottlenecks [7]. In security testing, en-
+insights and draw conclusions from the collected information.                           gineers simulate breaches or attacks to assess the system’s response
+                                                                                        and enhance defenses [7]. Lastly, engineers use fault application code
+4. Results                                                                              to inject targeted faults or errors into the codebase, assessing system
+                                                                                        resilience and error-handling capabilities [24]. These categories help
+    The result section of the paper provides various insights into how                  organizations proactively identify weaknesses, strengthen system ro-
+chaos engineering is applied in production environments, particularly                   bustness, and enhance reliability in complex technology landscapes [7].
+its use in improving the resilience and reliability of microservice ar-                 Functionality categories of tools are presented in Fig. 6.
+chitecture applications. The section discusses how fault detection is                       The tools utilized in industry settings are not comprehensively ad-
+developed using chaos engineering tools and is mainly used in pro-                      dressed in articles. To provide insights for future research, the identified
+                                                                                        tools from the additional examination were categorized based on their
+duction for troubleshooting. Chaos Experiments are usually conducted
+                                                                                        functionality, as presented in Tables 2 and 3. Table 2 displays the
+in the production environment to provide realistic results. The section
+                                                                                        tools obtained from the study, while Table 3 presents additional tools
+further enumerates several tools that have been used for Chaos experi-
+                                                                                        that have been examined. Tools listed in the table with corresponding
+ments, as well as discussing general principles such as defining a steady
+                                                                                        references indicate their inclusion in the referenced articles.
+state, forming a hypothesis, conducting the experiment, and proving or
+refuting the hypothesis. These principles and tools help detect problems
+                                                                                        4.2. How is Chaos engineering effectively applied in production environ-
+like hardware issues, software errors network interruptions security
+                                                                                        ments to enhance the resilience of software systems?
+vulnerabilities configuration mistakes within their respective contexts.
+                                                                                           Table 4 examines the successful implementation of Chaos Engineer-
+4.1. Main statistics                                                                    ing in operational settings, covering different aspects such as goals,
+                                                                                        techniques and resources, guiding principles, findings, limitations and
+    Fig. 3 shows the results of the quality assessment. The distribution of             substitutes, as well as the general strategy.
+the years of publication is shown in Fig. 4. Most of the studies related to
+our study were conducted in the last year. This shows that researchers’                 4.3. Which platforms have been used for chaos experiments?
+interest in chaos engineering has increased in recent years. Most of the
+studies included were indexed in the IEEE Xplore database.                                 Table 5 provides a concise summary of various tools and platforms
+    Fig. 5 presents the distribution of the type of publications and                    used in Chaos experiments, along with their specific functionalities
+the corresponding databases. While there are many journal papers,                       or characteristics. It offers comprehensive insights into each platform
+conference proceedings also appear in the selected papers.                              through detailed descriptions accompanied by the necessary references.
+
+                                                                                    6
+E. Esen et al.                                                                         Computer Standards & Interfaces 97 (2026) 104116
+
+
+
+
+                                 Fig. 3. Quality assessment scores.
+
+
+
+
+                                    Fig. 4. Year of publication.
+
+
+
+
+                 Fig. 5. Diagram of the distribution of studies per search database.
+
+
+                                                  7
+E. Esen et al.                                                                                                            Computer Standards & Interfaces 97 (2026) 104116
+
+
+
+
+                                                           Fig. 6. Functionality of chaos engineering tools.
+
+
+
+
+                 Table 2
+                 Chaos engineering tools from studies.
+                  Chaos engineering tool         Termination         Network simulating    Stressing machine   Security         Fault application code
+                  Chaos Monkey [57]              ×
+                  Gremlin [35]                   ×                   ×                     ×                   ×                ×
+                  Chaos Toolkit [45]             ×                   ×                     ×                   ×                ×
+                  Pumba [55]                                         ×                     ×
+                  LitmusChaos [45]               ×                   ×                     ×                   ×
+                  ToxiProxy [45]                                     ×                                         ×
+                  PowerfulSeal [45]              ×                   ×                     ×                   ×
+                  Pod Reaper [25]                ×
+                  Netflix Simian Army [36]       ×                   ×                                         ×
+                  WireMock [25]                                      ×                                                          ×
+                  KubeMonkey [25]                ×                   ×                     ×
+                  Chaosblade [45]                ×                   ×                     ×
+                  ChaosTwin [47]                 ×                   ×                     ×                                    ×
+                  Chaos Machine [24]                                 ×                     ×                   ×
+                  Cloud Strike [42]                                                                            ×
+                  Phoebe [22]                                                                                                   ×
+                  Mjolnirr [58]                                                                                                 ×
+                  ChaosOrca [37]                                     ×                     ×                   ×
+                  3MileBeach [31]                                    ×                                                          ×
+                  Muxy [25]                                          ×                     ×                                    ×
+                  Blockade [25]                                      ×
+                  Chaos Lambda [25]              ×                                                                              ×
+                  Byte-Monkey [25]                                                                                              ×
+                  Turbulence [25]                ×                                         ×                   ×
+                  Cthulhu [25]                   ×                   ×                     ×                                    ×
+                  Byteman [25]                                                                                 ×                ×
+                  ChaosCube [55]                 ×
+                  Chaos Lemur [25]               ×
+                  Chaos HTTP Proxy [25]                              ×
+                  Chaos Mesh [45]                ×                   ×                     ×
+                  Istio Chaos [45]                                   ×
+                  ChAP [33]                                          ×                                                          ×
+                  IntelliFT [44]                 ×                   ×                     ×                                    ×
+
+
+
+
+                 Table 3
+                 Chaos engineering tools from our search.
+                  Chaos engineering tool     Termination         Network simulating       Stressing machine    Security         Fault application code
+                  Pod Chaos                  X                   X                        X
+                  DNS Chaos                                      X
+                  AWS Chaos                  X                                            X                    X
+                  Azure Chaos                X                   X                        X                    X
+                  GCP Chaos                  X                   X                        X                    X
+
+
+
+
+                                                                                      8
+E. Esen et al.                                                                                                                Computer Standards & Interfaces 97 (2026) 104116
+
+
+                 Table 4
+                 Chaos engineering in production environments.
+                  Category                         Description
+                  Objective                        The primary objective of applying chaos engineering in production environments is to enhance the
+                                                   resilience of software systems. This involves troubleshooting to identify and address potential
+                                                   malfunctions before they occur. The overarching goal is to minimize issues in production through the
+                                                   use of chaos engineering tools, enabling automatic fault detection [24,53].
+                  Methods and tools                chaos engineering relies on specific tools to facilitate its effective application in production
+                                                   environments. These tools aid in automatic fault detection, a crucial aspect of troubleshooting to
+                                                   minimize potential issues in the production environment [24,53].
+                  Principles and considerations    The effective application of chaos engineering is closely tied to key principles and considerations.
+                                                   These include continuous experimentation, serving as a form of robustness testing conducted in
+                                                   real-world operational conditions. Fundamental principles of Chaos Experiments involve defining a
+                                                   steady state, hypothesizing about its impact, conducting the experiment, and then demonstrating or
+                                                   refuting the hypothesis [53].
+                  Insights and results             Chaos experiments conducted in the production environment provide valuable insights into the
+                                                   behavior of the system. This is particularly significant as the production environment may exhibit
+                                                   unpredictable behavior that differs from staging environments in some cases [24].
+                  Constraints and alternatives     While conducting chaos experiments in production is ideal, it is acknowledged that legal or technical
+                                                   constraints may sometimes prevent this. In such cases, an alternative approach is considered, starting
+                                                   chaos experiments in a staging environment and gradually transitioning to the production
+                                                   environment [25].
+                  Overall approach                 The overall approach for the effective application of chaos engineering in production environments
+                                                   involves the systematic execution of chaos experiments. This includes leveraging chaos engineering
+                                                   tools and taking into account the constraints and challenges associated with conducting experiments in
+                                                   real-world operational settings. The aim is to proactively identify and address potential issues before
+                                                   they impact the production environment, ultimately enhancing the resilience of software systems.
+
+
+
+
+                 Table 5
+                 Chaos engineering tools identified from selected papers.
+                  Platform/Tool                    Description
+                  The Chaos Machine                A tool for conducting chaos experiments at the application level on Java Virtual Machine (JVM),
+                                                   using exception injection to analyze try-catch blocks for error processing [24].
+                  Screwdriver                      An automated fault-tolerance testing tool for on-premise applications and services, creating realistic
+                                                   error models and collecting metrics by injecting errors into the system [34].
+                  Chaos Monkey                     Designed by Netflix, this tool tests the system’s resilience by randomly killing partitions to check
+                                                   system functionality [7,45].
+                  Cloud Strike                     A security chaos engineering system for multi-cloud security, extending chaos engineering to security
+                                                   by injecting faults impacting confidentiality, integrity, and availability [42].
+                  ChaosMesh                        An open-source chaos engineering platform for testing the resilience and reliability of distributed
+                                                   systems by intentionally injecting failures and disruptions [55].
+                  Powerfulseal                     An open-source tool for testing the resilience of Kubernetes clusters by simulating real-world failures
+                                                   and disruptions [55].
+                  IntelliFT                        A feedback-based, automated failure testing technique for microservice applications, focusing on
+                                                   exposing defects in fault-handling logic [44].
+                  The Chaos Toolkit                Open-source software that runs experiments against the system to confirm a hypothesis [25,55].
+                  Phoebe                           A fault injection framework for reliability analysis concerning system call invocation errors, enabling
+                                                   full observability of system call invocations and automatic experimentation [22].
+                  Mjolnirr                         A private cloud platform with a built-in Chaos Monkey service for developing private PaaS cloud
+                                                   infrastructure [58].
+                  ChaosOrca                        A tool for Chaos engineering on containers, perturbing system calls for processes inside containers
+                                                   and monitoring their effects [37].
+                  Gremlin                          Offered as a SaaS technology, Gremlin tests system resilience on various parameters and conditions,
+                                                   with capabilities for automation and integration with Kubernetes clusters and public clouds [35].
+                  3MileBeach                       A distributed tracing and fault injection framework for microservices, enabling chaos experiments
+                                                   through message serialization library manipulation [31].
+                  ChAP                             A software platform for running automated chaos experiments, simulating various failure scenarios
+                                                   and providing insights into system behavior under stress [29,33].
+                  ChaosTwin                        Utilizes a digital twin approach in Chaos Engineering to mitigate impacts of unforeseen events,
+                                                   constructing models across workload, network, and service layers [47].
+                  Litmus Chaos                     An open-source cloud-native framework for Chaos Engineering in Kubernetes environments, offering a
+                                                   range of chaos experiments and workflows [50].
+                  Filibuster                       A testing method in chaos engineering that introduces errors into microservice architecture to validate
+                                                   resilience and error tolerance [32].
+
+
+
+
+                                                                                    9
+E. Esen et al.                                                                                                           Computer Standards & Interfaces 97 (2026) 104116
+
+
+Table 6
+Chaos engineering in microservices: approaches, descriptions, and expected outcomes.
+ Approach                   Description                                                                              Expected impact
+ Fault injection testing    This method involves intentionally introducing errors into the system to assess its      Evaluating and enhancing the system’s resilience
+                            response, particularly in microservices by simulating various failure modes such as      and stability.
+                            network issues, service outages, or resource shortages within or between
+                            microservices, to evaluate the system’s resilience and stability [52].
+ Hypothesis-driven          Key to chaos engineering is conducting experiments based on well-defined                 Identifying system weaknesses and increasing
+ experiments                hypotheses about the normal state of the system and its expected behavior during         resilience.
+                            failure scenarios. This strategic approach enables focused experiments that assess the
+                            resilience of both individual microservices and the overall system [45,53].
+ Blast radius               Managing the ‘‘blast radius’’ of experiments is crucial in microservices. It involves    Better understanding and enhancing the system’s
+ management                 understanding the potential impact of introduced failures, starting with small           resilience.
+                            experiments and then expanding, to manage failure impacts while identifying system
+                            vulnerabilities [45].
+ Resilience requirement     Utilizing chaos engineering to determine and analyze the resilience requirements of      Understanding specific resilience needs of each
+ elicitation                microservice architectures. This process involves observing the system’s response to     microservice and their interactions.
+                            induced faults to identify specific resilience needs of each microservice and their
+                            interactions [52].
+ Continuous testing and     Regularly conducting chaos experiments as part of an ongoing testing process             Proactive identification and resolution of system
+ improvement                ensures that microservices remain resilient against unforeseen issues. This continuous   weaknesses, leading to continual improvement and
+                            approach aids in proactively finding and fixing potential system weaknesses [56].        increased resilience.
+ Observability and          Integrating chaos engineering with observability tools enhances the monitoring of        Real-time tracking of responses to failures and
+ remediation                microservices during fault injection, allowing for real-time tracking of responses to    development of effective remediation strategies for
+                            failures, aiding in the development of effective remediation strategies and overall      overall system resilience improvement.
+                            system resilience improvement [56].
+
+
+
+4.4. How can Chaos engineering be effectively applied to microservice archi-             5.1. General discussion
+tecture to ensure successful implementation and enhance system resilience?
+                                                                                             In this article, we reviewed the literature on the application of
+    Table 6 provides a comprehensive overview of the different facets                    chaos engineering in microservice architecture to understand the state-
+and projected implications of implementing chaos engineering within                      of-the-art. For this purpose, six research questions were defined and
+microservice architecture.                                                               answered.
+    By implementing these approaches and strategies, organizations can                       In RQ1, we aimed to understand how chaos engineering is ap-
+effectively integrate chaos engineering into their microservice architec-                plied to production environments. Chaos engineering, when adeptly
+tures to uncover vulnerabilities and enhance the overall dependability                   applied in production settings, serves as a pivotal tool for augmenting
+of their systems.                                                                        the robustness of software systems. This approach entails conducting
+                                                                                         deliberate and controlled chaos experiments within the production en-
+4.5. To what extent can the centralized provision of Chaos engineering                   vironment, a strategy that is instrumental in uncovering and rectifying
+effectively facilitate the management of chaos experiments across complex                potential issues before they escalate into full-blown system failures,
+systems?                                                                                 thereby bolstering system uptime [38]. Moreover, chaos engineering
+                                                                                         is characterized by the intentional injection of faults into systems.
+    Table 7 provides an overview of the ways in which centralized chaos                  This methodology is crucial for identifying and addressing security
+engineering can simplify experiment management in intricate systems.                     flaws and risks, laying the groundwork for the development of resilient
+It emphasizes advantages like standardization, resource utilization, risk                application architectures [56]. By replicating adverse conditions that
+mitigation, and more, resulting in enhanced system resilience and                        could naturally arise in production settings, chaos engineering helps
+performance.                                                                             detect of inherent system vulnerabilities and structural deficiencies,
+                                                                                         fostering a proactive stance towards issue mitigation [38].
+4.6. What are the challenges reported in the relevant papers?                                Additionally, this practice involves comprehensive testing of real-
+                                                                                         world scenarios on operational systems. Such testing is vital for as-
+   Table 8 concisely presents the primary obstacles in the area of                       sessing the complete spectrum of software systems, encompassing both
+chaos engineering and their respective resolutions. These obstacles                      hardware malfunctions and software glitches, within their actual de-
+encompass system intricacy, hazards to live environments, resource                       ployment contexts. This approach significantly contributes to the en-
+demands, security issues, and automation complexities. The proposed                      hancement of overall system resilience [38]. To effectively implement
+resolutions involve phased implementation, risk assessment, knowledge                    chaos engineering, it is recommended to initiate with less complex
+enhancement, robust security protocols, and automation approaches.                       experiments, leverage automation for these experiments, and focus on
+                                                                                         areas with either high impact or high frequency of issues. Observing
+5. Discussion                                                                            the system at its limits is also crucial for reinforcing resilience [25].
+                                                                                             In RQ2, we discuss various platforms that aim to increase the
+   In the discussion section, we summarize answers to the research                       flexibility and reliability of microservice architectures through chaos
+questions. They mention that chaos engineering can improve robust-                       experiments. Tools like Gremlin, Chaos Monkey, Chaos Toolkit, Pumba,
+ness by simulating real-world failure scenarios and exploring system                     LitmusChaos, ToxiProxy and PowerfulSeal have been utilized in indus-
+reactions, especially in microservice architectures. Various tools for                   try settings to simulate different failure scenarios. These tools provide
+implementing chaos engineering were listed and compared. They con-                       functions such as terminating processes, simulating network conditions,
+clude by stating that the application of chaos engineering requires                      applying stress tests security measures and injecting faults to proac-
+careful planning due to inherent challenges but has the potential to                     tively identify weaknesses and strengthen system robustness across
+greatly improve system resilience.                                                       different technology landscapes.
+
+                                                                                    10
+E. Esen et al.                                                                                                                    Computer Standards & Interfaces 97 (2026) 104116
+
+
+Table 7
+Centralized provision in chaos engineering.
+ Approach                            Description                                                                                    Expected impact
+ Standardization                     Centralized provision allows for the standardization of chaos engineering practices            Improved coordination and reliability of
+                                     and tools across the organization. This ensures that all teams follow consistent               results.
+                                     processes and use approved tools, leading to better coordination and more reliable
+                                     results [42].
+ Resource optimization               Centralized provision enables efficient allocation of resources for chaos experiments.         Enhanced resource utilization and reduced
+                                     It allows pooling of expertise, tools, and infrastructure, reducing redundancy and             redundancy.
+                                     optimizing resource utilization [38].
+ Risk management                     Centralized provision facilitates better risk management by providing oversight and            Controlled experimentation and effective
+                                     governance for chaos experiments. It establishes clear guidelines, safety measures,            risk management.
+                                     and expected states for running experiments in production environments, ensuring
+                                     controlled experimentation [42].
+ Automation and                      Centralized provision supports the automation of chaos experiments to run                      Ongoing validation of system resilience and
+ continuous testing                  continuously. This ensures regular conduction of experiments, leading to ongoing               early identification of potential issues.
+                                     validation of system resilience and identification of potential issues before they
+                                     manifest as outages [38,42].
+ Knowledge sharing and               A centralized approach encourages knowledge sharing and collaboration among                    Promotion of a continuous improvement
+ collaboration                       teams. It facilitates the dissemination of best practices, lessons learned, and                culture and shared learning.
+                                     successful experiment designs, fostering a culture of continuous improvement and
+                                     shared learning [25].
+ Performance metrics and             Centralized provision enables the establishment of standardized performance metrics            Consistent system health measurement and
+ analysis                            and analysis methods for chaos experiments. This allows for consistent measurement             more effective decision-making.
+                                     of system health and identification of deviations from steady-state, leading to more
+                                     effective decision-making and system improvements [43].
+
+
+Table 8
+Challenges and solutions in chaos Engineering.
+ Category             Challenges                                                  Possible solutions                                                             References
+ Complexity           Designing and executing effective chaos experiments         To mitigate complexity, it is recommended to start with smaller, more          [25,43]
+                      in large systems is complex due to intricate                manageable experiments and gradually expand the scope of chaos
+                      interdependencies within these systems.                     engineering practices.
+ Risk of impact       Concerns about causing disruptions in the production        Implementing risk analysis techniques can help prioritize experiments,         [45,50]
+                      environment, affecting users and business operations.       focusing on less critical system components first to minimize potential
+                                                                                  impacts.
+ Resource             Significant resources needed including time, expertise,     Addressing resource intensiveness involves providing comprehensive             [7,47]
+ intensiveness        and infrastructure, posing a barrier for many               training and education on chaos engineering best practices and tools to
+                      organizations.                                              equip teams with the necessary skills and knowledge.
+ Security             Introducing controlled failures can raise security          To combat security concerns, robust security measures should be                [42,47]
+ concerns             issues, potentially exposing vulnerabilities or sensitive   implemented during experiments to safeguard sensitive data and prevent
+                      data.                                                       unauthorized access.
+ Tooling and          Developing tools for automated chaos experiments is         Overcoming tooling and automation challenges requires the development          [7,33,38,40,42]
+ automation           challenging in heterogeneous and dynamic                    and use of automated tools for Chaos experiments, which reduce manual
+                      environments.                                               efforts and facilitate continuous, unattended testing.
+
+
+
+    Recent studies have emphasized the growing intersection between                          solutions like Netflix’s Chaos Automation Platform (ChAP) and fault
+artificial intelligence and cybersecurity within the context of chaos                        injection techniques such as service call manipulation. The emphasis is
+engineering. AI-driven techniques are nowadays used for real-time                            placed on the need for careful planning, effective communication, risk
+threat detection, anomaly prediction, and automated response mech-                           management, and continuous learning to ensure comprehensive and
+anisms in enterprise systems. For example, generative AI models have                         valuable chaos experiments for enhancing overall system resilience.
+been proposed to enhance cybersecurity frameworks by improving data                              In response to RQ5, our discussion concludes that the practical
+privacy management and identifying potential attack vectors [59].                            implementation of chaos engineering, despite its promise to enhance
+    In RQ3, we focused on understanding how chaos engineering is im-                         system resilience, presents numerous challenges. These challenges in-
+plemented in microservice architectures. To enhance system resilience                        clude potential business impacts, difficulty in determining scope, the
+in microservice architectures through chaos engineering, organizations
+                                                                                             unpredictability of outcomes, time and resource constraints, system
+should utilize fault injection testing to replicate failures within mi-
+                                                                                             complexities, skill and knowledge prerequisites, interpretation of re-
+croservices. They should also conduct hypothesis-driven experiments
+                                                                                             sults, cultural readiness, and selection of appropriate tools. These all
+with a solid comprehension of the normal state and anticipated behav-
+                                                                                             necessitate meticulous planning and skilled execution for effectiveness.
+ior during disruptions, while managing the scope of these experiments
+to minimize impact. Additionally, it is essential to identify and an-                            Recent studies explore the convergence of Chaos Engineering and
+alyze resilience requirements, participate in continuous testing and                         Artificial Intelligence (AI). Large language models (LLMs) have been
+improvement efforts, as well as integrate observability tools for real-                      used to automate the chaos engineering lifecycle, managing phases
+time monitoring during fault injection tests. Moreover, organizations                        from hypothesis creation to experiment orchestration and remedia-
+need to establish clear communication channels across teams involved                         tion [60]. Meanwhile, advances in applying chaos engineering to multi-
+in order to ensure effective collaboration and knowledge sharing.                            agent AI systems suggest new directions: for example, chaos experi-
+    The answer to RQ4, highlights the significance of centralized man-                       ments applied to LLM-based multi-agent systems can surface vulner-
+agement and monitoring in conducting chaos experiments within large-                         abilities such as hallucinations, agent failures, or inter-agent communi-
+scale microservices ecosystems. It discusses the utilization of software                     cation breakdowns [61]. Together, these works show how intelligent,
+
+                                                                                        11
+E. Esen et al.                                                                                                         Computer Standards & Interfaces 97 (2026) 104116
+
+
+adaptive chaos frameworks might evolve in microservice-based systems             experiments are insightful, as they reveal system behaviors in pro-
+as well.                                                                         duction environments, which often differ unpredictably from staging
+    Recent research also discusses specific operational challenges such          environments [36,53].
+as load balancing and security in the context of chaos engineering. For              Furthermore, the effectiveness of chaos engineering is contingent
+example, an empirical study applies delay injections under different             on the systematic execution of chaos experiments. These experiments,
+user loads in cloud-native systems to observe how throughput and                 utilizing advanced chaos engineering tools, need to navigate the con-
+latency change under stress, providing insights into how load balanc-            straints and challenges inherent in real-world operational settings.
+ing policies perform under fault conditions [62]. In parallel, several           The main objective is the enhancement of system resilience, achieved
+frameworks have begun integrating security-focused chaos tests that              by proactively identifying and preemptively addressing potential is-
+intentionally inject faults into authentication, identity management,            sues [46].
+and access control components to ensure that security mechanisms                     However, it is acknowledged that conducting chaos experiments
+remain effective under stress conditions [63]. These studies highlight           directly in production environments might be impeded by legal or
+how chaos engineering can be extended beyond performance reliability             technical constraints. In such scenarios, initiating experiments in a
+to proactively strengthen both load distribution and security resilience         staging environment and then gradually transitioning to the production
+in microservice environments.                                                    environment offers a viable alternative. This approach ensures that
+    The main challenges faced by previous researchers and possible               the benefits of chaos engineering can still be realized, but in a more
+solutions have been discussed in the paper. The collected challenges             controlled and possibly less direct manner.
+were mainly related to the correct interpretation of chaos experiments               Our review highlights that chaos engineering is a critical methodol-
+and making sense of them. There may be more challenges, but if                   ogy for ensuring the resilience and robustness of software systems. By
+they were not mentioned in these articles, we could not include them.            following continuous experimentation and proactive troubleshooting, it
+We believe that chaos engineering is still in the early stages and the           offers a pathway to address the challenges faced in complex production
+adoption in the software industry will take some time.                           environments. This SLR contributes to the scientific community by dis-
+                                                                                 cussing these methodologies and their applications, thereby providing
+5.2. Threats to validity                                                         a framework for future research and practical implementation in the
+                                                                                 field of software system resilience.
+Internal validity
+    The validity of this systematic literature review is threatened by           CRediT authorship contribution statement
+issues related to defining the candidate pool of papers, potential bias
+in selecting primary studies, data extraction, and data synthesis. The               Emrah Esen: Writing – review & editing, Writing – original draft,
+application of exclusion criteria can be influenced by the researchers’          Visualization, Validation, Software, Methodology, Investigation, For-
+biases, posing a potential threat to validity. We compiled a compre-             mal analysis, Data curation. Akhan Akbulut: Writing – review &
+hensive list of exclusion criteria, and all conflicts were documented            editing, Writing – original draft, Visualization, Validation, Supervi-
+and resolved through discussions among us. Data extraction validity is           sion, Software, Resources, Project administration, Methodology, Inves-
+crucial as it directly impacts the study results. Whenever any of us was         tigation, Formal analysis, Data curation. Cagatay Catal: Writing –
+uncertain about data extraction, the case was recorded for resolution            review & editing, Writing – original draft, Visualization, Validation,
+through discussions with the team. Multiple meetings were held to                Supervision, Software, Resources, Project administration, Methodology,
+minimize researcher bias.                                                        Investigation, Funding acquisition, Formal analysis, Data curation.
+
+External validity                                                                Declaration of competing interest
+   The search for candidate papers involved using general search terms
+to minimize the risk of excluding relevant studies. Despite using a broad            The authors declare that they have no known competing finan-
+search query to acquire more articles, there remains a possibility that          cial interests or personal relationships that could have appeared to
+some papers were overlooked in electronic databases or missed due to             influence the work reported in this paper.
+recent publications. Furthermore, although seven widely used online
+databases in computer science and software engineering were searched,            Data availability
+new papers may not have been included.
+                                                                                    Data will be made available on request.
+6. Conclusion
+
+    Our systematic literature review (SLR) on chaos engineering has              References
+explored its role in enhancing the resilience of software systems in pro-
+duction environments. Through our review, we have identified several              [1] P. Jamshidi, C. Pahl, N.C. Mendonça, J. Lewis, S. Tilkov, Microservices: The
+                                                                                      journey so far and challenges ahead, IEEE Softw. 35 (3) (2018) 24–35, http:
+crucial aspects that underline the effective application and challenges
+                                                                                      //dx.doi.org/10.1109/MS.2018.2141039.
+of chaos engineering [25].                                                        [2] I. Beschastnikh, P. Wang, Y. Brun, M.D. Ernst, Debugging distributed systems,
+    Firstly, Chaos Engineering serves as a proactive troubleshooting ap-              Commun. ACM 59 (8) (2016) 32–37, http://dx.doi.org/10.1145/2909480.
+proach in production environments [25]. By identifying and addressing             [3] W. Ahmed, Y.W. Wu, A survey on reliability in distributed systems, J. Comput.
+potential malfunctions before they occur, it effectively preempts system              System Sci. 79 (8) (2013) 1243–1255, http://dx.doi.org/10.1016/j.jcss.2013.02.
+                                                                                      006.
+disruptions. This proactive strategy is significantly implemented by
+                                                                                  [4] D. Ma’ruf, S. Sulistyo, L. Nugroho, Applying integrating testing of microservices
+chaos engineering tools that assist in automatic fault detection, thereby             in airline ticketing system, Ijitee (Int. J. Inf. Technol. Electr. Eng.) 4 (2020) 39,
+minimizing potential issues in these critical environments [50].                      http://dx.doi.org/10.22146/ijitee.55491.
+    Secondly, the essence of chaos engineering is rooted in continuous            [5] F. Dai, H. Chen, Z. Qiang, Z. Liang, B. Huang, L. Wang, Automatic analysis
+experimentation and robustness testing under real-world operational                   of complex interactions in microservice systems, Complexity 2020 (2020) 1–12,
+                                                                                      http://dx.doi.org/10.1155/2020/2128793.
+conditions. The methodology involves a systematic approach: defining              [6] J. Lewis, M. Fowler, Microservices: a definition of this new architectural term
+a steady state, hypothesizing its impacts, conducting controlled exper-               (2014), 2014, URL: http://martinfowler.com/articles/microservices.html (cit. p.
+iments, and subsequently confirming or refuting the hypotheses. These                 26).
+
+
+                                                                            12
+E. Esen et al.                                                                                                                      Computer Standards & Interfaces 97 (2026) 104116
+
+
+ [7] A. Basiri, N. Behnam, R. de Rooij, L. Hochstein, L. Kosewski, J. Reynolds, C.             [31] J. Zhang, R. Ferydouni, A. Montana, D. Bittman, P. Alvaro, 3MileBeach: A
+     Rosenthal, Chaos engineering, IEEE Softw. 33 (3) (2016) 35–41, http://dx.doi.                  tracer with teeth, in: Proceedings of the ACM Symposium on Cloud Computing,
+     org/10.1109/MS.2016.60.                                                                        SoCC ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp.
+ [8] R.T. Munodawafa, S.K. Johl, A systematic review of eco-innovation and perfor-                  458–472, http://dx.doi.org/10.1145/3472883.3486986.
+     mance from the resource-based and stakeholder perspectives, Sustainability 11             [32] C.S. Meiklejohn, A. Estrada, Y. Song, H. Miller, R. Padhye, Service-level fault
+     (2019) 6067, http://dx.doi.org/10.3390/su11216067.                                             injection testing, in: Proceedings of the ACM Symposium on Cloud Computing,
+ [9] J.M. Macharia, Systematic literature review of interventions supported by inte-                SoCC ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp.
+     gration of ict in education to improve learners’ academic performance in stem                  388–402, http://dx.doi.org/10.1145/3472883.3487005.
+     subjects in kenya, J. Educ. Pract. 6 (2022) 52–75, http://dx.doi.org/10.47941/            [33] A. Blohowiak, A. Basiri, L. Hochstein, C. Rosenthal, A platform for automating
+     jep.979.                                                                                       chaos experiments, in: 2016 IEEE International Symposium on Software Reliabil-
+[10] P. Gerli, J.N. Marco, J. Whalley, What makes a smart village smart? a review                   ity Engineering Workshops, ISSREW, 2016, pp. 5–8, http://dx.doi.org/10.1109/
+     of the literature, Transform. Gov.: People Process. Policy 16 (2022) 292–304,                  ISSREW.2016.52.
+     http://dx.doi.org/10.1108/tg-07-2021-0126.                                                [34] A. Nagarajan, A. Vaddadi, Automated fault-tolerance testing, in: 2016 IEEE
+[11] R. Coppola, L. Ardito, Quality assessment methods for textual conversational                   Ninth International Conference on Software Testing, Verification and Validation
+     interfaces: a multivocal literature review, Information 12 (2021) 437, http:                   Workshops, ICSTW, 2016, pp. 275–276, http://dx.doi.org/10.1109/ICSTW.2016.
+     //dx.doi.org/10.3390/info12110437.                                                             34.
+[12] B. Kitchenham, O. Pearl Brereton, D. Budgen, M. Turner, J. Bailey, S. Linkman,            [35] V. Heorhiadi, S. Rajagopalan, H. Jamjoom, M.K. Reiter, V. Sekar, Gremlin:
+     Systematic literature reviews in software engineering – A systematic literature                Systematic resilience testing of microservices, in: 2016 IEEE 36th International
+     review, Inf. Softw. Technol. 51 (1) (2009) 7–15, http://dx.doi.org/10.1016/j.                  Conference on Distributed Computing Systems, ICDCS, 2016, pp. 57–66, http:
+     infsof.2008.09.009, Special Section - Most Cited Articles in 2002 and Regular                  //dx.doi.org/10.1109/ICDCS.2016.11.
+     Research Papers.                                                                          [36] R.K. Lenka, S. Padhi, K.M. Nayak, Fault injection techniques - a brief review,
+[13] N. Dragoni, S. Giallorenzo, A.L. Lafuente, M. Mazzara, F. Montesi, R. Mustafin, L.             in: 2018 International Conference on Advances in Computing, Communication
+     Safina, Microservices: yesterday, today, and tomorrow, 2017, arXiv:1606.04036.                 Control and Networking, ICACCCN, 2018, pp. 832–837, http://dx.doi.org/10.
+[14] P.D. Francesco, I. Malavolta, P. Lago, Research on architecting microservices:                 1109/ICACCCN.2018.8748585.
+     Trends, focus, and potential for industrial adoption, in: 2017 IEEE International         [37] A. van Hoorn, A. Aleti, T.F. Düllmann, T. Pitakrat, ORCAS: Efficient resilience
+     Conference on Software Architecture, ICSA, 2017, pp. 21–30, http://dx.doi.org/                 benchmarking of microservice architectures, in: 2018 IEEE International Sym-
+     10.1109/ICSA.2017.24.                                                                          posium on Software Reliability Engineering Workshops, ISSREW, 2018, pp.
+[15] M. Fowler, Patterns of Enterprise Application Architecture, Addison-Wesley                     146–147, http://dx.doi.org/10.1109/ISSREW.2018.00-10.
+     Longman Publishing Co., Inc., USA, 2002.                                                  [38] H. Tucker, L. Hochstein, N. Jones, A. Basiri, C. Rosenthal, The business case for
+                                                                                                    chaos engineering, IEEE Cloud Comput. 5 (3) (2018) 45–54, http://dx.doi.org/
+[16] J. Lewis, M. Fowler, Microservices, 2014, https://martinfowler.com/articles/
+                                                                                                    10.1109/MCC.2018.032591616.
+     microservices.html.
+                                                                                               [39] N. Brousse, O. Mykhailov, Use of self-healing techniques to improve the
+[17] S. Newman, Building Microservices: Designing Fine-Grained Systems, " O’Reilly
+                                                                                                    reliability of a dynamic and geo-distributed ad delivery service, in: 2018
+     Media, Inc.", 2021.
+                                                                                                    IEEE International Symposium on Software Reliability Engineering Workshops,
+[18] C.K. Rudrabhatla, Comparison of zero downtime based deployment techniques in
+                                                                                                    ISSREW, 2018, pp. 1–5, http://dx.doi.org/10.1109/ISSREW.2018.00-40.
+     public cloud infrastructure, in: 2020 Fourth International Conference on I-SMAC
+                                                                                               [40] K.A. Torkura, M.I. Sukmana, F. Cheng, C. Meinel, Security chaos engineering for
+     (IoT in Social, Mobile, Analytics and Cloud), I-SMAC, 2020, pp. 1082–1086,
+                                                                                                    cloud services: Work in progress, in: 2019 IEEE 18th International Symposium
+     http://dx.doi.org/10.1109/I-SMAC49090.2020.9243605.
+                                                                                                    on Network Computing and Applications, NCA, 2019, pp. 1–3, http://dx.doi.org/
+[19] S.R. Addula, P. Perugu.P, M.K. Kumar, D. Kumar, B. Ananthan, R. R, S. P, S.
+                                                                                                    10.1109/NCA.2019.8935046.
+     G, Dynamic load balancing in cloud computing using hybrid Kookaburra-Pelican
+                                                                                               [41] H. Chen, P. Chen, G. Yu, A framework of virtual war room and matrix sketch-
+     optimization algorithms, in: 2024 International Conference on Augmented Re-
+                                                                                                    based streaming anomaly detection for microservice systems, IEEE Access 8
+     ality, Intelligent Systems, and Industrial Automation, ARIIA, 2024, pp. 1–7,
+                                                                                                    (2020) 43413–43426, http://dx.doi.org/10.1109/ACCESS.2020.2977464.
+     http://dx.doi.org/10.1109/ARIIA63345.2024.11051893.
+                                                                                               [42] K.A. Torkura, M.I.H. Sukmana, F. Cheng, C. Meinel, CloudStrike: Chaos engi-
+[20] M. Waseem, P. Liang, M. Shahin, A systematic mapping study on microservices
+                                                                                                    neering for security and resiliency in cloud infrastructure, IEEE Access 8 (2020)
+     architecture in devops, J. Syst. Softw. 170 (2020) 110798, http://dx.doi.org/10.
+                                                                                                    123044–123060, http://dx.doi.org/10.1109/ACCESS.2020.3007338.
+     1016/j.jss.2020.110798.
+                                                                                               [43] D. Kesim, A. van Hoorn, S. Frank, M. H00E4ussler, Identifying and prioritizing
+[21] C. Rosenthal, N. Jones, Chaos Engineering: System Resiliency in Practice, O’Reilly
+                                                                                                    chaos experiments by using established risk analysis techniques, in: 2020 IEEE
+     Media, 2020.
+                                                                                                    31st International Symposium on Software Reliability Engineering, ISSRE, 2020,
+[22] L. Zhang, B. Morin, B. Baudry, M. Monperrus, Maximizing error injection realism                pp. 229–240, http://dx.doi.org/10.1109/ISSRE5003.2020.00030.
+     for chaos engineering with system calls, IEEE Trans. Dependable Secur. Comput.            [44] Z. Long, G. Wu, X. Chen, C. Cui, W. Chen, J. Wei, Fitness-guided resilience
+     19 (4) (2022) 2695–2708, http://dx.doi.org/10.1109/TDSC.2021.3069715.                          testing of microservice-based applications, 2020, pp. 151–158, http://dx.doi.org/
+[23] Š. Davidovič, B. Beyer, Canary analysis service, Commun. ACM 61 (5) (2018)                     10.1109/ICWS49710.2020.00027.
+     54–62, http://dx.doi.org/10.1145/3190566.                                                 [45] S. De, A study on chaos engineering for improving cloud software quality
+[24] L. Zhang, B. Morin, P. Haller, B. Baudry, M. Monperrus, A chaos engineering                    and reliability, in: 2021 International Conference on Disruptive Technologies
+     system for live analysis and falsification of exception-handling in the JVM, IEEE              for Multi-Disciplinary Research and Applications, CENTCON, Vol. 1, 2021, pp.
+     Trans. Softw. Eng. 47 (11) (2021) 2534–2548, http://dx.doi.org/10.1109/TSE.                    289–294, http://dx.doi.org/10.1109/CENTCON52345.2021.9688292.
+     2019.2954871.                                                                             [46] C. Konstantinou, G. Stergiopoulos, M. Parvania, P. Esteves-Verissimo, Chaos
+[25] H. Jernberg, P. Runeson, E. Engström, Getting started with chaos engineering                   engineering for enhanced resilience of cyber-physical systems, in: 2021 Re-
+     - design of an implementation framework in practice, in: Proceedings of the                    silience Week, RWS, 2021, pp. 1–10, http://dx.doi.org/10.1109/RWS52686.
+     14th ACM / IEEE International Symposium on Empirical Software Engineering                      2021.9611797.
+     and Measurement, ESEM, ESEM ’20, Association for Computing Machinery, New                 [47] F. Poltronieri, M. Tortonesi, C. Stefanelli, ChaosTwin: A chaos engineering and
+     York, NY, USA, 2020, http://dx.doi.org/10.1145/3382494.3421464.                                digital twin approach for the design of resilient IT services, in: 2021 17th
+[26] A. Alkhateeb, C. Catal, G. Kar, A. Mishra, Hybrid blockchain platforms for the                 International Conference on Network and Service Management, CNSM, 2021,
+     internet of things (IoT): A systematic literature review, Sensors 22 (4) (2022)                pp. 234–238, http://dx.doi.org/10.23919/CNSM52442.2021.9615519.
+     http://dx.doi.org/10.3390/s22041304.                                                      [48] N. Luo, Y. Xiong, Platform software reliability for cloud service continuity
+[27] R. van Dinter, B. Tekinerdogan, C. Catal, Predictive maintenance using digital                 - challenges and opportunities, in: 2021 IEEE 21st International Conference
+     twins: A systematic literature review, Inf. Softw. Technol. 151 (2022) 107008,                 on Software Quality, Reliability and Security, QRS, 2021, pp. 388–393, http:
+     http://dx.doi.org/10.1016/j.infsof.2022.107008.                                                //dx.doi.org/10.1109/QRS54544.2021.00050.
+[28] M. Jorayeva, A. Akbulut, C. Catal, A. Mishra, Machine learning-based software             [49] H. Chen, K. Wei, A. Li, T. Wang, W. Zhang, Trace-based intelligent fault diagnosis
+     defect prediction for mobile applications: A systematic literature review, Sensors             for microservices with deep learning, in: 2021 IEEE 45th Annual Computers,
+     22 (7) (2022) http://dx.doi.org/10.3390/s22072551.                                             Software, and Applications Conference, COMPSAC, 2021, pp. 884–893, http:
+[29] A. Basiri, L. Hochstein, N. Jones, H. Tucker, Automating chaos experiments                     //dx.doi.org/10.1109/COMPSAC51774.2021.00121.
+     in production, in: 2019 IEEE/ACM 41st International Conference on Software                [50] O. Sharma, M. Verma, S. Bhadauria, P. Jayachandran, A guided approach
+     Engineering: Software Engineering in Practice, ICSE-SEIP, 2019, pp. 31–40,                     towards complex chaos selection, prioritisation and injection, in: 2022 IEEE
+     http://dx.doi.org/10.1109/ICSE-SEIP.2019.00012.                                                15th International Conference on Cloud Computing, CLOUD, 2022, pp. 91–93,
+[30] L.B. Canonico, V. Vakeel, J. Dominic, P. Rodeghero, N. McNeese, Human-AI                       http://dx.doi.org/10.1109/CLOUD55607.2022.00025.
+     partnerships for chaos engineering, in: Proceedings of the IEEE/ACM 42nd                  [51] N. Luo, L. Zhang, Chaos driven development for software robustness enhance-
+     International Conference on Software Engineering Workshops, ICSEW ’20, As-                     ment, in: 2022 9th International Conference on Dependable Systems and their
+     sociation for Computing Machinery, New York, NY, USA, 2020, pp. 499–503,                       Applications, DSA, 2022, pp. 1029–1034, http://dx.doi.org/10.1109/DSA56465.
+     http://dx.doi.org/10.1145/3387940.3391493.                                                     2022.00154.
+
+
+                                                                                          13
+E. Esen et al.                                                                                                                   Computer Standards & Interfaces 97 (2026) 104116
+
+
+[52] M.A. Naqvi, S. Malik, M. Astekin, L. Moonen, On evaluating self-adaptive                [58] D. Savchenko, G. Radchenko, O. Taipale, Microservices validation: Mjolnirr
+     and self-healing systems using chaos engineering, in: 2022 IEEE International                platform case study, in: 2015 38th International Convention on Information and
+     Conference on Autonomic Computing and Self-Organizing Systems, ACSOS, 2022,                  Communication Technology, Electronics and Microelectronics, MIPRO, 2015, pp.
+     pp. 1–10, http://dx.doi.org/10.1109/ACSOS55765.2022.00018.                                   235–240, http://dx.doi.org/10.1109/MIPRO.2015.7160271.
+[53] J. Simonsson, L. Zhang, B. Morin, B. Baudry, M. Monperrus, Observability and            [59] G.S. Nadella, S.R. Addula, A.R. Yadulla, G.S. Sajja, M. Meesala, M.H. Maturi,
+     chaos engineering on system calls for containerized applications in Docker,                  K. Meduri, H. Gonaygunta, Generative AI-enhanced cybersecurity framework for
+     Future Gener. Comput. Syst. 122 (2021) 117–129, http://dx.doi.org/10.1016/                   enterprise data privacy management, Computers 14 (2) (2025) http://dx.doi.org/
+     j.future.2021.04.001.                                                                        10.3390/computers14020055.
+[54] A.A.-S. Ahmad, P. Andras, Scalability resilience framework using application-           [60] D. Kikuta, H. Ikeuchi, K. Tajiri, Y. Nakano, ChaosEater: Fully automating chaos
+     level fault injection for cloud-based software services, J. Cloud Comput. 11 (1)             engineering with large language models, 2025, arXiv preprint arXiv:2501.11107.
+     (2022) 1, http://dx.doi.org/10.1186/s13677-021-00277-z.                                      URL https://arxiv.org/abs/2501.11107.
+[55] C. Camacho, P.C. Cañizares, L. Llana, A. Núñez, Chaos as a software product             [61] J. Owotogbe, Assessing and enhancing the robustness of LLM-based multi-
+     line—A platform for improving open hybrid-cloud systems resiliency, Softw.:                  agent systems through chaos engineering, in: 2025 IEEE/ACM 4th International
+     Pract. Exp. 52 (7) (2022) 1581–1614, http://dx.doi.org/10.1002/spe.3076.                     Conference on AI Engineering – Software Engineering for AI, CAIN, 2025, pp.
+[56] P. Raj, S. Vanga, A. Chaudhary, The observability, chaos engineering, and                    250–252, http://dx.doi.org/10.1109/CAIN66642.2025.00039.
+     remediation for cloud-native reliability, in: Cloud-Native Computing: How To            [62] A. Al-Said Ahmad, L.F. Al-Qora’n, A. Zayed, Exploring the impact of chaos
+     Design, Develop, and Secure Microservices and Event-Driven Applications, 2023,               engineering with various user loads on cloud native applications: An exploratory
+     pp. 71–93, http://dx.doi.org/10.1002/9781119814795.ch4.                                      empirical study, Computing 106 (2024) 2389–2425, http://dx.doi.org/10.1007/
+[57] M.A. Chang, B. Tschaen, T. Benson, L. Vanbever, Chaos monkey: Increasing sdn                 s00607-024-01292-z.
+     reliability through systematic network destruction, in: Proceedings of the 2015         [63] K.A. Torkura, M.I. Sukmana, F. Cheng, C. Meinel, Security chaos engineering for
+     ACM Conference on Special Interest Group on Data Communication, 2015, pp.                    cloud services: Work in progress, in: 2019 IEEE 18th International Symposium
+     371–372.                                                                                     on Network Computing and Applications, NCA, 2019, pp. 1–3, http://dx.doi.org/
+                                                                                                  10.1109/NCA.2019.8935046.
+
+
+
+
+                                                                                        14
+
--- a/papers_txt/Co-distillation-based-defense-framework-for-federated-k_2026_Computer-Standa.txt
+++ b/papers_txt/Co-distillation-based-defense-framework-for-federated-k_2026_Computer-Standa.txt
@@ -0,0 +1,830 @@
+                                                              Computer Standards & Interfaces 97 (2026) 104113
+
+
+                                                                  Contents lists available at ScienceDirect
+
+
+                                                        Computer Standards & Interfaces
+                                                           journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+Co-distillation-based defense framework for federated knowledge graph
+embedding against poisoning attacks
+                                      ∗
+Yiqin Lu, Jiarui Chen                  , Jiancheng Qin
+School of Electronic and Information Engineering, South China University of Technology, 510641, China
+
+
+
+ARTICLE                  INFO                            ABSTRACT
+
+Keywords:                                                Federated knowledge graph embedding (FKGE) enables collaborative knowledge sharing without data ex-
+Federated learning                                       change, but it also introduces risks of poisoning attacks that degrade model accuracy or force incorrect
+Knowledge graph                                          outputs. Protecting FKGE from poisoning attacks becomes a critical research problem. This paper reveals
+Poisoning attack
+                                                         the malicious strategy of untargeted FKGE poisoning attacks and proposes CoDFKGE, a co-distillation-based
+Knowledge distillation
+                                                         FKGE framework for defending against poisoning attacks. CoDFKGE deploys two collaborative knowledge
+                                                         graph embedding models on clients, decoupling prediction parameters from shared parameters as a model-
+                                                         agnostic solution. By designing distinct distillation loss functions, CoDFKGE transfers clean knowledge from
+                                                         potentially poisoned shared parameters while compressing dimensions to reduce communication overhead.
+                                                         Experiments show CoDFKGE preserves link prediction performance with lower communication costs, eliminates
+                                                         malicious manipulations under targeted poisoning attacks, and significantly mitigates accuracy degradation
+                                                         under untargeted poisoning attacks.
+
+
+
+1. Introduction                                                                               embedding for entities and relations. However, real-world KGs of dif-
+                                                                                              ferent organizations are often incomplete, making it difficult to train
+    Knowledge graphs (KGs) are structured representations of real-                            high-quality knowledge graph reasoning models. Moreover, KG data
+world entities and their relationships, supporting applications in search                     often contains a large amount of private data, and direct data sharing
+engines [1,2], recommendation systems [3,4], and security analysis [5,                        will inevitably lead to privacy leakage. For this reason, federated
+6]. Knowledge graph embedding (KGE) techniques project entities                               learning [12] is introduced into knowledge graph reasoning.
+and relations into low-dimensional vector spaces, enabling efficient
+                                                                                                  FKGE assumes that there are multiple participants with comple-
+knowledge reasoning and completion [7]. Due to privacy regulations
+                                                                                              mentary but incomplete KGs, aiming to derive optimal knowledge
+and data sensitivity requirements, KGs across organizations within the
+                                                                                              embeddings for each participant without data exchange. Most existing
+same domain remain fragmented despite growing data volumes. In this
+context, federated knowledge graph embedding (FKGE) emerges as a                              studies [13–15] model FKGE as multiple clients that maintain local
+collaborative learning technique for sharing KG embeddings without                            KGE models and a central server. Clients train models locally and
+data exchange. However, the introduction of federation mechanisms                             upload the model parameters to the central server, which aggregates
+will bring new privacy risks. malicious participants can inject poisoned                      the parameters and then returns them to the clients.
+parameters during training or aggregation to launch a poisoning attack,                           However, since the embedding vectors are directly the model pa-
+degrading model accuracy or forcing incorrect outputs. Consequently,                          rameters, FKGE is highly vulnerable to poisoning attacks. With the
+protecting FKGE systems against poisoning attacks has emerged as a                            intent to reduce model performance, steal sensitive information, or dis-
+critical research challenge.                                                                  rupt system stability, poisoning attacks refer to malicious modifications
+    Unlike graph neural network (GNN)-based models, KGE models                                of parameters during local training or parameter aggregation on the
+usually rely on the translation-based model [8–11]. The embedding
+                                                                                              server. To protect the participants of FKGE, it is necessary to propose
+vectors of entity and relation in the KG are directly used as learnable
+                                                                                              a protection mechanism against FKGE poisoning attacks.
+parameters. KGE models utilize different score functions to measure
+                                                                                                  Moreover, other related indicators in FKGE deserve attention. For
+the plausibility of triples (h,r,t). By contrasting the outputs of existing
+triples and negatively sampled triples, KGE models derive appropriate                         example, the federated learning of KGE requires frequent parameter
+
+
+
+  ∗ Corresponding author.
+     E-mail addresses: eeyqlu@scut.edu.cn (Y. Lu), ee_jrchen@mail.scut.edu.cn (J. Chen), jcqin@scut.edu.cn (J. Qin).
+
+https://doi.org/10.1016/j.csi.2025.104113
+Received 3 June 2025; Received in revised form 8 November 2025; Accepted 8 December 2025
+Available online 9 December 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Y. Lu et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104113
+
+
+exchange, and the use of a translation-based model will submit the en-            2.3. Poisoning attack in federated learning
+tity or relation embeddings, which makes the communication overhead
+greater than that of traditional federated learning.                                  Federated Learning (FL), due to its distributed training nature,
+    Knowledge distillation [16] is a model compression technique that             creates favorable conditions for poisoning attacks while protecting
+improves the performance of a simple (student) model by transfer-                 data privacy. Poisoning attacks in federated learning have attracted
+ring the knowledge from a complex (teacher) model. Distillation-based             significant attention from researchers [25]. In federated learning sce-
+methods are considered to be a feasible solution to combat poisoning              narios, poisoning attacks pose serious threats to model security by
+attacks [17–19]. A teacher model can extract clean knowledge from                 manipulating partial training data or local models to embed malicious
+the poisoned parameters and transfer it to a student model, thereby               behaviors [26]. The literature [27] generates stealthy backdoor trig-
+improving the robustness without changing the model structure. Co-                gers by extracting high-frequency features from images using discrete
+distillation [20] is a variant of knowledge distillation that trains two or       wavelet transform and introduces an asymmetric frequency confusion
+more models simultaneously, allowing mutual learning and information              mechanism, achieving efficient backdoor attacks on multiple datasets.
+sharing. This paper aims to design a federated knowledge graph defense
+                                                                                  Meanwhile, many studies have proposed defense methods against poi-
+framework based on Co-distillation, which can enhance the model’s
+                                                                                  soning attacks. The Literature [28] proposes the Krum method, which
+resistance to poisoning attacks through collaborative learning without
+                                                                                  selects the most reliable gradient update by evaluating the consistency
+changing the original FKGE architecture.
+                                                                                  of gradients, thereby effectively defending against poisoning attacks.
+    The rest of this paper is organized as follows. Section 2 reviews the
+                                                                                  The Literature [29] proposes Fl-Defender, which improves robustness
+related work on FKGE and knowledge distillation. Section 3 introduces
+                                                                                  by introducing cosine similarity to adjust the weights of parameter
+the preliminary concepts and methodologies essential for addressing
+                                                                                  aggregation. The literature [30] proposed a two-stage backdoor defense
+FKGE poisoning attacks, with the main contributions of this paper
+                                                                                  method called MCLDef based on Model Contrastive Learning (MCL),
+summarized at the end of this section. In Section 4, we detail the threat
+                                                                                  which can significantly reduce the success rate of backdoor attacks with
+model and malicious strategies for targeted and untargeted poison-
+                                                                                  only a small amount of clean data. In summary, existing research on
+ing attacks in FKGE. Section 5 presents the CoDFKGE framework for
+                                                                                  poisoning attacks in federated learning mainly focuses on traditional
+defending against FKGE poisoning attacks, followed by experimental
+validation in Section 6. Finally, concluding remarks and future research          deep learning domains. The design ideas of defense frameworks have
+directions are outlined in Section 7.                                             laid the foundation for subsequent poisoning attack defense methods of
+                                                                                  FKGE.
+2. Related work
+                                                                                  2.4. Security issues in FKGE
+2.1. Basic FKGE framework
+                                                                                      With the development of FKGE, its security and privacy issues have
+    Early research on FKGE mainly focused on how to achieve cross-                attracted increasing attention, with existing research mainly focusing
+client knowledge sharing and model aggregation while protecting data              on privacy leakage defense. The literature [31] proposed a decentral-
+privacy. FedE [13] is the first paper to introduce federated learning into        ized scalable learning framework where embeddings from different KGs
+KGE. FedE facilitates cross-client knowledge sharing by maintaining an            can be learned in an asynchronous and peer-to-peer manner while
+entity table. Nevertheless, the mechanism of sharing entity embeddings            being privacy-preserving. The literature [21] conducts the first holistic
+in FedE has been proven to contain privacy vulnerabilities [21]. At-              study of the privacy threat on FKGE from both attack and defense
+tackers can leverage the embedding information to infer the existence             perspectives. It introduced three new inference attacks and proposed
+of private triples within client datasets. Based on FedE, FedEC [14]              a differentially private FKGE model DP-Flames with private selection
+applies embedding contrastive learning for tackling data heterogeneity            and an adaptive privacy budget allocation policy. Based on [21], the
+and utilizes a global update procedure for sharing entity embeddings.             literature [32] introduces five new inference attacks, and proposed
+In response to the privacy vulnerability of FedE, FedR [15] proposed a            PDP-Flames, which leverages the sparse gradient nature of FKGE for
+privacy-preserving relation embedding aggregation method. By sharing
+                                                                                  better privacy-utility trade-off.
+relation embeddings instead of entity embeddings, FedR can signifi-
+                                                                                      Compared with privacy leakage issues, research on defending
+cantly reduce the communication overhead of privacy leakage risks
+                                                                                  against poisoning attacks in FKGE is still in its early stages. Traditional
+while retaining the semantic information of the KG.
+                                                                                  federated learning typically does not directly transmit original embed-
+                                                                                  dings. However, entity and relation embeddings are core components
+2.2. Knowledge distillation in FKGE
+                                                                                  in translation-based KGE, so direct transmission of embeddings is
+                                                                                  required during FKGE aggregation. Direct malicious modifications to
+    Knowledge Distillation techniques are widely applied in the FKGE
+                                                                                  embeddings are difficult to effectively defend against using traditional
+field due to their advantages in model compression and knowledge
+transfer. To cope with the drift between local optimization and global            federated learning defense methods.
+convergence caused by data heterogeneity, FedLU [22] proposes mu-                     The recent literature [33] is the first work to systematize the risks of
+tual knowledge distillation. Moreover, it contains an unlearning method           FKGE poisoning attacks. However, it primarily focuses on several forms
+to erase specific knowledge from local clients. FedKD [23] uses knowl-            of targeted poisoning attacks in FKGE, without mentioning untargeted
+edge distillation to reduce communication costs, and proposes to adap-            poisoning attacks. Although this research provides some defense sug-
+tively learn temperature to scale the scores of triples to mitigate teacher       gestions, such as zero-knowledge proof and privacy set intersection, it
+over-confidence issues. In addition to FKGE, the KGE model ColE [24]              does not propose specific defense methods. In summary, the existing
+proposes co-distillation learning to exploit the complementarity of               research lacks a systematic introduction to the untargeted poisoning
+graph structure and text information. It employs Transformer and Bert             attack of FKGE, and there is no complete defense method against FKGE
+for graph and text respectively, then distills selective knowledge from           poisoning attacks.
+each other’s prediction logits. Overall, existing research on knowledge               To address the above issues, this paper reveals the malicious strat-
+distillation in FKGE primarily focuses on handling data heterogeneity,            egy of FKGE untargeted poisoning attacks and proposes CoDFKGE,
+with insufficient exploration of its potential value in model security.           a co-distillation-based federating knowledge graph embedding frame-
+This paper will explore the application of knowledge distillation in              work for defending against poisoning attacks. The main contributions
+FKGE security to defend against poisoning attacks.                                of this paper are summarized as follows.
+
+                                                                              2
+Y. Lu et al.                                                                                                             Computer Standards & Interfaces 97 (2026) 104113
+
+
+    1 We systematically define untargeted poisoning attacks in FKGE                    local KGE model to update its local embedding 𝜃𝐿𝑘 and server-shared
+                                                                                                                                            𝑐
+      and reveal the poisoning attacks’ malicious strategy, thereby en-                embedding 𝜃𝑆𝑘 . Then, client 𝑐 uploads its shared embedding 𝜃𝑆𝑘 to the
+                                                                                                        𝑐                                               𝑐
+      hancing threat identification in FKGE and providing a foundation                 server. In server aggregate stage, the central server 𝑆 aggregates the
+      for subsequent defense research.                                                 shared embeddings from all clients to obtain the shared parameters
+    2 We propose CoDFKGE, the first co-distillation defense framework                  𝜃𝑆𝑘+1 . Finally, the server broadcasts the shared parameters 𝜃𝑆𝑘+1 to all
+      against poisoning attacks in FKGE. By deploying bidirectional                    clients. Entity embeddings in KGE are usually shared parameters, while
+      distillation models with distinct distillation loss at the client side,          relation embeddings are local parameters. Only rare literature [15] uses
+      CoDFKGE as a model-agnostic solution decouples prediction pa-                    relation embeddings as shared parameters.
+      rameters from shared parameters, thereby enhancing the model’s                        In FKGE, how the server effectively aggregates shared embeddings
+      resistance to poisoning attacks and improving robustness. We                     from different clients is a common problem. The most common FKGE
+      designed distinct distillation loss functions for the two models in              server aggregation method is FedE [13], which is an improvement on
+      CoDFKGE, enabling CoDFKGE to transfer clean knowledge from                       FedAvg [12]. To handle the imbalance in the number of entities across
+      potentially poisoned shared parameters and compress shared pa-                   different clients, FedE aggregate the shared entities using the number
+      rameter dimensions, which reduces communication overhead.                        of occurrences in the local data as the weight 𝑤𝑐 . This weight value
+    3 We validated the performance of CoDFKGE against poisoning                        can be obtained using the existence matrix 𝑀 mentioned above. The
+      attacks through experiments. The results show that without com-                  mathematical expression for FedE’s server aggregation method is shown
+      promising link prediction performance CoDFKGE can completely                     in (2).
+      eliminate targeted poisoning attacks and significantly mitigate                             ∑
+                                                                                         𝜃𝑆𝑘+1 = 𝑐 𝑤𝑐 𝜃𝑆𝑘                                                   (2)
+      the performance degradation caused by untargeted poisoning                                         𝑐
+
+      attacks, while simultaneously reducing communication overhead.                       The final target of FKGE is to minimize the loss function of all client
+      Ablation experiments further confirm the effectiveness of the two                local triplets simultaneously through federated learning. Its optimiza-
+      distillation loss functions in CoDFKGE.                                          tion objective can be expressed as Eq. (3).
+                                                                                                     ∑𝐶
+                                                                                         𝑎𝑟𝑔 min         𝑐 (𝜃𝐿𝑐 , 𝜃𝑆𝑐 )                                       (3)
+3. Preliminaries                                                                            (𝜃 ,𝜃 ) 𝑐
+                                                                                              𝐿𝑐   𝑆𝑐
+
+
+3.1. Knowledge graph embedding                                                         3.3. Knowledge distillation
+
+    KG can be represented as (, ,  ), where E and R are entity sets                    Knowledge distillation is a model compression technique that trans-
+and relationship sets.  is a set of triples, where a triple (ℎ, 𝑟, 𝑡) ∈              fers knowledge contained in a complex model (teacher) to a simple
+indicates that a relationship 𝑟 ∈  connects the entities ℎ, 𝑡 ∈ .                    model (student) to improve the performance of the simple model. In the
+    Translation-based KGE models project entities and relationships                    classic knowledge distillation framework, the student model’s training
+in KGs into a continuous vector space. Models employ the scoring                       loss comprises two components: the cross entropy loss 𝐿𝐶𝐸 , computed
+function 𝑔(ℎ, 𝑟, 𝑡; 𝜃) to evaluate the plausibility of triples, while 𝜃 rep-           between its output and the true label, and the distillation loss 𝐿𝐾𝐷 ,
+resents the embedding parameters. During model training, negative                      computed between its output and the teacher model’s output (soft
+samples (ℎ, 𝑟, 𝑡′ ) are constructed by randomly replacing the tail entities            label). In practical applications, the distillation loss is usually quantified
+of positive triples. The training process aims to maximize the score                   using the Kullback–Leibler divergence 𝐷𝐾𝐿 between the student model
+discrepancy between positive and negative samples. Currently, most                     output and the soft label, and its mathematical expression is shown
+KGE models [9,11] employ the binary cross-entropy loss to measure                      in Eq. (4).
+the difference between positive and negative samples. Its mathematical                                   (            ) ∑              (       )
+                                                                                                                                         𝑝 (𝑖)
+expression is as Eq. (1).                                                                           𝐷𝐾𝐿 𝑝𝑡𝑒𝑎 ∥ 𝑝𝑠𝑡𝑢 = 𝑖 𝑝𝑡𝑒𝑎 (𝑖) log 𝑝𝑡𝑒𝑎 (𝑖)
+                                                                                                         (                   )            𝑠𝑡𝑢         ( )         (4)
+               (
+          ∑                                                                              𝐿𝐾𝐷 = 𝜏 2 𝐷𝐾𝐿 𝜎(𝑧(𝑛)           (𝑛)
+                                                                                                             𝑡𝑒𝑎 ) ∥ 𝜎(𝑧𝑠𝑡𝑢 ) , 𝑤ℎ𝑒𝑟𝑒 𝜎(𝑥) = sof tmax 𝜏
+                                                                                                                                                       𝑥
+𝐿 = −              log 𝜎 (𝑔(ℎ, 𝑟, 𝑡; 𝜃) − 𝛾)
+          (ℎ,𝑟,𝑡)∈                                                                       Among them, 𝑧𝑡𝑒𝑎 and 𝑧𝑠𝑡𝑢 are the logits of the teacher model and
+                                                          )
+          ∑                                                                            student model, respectively. 𝜏 is the temperature coefficient, which is
+      +        𝑝(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃) log 𝜎(𝛾 − 𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃))                 (1)        used to control the smoothness of the output.
+           𝑖
+                                                                                          To allow the student model to effectively absorb the knowledge
+    Among them, 𝛾 represents the margin, and (ℎ, 𝑟, 𝑡′𝑖 ) is 𝑖th negative              contained in the teacher model while fitting the real data distribution,
+triples. 𝑝(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃) stands for the occurrence probability of this negative       the final loss function is usually the weighted sum of 𝐿𝐶𝐸 and 𝐿𝐾𝐷 .
+sample given the embedding parameters 𝜃.
+                                                                                       4. Threat model
+3.2. Federated knowledge graph embedding
+                                                                                           Poisoning attacks in federated learning can be categorized into
+    FKGE is an application of federated learning that aims to fuse and                 targeted poisoning attacks, semi-targeted poisoning attacks, and untar-
+share knowledge vectors from different KGs to enhance the effective-                   geted poisoning attacks according to the intention of attackers [34].
+ness of KGE. Currently, most related studies are based on the framework                In FKGE, a semi-targeted poisoning attack can be regarded as a special
+proposed in FedE [13].                                                                 case of a targeted poisoning attack. Therefore, this paper focuses on the
+    The basic framework of FKGE consists of a client set 𝐶 and a central               targeted and untargeted poisoning attack type.
+server 𝑆. Each client 𝑐 ∈ 𝐶 holds a local KG 𝑐 (𝑐 , 𝑐 , 𝑐 ). The entity
+sets of different KGs are partially overlapping, so the understanding of               4.1. Targeted poisoning attack
+entities in a certain client can be supplemented by information from
+other clients. The server has the one-hot existence matrix 𝑀 ∈ R𝐶×𝑁                       Targeted poisoning attacks are a attack strategy where the attacker
+of all entities in the client, where 𝑁 is the number of entities.                      crafts specific malicious triples that do not exist in the target system,
+    In each client, KGE model parameters consist of local parame-                      and manipulate the target model to accept these fake triples by inject-
+ters 𝜃𝐿 and shared parameters 𝜃𝑆 . During FKGE training, each epoch                    ing poisoned parameters into the shared parameters. This type of attack
+progresses through two sequential phases: client update and server                     poses a serious threat to the application of FKGE, as the false relation-
+aggregation. In the 𝑘th client update stage, client 𝑐 first trains its                 ships it introduces can lead to reasoning errors and decision-making
+
+                                                                                   3
+Y. Lu et al.                                                                                                      Computer Standards & Interfaces 97 (2026) 104113
+
+
+
+
+                                                        Fig. 1. Process of targeted poisoning attack.
+
+
+
+
+                                                          Fig. 2. Framework of CoDFKGE model.
+
+
+
+
+biases in downstream tasks. For example, in financial transaction net-            attacker’s deceptive information. The shadow model’s parameters in-
+works, a knowledge graph is constructed with transaction entities                 clude 𝜃𝑆𝑝 , which can be initialized with the victim shared parameters
+as nodes and transaction relationships as edges. Link prediction can              𝜃𝑆𝑐 , and 𝜃𝐿𝑝 , which approximates the victim’s local model parameters
+then be applied to detect potential transaction relationships (such as            𝜃𝐿𝑐 from random initial values. To ensure the shadow model effectively
+money laundering or fraud). If an attacker compromises one of the                 bridges both the victim’s genuine knowledge and the attacker’s ma-
+participants, they can introduce false transaction relationships through          licious objectives, its parameters are optimized to minimize the loss
+targeted poisoning attacks, leading to unreasonable inferences about              function across all triples in the poisoned dataset, as formalized in Eq.
+the victim entity.                                                                (5).
+                                                                                                    ∑
+    To execute such an attack successfully, the attacker typically follows          arg min            𝐿(ℎ, 𝑟, 𝑡; 𝜃𝑆𝑝 , 𝜃𝐿𝑝 )
+                                                                                        (𝜃𝑆𝑝 ,𝜃𝐿𝑝 )                                                     (5)
+a multi-stage process that begins with victim’s local information gath-                       (ℎ,𝑟,𝑡)∈𝑝
+ering. Fig. 1 shows the process of a targeted poisoning attack. In FKGE
+                                                                                  Where L is the loss function of the baseline model.
+systems, while the server can observe the entities and relations each
+                                                                                      After training the shadow model, the attacker extracts the poisoned
+client possesses, it lacks visibility into how these elements are struc-          shared parameters 𝜃𝑆𝑝 using the same procedure that legitimate clients
+tured into specific triples. However, for frameworks that share entity            employ to prepare parameters for server aggregation. The attacker can
+embeddings (such as FedE [13]), recent research [21] has shown that a             aggregate the poisoned parameters 𝜃𝑆𝑝 with the normal clients’ shared
+malicious server can use KGE scoring function to infer the victim’s local         parameters. The attacker usually operates as a compromised server and
+relationship patterns and reconstruct the victim’s triple 𝑣 . Armed with         assigns a disproportionately high weight to the poisoned parameters
+this inferred knowledge, the attacker strategically constructs malicious          during the aggregation process to ensure that the poisoned parameter
+triples 𝑚 that align with the victim’s existing KG schema but represent          dominate the aggregated shared parameters.
+false information.                                                                    The final stage of the attack exploits the implicit trust in feder-
+    The next critical attack phase involves training a shadow model, a            ated systems. The victim client, unaware of the poisoning, directly
+surrogate KGE model designed to mimic the victim’s learning process.              incorporates the compromised aggregated parameters into its local
+The shadow model is trained on a poisoned dataset 𝑝 , which combines             training process without validation. As a result, the victim’s model
+the inferred victim triples 𝑣 and the malicious triples 𝑚 . This training       gradually learns to accept the malicious triples as valid, ultimately pro-
+strategy ensures the shadow model learns to generate embeddings                   ducing incorrect predictions on these non-existent relationships while
+that are consistent with both the victim’s genuine knowledge and the              maintaining seemingly normal performance on other parts of the KG.
+
+                                                                              4
+Y. Lu et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104113
+
+
+4.2. Untargeted poisoning attack                                                   facilitate the reproducibility of our CoDFKGE model, we provide the
+                                                                                   complete training framework pseudocode as shown in Algorithm 1.
+    The conditions for achieving a targeted poisoning attack are com-
+plex. For example, FedR [15] shares only relation embeddings (not
+                                                                                   Algorithm 1 CoDFKGE Training Framework
+entity embeddings), preventing attackers from inferring victim rela-
+tions via entity matrices and thus avoiding targeted poisoning attacks.            Require: Baseline KGE model 𝑔, Training triples  , Learning rate 𝜂,
+Even with relational data leaks, targeted poisoning attacks are difficult.             Distillation weight 𝛽, Distillation temperature 𝜏, Total iterations 𝐾
+Compared with sharing entity embeddings, the sparsity of relation                      Initialization:
+embeddings reduces the shadow model’s ability to align parameters                   1: Initialize client-side prediction model with 𝜃0𝑃 = (𝜃0𝑆 , 𝜃0𝐿 ) ⊳ Local
+with the victim’s vector space. However, FedR has almost no defense                    parameters randomly initialized
+                                                                                    2: Initialize client-side communication model with reduced feature
+effect against untargeted poisoning attacks.
+                                                                                       dimensions
+    An untargeted poisoning attack means that the attacker aims to dis-
+                                                                                    3: Initialize server-side aggregated parameters 𝜃1𝑆 = 𝜃0𝑆 ⊳ First round
+rupt victim model convergence or maximize the mispredictions among
+                                                                                       initialization
+test cases. By maximizing the victim’s loss function during training,
+                                                                                       Main Training Loop (Iterations 𝑘 = 1, 2, ..., 𝐾):
+attackers can force non-convergent predictions. The attacker can gen-
+                                                                                       // Client Update Phase (For each client)
+erate the poisoned shared parameter 𝜃𝑆∗ for the victim, which can be
+                                         𝑣                                          4: for each client 𝑐 ∈ 𝐶 do
+formalized in Eq. (6).
+            ∑                                                                       5:     // Step 1: Communication to Prediction Model Distillation
+  arg max        𝐿(ℎ, 𝑟, 𝑡; 𝜃𝑆∗ , 𝜃𝐿𝑣 )                               (6)           6:     Load server-shared parameters 𝜃𝑘𝑆         ⊳ Latest global shared
+      𝜃∗𝑆𝑣 (ℎ,𝑟,𝑡)∈𝑣
+                               𝑣
+                                                                                       embeddings
+                                                                                                                                              𝐶𝐿
+     Among them, 𝜃𝐿𝑣 denotes the victim’s local parameters. 𝑣 is the               7:     Initialize communication model with 𝜃 𝐶 = (𝜃𝑘𝑆 , 𝜃𝑘−1    )
+                                                                                    8:     Freeze communication model parameters            ⊳ Act as teacher
+victim’s triplet set. Since it is difficult for the attacker to obtain these
+                                                                                       model
+two parameters directory, they can use random values as guesses for                                                     𝑃
+                                                                                    9:     Compute distillation loss 𝐿𝑘 𝐾𝐷 using Equation (7)          ⊳ Only
+𝜃𝐿𝑣 and use triples of random combinations of 𝑣 and  as guesses for
+                                                                                       positive samples
+𝑣 .                                                                                                             𝑃
+                                                                                   10:     Compute KGE loss 𝐿𝑘 𝐾𝐺𝐸 on training triples 
+     In particular, for the TransE model [7] with the scoring function                                                                  𝑃   𝑃
+𝑔(ℎ, 𝑟, 𝑡) = |ℎ + 𝑟 − 𝑡|, the attacker can launch an untargeted poisoning          11:      Update prediction model parameters (𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) with:
+                                                                                                           𝑃               𝑃
+attack by setting the shared parameter 𝜃𝑆′ sent to the victim to identical         12:          ∇𝜃𝑘𝑃 = ∇(𝛽𝐿𝑘 𝐾𝐺𝐸 + (1 − 𝛽)𝐿𝑘 𝐾𝐷 )⊳ Gradient flows through
+                                             𝑣
+value or using negative aggregation parameters. To avoid detection,                    prediction model only
+                                                                                                                               𝑃     𝑃
+noise is often added to poisoned parameters. The prediction perfor-                13:         𝜃𝑘 = 𝜃𝑘 − 𝜂∇𝜃𝑘𝑃 , 𝑤ℎ𝑒𝑟𝑒 𝜃𝑘 = {𝜃𝑘 𝐿 , 𝜃𝑘 𝑆 }          ⊳ Update
+mance of the victim model may even be lower than that of standalone                    prediction model parameters
+training without federated aggregation.                                            14:    Unfreeze communication model parameters
+     In general, the success of FKGE poisoning attacks relies on vic-              15:    // Step 2: Prediction to Communication Model Distillation
+tims using attacker-provided aggregate parameters directly for training            16:    Freeze prediction model parameters 𝜃𝑘𝑃            ⊳ Used as teacher
+without validation. To prevent poisoning attacks, it is critical to isolate            model
+                                                                                                                     𝐶
+the parameters of the prediction model from externally provided aggre-             17:    Compute distillation loss 𝐿𝑘 𝐾𝐷 using Equation (9)           ⊳ Both
+gate parameters. Specifically, potentially poisoned shared parameters                  samples
+                                                                                                                                           𝐶   𝐶
+must be filtered before training. Meanwhile, minimizing parameter ex-              18:    Update communication model parameters (𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) with
+                                                                                                           𝐶
+posure to the external environment is essential. Therefore, we propose             19:        ∇𝜃𝑘𝐶 = ∇𝐿𝑘 𝐾𝐷     ⊳ Gradient flows through communication
+CoDFKGE, a defense FKGE framework based on co-distillation.                            model only
+                                                                                                                              𝐶     𝐶
+                                                                                   20:        𝜃𝑘 = 𝜃𝑘 − 𝜂∇𝜃𝑘𝐶 , 𝑤ℎ𝑒𝑟𝑒 𝜃𝑘 = {𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 }
+                                                                                                                                    𝐶
+5. Model design                                                                    21:    Upload updated shared parameters 𝜃𝑘 𝑆 to server
+                                                                                   22:    Unfreeze prediction model parameters
+    CoDFKGE is a training framework on the client side. Its training               23: end for
+process is shown in Fig. 2. CoDFKGE initializes two baseline models                      // Server Aggregation Phase
+with the same structure and scoring function, but for different purposes.          24: Server aggregates 𝜃𝑘𝑆 + 1 from all clients using baseline federated
+The communication model is mainly responsible for receiving and                          aggregate method.
+processing shared parameters, while the prediction model is used for               25: Set 𝑘 = 𝑘 + 1 and repeat main loop until 𝑘 > 𝐾    ⊳ Continue Main
+the final embedding and prediction. To minimize potential parameter                      Training Loop
+leakage and communication overhead, the feature dimension of the                             return Final prediction model parameters of each client.
+communication model is intentionally designed to be smaller than that
+of the prediction model.
+    During the training process, the two models learn collaboratively                  CoDFKGE is designed to be model-agnostic, enabling seamless in-
+through knowledge distillation. Once the communication model re-                   tegration with diverse FKGE models based on their shared parameter
+ceives the potentially poisoned shared parameters from the server,                 types. Both communication and prediction models used by CoDFKGE
+it acts as a teacher model to transfer clean knowledge to the pre-                 clients utilize the same scoring function 𝑔 as the original KGE model.
+diction model. Following the training of the prediction model, the                 Clients upload and utilize shared parameters identically to the baseline
+roles are reversed: the prediction model becomes the teacher, and the
+                                                                                   model, with these parameters maintaining the same form and dimen-
+communication model serves as the student for distillation. This stage
+                                                                                   sionality as the original implementation. This parameter compatibility
+extracts knowledge from the prediction model and compresses it into
+the communication model, ensuring efficient knowledge sharing while                enables the server to aggregate updates using existing federated learn-
+minimizing parameter exposure and communication overhead. By de-                   ing aggregation methods without modification. This design ensures that
+ploying two distinct model instances, the framework physically isolates            CoDFKGE preserves the original knowledge representation capabilities
+attacker-injected parameters from the prediction model’s parameters,               while maintaining consistent operational semantics with the baseline
+making poisoning attacks significantly more difficult to execute. To               model.
+
+                                                                               5
+Y. Lu et al.                                                                                                                                      Computer Standards & Interfaces 97 (2026) 104113
+
+
+5.1. Communication to prediction model distillation                                                      of 𝑝 follows the approach in [9], with its mathematical formulation
+                                                                                                         provided in Eq. (10).
+    In the first iteration, the model trains the prediction component                                                      exp 𝜏 𝑔(ℎ,𝑟,𝑡′ )
+following the standard procedure. Starting from the second iteration of                                   𝑝(ℎ, 𝑟, 𝑡′𝑖 ) = ∑ exp𝛼𝜏 𝑔(ℎ,𝑟,𝑡
+                                                                                                                                       𝑖
+                                                                                                                                          ′)                                                (10)
+                                                                                                                            𝑗     𝛼           𝑗
+the training process, the communication model loads the server-shared
+                                                                                                         Where 𝜏𝛼 is the self-adversarial sampling temperature.
+parameters 𝜃𝑘𝑆 and initializes itself jointly with the local embeddings
+ 𝐿 from the previous iteration’s local prediction model.                                                     After the bidirectional distillation process of CoDFKGE, the com-
+𝜃𝑘−1                                                                                                                                                      𝐶      𝐶
+                                                                                                         munication model parameters are updated to 𝜃𝑘 𝑆 and 𝜃𝑘 𝐿 . Client then
+    After the communication model receives and applies the server-                                                 𝐶𝑆
+                                                                                                         uploads 𝜃𝑘 to the server, which aggregates these parameters from all
+shared parameters, it filters out potentially poisoned model parameters
+                                                                                                         clients using federated averaging to generate the next round’s shared
+through knowledge distillation. The communication model acts as a                                                     𝑆 .
+                                                                                                         parameters 𝜃𝑘+1
+teacher model to transfer clean knowledge to the prediction model,
+which serves as the student model. During this process, the prediction
+                                                                                                         6. Experiments
+model parameters are frozen to ensure that the knowledge transfer
+direction is strictly from the communication model to the prediction
+                                                                                                              Experiments are conducted on the open available dataset FB15K-
+model. Gradients only flow through the prediction model parameters,
+                                                                                                         237 [35], which is a subset of Freebase, containing 14,505 entities,
+while the communication model parameters remain frozen, preventing
+                                                                                                         544,230 triples, and 474 relations. To perform federated learning, we
+gradient leakage back to potentially poisoned shared parameters.
+                                                                                                         adopt the relational partitioning method in [22]. This method first
+    If the communication model suffers from poisoning attacks and
+                                                                                                         partitions the relationships through clustering, ensuring that the triple
+contains the poisoning parameter, its outputs for negative samples are
+                                                                                                         relationships within each partition are as close as possible. Then, these
+not reliable. Distilling or teaching such uncertain predictions would
+                                                                                                         partitions are divided into groups of roughly equal numbers of triples
+propagate noise rather than useful knowledge. To exclude the poisoned
+                                                                                                         and distributed to the client. This results in tighter triple relationships
+knowledge, the prediction model should focus on positive samples
+                                                                                                         within the client, better reflecting real-world scenarios.
+during distillation, ensuring that only trustworthy knowledge is trans-
+                                                                                                              The TransE model [7] is selected as the KGE model, serving as
+ferred. The mathematical expression for the distillation loss of the
+                                                                                                         the foundation for all federated learning methods in the experiments—
+prediction model in the 𝑘th training epoch is provided in Eq. (7).                                       including the attacker’s shadow model. To benchmark CoDFKGE, we
+               ∑          (                                                      )                       select multiple baseline models. First, the local training model without
+    𝑃                                        𝑃𝐿                     𝑃      𝑃
+  𝐿𝑘 𝐾𝐷 = 𝜏 2        𝐷𝐾𝐿 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘𝑆 , 𝜃𝑘−1 )) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ))
+                                                                                                         federated learning is selected as the KGE baseline model. It does not
+                   (ℎ,𝑟,𝑡)∈
+                                                                                                         share parameters between clients, so it has no communication over-
+                                                                                               (7)       head and is not vulnerable to poisoning attacks. Then, FedE [13] and
+    Among them, 𝑡 is the distillation temperature coefficient, and 𝜎 is                                  FedR [15] are also chosen as baseline FGKE models, representing stan-
+                                                                                                         dard approaches in the field. Additionally, we implement a knowledge
+the softmax function of the ratio of the model output to 𝑡. 𝑔 represents
+                                                                                                         distillation model, which utilizes communication and prediction models
+the scoring function of the prediction model, which is used to compute
+                                𝑃𝐿                                                                       similar to CoDFKGE but only processes a unidirectional knowledge dis-
+the KGE loss. 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘𝑆 , 𝜃𝑘−1 ) represents the communication model
+                                                                  𝑃𝐿                                     tillation. Specifically, it uses the communication model as the teacher
+output under server-shared parameter 𝜃𝑘𝑆 and local parameter 𝜃𝑘−1     , and
+            𝑃𝑆 𝑃𝐿                                                                                        model and the prediction model as the student model to filter out
+𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 , 𝜃𝑘 ) represents the training prediction model output.                                    poisoning knowledge, with the distillation loss function following Eq.
+    When training distillation, the model also needs to consider the                                     (4).
+KGE loss function. The overall loss function of the prediction model                                          All experiments are performed on a 72-core Ubuntu 18.04.6 LTS
+is the weighted sum of the KGE loss and the distillation loss, and its                                   machine with an Intel(R) Xeon(R) Gold 5220 CPU @ 2.20 GHz and
+mathematical expression is shown in Eq. (8).                                                             a V100S-PCIE-32GB GPU. We implemented the proposed FKGE frame-
+               𝑃
+ 𝐿𝑃𝑘 = 𝛽𝐿𝑘 𝐾𝐺𝐸 + (1 − 𝛽)𝐿𝑘 𝐾𝐷
+                                   𝑃
+                                                                                               (8)       work and baseline model based on PyTorch Geometric [36] and dis-
+                                                                                                         tributed AI framework Ray [37]. We used KGE hyperparameter settings
+        𝑃𝑘
+Where, 𝐿𝐾𝐺𝐸 is the KGE loss of the 𝑘th epoch of the prediction model                                     based on [9] and FKGE hyperparameter settings based on FedE [13].
+defined by Eq. (1), and 𝛽 is the weight.                                                                 Specifically, we used the Adam [38] optimizer with a learning rate of
+                                                                                                         1e-3. 𝛾 is 10, and self-advertise negative sampling temperature 𝜏𝛼 in
+5.2. Prediction to communication model distillation                                                      KGE is 1. The distillation temperature 𝜏 is 2, and the coefficient 𝛽 of
+                                                                                                         distillation and KGE loss are both 0.5. The maximum training epoch
+   After training the prediction model, we train the communication                                       is 400. In each epoch, the client performs 3 iterations locally before
+model through distillation, which extracts and propagates knowledge                                      uploading the parameters to the server.
+without directly sharing prediction parameters, thereby avoiding pri-                                         We utilize the link prediction task, a sub-task of KGE, to validate the
+vacy leakage. During the communication model’s distillation, the out-                                    model’s accuracy. Referencing the common implementation of the link
+put of the prediction model under positive and negative samples serves                                   prediction, we employ the Mean Reciprocal Rank (MRR) and Hits@N as
+as soft labels. As Eq. (1) illustrates, the loss function must account                                   accuracy metrics. The MRR is the average of the reciprocals of the ranks
+for the probability of negative samples when balancing the impact                                        of the predicted triples among all possible triples. Mathematically, if
+of positive and negative predictions. Therefore, the distillation loss                                   𝑟𝑎𝑛𝑘𝑖 is the rank of the correct triple for the 𝑖th query, and 𝑛 is the
+                                                                                                                                                      ∑
+function of the communication model is formalized in Eq. (9).                                            total number of queries, then 𝑀𝑅𝑅 = 1𝑛 𝑛𝑖=1 𝑟𝑎𝑛𝑘    1
+                                                                                                                                                                 . The Hits@N is the
+                                                                                                                                                               𝑖
+             ∑                                                                                           proportion of query triples for which the correct triple is present among
+  𝐶𝑘                                  𝑃      𝑃                      𝐶      𝐶
+ 𝐿𝐾𝐷  = 𝜏2        (𝐷𝐾𝐿 (𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 )) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ))))                      the top 𝑁 candidates generated by the model. Generally, higher values
+            ∑(ℎ,𝑟,𝑡)∈                                                                                   for both metrics indicate better model performance in link prediction.
+                                                 𝑃      𝑃                        𝐶      𝐶
+        +     𝑝(ℎ, 𝑟, 𝑡′𝑖 )𝐷𝐾𝐿 (𝜎(𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ))))              Through experiments, the following research questions will be ver-
+               𝑖                                                                                         ified.
+                                                                                               (9)
+                              𝐶   𝐶
+                                                                                                         RQ1 Does CoDFKGE maintain KGE prediction performance while re-
+   Among them, 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) represents the communication model                                   ducing FKGE communication overhead?
+                    𝑃𝑆 𝑃𝐿
+output. 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 , 𝜃𝑘 ) represents the prediction model output under                                RQ2 Can CoDFKGE effectively defend against targeted poisoning at-
+                      𝑃                          𝑃
+shared parameter 𝜃𝑘 𝑆 and local parameter 𝜃𝑘 𝐿 . The calculation method                                      tacks?
+
+                                                                                                     6
+Y. Lu et al.                                                                                                    Computer Standards & Interfaces 97 (2026) 104113
+
+
+Table 1
+Experiment result on normal link prediction.
+ Fed type       Model                     Mem(MB)      CC(MB)        MRR                     Hits@1                Hits@5                    Hits@10
+ Local          Local(128)                57.05        –             0.4081 ± 0.0015         0.3066 ± 0.0014       0.5223 ± 0.0023           0.6077 ± 0.0015
+ Entity         FedE(128)                 185.58       42.60         0.4082 ± 0.0004         0.3068 ± 0.0012       0.5232 ± 0.0013           0.6080 ± 0.0018
+ Entity         Distillation (128-128)    356.10       42.60         0.4129 ± 0.0008         0.3118 ± 0.0016       0.5279 ± 0.0008           0.6122 ± 0.0003
+ Entity         CoDFKGE (128-128)         356.10       42.60         0.4109 ± 0.0043         0.3097 ± 0.0041       0.5246 ± 0.0044           0.6087 ± 0.0040
+ Entity         Distillation (32-128)     217.39       10.65         0.3914 ± 0.0011         0.2935 ± 0.0008       0.5005 ± 0.0014           0.5838 ± 0.0032
+ Entity         CoDFKGE (32-128)          217.40       10.65         0.4090 ± 0.0010         0.3079 ± 0.0007       0.5233 ± 0.0019           0.6068 ± 0.0019
+ Relation       FedR(128)                 75.49        0.69          0.4085 ± 0.0011         0.3079 ± 0.0021       0.5219 ± 0.0016           0.6066 ± 0.0017
+ Relation       Distillation (128-128)    151.74       0.69          0.4106 ± 0.0013         0.3092 ± 0.0023       0.5242 ± 0.0008           0.6098 ± 0.0009
+ Relation       CoDFKGE (128-128)         150.02       0.69          0.4065 ± 0.0007         0.3056 ± 0.0013       0.5190 ± 0.0023           0.6063 ± 0.0012
+ Relation       Distillation (32-128)     94.53        0.17          0.3920 ± 0.0012         0.2960 ± 0.0007       0.4996 ± 0.0019           0.5807 ± 0.0013
+ Relation       CoDFKGE (32-128)          93.69        0.17          0.4078 ± 0.0009         0.3060 ± 0.0007       0.5224 ± 0.0031           0.6074 ± 0.0015
+
+
+
+RQ3 Can CoDFKGE effectively defend against untargeted poisoning                6.2. Targeted poisoning attack experiment (RQ2)
+    attacks?
+RQ4 Do the two proposed distillation loss functions individually con-               In the targeted poisoning attack, 32 pairs of non-existent triples
+    tribute to poisoning defense?                                              are selected as attack targets from the victim’s KG through negative
+                                                                               sampling to construct a poisoned triple dataset. First, a predetermined
+6.1. Normal link prediction (RQ1)                                              number of normal triples are selected from the victim’s training triples.
+                                                                               Subsequently, the head or tail nodes of these triples are randomly re-
+    To explore the performance of the proposed model in normal link            placed, and any triples already existing in the training set are iteratively
+prediction, we first tested the model on a conventional dataset. The           removed until 32 pairs of non-existent triples are successfully con-
+performance of the model is measured using MRR and Hits@1, Hits@5,             structed. In each epoch, the shadow model undergoes the same number
+and Hits@10. The model is trained by federated learning and evaluated          of local training rounds as legitimate clients on the poisoned dataset to
+on the local test sets of clients.                                             generate poisoned parameters. The malicious server aggregates these
+    Table 1 lists the performance of the local KGE model, FedE, FedR,          poisoned parameters with the parameters of the normal client into
+and CoDFKGE with different dimensions. The experimental results are            shared parameters and distributes them to all clients. Attackers can
+grouped according to the type of shared embeddings and the dimension           assign high weights to poisoned model parameters during aggregation.
+of the prediction model. The parameter dimensions are specified in             Following the setup in Ref. [33], we set the weight of the attacker’s
+parentheses within the ‘‘Model’’ column. For example, CoDFKGE(32-              aggregated poisoned triples to be 256 times that of normal triples.
+128) denotes the CoDKGE model with a 32-dimensional communication              Experiments focus on models with shared entity parameters (required
+model and a 128-dimensional prediction model. All link prediction
+                                                                               for targeted poisoning attacks) and non-federated local baselines.
+experiments were repeated 5 times with different random seeds, and
+                                                                                    For space considerations, this section reports only MRR and
+the accuracy results of all models are reported as (mean ± standard
+                                                                               Hits@10 metrics. Attack effectiveness is measured by the MRR and
+deviation). The best performing model results in each group (excluding
+                                                                               Hits@10 of poisoned triples on the victim. The higher metrics of the
+the local model) are bolded. The results of the CoDFKGE (32-128)
+                                                                               poisoned triples indicate greater vulnerability to poisoning and weaker
+model that are better than those of Distillation(32-128) are underlined.
+                                                                               resistance of the model to targeted poisoning attacks.
+    The performance of locally trained models is lower than most feder-
+                                                                                    Table 2 lists the performance of baseline models and CoDFKGE
+ated learning models, highlighting the advantages of sharing model pa-
+                                                                               under targeted poisoning attacks, grouped by the prediction model
+rameters. High-dimensional distillation(128-128) models achieve better
+                                                                               dimension. The parameter dimensions are specified in parentheses
+link prediction performance. Compared to distillation(128-128), CoD-
+                                                                               within the ‘‘Model’’ column. The ‘‘All Clients’’ column reports av-
+FKGE models show slightly inferior prediction performance. However,
+                                                                               erage performance across all clients’ test sets during attacks, while
+by comparing models with the same dimensions, CoDFKGE outperform
+both local baselines and federated baselines (FedE, FedR). The co-             ‘‘Victim Poisoned’’ measures the victim’s performance on predicting
+distillation process in CoDFKGE may lead to a loss of generalization           poisoned triples. All experiments were repeated 5 times with differ-
+accuracy. We believe that the main advantage of CoDFKGE is its                 ent random seeds, and the results are reported as (mean ± standard
+ability to enhance the security of FKGE. In addition to the security           deviation). The best performing model results are bolded. Moreover,
+performance demonstrated in Sections 6.2 and 6.3, it also maintains            the ‘‘Communication Poison’’ column highlights the communication
+link prediction performance comparable to its baseline FKGE models.            model’s performance on poisoned triples for CoDFKGE and the dis-
+    Beyond accuracy metrics, the ‘‘CC’’ (Communication Cost) column            tillation model, demonstrating that both communication models are
+reports the communication overhead per training epoch, which is                impacted by targeted poisoning attacks. Through distillation, the pre-
+calculated based on the byte size of PyTorch Embedding used in the             diction accuracy of poisoned triples by the prediction model decreases
+implementation. The ‘‘Mem’’ column shows the GPU memory usage                  in both cases.
+of federated models in MB. Distillation-based model requires main-                  For targeted poisoning attacks, the primary evaluation metrics
+taining two KGE models, resulting in higher computational resource             should be the MRR and Hits@10 performance indicators of the victim
+consumption. Distillation-based models need larger GPU memory to               model when predicting poisoned triples. The Local training model,
+store the parameters of both models. Compared to using model pa-               which does not employ federated learning, remains immune to poi-
+rameters of the same size, distillation-based models allow to compress         soning attacks, resulting in low MRR for poisoned triples, with the
+parameters in the communication model, achieving significantly lower           Hits@10 value being exactly 0. This indicates that the unpoisoned Local
+communication overhead. In cases of smaller communication overhead,            model does not include non-existent poisoned triples among its top
+CoDFKGE(32-128) outperforms distillation(32-128) in link prediction            10 candidate results when making predictions. If a model incorrectly
+performance. Therefore, we believe that the CoDFKGE model does                 marks non-existent poisoned test triples as one of the top 10 candidates,
+not degrade the normal link prediction performance of baseline FKGE            it demonstrates that the poisoning attack has successfully manipulated
+models and can effectively reduce the communication overhead of the            the model’s predictions. Therefore, we use Hits@10 as the metric to
+model.                                                                         measure the Attack Success Rate (ASR).
+
+                                                                           7
+Y. Lu et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104113
+
+
+Table 2
+Experiment result under targeted poisoning attack.
+ Model                       All clients                                  Victim poison                                 Communication poison
+                             MRR                     Hits@10              MRR                     Hits@10(ASR)          MRR                      Hits@10
+ Local(128, unpoisoned)      0.4081 ± 0.0015         0.6077 ± 0.0015      0.0003 ± 0.0001         0.0000 ± 0.0000       –                        –
+ FedE(128)                   0.4034 ± 0.0035         0.6004 ± 0.0029      0.4450 ± 0.0938         0.7857 ± 0.1248       –                        –
+ Distillation(128-128)       0.4026 ± 0.0025         0.6006 ± 0.0039      0.0844 ± 0.0552         0.2000 ± 0.1311       0.4999 ± 0.1429          0.7714 ± 0.1046
+ CoDFKGE(128-128)            0.4086 ± 0.0007         0.6089 ± 0.0012      0.0010 ± 0.0003         0.0009 ± 0.0005       0.4694 ± 0.1511          0.6589 ± 0.1242
+ Distillation(32-128)        0.3821 ± 0.0022         0.5717 ± 0.0018      0.1511 ± 0.3356         0.1960 ± 0.4362       0.4919 ± 0.2364          0.6625 ± 0.1887
+ CoDFKGE(32-128)             0.3856 ± 0.0039         0.5740 ± 0.0054      0.0010 ± 0.0001         0.0010 ± 0.0003       0.3794 ± 0.0032          0.5702 ± 0.005
+
+
+
+
+                                                         Fig. 3. Performance degradation comparison.
+
+
+    The FedE model maintains high prediction accuracy on normal                     communication model in CoDFKGE(32-128) less susceptible to poison-
+test triples when under attack, but exhibits abnormally high MRR and                ing attacks.
+Hits@10 metrics for targeted poisoned triples, even exceeding those
+of normal triples. This indicates that targeted poisoning attacks can               6.3. Untargeted poisoning attack experiment (RQ3)
+effectively manipulate the FedE model to generate incorrect prediction
+results. Similarly, in distillation-based models, their communication                   In untargeted poisoning attack experiments, the attacker returns
+models are severely affected by poisoning attacks, while the impact on              negative aggregate parameters to the victim client, making the victim
+the prediction models is relatively minor. Although the distill(128-128)            model non-converge and degrading prediction performance. The results
+model can partially eliminate poisoning knowledge, it still remains vul-            presented in this section reflect average prediction performance on
+nerable to the targeted poisoning attacks. Moreover, as the dimension               local test triples of clients.
+of the communication model parameter increases, the extent of the                       Table 3 lists the performance of each model under untargeted
+model’s vulnerability to poisoning attacks also grows.                              poisoning attacks, grouped by the prediction model dimension and
+    In contrast, CoDFKGE’s prediction model performs distillation learn-            federated type. The parameter dimensions are specified in parenthe-
+ing exclusively on verified positive samples, effectively eliminating               ses within the ‘‘Model’’ column. The ‘‘All Clients’’ column shows the
+potential poisoning knowledge that might exist in negative samples.                 average performance of all clients under untargeted poisoning attacks,
+Similar to the Local training model, CoDFKGE achieves extremely low                 and the ‘‘Victim Client’’ column shows the performance of the victim
+MRR and Hits@10 metrics for poisoned triples, which fully demon-                    client. To measure the severity of the model being attacked, the MRR of
+strates that the CoDFKGE model can effectively defend against targeted              the local model in Table 1 is used as a benchmark. The ‘‘Decay Ratio’’
+poisoning attacks in FKGE. Furthermore, due to the compression of                   column shows the ratio of performance degradation on the victim
+the communication model’s dimension, the amount of information                      client compared to the local model shown in Table 1. All experiments
+that attackers can transmit is correspondingly reduced, making the                  were repeated 5 times with different random seeds, and the results
+
+                                                                                8
+Y. Lu et al.                                                                                                             Computer Standards & Interfaces 97 (2026) 104113
+
+
+Table 3
+Experiment result under untargeted poisoning attack.
+ Fed Type         Model                    All clients                                       Victim                                             Decay ratio (%)
+                                           MRR                    Hits@10                    MRR                       Hits@10                  MRR           Hits@10
+ Entity           FedE(128)                0.3896 ± 0.0010        0.5939 ± 0.0009            0.3625 ± 0.0102           0.5620 ± 0.0144          11.21         7.58
+ Entity           Distillation(128-128)    0.3900 ± 0.0017        0.5921 ± 0.0007            0.3641 ± 0.0012           0.5664 ± 0.0018          11.82         7.54
+ Entity           CoDFKGE(128-128)         0.4084 ± 0.0007        0.6068 ± 0.0003            0.4017 ± 0.0010           0.6009 ± 0.0005          2.25          1.28
+ Entity           Distillation (32-128)    0.3024 ± 0.0208        0.5422 ± 0.0105            0.2739 ± 0.0264           0.5262 ± 0.0124          30.02         9.49
+ Entity           CoDFKGE (32-128)         0.4093 ± 0.0018        0.6081 ± 0.0014            0.4022 ± 0.0022           0.6023 ± 0.0011          1.66          0.75
+ Relation         FedR(128)                0.3915 ± 0.0010        0.5951 ± 0.0016            0.3637 ± 0.0093           0.5636 ± 0.0150          10.96         7.10
+ Relation         Distillation(128-128)    0.3978 ± 0.0017        0.6022 ± 0.0019            0.3881 ± 0.0023           0.5942 ± 0.0028          5.51          2.56
+ Relation         CoDFKGE(128-128)         0.4086 ± 0.0017        0.6075 ± 0.0029            0.4014 ± 0.0020           0.6018 ± 0.0037          1.24          0.75
+ Relation         Distillation (32-128)    0.3058 ± 0.0079        0.5463 ± 0.0029            0.2787 ± 0.0101           0.5307 ± 0.0038          27.78         8.61
+ Relation         CoDFKGE (32-128)         0.4090 ± 0.0008        0.6066 ± 0.0011            0.4026 ± 0.0008           0.6018 ± 0.0013          1.27          0.92
+
+
+Table 4
+Ablation study in normal link prediction and under targeted attack.
+ Model                Link prediction                          Targeted all clients                            Targeted victim poisoning
+                      MRR                 Hits@10              MRR                      Hits@10                MRR                   Hits@10 (targeted poisoning ASR)
+ CoDFKGE              0.4112 ± 0.0039     0.6084 ± 0.0036      0.4086 ± 0.0007          0.6089 ± 0.0012        0.0010 ± 0.0003       0.0009 ± 0.0005
+ Ablation(Comm)       0.4095 ± 0.0016     0.6074 ± 0.0014      0.4086 ± 0.0022          0.6076 ± 0.0021        0.0017 ± 0.0008       0.0013 ± 0.0008
+ Ablation(Pred)       0.4132 ± 0.0006     0.6116 ± 0.0012      0.4098 ± 0.0011          0.6080 ± 0.0009        0.8086 ± 0.0064       0.9702 ± 0.0228
+
+
+
+are reported as (mean ± standard deviation). The best and second best                 were repeated 5 times with different random seeds, and the results are
+results in each group have been marked in bold and underline.                         reported as (mean ± standard deviation). The best results are bolded.
+     From the experimental results, it can be observed that when sub-                     Experimental results demonstrate that while Ablation(Pred) per-
+jected to untargeted poisoning attacks, the CoDFKGE series models                     forms well in conventional link prediction, its resistance to poisoning
+achieve optimal MRR and Hits@10 performance metrics compared to                       attacks lags behind the other two models due to not employing a
+other models. In this context, all models exhibit varying degrees of                  negative sample exclusion strategy in its loss function. Among the re-
+decline in both their overall performance metrics and their performance               maining two models, while both demonstrate robust resilience against
+metrics on victims. In Fig. 3, we present a comparison of the predic-                 poisoning attacks, the CoDFKGE model achieves superior link pre-
+tion performance of various models under normal link prediction and                   diction performance compared to Ablation(Comm). Ablation(Comm)
+untargeted poisoning attack scenarios. It can be observed that the Dis-               employs a baseline loss function during the distillation training of
+tillation(32-128) model experiences the most significant performance                  the communication model. In contrast, the CoDFKGE model adopts
+degradation; for Distillation(128-128), FedE, and FedR models, their                  the approach from [9] and utilizes self-adversarial sampling temper-
+performance degradation is also substantial and cannot be ignored.                    ature 𝜏𝛼 to reweight negative samples, thereby enhancing the model’s
+These models directly incorporate poisoned global knowledge as an                     ability to distinguish between negative samples. Overall, the ablation
+integral part of their own models, causing the convergence process of                 experiments demonstrate that applying the proposed distillation loss
+the models to be adversely affected. In contrast, the performance degra-              functions simultaneously enhances the model’s capability in defending
+dation of CoDFKGE models is fully within 3%. This is because even in                  against poisoning attacks and link prediction.
+the absence of global knowledge, the prediction model of CoDFKGE still
+                                                                                      7. Conclusion
+utilizes local data knowledge for training, and its training effectiveness
+is comparable to that of local KGE models without knowledge sharing.
+                                                                                          This paper proposes CoDFKGE, a co-distillation-based defense
+     Baseline models may have their results manipulated or exhibit
+                                                                                      framework for FKGE poisoning attacks. As the first co-distillation
+significant performance degradation when facing poisoning attacks.
+                                                                                      defense framework against poisoning attacks in FKGE, CoDFKGE does
+Although in link prediction experiments, distillation models exhibited
+                                                                                      have some limitations. First, maintaining two separate models requires
+advantages in performance, their defense effectiveness is extremely
+                                                                                      higher computational resource consumption on clients. Second, the
+limited when facing poisoning attacks. In contrast, CoDFKGE remains
+                                                                                      bidirectional distillation process may lead to a loss of generalization
+unmanipulated when encountering targeted poisoning attacks and does
+                                                                                      accuracy. In contrast, CoDFKGE’s advantages lie in its model-agnostic
+not exhibit significant performance degradation when subjected to
+                                                                                      applicability to existing FKGE models without compromising perfor-
+untargeted poisoning attacks, demonstrating its effective defense capa-               mance. By decoupling clients’ prediction models from shared parameter
+bility against poisoning attacks.                                                     models, CoDFKGE effectively filters out poisoned knowledge embedded
+                                                                                      in shared updates. CoDFKG eliminates malicious manipulations under
+6.4. Ablation study (RQ4)                                                             targeted poisoning attacks, and significantly mitigates accuracy degra-
+                                                                                      dation under untargeted poisoning attacks. Leveraging distillation,
+    This section evaluates the defensive effects of applying different                the framework further reduces communication overhead. This work
+loss functions in CoDFKGE against poisoning attacks. Specifically, we                 provides new ideas for enhancing the security of FKGE.
+compare the performance of models using 128-dimensional training                          The limitations of FKGE poisoning defense research are partially
+parameters for both communication and prediction models across nor-                   rooted in the unique characteristics of KGE. When considering
+mal link prediction, targeted poisoning attack scenarios, and untargeted              translation-based KGE models in FKGE, sharing entity or relation
+poisoning attack scenarios. Two ablation baselines were implemented:                  embeddings introduces risks related to both privacy preservation and
+Ablation(Comm) applies the baseline loss function (Eq. (4)) solely                    poisoning attacks. Employing GNN-based KGE models in FKGE that
+during the communication module’s distillation, while Ablation(Pred)                  transmit GNN parameters or gradients can alleviate these concerns.
+uses it exclusively for the prediction module’s distillation.                         However, due to their superior robustness to sparse data and lower
+    Tables 4 and 5 shows the experiment results of models with different              computational resource requirements, translation-based models still
+distillation loss functions sharing entity embeddings. All experiments                maintain unparalleled advantages in specific application scenarios.
+
+                                                                                9
+Y. Lu et al.                                                                                                                          Computer Standards & Interfaces 97 (2026) 104113
+
+
+                 Table 5
+                 Ablation study under untargeted attack.
+                   Model                   Untargeted all clients                            Untargeted victim                                Decay ratio (%)
+                                           MRR                      Hits@10                  MRR                      Hits@10                 MRR         Hits@10
+                   CoDFKGE                 0.4084 ± 0.0007          0.6068 ± 0.0003          0.4017 ± 0.0010          0.6009 ± 0.0005         2.25        1.27
+                   Ablation(Comm)          0.4056 ± 0.0017          0.6062 ± 0.0011          0.3996 ± 0.0018          0.6003 ± 0.0013         2.42        1.16
+                   Ablation(Pred)          0.3951 ± 0.0011          0.6022 ± 0.0008          0.3852 ± 0.0009          0.5951 ± 0.0005         6.76        2.69
+
+
+
+    For future research, we recommend exploring the application of the                            [8] Z. Wang, J. Zhang, J. Feng, Z. Chen, Knowledge graph embedding by translating
+CoDFKGE framework in more complex real-world scenarios, such as                                       on hyperplanes, in: Proceedings of the AAAI Conference on Artificial Intelligence,
+                                                                                                      vol. 28, 2014.
+personalized FKGE problems. Additionally, in large-scale dynamic KG
+                                                                                                  [9] Z. Sun, Z.-H. Deng, J.-Y. Nie, J. Tang, Rotate: Knowledge graph embedding by
+environments, the security landscape for FKGE may undergo signifi-                                    relational rotation in complex space, 2019, arXiv preprint arXiv:1902.10197.
+cant changes, necessitating further investigation into defense methods                           [10] Z. Zhang, J. Jia, Y. Wan, Y. Zhou, Y. Kong, Y. Qian, J. Long, Transr*: Repre-
+tailored to these evolving scenarios.                                                                 sentation learning model by flexible translation and relation matrix projection,
+                                                                                                      J. Intell. Fuzzy Systems 40 (5) (2021) 10251–10259.
+                                                                                                 [11] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2d knowl-
+CRediT authorship contribution statement                                                              edge graph embeddings, in: Proceedings of the AAAI Conference on Artificial
+                                                                                                      Intelligence, vol. 32, (1) 2018.
+   Yiqin Lu: Supervision. Jiarui Chen: Writing – original draft, Soft-                           [12] B. McMahan, E. Moore, D. Ramage, S. Hampson, B.A. y Arcas, Communication-
+                                                                                                      efficient learning of deep networks from decentralized data, in: Artificial
+ware, Methodology. Jiancheng Qin: Writing – review & editing.
+                                                                                                      Intelligence and Statistics, PMLR, 2017, pp. 1273–1282.
+                                                                                                 [13] M. Chen, W. Zhang, Z. Yuan, Y. Jia, H. Chen, Fede: Embedding knowledge graphs
+Declaration of Generative AI and AI-assisted technologies in the                                      in federated setting, in: Proceedings of the 10th International Joint Conference
+writing process                                                                                       on Knowledge Graphs, 2021, pp. 80–88.
+                                                                                                 [14] M. Chen, W. Zhang, Z. Yuan, Y. Jia, H. Chen, Federated knowledge graph
+   During the preparation of this work the author(s) used deepseek in                                 completion via embedding-contrastive learning, Knowl.-Based Syst. 252 (2022)
+                                                                                                      109459.
+order to improve language and readability. After using this tool/service,                        [15] K. Zhang, Y. Wang, H. Wang, L. Huang, C. Yang, X. Chen, L. Sun, Efficient fed-
+the author(s) reviewed and edited the content as needed and take(s) full                              erated learning on knowledge graphs via privacy-preserving relation embedding
+responsibility for the content of the publication.                                                    aggregation, 2022, arXiv preprint arXiv:2203.09553.
+                                                                                                 [16] G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network,
+                                                                                                      2015, arXiv preprint arXiv:1503.02531.
+Declaration of competing interest                                                                [17] N. Papernot, P. McDaniel, X. Wu, S. Jha, A. Swami, Distillation as a de-
+                                                                                                      fense to adversarial perturbations against deep neural networks, in: 2016 IEEE
+    The authors declare that they have no known competing finan-                                      Symposium on Security and Privacy, SP, IEEE, 2016, pp. 582–597.
+cial interests or personal relationships that could have appeared to                             [18] K. Yoshida, T. Fujino, Countermeasure against backdoor attack on neural
+                                                                                                      networks utilizing knowledge distillation, J. Signal Process. 24 (4) (2020)
+influence the work reported in this paper.
+                                                                                                      141–144.
+                                                                                                 [19] K. Yoshida, T. Fujino, Disabling backdoor and identifying poison data by
+Acknowledgment                                                                                        using knowledge distillation in backdoor attacks on deep neural networks, in:
+                                                                                                      Proceedings of the 13th ACM Workshop on Artificial Intelligence and Security,
+                                                                                                      2020, pp. 117–127.
+   This work is supported by the Special Project for Research and                                [20] R. Anil, G. Pereyra, A. Passos, R. Ormandi, G.E. Dahl, G.E. Hinton, Large
+Development in Key Areas of Guangdong Province, under Grant                                           scale distributed neural network training through online distillation, 2018, arXiv
+2019B010137001.                                                                                       preprint arXiv:1804.03235.
+                                                                                                 [21] Y. Hu, W. Liang, R. Wu, K. Xiao, W. Wang, X. Li, J. Liu, Z. Qin, Quantifying and
+                                                                                                      defending against privacy threats on federated knowledge graph embedding, in:
+Data availability
+                                                                                                      Proceedings of the ACM Web Conference 2023, 2023, pp. 2306–2317.
+                                                                                                 [22] X. Zhu, G. Li, W. Hu, Heterogeneous federated knowledge graph embedding
+    Data will be made available on request.                                                           learning and unlearning, in: Proceedings of the ACM Web Conference 2023,
+                                                                                                      2023, pp. 2444–2454.
+                                                                                                 [23] X. Zhang, Z. Zeng, X. Zhou, Z. Shen, Low-dimensional federated knowledge graph
+                                                                                                      embedding via knowledge distillation, 2024, arXiv preprint arXiv:2408.05748.
+References
+                                                                                                 [24] Y. Liu, Z. Sun, G. Li, W. Hu, I know what you do not know: Knowledge
+                                                                                                      graph embedding via co-distillation learning, in: Proceedings of the 31st ACM
+ [1] X. Zhao, H. Chen, Z. Xing, C. Miao, Brain-inspired search engine assistant based                 International Conference on Information & Knowledge Management, 2022, pp.
+     on knowledge graph, IEEE Trans. Neural Netw. Learn. Syst. 34 (8) (2021)                          1329–1338.
+     4386–4400.                                                                                  [25] F. Xia, W. Cheng, A survey on privacy-preserving federated learning against
+ [2] S. Sharma, Fact-finding knowledge-aware search engine, in: Data Management,                      poisoning attacks, Clust. Comput. 27 (10) (2024) 13565–13582.
+     Analytics and Innovation: Proceedings of ICDMAI 2021, vol. 2, Springer, 2021,               [26] J. Chen, H. Yan, Z. Liu, M. Zhang, H. Xiong, S. Yu, When federated learning
+     pp. 225–235.                                                                                     meets privacy-preserving computation, ACM Comput. Surv. (ISSN: 0360-0300)
+ [3] Y. Jiang, Y. Yang, L. Xia, C. Huang, DiffKG: Knowledge graph diffusion model for                 56 (12) (2024).
+     recommendation, in: Proceedings of the 17th ACM International Conference on                 [27] J. Xia, Z. Yue, Y. Zhou, Z. Ling, Y. Shi, X. Wei, M. Chen, Waveattack: Asymmetric
+     Web Search and Data Mining, WSDM ’24, Association for Computing Machinery,                       frequency obfuscation-based backdoor attacks against deep neural networks, Adv.
+     New York, NY, USA, ISBN: 9798400703713, 2024, pp. 313–321.                                       Neural Inf. Process. Syst. 37 (2024) 43549–43570.
+ [4] W. Wang, X. Shen, B. Yi, H. Zhang, J. Liu, C. Dai, Knowledge-aware fine-grained             [28] P. Blanchard, E.M. El Mhamdi, R. Guerraoui, J. Stainer, Machine learning with
+     attention networks with refined knowledge graph embedding for personalized                       adversaries: Byzantine tolerant gradient descent, Adv. Neural Inf. Process. Syst.
+     recommendation, Expert Syst. Appl. 249 (2024) 123710.                                            30 (2017).
+ [5] J. Chen, Y. Lu, Y. Zhang, F. Huang, J. Qin, A management knowledge graph                    [29] N.M. Jebreel, J. Domingo-Ferrer, Fl-defender: Combating targeted attacks in
+     approach for critical infrastructure protection: Ontology design, information ex-                federated learning, Knowl.-Based Syst. 260 (2023) 110178.
+     traction and relation prediction, Int. J. Crit. Infrastruct. Prot. (ISSN: 1874-5482)        [30] Z. Yue, J. Xia, Z. Ling, M. Hu, T. Wang, X. Wei, M. Chen, Model-contrastive
+     43 (2023) 100634.                                                                                learning for backdoor elimination, in: Proceedings of the 31st ACM International
+ [6] Y. Zhang, J. Chen, Z. Cheng, X. Shen, J. Qin, Y. Han, Y. Lu, Edge propagation                    Conference on Multimedia, 2023, pp. 8869–8880.
+     for link prediction in requirement-cyber threat intelligence knowledge graph,               [31] H. Peng, H. Li, Y. Song, V. Zheng, J. Li, Differentially private federated
+     Inform. Sci. (ISSN: 0020-0255) 653 (2024) 119770.                                                knowledge graphs embedding, in: Proceedings of the 30th ACM International
+ [7] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, O. Yakhnenko, Translating                     Conference on Information & Knowledge Management, CIKM ’21, Association
+     embeddings for modeling multi-relational data, Adv. Neural Inf. Process. Syst.                   for Computing Machinery, New York, NY, USA, ISBN: 9781450384469, 2021,
+     26 (2013).                                                                                       pp. 1416–1425.
+
+
+                                                                                            10
+Y. Lu et al.                                                                                                                      Computer Standards & Interfaces 97 (2026) 104113
+
+
+[32] Y. Hu, Y. Wang, J. Lou, W. Liang, R. Wu, W. Wang, X. Li, J. Liu, Z. Qin, Privacy         [36] M. Fey, J.E. Lenssen, Fast graph representation learning with PyTorch Geometric,
+     risks of federated knowledge graph embedding: New membership inference                        in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
+     attacks and personalized differential privacy defense, IEEE Trans. Dependable            [37] P. Moritz, R. Nishihara, S. Wang, A. Tumanov, R. Liaw, E. Liang, M. Elibol,
+     Secur. Comput. (2024).                                                                        Z. Yang, W. Paul, M.I. Jordan, I. Stoica, Ray: A distributed framework for
+[33] E. Zhou, S. Guo, Z. Ma, Z. Hong, T. Guo, P. Dong, Poisoning attack on federated               emerging AI applications, in: 13th USENIX Symposium on Operating Systems
+     knowledge graph embedding, in: Proceedings of the ACM Web Conference 2024,                    Design and Implementation (OSDI 18), USENIX Association, Carlsbad, CA, ISBN:
+     2024, pp. 1998–2008.                                                                          978-1-939133-08-3, 2018, pp. 561–577.
+[34] G. Xia, J. Chen, C. Yu, J. Ma, Poisoning attacks in federated learning: A survey,        [38] D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, 2014, arXiv
+     Ieee Access 11 (2023) 10708–10722.                                                            preprint arXiv:1412.6980.
+[35] K. Toutanova, D. Chen, P. Pantel, H. Poon, P. Choudhury, M. Gamon, Repre-
+     senting text for joint embedding of text and knowledge bases, in: Proceedings
+     of the 2015 Conference on Empirical Methods in Natural Language Processing,
+     2015, pp. 1499–1509.
+
+
+
+
+                                                                                         11
+
--- a/papers_txt/Collaborative-optimization-of-offloading-and-pricing-st_2025_Journal-of-Syst.txt
+++ b/papers_txt/Collaborative-optimization-of-offloading-and-pricing-st_2025_Journal-of-Syst.txt
--- a/papers_txt/Component-based-architectural-regression-test-select_2025_Journal-of-Systems.txt
+++ b/papers_txt/Component-based-architectural-regression-test-select_2025_Journal-of-Systems.txt
--- a/papers_txt/DEA-GAO--A-two-stage-approach-optimal-controller-placeme_2026_Computer-Stand.txt
+++ b/papers_txt/DEA-GAO--A-two-stage-approach-optimal-controller-placeme_2026_Computer-Stand.txt
--- a/papers_txt/Designing-secure-blockchain-based-authentication-and-ke_2025_Journal-of-Syst.txt
+++ b/papers_txt/Designing-secure-blockchain-based-authentication-and-ke_2025_Journal-of-Syst.txt
--- a/papers_txt/EDF-based-Energy-Efficient-Probabilistic-Imprecise-_2025_Journal-of-Systems-.txt
+++ b/papers_txt/EDF-based-Energy-Efficient-Probabilistic-Imprecise-_2025_Journal-of-Systems-.txt
@@ -0,0 +1,875 @@
+                                                            Journal of Systems Architecture 160 (2025) 103361
+
+
+                                                                Contents lists available at ScienceDirect
+
+
+                                                       Journal of Systems Architecture
+                                                       journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+EDF-based Energy-Efficient Probabilistic Imprecise Mixed-Criticality
+Scheduling
+Yi-Wen Zhang ∗, Jin-Long Zhang
+College of Computer Science and Technology, Huaqiao University, Xiamen, 361021, China
+
+
+
+ARTICLE                INFO                            ABSTRACT
+
+Keywords:                                              We focus on Mixed-Criticality Systems (MCS), which involves the integration of multiple subsystems with
+Imprecise Mixed-Criticality                            varying levels of criticality on shared hardware platforms. The classic MCS task model assumes hard real-time
+Energy management                                      constraints and no Quality-of-Service (QoS) for low-criticality tasks in high-criticality mode. Many researchers
+DVFS
+                                                       have put forward a range of extensions to the classic MCS task model to make MCS theory more applicable in
+Probabilistic schedulability
+                                                       industry practice. In this paper, we consider an Imprecise MCS taskset scheduled with Earliest Deadline First
+                                                       algorithm on a uniprocessor platform, and propose an Energy-Efficient Task Execution Model that guarantees
+                                                       (deterministic or probabilistic) schedulability, allows degraded QoS to low-criticality tasks in high-criticality
+                                                       mode, and applies Dynamic Voltage and Frequency Scaling to save energy.
+
+
+
+1. Introduction
+                                                                                            In this paper, we consider all the above different aspects within
+    Mixed-Criticality Systems (MCS) [1] involve the integration of mul-                 a unified framework. We consider an Imprecise MCS probabilistic
+tiple sub-systems with varying criticality levels on a shared hardware                  taskset scheduled with Earliest Deadline First (EDF) algorithm on a
+platform. For example, the automotive safety certification standard ISO                 uniprocessor platform, and propose an Energy-Efficient Task Execution
+26262 and the avionics safety certification standard DO-178C. Since                     Model that guarantees (deterministic or probabilistic) schedulability,
+the introduction of the MCS concept by Vestal [2], there has been                       allows degraded QoS to LO tasks in HI mode, and applies DVFS to
+considerable research conducted on this topic [1,3,4]. Many researchers                 save energy. Although the work in [7] is the closest to ours, there are
+                                                                                        several key differences. Firstly, it schedules tasks under non-preemptive
+have put forward a range of extensions to the classic MCS task model
+                                                                                        fixed-priority (NPFP) [8] scheduling policy while our work schedules
+to make MCS theory more applicable in industry practice, including:
+                                                                                        tasks with a preemptive EDF. Secondly, it uses probabilistic WCET
+                                                                                        (pWCET) to determine the probability of mode transition and uses a
+    • To reduce the pessimism in task worst-case execution time
+                                                                                        deterministic schedulability analysis while our work includes determin-
+      (WCET) estimation and system schedulability analysis,
+                                                                                        istic or probabilistic schedulability analysis. Finally, it uses the response
+      researchers have proposed probabilistic schedulability analysis
+                                                                                        time analysis to determine the schedulability analysis while our work
+      techniques where the task WCETs (and/or periods) are repre-
+                                                                                        uses Demand Bound Function (DBF) to determine the schedulability
+      sented by random variables, and the system is allowed to miss                     analysis. In short, the work is first to address the energy issue and
+      deadlines with a small probability [5].                                           schedulability test of the Imprecise MCS probabilistic taskset MCS
+    • The original assumption that all low-criticality (LO) tasks are                   taskset scheduling under EDF.
+      discarded in high-criticality (HI) mode is likely to be undesirable                   The remainder of the paper is organized as follows. We present
+      in industry practice, hence researchers have proposed various                     background and related work in Section 2. Section 3 presents prelim-
+      approaches to allow a certain level of degraded Quality-of-Service                inaries. Section 4 presents our probabilistic IMC scheduling; Section 5
+      (QoS) to LO tasks in HI mode [1].                                                 presents the Energy-Efficient Task Execution Model; Section 6 presents
+    • To address energy-constrained safety–critical systems, researchers                experimental results; Section 7 discusses practical issues. Finally, Sec-
+      have proposed power and energy-aware scheduling algorithms                        tion 8 presents conclusions and future work.
+      with Dynamic Voltage and Frequency Scaling (DVFS) for MCS [6].
+
+
+
+  ∗ Corresponding author.
+     E-mail addresses: zyw@hqu.edu.cn (Y.-W. Zhang), sang_yunl@stu.hqu.edu.cn (J.-L. Zhang).
+
+https://doi.org/10.1016/j.sysarc.2025.103361
+Received 11 September 2024; Received in revised form 3 February 2025; Accepted 4 February 2025
+Available online 12 February 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Y.-W. Zhang and J.-L. Zhang                                                                                        Journal of Systems Architecture 160 (2025) 103361
+
+
+2. Background and related work                                                    2.2. The classic MCS task model
+
+2.1. Background and motivation                                                         The MCS taskset 𝛤 includes 𝑛 independent sporadic tasks 𝛤 =
+                                                                                  {𝜏𝑖 |1 ≤ 𝑖 ≤ 𝑛} [13,14]. Although there may be multiple (4–5) criticality
+    Resource-constrained embedded systems. In order to motivate                   levels in general, we present the task model assuming a dual-criticality
+the need for probabilistic scheduling and DVFS addressed in this paper,           system with criticality levels LO and HI for the sake of simplicity. The
+we first discuss the issue of hardware resource constraints in real-              taskset 𝛤 includes two subsets: LO tasks 𝛤𝐿𝑂 = {𝜏𝑖 ∈ 𝛤 |𝐿𝑖 = 𝐿𝑂} and
+time embedded systems, including but not limited to MCS, which                    HI tasks 𝛤𝐻 𝐼 = {𝜏𝑖 ∈ 𝛤 |𝐿𝑖 = 𝐻 𝐼}. Each task 𝜏𝑖 ∈ 𝛤 is described by
+are especially pertinent for mass-produced consumer products such                 (𝐿𝑖 , 𝑇𝑖 , 𝐷𝑖 , 𝐶𝑖𝐿𝑂 , 𝐶𝑖𝐻 𝐼 ):
+as ground vehicles and drones (Unmanned Aerial Vehicles), due to
+monetary cost as well as Size, Weight, and Power (SWaP) constraints.                 • 𝐿𝑖 ∈ {𝐿𝑂, 𝐻 𝐼} denoted its criticality level.
+Automotive Electrical/Electronic (E/E) systems typically have stringent              • 𝑇𝑖 denoted its period.
+hardware resource constraints. In modern high-end vehicles, there can                • 𝐷𝑖 denoted its relative deadline.
+be up to 100 ECUs (Electronic Control Units) embedded within them,                   • 𝐶𝑖𝐿𝑂 denoted its WCET in LO mode.
+and each model can be sold millions of times. An overall savings of                  • 𝐶𝑖𝐻 𝐼 denoted its WCET in HI mode for HI tasks (𝐿𝑖 = 𝐻 𝐼), with
+millions of dollars may be achieved by saving a few dollars per ECU.                   𝐶𝑖𝐻 𝐼 ≥ 𝐶𝑖𝐿𝑂 .
+Hence, a designer of E/E systems should choose the cheapest ECU
+according to their application’s needs. The monetary cost pressure on                 Task execution model of classic MCS. The system is first ini-
+relatively cheap consumer drones is even higher. Next, let us consider            tialized to be in LO mode. LO tasks 𝜏𝑖 ∈ 𝛤𝐿𝑂 are monitored at run
+the issue of SWaP, which lumps together three factors that are closely            time and their execution is no more than their 𝐶𝑖𝐿𝑂 . The system is
+correlated due to the same underlying cause of hardware resource                  schedulable in LO mode if all tasks 𝜏𝑖 ∈ 𝛤 can complete their LO mode
+constraints. The significance of SWaP is obvious in battery-powered               WCETs 𝐶𝑖𝐿𝑂 within their respective deadlines. If any HI task 𝜏𝑖 ∈ 𝛤𝐻 𝐼
+mobile devices like drones and mobile robots, where operating time                executes beyond its 𝐶𝑖𝐿𝑂 , the system enters HI mode while all LO tasks
+and physical constraints are limited. However, SWaP considerations                in 𝛤𝐿𝑂 are abandoned. The system is schedulable in HI mode if all HI
+are equally applicable to ground vehicles that are equipped with siz-             tasks 𝜏𝑖 ∈ 𝛤𝐻 𝐼 can complete their HI mode WCETs 𝐶𝑖𝐻 𝐼 within their
+able battery systems. Electronics within autonomous vehicles consume              respective deadlines. The system switches back to LO mode at an idle
+substantial power, impacting the range of electric vehicles or the fuel           instant if no jobs wait for executions at this time [15]. The system is
+consumption of gasoline vehicles. Size and weight affect consumer                 schedulable if both modes are schedulable.
+acceptance, e.g., an autonomous vehicle with a trunk full of electronics              The state-of-the-art scheduling algorithms for the classic MCS task
+is not likely to be acceptable to the average consumer. The issue of              model include Fixed-Priority scheduling [14], and Earliest-Deadline
+significant hardware resource constraints in MCS has motivated a line             First with Virtual Deadline (EDF-VD) [16] for Dynamic-Priority
+of work on processing and memory resource optimization algorithms                 scheduling on uniprocessor systems. Subsequently, many extensions to
+for MCS [9].                                                                      the classic MCS task model have been proposed, as discussed next.
+    Motivation for probabilistic schedulability analysis. Recently,
+Akesson et al. [10] investigated 120 industry practitioners in real-time          2.3. Degraded QoS for LO tasks
+embedded systems, and results indicated that soft or firm real-time
+constraints are prevalent even in safety–critical application domains.                The degraded QoS of LO tasks in HI mode is achieved by decreasing
+A minority (15%) of the surveyed systems were considered strictly                 execution time budgets [17] or adding the task period [18] for LO tasks.
+hard real-time (no deadlines to be missed). Thus, designing the timing                Liu et al. [17] proposed the Imprecise Mixed-Criticality (IMC) task
+behavior of a system function to ensure a much lower failure rate did             model in which a HI task 𝜏𝑖 (𝐿𝑖 = 𝐻 𝐼) is assigned a greater estimated
+not affect the system’s total schedulability.                                     WCET compared to its estimation in LO mode (𝐶𝑖𝐿𝑂 ≤ 𝐶𝑖𝐻 𝐼 ), while a
+    Industry safety certification standards specify acceptable failure            LO task 𝜏𝑖 (𝐿𝑖 = 𝐿𝑂) is assigned a smaller estimated WCET in HI mode
+rates depending on the system’s criticality levels such as each ASIL has          compared to the estimation in LO mode (𝐶𝑖𝐿𝑂 ≥ 𝐶𝑖𝐻 𝐼 ). They considered
+a permitted failure probability of 10−9 for ASIL D, 10−8 for ASIL C               EDF-VD scheduling on a single processor system, and presented two
+and B, and 10−7 for ASIL A in the automotive standard ISO-26262 [5].              schedulability tests, one based on the utilization bound test, and the
+Relaxing the hard real-time assumption can help reduce pessimism                  other based on the Demand Bound Function (DBF). Davis et al. [19]
+in task WCET estimation and system schedulability analysis and in-                addressed the IMC task model under fixed-priority scheduling, and pre-
+crease schedulable utilization significantly. Von der Brüggen et al. [11]         sented a Compensating AMC Scheduling scheme and two schedulability
+demonstrated large gains in processor utilization with experiments                tests. Jiang et al. [20] presented a concrete implementation of the
+using randomly-generated workloads, e.g., a gain of at least 12%                  IMC task model in the form of a configurable processor floating point
+schedulable utilization for an acceptable worst-case deadline failure             unit hardware design, as well as schedulability analysis and optimized
+probability of 10−6 . This motivates probabilistic schedulability analysis        priority assignment algorithms based on fixed-priority scheduling.
+as an effective technique for reducing analysis pessimism and increase
+processor utilization in resource-constrained embedded systems.                   2.4. Energy-aware scheduling for MCS
+    Motivation for not dropping LO tasks in HI mode. Consider
+the automotive standard ISO-26262, where ASIL determination of haz-                   DVFS dynamically adjusts the processor supply voltage and speed
+ardous events is based on three parameters: ‘‘severity’’, ‘‘probability of        (frequency) based on the system’s workload, which is an effective
+exposure’’ and ‘‘controllability’’. An individual’s vulnerability to harm         energy-saving technique [21]. Most modern microprocessors, including
+in a potentially hazardous situation is determined by severity. Proba-            those used in embedded systems, provide support for DVFS. Our recent
+bility is the likelihood that harm will occur, while controllability is the       survey paper [6] provided an overview of recent developments in
+ability to avoid harm or damage through prompt action by the agents               energy-aware real-time scheduling for MCS, predominantly focusing on
+involved (e.g. a driver of the vehicle). It cannot always be assumed that         DVFS.
+a software function that is part of a high ASIL functionality is more                 Recently, power and energy-aware real-time scheduling for MCS
+important than one that is part of a lower ASIL functionality, as both            has attracted significant attention [6]. Huang et al. [22] proposed a
+may be safety–critical, and each function’s failure may cause severe              scheduling algorithm for MCS based on EDF-VD [16]. This scheduling
+damage [12].                                                                      algorithm reduces energy consumption by optimizing virtual deadlines
+
+                                                                              2
+Y.-W. Zhang and J.-L. Zhang                                                                                               Journal of Systems Architecture 160 (2025) 103361
+
+
+and processor speeds. Zhang [23] used the dynamic slack time gener-                 Table 1
+                                                                                    Related work on probabilistic Scheduling for MCS. Abbreviations: Prob. (Probabilistic);
+ated from late arrival tasks to reduce energy consumption. This work
+                                                                                    S.A. (Schedulability Analysis).
+is extended to MCS with fixed-priority preemptive scheduling [24] and
+                                                                                     Work                                 Sched.    Prob.     Energy-      LO tasks
+dynamic priority non-preemptive scheduling [25]. Zhang et al. [26]                                                        Algo.     S.A.      Aware        dropped in
+tackled the issue of MCS with shared resources and proposed a dual-                                                                                        HI Mode
+speed scheduling algorithm. This algorithm ensured both the system                   Santinelli and George (2015) [33]    EDF       Y         N            Y
+schedulability and mutually exclusive access to shared resources. How-               Maxim et al. (2017) [34]             FP        Y         N            Y
+ever, it assumed that all tasks execute with their WCET. Zhang [27]                  Singh et al. (2020) [35]             NPFP      Y         N            Y
+used the difference between the actual execution time and WCET                       Draskovic et al. (2021) [36]         FP        Y         N            N
+                                                                                     Guo et al. (2021) [37]               EDF       Y         N            Y
+to save energy. These works focus on the classic MCS task model.
+                                                                                     Bhuiyan et al. (2020) [7]            NPFP      N         Y            Y
+Zhang [28] focused on the IMC task model in which LO tasks allow                     This work                            EDF       Y         Y            N
+Qos in HI mode and proposed an energy-aware scheduling algorithm
+(EA-IMC).
+    There has been a small number of recent works on energy-aware
+MCS on multiprocessors. Narayana et al. [29] considered the energy                  probability that its WCET is equal to 𝑒𝑡.1 Given the PMF 𝑓𝑖 (⋅), we
+minimization problem for multiprocessor MCS based on DVFS. They                     can easily obtain the corresponding Cumulative Distribution Function
+                                                                                                                                           ∑
+first proposed an optimal solution and an effective lightweight heuristic           (CDF) 𝐹𝑖 (⋅), where 𝐹𝑖 (𝑒𝑡) = 𝑃 (𝑖 ≤ 𝑒𝑡) = 𝑥≤𝑒𝑡 𝑓𝑖 (𝑥). The Complemen-
+on a uniprocessor, then extended these results to multicore systems.                tary Cumulative Distribution Function (1-CDF) is defined as 𝐹̄𝑖 (𝑒𝑡) =
+Ranjbar et al. [30] proposed a heuristic algorithm for online peak                  𝑃 (𝑖 > 𝑒𝑡) = 1 − 𝐹𝑖 (𝑒𝑡).
+power and thermal management of a multicore MCS by using the slack                      We consider the MCS taskset 𝛤 including 𝑛 independent periodic
+time and per-cluster DVFS. Recently, some researchers [31] studied the              tasks 𝛤 = {𝜏𝑖 |1 ≤ 𝑖 ≤ 𝑛} scheduled with preemptive EDF on
+IMC task model on multiprocessors in which LO tasks allow QoS in HI                 a single processor platform. (It is a special case of EDF-VD with a
+mode and proposed the partitioned scheduling algorithm. In addition,                deadline scaling factor 𝑥 = 1.) We assume a dual-criticality system with
+this work is extended to shared resource scheduling [32]. However, the              criticality levels LO and HI for the sake of simplicity. The taskset 𝛤
+above studies assume that tasks execute with their deterministic WCET.              consists of two subsets: LO tasks 𝛤𝐿𝑂 = {𝜏𝑖 ∈ 𝛤 |𝐿𝑖 = 𝐿𝑂} and HI tasks
+                                                                                    𝛤𝐻 𝐼 = {𝜏𝑖 ∈ 𝛤 |𝐿𝑖 = 𝐻 𝐼}. Each task 𝜏𝑖 ∈ 𝛤 is described by a tuple of
+2.5. Probabilistic scheduling for MCS                                               parameters ⟨𝐿𝑖 , 𝑇𝑖 , 𝐷𝑖 , 𝑖 , 𝑖𝐿𝑂 , 𝑖𝐻 𝐼 , 𝐶𝑖𝑑 𝑒𝑔 ∨ 𝐶𝑖𝑡ℎ𝑟 ⟩:
+
+                                                                                        • 𝐿𝑖 ∈ {𝐿𝑂, 𝐻 𝐼} denotes its criticality level.
+    Santinelli and George [33] presented an initial solution to proba-
+bilistic schedulability analysis for EDF scheduling of MCS based on the                 • 𝑇𝑖 denotes its period.
+concept of probabilistic C-Space. Maxim et al. [34] presented a prob-                   • 𝐷𝑖 denotes its constrained deadline (𝐷𝑖 ≤ 𝑇𝑖 ).
+abilistic fixed-priority schedulability analysis [14]. Singh et al. [35]                • 𝑖 is its nominal pWCET, a discrete random variable with 𝐾
+considered a novel MCS task model with job-level mode switching,                          discrete values characterized by PMF 𝑓𝑖 (⋅) and CDF 𝐹𝑖 (⋅). It has
+and presented a graph-traversal-based analytic framework for non-                         the minimum value 𝐶𝑖𝑚𝑖𝑛 with index 𝑖𝑛𝑑(𝐶𝑖𝑚𝑖𝑛 ) = 0 and maximum
+preemptive job-level fixed-priority probabilistic schedulability analysis.                value 𝐶𝑖𝑚𝑎𝑥 with index 𝑖𝑛𝑑(𝐶𝑖𝑚𝑎𝑥 ) = 𝐾 − 1 among the 𝐾 discrete
+Draskovic et al. [36] proposed metrics that are inspired by industry                      values of 𝑖 .
+safety standards, including the probability of deadline miss per hour,                  • 𝑖𝐿𝑂 is its pWCET in LO mode, characterized by PMF 𝑓 𝐿𝑂 (⋅) and
+                                                                                                                                                   𝑖
+the expected time before degradation happens, and the duration of the                     CDF 𝐹 𝐿𝑂 (⋅).
+                                                                                                   𝑖
+degradation, and presented a system-wide approach to probabilistic                      • 𝑖𝐻 𝐼 is its pWCET in HI mode, characterized by PMF 𝑓 𝐻 𝐼 (⋅) and
+                                                                                                                                                𝑖
+scheduling of MCS. Guo et al. [37] proposed a new task model in                           CDF 𝐹 𝐻 𝐼 (⋅).
+                                                                                                   𝑖
+which a new parameter is added to characterize the distribution of the
+                                                                                        • 𝐶𝑖𝑑 𝑒𝑔 is valid for LO tasks (𝐿𝑖 = 𝐿𝑂), and denotes its Degraded
+WCET estimations for each task. They presented efficient algorithms for
+                                                                                          WCET in HI mode 𝐶𝑖𝑑 𝑒𝑔 with index 𝑖𝑛𝑑(𝐶𝑖𝑑 𝑒𝑔 ) ∈ [0, 𝐾 − 1].
+MCS scheduling under this task model for both independent tasks and
+failure-dependent tasks.                                                                • 𝐶𝑖𝑡ℎ𝑟 is valid for HI tasks (𝐿𝑖 = 𝐻 𝐼), and denotes its Threshold
+    We are aware of only one related work that addressed energy-                          WCET in LO mode 𝐶𝑖𝑡ℎ𝑟 with index 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) ∈ [0, 𝐾 − 1].
+aware scheduling in MCS assuming probabilistic task execution times.                     Task execution model. The system is first initialized to be in LO
+Bhuiyan et al. [7] proposed a probabilistic technique to derive an                  mode. If any HI task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 executes beyond its 𝐶𝑖𝑡ℎ𝑟 , the system
+energy-efficient processor speed that minimized the average energy                  switches from LO mode to HI mode. At the mode switch instant 𝑡𝑠 , if
+consumption with DVFS, while ensuring deadlines of all tasks in MCS.                jobs of LO tasks have run for longer than their 𝐶𝑖𝑑 𝑒𝑔 , any such jobs will
+This work used non-preemptive fixed-priority scheduling and determin-               be dropped, without suppressing future arrivals thereof. In addition, if a
+istic schedulability test based on Worst-Case Response Time analysis,               LO job has executed for less than 𝐶𝑖𝑑 𝑒𝑔 by the switch time instant, these
+instead of probabilistic schedulability analysis. It is not directly com-           carry-over jobs that have an arrival time before 𝑡𝑠 and have absolute
+parable to our work due to the different task models and analysis                   deadlines after 𝑡𝑠 will continue to execute the leftover execution up to
+techniques.                                                                         𝐶𝑖𝑑 𝑒𝑔 . While in HI mode, each LO task 𝜏𝑖 ∈ 𝛤𝐿𝑂 executes no more than
+    Table 1 summarized related work on probabilistic Scheduling for                 its 𝐶𝑖𝑑 𝑒𝑔 , i.e., it is dropped if its execution time exceeds 𝐶𝑖𝑑 𝑒𝑔 . The system
+MCS.                                                                                switches from HI mode to LO mode at an idle instant if no jobs wait
+                                                                                    for executions at this time. Moreover, incomplete tasks are dropped at
+3. Preliminaries                                                                    their deadlines, hence there does not exist a backlog of outstanding
+                                                                                    execution at the end of each hyper-period (this is a common assumption
+3.1. Task model                                                                     in industry practice [10].
+                                                                                         The pWCET of a LO task in LO mode, or the pWCET of a HI task
+   Our task model is inspired by the IMC task model [17], with                      in HI mode, is the same as its nominal pWCET 𝑖 . The pWCET of a HI
+extensions to the probabilistic scheduling scenario. We first introduce
+some basic notations for probabilistic scheduling. A task 𝜏𝑖 ’s probabilistic
+WCET (pWCET) 𝑖 is a random variable characterized by a Probability                   1
+                                                                                         Calligraphic letters are used to represent distributions while non
+Mass Function (PMF) 𝑓𝑖 (⋅), where 𝑓𝑖 (𝑒𝑡) = 𝑃 (𝑖 = 𝑒𝑡) denotes the               calligraphic letters are for scalars.
+
+
+                                                                                3
+Y.-W. Zhang and J.-L. Zhang                                                                                                       Journal of Systems Architecture 160 (2025) 103361
+
+
+task 𝜏𝑖 in LO mode is trimmed with the upper bound 𝐶𝑖𝑡ℎ𝑟 to have the                   Table 2
+                                                                                       Taskset parameters of 𝛤1 , with 𝐶1𝑑 𝑒𝑔 = 1, 𝐶2𝑡ℎ𝑟 = 1.
+conditional PMF 𝑓 𝐿𝑂 (𝑒𝑡) = 𝑃 (𝑖 = 𝑒𝑡 ∣ 𝑒𝑡 ≤ 𝐶𝑖𝑡ℎ𝑟 ). The pWCET of a LO
+                        𝑖
+                                                                                        Task 𝐿𝑖       𝑇𝑖 = 𝐷𝑖   𝑖            𝑖𝐿𝑂           𝑖𝐻 𝐼          𝑖𝐿𝑂           𝑖𝐻 𝐼
+task 𝜏𝑖 in HI mode is trimmed with the upper bound 𝐶𝑖𝑑 𝑒𝑔 to have the
+conditional PMF 𝑓 𝐻 𝐼 (𝑒𝑡) = 𝑃 (𝑖 = 𝑒𝑡 ∣ 𝑒𝑡 ≤ 𝐶𝑖𝑑 𝑒𝑔 ). In other words, 𝐶𝑖𝑑 𝑒𝑔                                ⎛1      2⎞    ⎛1       2⎞    ⎛1⎞            ⎛0.5    1.0⎞   ⎛0.5⎞
+                        𝑖                                                               𝜏1       LO 2           ⎜0.5   0.5⎟   ⎜0.5    0.5⎟   ⎜1.0⎟          ⎜0.5    0.5⎟   ⎜1.0⎟
+is LO task 𝜏𝑖 ’s execution time budget in HI mode, and 𝐶𝑖𝑡ℎ𝑟 is HI task                                         ⎜         ⎟   ⎜          ⎟   ⎜ ⎟            ⎜          ⎟   ⎜ ⎟
+                                                                                                                ⎝0.5   1.0⎠   ⎝0.5    1.0⎠   ⎝1.0⎠          ⎝0.5    1.0⎠   ⎝1.0⎠
+𝜏𝑖 ’s execution time budget in LO mode. This is inspired by the IMC task                                        ⎛1      2⎞    ⎛1⎞            ⎛1       2⎞    ⎛0.5⎞          ⎛0.5    1.0⎞
+                                                                                        𝜏2       HI   2         ⎜0.5   0.5⎟   ⎜1.0⎟          ⎜0.5    0.5⎟   ⎜1.0⎟          ⎜0.5    0.5⎟
+model [17,19,20]. They are computed with Eqs. (1) and (2):                                                      ⎜         ⎟   ⎜ ⎟            ⎜          ⎟   ⎜ ⎟            ⎜          ⎟
+                                                                                                                ⎝0.5   1.0⎠   ⎝1.0⎠          ⎝0.5    1.0⎠   ⎝1.0⎠          ⎝0.5    1.0⎠
+∀𝜏𝑖 ∈ 𝛤𝐿𝑂 ∶ 𝑓 𝐿𝑂 (𝑒𝑡) = 𝑓𝑖 (𝑒𝑡),                                          (1)
+                  𝑖
+
+              ⎧∑                                 𝑑 𝑒𝑔
+              ⎪ 𝑒𝑡′ ≥𝐶 𝑑 𝑒𝑔 𝑓 𝐿𝑂 (𝑒𝑡′ ), 𝑒𝑡 = 𝐶𝑖
+              ⎪        𝑖      𝑖                                                              • [[𝐴]]0 stands for max(𝐴, 0).
+𝑓 𝐻 𝐼 (𝑒𝑡) = ⎨𝑓 𝐿𝑂 (𝑒𝑡),                𝑒𝑡 < 𝐶𝑖𝑑 𝑒𝑔                                        • 𝑡𝑠 stands for the mode-switch time.
+  𝑖
+              ⎪ 𝑖                                                                                       𝑡−𝐷                  𝑡
+              ⎪0,                         𝑒𝑡 > 𝐶𝑖𝑑 𝑒𝑔                                        • 𝑚𝑖 = ⌊ 𝑇 𝑖 ⌋ and 𝑘𝑖 = ⌊ 𝑇𝑠 ⌋ are the number of jobs for 𝜏𝑖 in the
+              ⎩                                                                                           𝑖                    𝑖
+                                                                                               interval [0, 𝑡) and [0, 𝑡𝑠 ), respectively.
+                                                                                             • 𝐷𝐵 𝐹𝐿 (𝜏𝑖 , 𝑡) stands for the processor demand of any task 𝜏𝑖 ∈ 𝛤
+∀𝜏𝑖 ∈ 𝛤𝐻 𝐼 ∶ 𝑓 𝐻 𝐼 (𝑒𝑡) = 𝑓𝑖 (𝑒𝑡)                                         (2)                within [0, 𝑡) in LO mode.
+                 𝑖
+               ∑                                                                             • 𝐷𝐵 𝐹 (𝐽𝐿 , 𝑡) and 𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡) stand for the processor demand of a
+             ⎧ 𝑒𝑡′ ≥𝐶 𝑡ℎ𝑟 𝑓 𝐻 𝐼 (𝑒𝑡′ ), 𝑒𝑡 = 𝐶𝑖𝑡ℎ𝑟
+             ⎪         𝑖    𝑖
+                                                                                               carry-over job released by task 𝜏𝑖 ∈ 𝛤𝐿𝑂 and 𝜏𝑖 ∈ 𝛤𝐻 𝐼 within [0, 𝑡),
+𝑓 𝐿𝑂 (𝑒𝑡) = ⎨𝑓 𝐻 𝐼 (𝑒𝑡),               𝑒𝑡 < 𝐶𝑖𝑡ℎ𝑟
+  𝑖
+             ⎪ 𝑖                                                                               respectively.
+             ⎩0,                         𝑒𝑡 > 𝐶𝑖𝑡ℎ𝑟                                          • 𝑟𝑖 stands for the arrival time of the carry-over job that arrives
+                                                                                               before 𝑡𝑠 and has a deadline after 𝑡𝑠 .
+    Since task 𝜏𝑖 ’s period 𝑇𝑖 is a constant in both LO and HI modes, its                    • 𝐷𝐵 𝐹𝐿𝐻 (𝜏𝑖 , 𝑡) stands for the processor demand of a LO task 𝜏𝑖 within
+probabilistic Worst-Case Utilization (pWCU) can be obtained by dividing                                                              𝐻 (𝜏 , 𝑡) stands for the processor
+                                                                                               [0, 𝑡) in HI mode, while 𝐷𝐵 𝐹𝐻            𝑖
+its pWCET by its period: 𝑖 = 𝑖 ∕𝑇𝑖 , 𝑖𝐿𝑂 = 𝑖𝐿𝑂 ∕𝑇𝑖 in LO mode, and
+                                                                                               demand of a HI task 𝜏𝑖 within [0, 𝑡) in HI mode.
+𝑖𝐻 𝐼 = 𝑖𝐻 𝐼 ∕𝑇𝑖 in HI mode. The pWCU of a taskset can be obtained by
+summing the pWCUs of all tasks in the taskset.                                            Fig. 1 illustrates a carry-over job and the mode switch. The down-
+                                                                                       ward arrow represents the job arrival time. If the execution time of 𝜏𝑖
+Example 1. A taskset 𝛤1 with two tasks is shown in Table 2. Each task                  exceeds 𝐶𝑖𝐿𝑂 without signaling completion, the system switches from
+𝜏𝑖 ’s nominal pWCET 𝑖 is shown in matrix form defined in Eq. (3). For                 LO mode to HI mode. 𝐽𝐻 is a carry-over job.
+the matrix form, the first row denotes each discrete value of 𝑖 ; the
+                                                                                           According to the Task Execution model, the processor demand
+second row denotes probability values of the PMF 𝑓𝑖 (⋅); and the third
+                                                                                       of LO carry-over jobs is always less than or equal to 𝐶𝑖𝐿𝑂 , while the
+row denotes cumulative probability values of the CDF 𝐹𝑖 (⋅).
+                                                                                       processor demand of HI carry-over jobs is always less than or equal to
+⎛ 𝐶0          𝐶1     …      𝐶𝐾−1 ⎞
+                                                                                       𝐶𝑖𝐻 𝐼 . Therefore, 𝐷𝐵 𝐹 (𝐽𝐿 , 𝑡) can be calculated as follows:
+⎜ 𝑓 (𝐶0 ) 𝑓 (𝐶1 ) … 𝑓 (𝐶𝐾−1 ) ⎟                                  (3)                                {
+⎜ 𝑖           𝑖             𝑖      ⎟                                                                     𝐶𝑖𝐿𝑂 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡
+⎝𝐹𝑖 (𝐶0 ) 𝐹𝑖 (𝐶1 ) … 𝐹𝑖 (𝐶𝐾−1 )⎠                                                    𝐷𝐵 𝐹 (𝐽𝐿 , 𝑡) =                                                    (5)
+                                                                                                         0,     𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
+The PMF of 𝜏𝑖 ’s pWCET in LO mode 𝑖𝐿𝑂 is obtained by Eq. (2); the
+PMF of its pWCET in HI mode 𝑖𝐻 𝐼 is obtained by Eq. (1). For the toy                  and 𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡) can be calculated as follows:
+                                                                                                       {
+example, the LO task 𝜏1 ’s nominal pWCET 1 has two possible values                                      𝐶𝑖𝐻 𝐼 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡
+1 and 2, each with probability 0.5; its pWCET in LO mode 1𝐿𝑂 is the                   𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡) =                                                                              (6)
+                                                                                                         0,      𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
+same as 1 ; its pWCET in HI mode 1𝐻 𝐼 is obtained by trimming 1 with
+the upper bound 𝐶1𝑑 𝑒𝑔 = 1 and 𝑖𝑛𝑑(𝐶1𝑑 𝑒𝑔 ) = 0 (assuming the index starts
+from 0), with one possible value of 1 with a probability 1.0. The HI                         From [3,17], we have the following Theorems.
+task 𝜏2 ’s nominal pWCET 2 has two possible values 1 and 2, each with
+probability 0.5; its pWCET in LO mode 2𝐿𝑂 is obtained by trimming                     Theorem 1. A deterministic IMC taskset 𝛤 is schedulable under EDF in
+2 with the upper bound 𝐶2𝑡ℎ𝑟 = 1 and 𝑖𝑛𝑑(𝐶2𝑡ℎ𝑟 ) = 0, with one possible               LO mode, if 0 < ∀𝑡 ≤ 𝑡𝑚𝑎𝑥 ,
+value of 1 with a probability 1.0; its pWCET in HI mode 2𝐻 𝐼 is the                    ∑
+                                                                                           𝐷𝐵 𝐹𝐿 (𝜏𝑖 , 𝑡) ≤ 𝑡,                                          (7)
+same as 2 . The matrix that denotes 𝜏𝑖 ’s pWCU is obtained by dividing                𝜏𝑖 ∈𝛤
+each term in the first row of its pWCET matrix by its period 𝑇𝑖 .
+                                                                                       where 𝐷𝐵 𝐹𝐿 (𝜏𝑖 , 𝑡) = [[𝑚𝑖 + 1]]0 ⋅ 𝐶𝑖𝐿𝑂 , and 𝑡𝑚𝑎𝑥 is a hyper-period.
+     Eq. (4) shows the definitions of pWCU for the subset of LO tasks
+𝛤𝐿𝑂 in LO mode. (As mathematical background, the addition of two
+discrete random variables  and  results in a new random variable
+                                                                                       Theorem 2. A deterministic IMC taskset 𝛤 is schedulable under EDF in
+ with PMF computed by the convolution of the two PMFs  and ,
+             ⨂                      ∑                                                  HI mode, if 0 < ∀𝑡 ≤ 𝑡𝑚𝑎𝑥 , 0 < 𝑡𝑠 < 𝑡,
+i.e.,  =     , where 𝑃 ( = 𝑧) = ∞  𝑘=−∞ 𝑃 ( = 𝑘)𝑃 ( = 𝑧 − 𝑘).                     ∑                          ∑
+              ⨂                     ⨂                                                       𝐷𝐵 𝐹𝐿𝐻 (𝜏𝑖 , 𝑡𝑠 , 𝑡) +     𝐷 𝐵 𝐹𝐻 𝐻
+                                                                                                                                (𝜏𝑗 , 𝑡𝑠 , 𝑡) ≤ 𝑡,      (8)
+   𝐿𝑂               𝐿𝑂   𝐻𝐼
+𝐿𝑂 (𝛤 ) =        𝑖 , 𝐻 𝐼 (𝛤 ) =       𝑖𝐻 𝐼 ,                    (4)                𝜏𝑖 ∈𝛤𝐿𝑂                         𝜏𝑗 ∈𝛤𝐻 𝐼
+             𝜏𝑖 ∈𝛤𝐿𝑂                      𝜏𝑖 ∈𝛤𝐻 𝐼
+
+       𝐿𝑂 (𝛤 ) denotes pWCU of 𝛤
+where 𝐿𝑂                                       𝐻𝐼                                     where 𝐷𝐵 𝐹𝐿𝐻 (𝜏𝑖 , 𝑡𝑠 , 𝑡) = 𝑘𝑖 𝐶𝑖𝐿𝑂 + 𝐷𝐵 𝐹 (𝐽𝐿 , 𝑡) + 𝑐𝑖 𝐶𝑖𝐻 𝐼 , and 𝐷𝐵 𝐹𝐻
+                                                                                                                                                                 𝐻 (𝜏 , 𝑡 , 𝑡)
+                                                                                                                                                                     𝑖 𝑠
+                                𝐿𝑂 in LO mode; 𝐻 𝐼 (𝛤 ) denotes
+                                                                                       can be determined as follows:
+pWCU of 𝛤𝐻 𝐼 in HI mode.                                                                                      {
+                                                                                            𝐻                   𝐷𝐵 𝐹 (1),                     𝐷𝑖 ≤ 𝑡 − 𝑡𝑠 ;
+                                                                                       𝐷 𝐵 𝐹𝐻 (𝜏𝑖 , 𝑡𝑠 , 𝑡) =                                                            (9)
+3.2. Existing deterministic IMC scheduling                                                                      max{𝐷𝐵 𝐹 (1), 𝐷𝐵 𝐹 (2)}, 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒,
+
+   Liu et al. [17] have studied the schedulability test for deterministic              where 𝐷𝐵 𝐹 (1) = 𝑏𝑖 𝐶𝑖𝐿𝑂 + 𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡) + 𝑎𝑖 𝐶𝑖𝐻 𝐼 , 𝐷𝐵 𝐹 (2) = 𝑘𝑖 𝐶𝑖𝐿𝑂 +
+                                                                                                                                   𝑡 −(𝑡−𝐷𝑖 −𝑚𝑖 𝑇𝑖 )
+IMC task model and proposed the sufficient conditions of the schedu-                   𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡), 𝑎𝑖 = [[𝑚𝑖 − 𝑏𝑖 ]]0 , 𝑏𝑖 = [[⌊ 𝑠    𝑇
+                                                                                                                                                     ⌋]]0 , and 𝑐𝑖 = [[𝑚𝑖 − 𝑘𝑖 ]]0 .
+                                                                                                                                                𝑖
+lability under EDF-VD. We first introduce the following notations.
+
+
+                                                                                   4
+Y.-W. Zhang and J.-L. Zhang                                                                                                         Journal of Systems Architecture 160 (2025) 103361
+
+
+
+
+                                                                         Fig. 1. Carry-over job.
+
+
+4. Probabilistic IMC scheduling
+                                                                                             According to [3,17], we should consider two cases to determine the
+4.1. Schedulability analysis                                                             probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 within [0, 𝑡) in HI
+                                                                                         mode.
+   Before presenting the schedulability analysis, let us introduce a few                     Case 1: 𝐷𝑖 ≤ 𝑡 − 𝑡𝑠 . The maximum demand of a job released by the
+notations.                                                                               HI task 𝜏𝑖 is generated while its deadline coincides with 𝑡. According
+                                                                                         to Eq. (9) in Theorem 2, the probabilistic processor demand of any
+    • max{} stands for the maximum value of random variable .
+                                                                                         task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 within [0, 𝑡) in HI mode is equal to  (1) = ((𝑏𝑖 ) ⊙
+                ⎛𝑥⎞                                                                            ⨂               ⨂
+                                                                                         𝑖𝐿𝑂 )  (𝐽𝐻 , 𝑡) ((𝑎𝑖 ) ⊙ 𝑖𝐻 𝐼 ).
+    • (𝑥) = ⎜1⎟, where 𝑥 is a constant.
+                ⎜ ⎟                                                                          Case 2: 𝐷𝑖 > 𝑡 − 𝑡𝑠 . The HI task 𝜏𝑖 has at most one job with a
+                ⎝1⎠
+                                                                                         processor demand 𝐶𝑖𝐻 𝐼 . If the deadline of this job is 𝐷𝑖 , the probabilistic
+    •  𝐿 (𝜏𝑖 , 𝑡) stands for the probabilistic processor demand of any
+                                                                                         processor demand is the same as  (1). Moreover, the only way to
+      task 𝜏𝑖 within [0, 𝑡) in LO mode.
+                                                                                         increase the demand of the HI task 𝜏𝑖 is to add a new job in the interval.
+    •  (𝐽𝐿 , 𝑡) and  (𝐽𝐻 , 𝑡) stand for the probabilistic processor
+                                                                                         In other words, the first job of the HI task 𝜏𝑖 arrives at time 0. Therefore,
+      demand of a carry-over job released by the task 𝜏𝑖 ∈ 𝛤𝐿𝑂 and
+                                                                                         the processor demand includes two parts: one part is the demand of
+      𝜏𝑖 ∈ 𝛤𝐿𝑂 within [0, 𝑡), respectively.
+                                                                                         all jobs before 𝑡𝑠 , and the other part is the demand of a carry-over
+    •  𝐻  𝐿 (𝜏𝑖 , 𝑡) stands for the probabilistic processor demand of a LO            job 𝐽𝐻 . In this case, the probabilistic processor demand is equal to
+      task 𝜏𝑖 within [0, 𝑡) in HI mode, while  𝐻                                                                 ⨂
+                                                         𝐻 (𝜏𝑖 , 𝑡) stands for the        (2) = ((𝑘𝑖 ) ⊙ 𝑖𝐿𝑂 )  (𝐽𝐻 , 𝑡).
+      probabilistic processor demand of a HI task 𝜏𝑖 within [0, 𝑡) in HI                     In short, the probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤𝐻 𝐼
+      mode.                                                                              within [0, 𝑡) and 𝐷𝑖 ≤ 𝑡 − 𝑡𝑠 in HI mode can be determined as follows:
+    •  𝐿 (𝑡) stands for the probabilistic processor demand of all tasks                                 {
+      within [0, 𝑡) in LO mode.                                                                               (1), 𝐷𝑖 ≤ 𝑡 − 𝑡𝑠 ;
+                                                                                          𝐻   (𝜏
+                                                                                               𝐻 𝑖  , 𝑡) =                                                        (15)
+    •  𝐻 (𝑡) stands for the probabilistic processor demand of all tasks                                   ,        𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒,
+      within [0, 𝑡) in HI mode.                                                          where  can be determined as follows:
+         𝑡𝑚𝑎𝑥
+    • 𝛱𝑡=1    𝑡 = 1 × 2 × ⋯ × 𝑡𝑚𝑎𝑥 .                                                         {
+                                                                                                (1), max{ (2)} ≤ max{ (1)};
+                                                                                         =                                                                                    (16)
+   According to [3,17,33], the probabilistic processor demand of any                            (2), 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
+task 𝜏𝑖 ∈ 𝛤 within [0, 𝑡) in LO mode can be calculated as follows:
+ 𝐿 (𝜏𝑖 , 𝑡) = ([[𝑚𝑖 + 1]]0 ) ⊙ 𝑖𝐿𝑂 ,                                    (10)             Therefore, the probabilistic processor demand of all tasks within
+                                                                                         [0, 𝑡) in HI mode is determined by the following:
+where ⊙ denotes the Hadamard product, where each element in the 𝑖th                                     ⨂                     ⨂ ⨂
+                                                                                          𝐻 (𝑡) = (        𝐻𝐿 (𝜏𝑖 , 𝑡))  (     𝐻
+                                                                                                                                        𝐻 (𝜏𝑖 , 𝑡)).       (17)
+row of the right matrix is multiplied by the element on the 𝑖th row of
+                                                                                                         𝜏𝑖 ∈𝛤𝐿𝑂                         𝜏𝑖 ∈𝛤𝐻 𝐼
+the left vector.
+     In addition, the probabilistic processor demand of all tasks within
+[0, 𝑡) in LO mode can be calculated as follows:                                          Theorem 3. An IMC taskset 𝛤 is deterministically schedulable under EDF,
+             ⨂
+ 𝐿 (𝑡) =       𝐿 (𝜏𝑖 , 𝑡).                                    (11)                 if 0 < ∀𝑡 ≤ 𝑡𝑚𝑎𝑥 , 0 < 𝑡𝑠 < 𝑡,
+             𝜏𝑖 ∈𝛤
+                                                                                         max{ 𝐿 (𝑡)} ≤ 𝑡,         𝑎𝑛𝑑     max{ 𝐻 (𝑡)} ≤ 𝑡,                               (18)
+   The probabilistic processor demand of a carry-over job released by                    It is probabilistically schedulable if the maximum probability that the pro-
+LO task 𝜏𝑖 within [0, 𝑡) can be calculated as follows:                                   cessor demand of all tasks in both LO mode and HI mode exceeds 𝑡 does
+               {
+                 𝑖𝐿𝑂 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡                                                      not exceed the permitted system failure probability 𝐹𝑠 ,2 expressed as:
+ (𝐽𝐿 , 𝑡) =                                                   (12)                           𝑡
+                 (0), 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.                                                        1 − 𝛱𝑡 𝑚𝑎𝑥
+                                                                                                𝑘=𝑡 𝐹 𝐿 (𝑡𝑘 ) (𝑡𝑘 ) ≤ 𝐹𝑠 , 𝑎𝑛𝑑                                              (19)
+                                                                                              𝑡
+                                                                                         1 − 𝛱𝑡 𝑚𝑎𝑥 𝐹          (𝑡 ) ≤ 𝐹𝑠 .
+   The probabilistic processor demand of a carry-over job released by                          𝑘 =𝑡  𝐻 (𝑡𝑘 ) 𝑘
+HI task 𝜏𝑖 within [0, 𝑡) can be calculated as follows:
+               {
+                 𝑖𝐻 𝐼 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡
+ (𝐽𝐻 , 𝑡) =                                                   (13)
+                 (0), 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.                                                           2
+                                                                                              Chen et al. [38] pointed out that there are certain flaws in the probabilis-
+                                                                                         tic WCRT based on critical instant instances. However, our work focuses on the
+   The probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤𝐿𝑂 within [0, 𝑡)
+                                                                                         overall distribution of all task behaviors within a task’s hyper-period, rather
+in HI mode can be calculated as follows:                                                 than relying solely on a single critical instant and considers the probability
+                                 ⨂              ⨂
+ 𝐻                        𝐿𝑂
+     𝐿 (𝜏𝑖 , 𝑡) = ((𝑘𝑖 ) ⊙ 𝑖 )    (𝐽𝐿 , 𝑡)   ((𝑐𝑖 ) ⊙ 𝑖𝐻 𝐼 ). (14)                distribution of all possible processor demand throughout the hyper-period.
+
+
+                                                                                     5
+Y.-W. Zhang and J.-L. Zhang                                                                                                                 Journal of Systems Architecture 160 (2025) 103361
+
+Table 3                                                                                               𝐿 (𝜏2 , 𝑡) = (0), and  𝐿 (𝜏3 , 𝑡) = 3𝐿𝑂 . In addition, we have
+Taskset parameters of 𝛤2 , with 𝐶1𝑑 𝑒𝑔 = 3, 𝐶2𝑡ℎ𝑟 = 1, 𝐶3𝑑 𝑒𝑔 = 3.                                                   ⎛    3               4        ⋯        8            9           10 ⎞
+ Task 𝐿𝑖      𝑇𝑖 = 𝐷𝑖 𝑖𝐿𝑂                                    𝑖𝐻 𝐼                                                  ⎜                                                                    ⎟
+                                                                                                      𝐿 (𝑡) =  = ⎜0.008645          0.273      ⋯     0.00266     0.000384     0.000001⎟
+                        ⎛ 1          3        4        5 ⎞    ⎛ 1          3 ⎞                                       ⎜0.008645        0.281645     ⋯    0.999615     0.999999        1.0 ⎟⎠
+                        ⎜0.455                                                                                       ⎝
+ 𝜏1     LO 10                      0.54     0.004    0.001⎟   ⎜0.455     0.545⎟
+                        ⎜                                 ⎟   ⎜               ⎟                      from Eq. (11). Moreover, from (17), we have  𝐻 (𝑡) = .
+                        ⎝0.455    0.995     0.999     1.0 ⎠   ⎝0.455      1.0 ⎠
+                        ⎛ 0.5      1 ⎞                        ⎛ 0.5      1        2       3 ⎞            When 10 < 𝑡 < 20, 𝑚1 = 0, 𝑚2 = −1, 𝑚3 = 0, 𝑎𝑖 = 0, 𝑐𝑖 = 0, and
+ 𝜏2     HI    20        ⎜0.49    0.51⎟                        ⎜0.49     0.5     0.009   0.001⎟
+                        ⎜             ⎟                       ⎜                              ⎟       𝑏𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (11), we have  𝐿 (𝑡) = . If
+                        ⎝0.49     1.0 ⎠                       ⎝0.49    0.99     0.999    1.0 ⎠
+                                                                                                     𝑡𝑠 < 10, 𝑘𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (17), we have  𝐻 (𝑡) = 
+                        ⎛ 2          3         4       5 ⎞    ⎛ 2          3 ⎞
+ 𝜏3     LO 10           ⎜0.019      0.6      0.38    0.001⎟   ⎜0.019     0.981⎟                      and max{ 𝐻 (𝑡) ≤ 𝑡}. If 10 ≤ 𝑡𝑠 < 𝑡, we have 𝑘1 = 1, 𝑘2 = 0,
+                        ⎜                                 ⎟   ⎜               ⎟
+                        ⎝0.019    0.619     0.999     1.0 ⎠   ⎝0.019      1.0 ⎠                      and 𝑘3 = 1. According to Eq. (14), we have  𝐻                         𝐿𝑂 and
+                                                                                                                                                                𝐿 (𝜏1 , 𝑡) = 1
+                                                                                                      𝐻  (𝜏
+                                                                                                           𝐿 3  , 𝑡) =   𝐿𝑂 . We calculate  𝐻 (𝜏 , 𝑡) = (0) from Eq. (15).
+                                                                                                                          3                       𝐻 2
+                                                                                                     In addition, we have  𝐻 (𝑡) =  from Eq. (17). Therefore, we have
+                                                                                                     max{ 𝐻 (𝑡)} ≤ 𝑡 and max{ 𝐿 (𝑡)} ≤ 𝑡.
+                                                                                                         When 𝑡 = 20, 𝑚1 = 1, 𝑚2 = 0, 𝑚3 = 1. According to Eq. (10), we
+                                                                                                     have  𝐿 (𝜏1 , 𝑡) = (2) ⊙ 1𝐿𝑂 ,  𝐿 (𝜏2 , 𝑡) = 2𝐿𝑂 , and  𝐿 (𝜏3 , 𝑡) =
+Proof. The IMC taskset 𝛤 is deterministically schedulable under EDF if it
+                                                                                                     (2) ⊙ 3𝐿𝑂 . In addition, we have
+is deterministically schedulable in both LO mode and HI mode. The condi-
+tion for deterministic schedulability in LO mode and HI mode Eq. (18)                                                ⎛     6.5       ⋯        19             20.5             21    ⎞
+                                                                                                      𝐿 (𝑡) = ⎜0.00423605 ⋯ 0.00019584 0.00000049 0.00000051⎟
+is self-evident, because it can be directly derived from Theorems 1 and                                              ⎜                                                              ⎟
+2. In addition, the IMC taskset 𝛤 is probabilistically schedulable under                                             ⎝0.00406315 ⋯         0.999999      0.99999949           1.0   ⎠
+EDF if it is probabilistically schedulable in both LO mode and HI mode.                              from Eq. (11). If 𝑡𝑠 < 10, 𝑎1 = 1, 𝑎2 = 0, 𝑎3 = 1, 𝑐1 = 1, 𝑐2 = 0,
+The condition for probabilistic schedulability (Eq. (19)) states that the                            𝑐3 = 1, 𝑘𝑖 = 0, and 𝑏𝑖 = 0 (𝑖 = 1, 2, 3). From Eq. (17), we have
+probability that the processor demand of all tasks in both LO mode and                               max{ 𝐻 (𝑡)} = 19. If 10 ≤ 𝑡𝑠 < 𝑡, 𝑘1 = 1, 𝑘2 = 0, 𝑘3 = 1, 𝑏1 = 1,
+HI mode exceeds 𝑡 is less than or equal to 𝐹𝑠 , hence it is probabilistically                        𝑏2 = 0, 𝑏3 = 1, 𝑎𝑖 = 0 and 𝑐𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (17),
+schedulable with system failure probability not exceeding 𝐹𝑠 . (Note that                            we have max{ 𝐻 (𝑡)} = 23. Therefore, we have max{ 𝐿 (𝑡)} > 𝑡
+the condition of deterministic schedulability in Eq. (18) is a special                               and max{ 𝐻 (𝑡)} > 𝑡 (10 ≤ 𝑡𝑠 < 𝑡), but 1 − 𝐹 𝐿 (𝑡) (𝑡) ≤ 𝐹𝑠
+case of the condition of probabilistic schedulability in Eq. (19), with                              and 1 − 𝐹 𝐻 (𝑡) (𝑡) ≤ 𝐹𝑠 . According to Theorem 3, the taskset 𝛤 is
+permitted system failure probability equal to 0 (𝐹𝑠 = 0).) Q.E.D.                                    probabilistically schedulable.
+    In the deterministic analysis, the processor demand grows in a
+stepwise manner based on the interval length. The processor demand                                   5. Energy-efficient task execution model
+is affected only when the increase in interval length is a multiple of the
+task period. When we switch to probabilistic analysis, the probability                                   We present in sequence the power model, the calculation of energy-
+distribution of processor demand also increases in a stepwise manner to                              efficient processor speeds in LO mode, and the Energy-Efficient Task
+maintain consistency. In other words, during deterministic analysis, the                             Execution Model in this section.
+processor demand does not change in the given time intervals, and in
+probabilistic scheduling analysis, the values in its probability distribu-                           5.1. Power model
+tion of processor demand also remain unchanged. Specifically, there are
+some 𝑡𝑘 values that can generate the same probability distribution of                                   We adopt the state-of-the-art processor power model [39–41]
+processor demand. The values of 𝐹 𝐿 (𝑡𝑘 ) (𝑡𝑘 ) and 𝐹 𝐻 (𝑡𝑘 ) (𝑡𝑘 ), which
+                                                                                                     𝑃 = 𝑃𝑠 + ℎ(𝑃𝑖𝑛𝑑 + 𝐶𝑒𝑓 𝑠𝑚 ),                                                       (20)
+correspond to the same probability distribution of processor demand,
+should not be computed repeatedly in Eq. (19). Therefore, we only                                    where 𝑃𝑠 is a static power and 𝑃𝑖𝑛𝑑 is the frequency-independent active
+calculate once. In addition, If 𝑡1 , 𝑡2 and 𝑡𝑙 (𝑡1 < 𝑡2 < 𝑡𝑙 ) can generate                          power. ℎ = 1 if the system is active (defined as having computation in
+the same probability distribution of the processor demand for all tasks                              progress); otherwise, ℎ = 0. 𝐶𝑒𝑓 is an effective switching capacitance
+in both modes. We choose the minimum value 𝑡1 among these values,                                    and 𝑚 is system-application-dependent constant. 𝑠 is the normalized
+which corresponds to 𝐹 𝐿 (𝑡1 ) (𝑡1 ) and 𝐹 𝐻 (𝑡1 ) (𝑡1 ). This is because it                   processor speed (frequency). Like [39], we ignore a static power (𝑃𝑠 =
+is the value that maximizes the probability of the processor demand                                  0) and set 𝑃𝑖𝑛𝑑 = 0.01, 𝐶𝑒𝑓 = 1, 𝑚 = 3.
+exceeding the interval length.                                                                            Considering our task model, the expected energy consumption of a
+                                                                                                     single job of task 𝜏𝑖 is [42–44]:
+4.2. Example 2                                                                                                                  𝑥
+                                                                                                     𝐸 𝑖 = (𝑃𝑖𝑛𝑑 + 𝐶𝑒𝑓 𝑠𝑚 ) ⋅ 𝑖                                                  (21)
+                                                                                                                                  𝑠
+    We present a taskset 𝛤2 , with the parameters shown in Table 3.                                                 ∑
+(The nominal pWCET 𝑖 is omitted for brevity.) We assume that 𝐹𝑠 =                                   where 𝑥𝑖 = 𝐾−1            𝑘           𝑘
+                                                                                                                       𝑘=0 𝐶𝑖 ⋅ 𝑓𝑖𝐿𝑂 (𝐶𝑖 ) with the normalized processor speed
+1.0 × 10−6 .                                                                                         𝑆𝑚𝑎𝑥 = 1. In addition, the processor speed 𝑠 should not be lower than
+    In this example, 𝑡𝑚𝑎𝑥 = 20. 0 < 𝑡 < 10, 0 < 𝑡𝑠 < 𝑡, we have                                      𝑆𝑐 𝑟𝑖𝑡 , where 𝑆𝑐 𝑟𝑖𝑡 (𝑆𝑐 𝑟𝑖𝑡√< 𝑆𝑚𝑎𝑥 ) is an energy-efficient speed while it can
+                          𝑡−𝐷                    𝑡                                                                                   𝑃𝑖𝑛𝑑
+𝑚𝑖 = −1 (𝑚𝑖 = ⌊ 𝑇 𝑖 ⌋), 𝑘𝑖 = 0 (𝑘𝑖 = ⌊ 𝑇𝑠 ⌋), 𝑎𝑖 = 0, 𝑐𝑖 = 0, and                                    be computed 𝑆𝑐 𝑟𝑖𝑡 = 𝑚                    [39].
+                            𝑖                      𝑖                                                                               (𝑚−1)⋅𝐶𝑒𝑓
+𝑏𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (10),  𝐿 (𝜏𝑖 , 𝑡) = (0). In                                  To facilitate comparisons between task sets with varying hyper-
+addition, we have  𝐿 (𝑡) = (0) from Eq. (11). From Eq. (12), we                                  periods, we utilize the definition of normalized energy consumption of
+have  (𝐽𝐿 , 𝑡) = (0) for LO tasks 𝜏1 and 𝜏3 . Moreover, we have                                  task set 𝛤 within its hyper-period [22] (i.e., its power consumption):
+ (𝐽𝐻 , 𝑡) = (0) for HI task 𝜏2 from Eq. (13). Therefore, we have                                                           ℎ𝑖
+                                                                                                                    1     ∑𝑛 ∑
+                                                                                                                                                 𝑥
+ 𝐻                                𝐻
+      𝐿 (𝜏1 , 𝑡) = (0) and  𝐿 (𝜏3 , 𝑡) = (0) from Eq. (14). Due to                              𝑁 𝐸(𝛤 ) =                    (𝑃 + 𝐶𝑒𝑓 𝑠𝑚 ) ⋅ 𝑖                                    (22)
+𝑘2 = 0, 𝑎2 = 0, 𝑏2 = 0 and 𝐷2 > 𝑡−𝑡𝑠 , we have  (1) = (0),  (2) =                                         𝐻 𝑃 (𝛤 ) 𝑖=1 𝑗=1 𝑖𝑛𝑑             𝑠
+(0), and max{ (2)} ≤ max{ (1)}. According to Eq. (15), we                                                 ∑𝑛
+                                                                                                                                          𝑥𝑖
+have  𝐻    𝐻 (𝜏2 , 𝑡) = (0). We calculate  𝐻 (𝑡) = (0) from Eq. (17).
+                                                                                                             =       (𝑃𝑖𝑛𝑑 + 𝐶𝑒𝑓 𝑠𝑚 ) ⋅        ,
+                                                                                                                 𝑖=1
+                                                                                                                                        𝑠 ⋅ 𝑇𝑖
+Therefore, we have max{ 𝐿 (𝑡)} ≤ 𝑡 and max{ 𝐻 (𝑡) ≤ 𝑡}.
+    When 𝑡 = 10, 𝑚1 = 0, 𝑚2 = −1, 𝑚3 = 0, 𝑘𝑖 = 0, 𝑎𝑖 = 0, 𝑐𝑖 = 0, and                                where ℎ𝑖 = 𝐻 𝑃 (𝛤 )∕𝑇𝑖 is the number of jobs of task 𝜏𝑖 ∈ 𝛤 released in
+𝑏𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (10), we have  𝐿 (𝜏1 , 𝑡) = 1𝐿𝑂 ,                         the hyper-period 𝐻 𝑃 (𝛤 ).
+
+                                                                                                 6
+Y.-W. Zhang and J.-L. Zhang                                                                                                       Journal of Systems Architecture 160 (2025) 103361
+
+
+5.2. Calculating energy-efficient processor speeds                                     Table 4
+                                                                                       Taskset parameters of 𝛤3 , with 𝐶1𝑑 𝑒𝑔 = 1.5, 𝐶2𝑡ℎ𝑟 = 2, 𝐶3𝑑 𝑒𝑔 = 2.
+
+   We determine the energy-efficient processor speed in LO mode 𝑆𝐿                      Task      𝐿𝑖      𝑇𝑖 = 𝐷𝑖     𝑖𝐿𝑂                              𝑖𝐻 𝐼
+and schedule the tasks with 𝑆𝑚𝑎𝑥 = 1 in HI mode if an IMC taskset 𝛤 is                                                ⎛1      1.5       2      2.5 ⎞    ⎛1      1.5⎞
+                                                                                        𝜏1        LO      10          ⎜0.1    0.4     0.35    0.15⎟     ⎜0.1    0.9⎟
+deterministically schedulable by EDF on a single processor.                                                           ⎜                            ⎟    ⎜          ⎟
+                                                                                                                      ⎝0.1    0.5     0.85    1.0 ⎠     ⎝0.1    1.0⎠
+   A taskset 𝛤 running on a processor with speed 𝑆𝐿 is equivalent
+                                                                                                                      ⎛ 1        2 ⎞                    ⎛ 1        2      4      5 ⎞
+to the taskset 𝛤 ∗ running on a processor with speed 𝑆max = 1 with                      𝜏2        HI      20          ⎜0.01    0.99⎟                    ⎜0.01    0.49   0.45   0.05⎟
+                                                                                                                      ⎜             ⎟                   ⎜                           ⎟
+proportionally-scaled execution times 1∕𝑆𝐿 times of each task in 𝛤 .                                                  ⎝0.01     1.0 ⎠                   ⎝0.01     0.5   0.95    1.0 ⎠
+Therefore, the probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤 with                                                 ⎛1.5     2      2.5     3⎞        ⎛1.5     2⎞
+                                                                                        𝜏3        LO      10          ⎜0.2    0.3     0.4    0.1⎟       ⎜0.2    0.8⎟
+speed 𝑆𝐿 within [0, 𝑡) in LO mode can be calculated as follows:                                                       ⎜                         ⎟       ⎜          ⎟
+                                                                                                                      ⎝0.2    0.5     0.9    1.0⎠       ⎝0.2    1.0⎠
+ 𝐿 (𝜏𝑖 , 𝑡) = ([[𝑚𝑖 + 1]]0 ) ⊙ ((1∕𝑆𝐿 ) ⊙ 𝑖𝐿𝑂 ),                     (23)
+
+   The probabilistic processor demand of a carry-over job released by
+LO task 𝜏𝑖 with speed 𝑆𝐿 within [0, 𝑡) can be calculated as follows:                   the energy-efficient task execution model based on DVFS as shown below.
+                 {
+                   (1∕𝑆𝐿 ) ⊙ 𝑖𝐿𝑂 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡                                           Energy-efficient task execution model in probabilistic IMC. The
+ (𝐽𝐿 , 𝑡) =                                                      (24)               system is first initialized to be in LO mode with processor speed 𝑆𝐿 . If
+                   (0),                𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
+                                                                                       any HI task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 executes beyond its 𝐶𝑖𝑡ℎ𝑟 ∕𝑆𝐿 , the system switches
+   The probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤𝐿𝑂 with speed                  into HI mode, with processor speed 𝑆𝑚𝑎𝑥 = 1. As the mode-switch
+𝑆𝐿 within [0, 𝑡) in HI mode can be calculated as follows:                              instant, if jobs of LO tasks have run for longer than their 𝐶𝑖𝑑 𝑒𝑔 ∕𝑆𝐿 , the
+                                              ⨂
+ 𝐻                                     𝐿𝑂
+     𝐿 (𝜏𝑖 , 𝑡) =((𝑘𝑖 ) ⊙ ((1∕𝑆𝐿 ) ⊙ 𝑖 ))                        (25)               jobs will be stopped until new released. In addition, if the execution
+                              ⨂                                                        time of LO jobs is less than 𝐶𝑖𝑑 𝑒𝑔 ∕𝑆𝐿 by the switch time instant, these
+                                             𝐻𝐼
+                  (𝐽𝐿 , 𝑡)    ((𝑐𝑖 ) ⊙ 𝑖 ).
+                                                                                       carry-over jobs will continue to execute the leftover execution up to
+   In addition, the system schedules tasks with 𝑆𝐿 in LO mode and                      𝐶𝑖𝑙𝑒𝑓 𝑡𝑜𝑣𝑒𝑟 after the switch time instant and before their deadlines, where
+𝑆𝑚𝑎𝑥 = 1 in HI mode,  (1) and  (2) in Eq. (16) are calculated                    𝐶𝑖𝑙𝑒𝑓 𝑡𝑜𝑣𝑒𝑟 is the leftover execution time at the nominal processor speed
+by Eqs. (26) and (27), respectively.                                                   𝑆𝑚𝑎𝑥 = 1. While in HI mode, each LO task 𝜏𝑖 ∈ 𝛤𝐿𝑂 executes no more
+                                       ⨂                                               than its 𝐶𝑖𝑑 𝑒𝑔 if it is started in HI mode, or its 𝐶𝑖𝑙𝑒𝑓 𝑡𝑜𝑣𝑒𝑟 if it is a leftover
+ (1) =((𝑏𝑖 ) ⊙ ((1∕𝑆𝐿 ) ⊙ 𝑖𝐿𝑂 ))                         (26)
+                       ⨂                                                               job started in LO mode. The system switches back to LO mode, with
+                                     𝐻𝐼
+           (𝐽𝐻 , 𝑡)   ((𝑎𝑖 ) ⊙ 𝑖 ).                                               processor speed 𝑆𝐿 , at an idle instant if no jobs wait for executions at
+                                                                                       this time. In addition, incomplete tasks are dropped at their deadlines,
+                                          ⨂                                            hence there does not exist a backlog of outstanding execution at the
+ (2) = ((𝑘𝑖 ) ⊙ ((1∕𝑆𝐿 ) ⊙ 𝑖𝐿𝑂 ))         (𝐽𝐻 , 𝑡).               (27)        end of each hyper-period.
+
+                                                                                       6. Experimental evaluation
+Theorem 4. Given an IMC taskset 𝛤 that is deterministically schedulable
+by EDF on a single processor, it remains deterministically schedulable with
+                                                                                           We evaluate our approach based on two performance metrics: the
+the energy-efficient processor speed 𝑆𝐿 in LO mode and 𝑆𝑚𝑎𝑥 = 1 in HI
+                                                                                       schedulability ratio, which represents the proportion of schedulable task
+mode if 0 < ∀𝑡 ≤ 𝑡𝑚𝑎𝑥 , 0 < 𝑡𝑠 < 𝑡
+                                                                                       sets (either deterministically or probabilistically schedulable) out of all
+max{ 𝐿 (𝑡)} ≤ 𝑡,      𝑎𝑛𝑑    max{ 𝐻 (𝑡)} ≤ 𝑡,                        (28)        task sets; and the normalized energy consumption of each task set, as
+                                                                                       defined in Eq. (22).
+where 𝑆𝑐 𝑟𝑖𝑡 ≤ 𝑆𝐿 ≤ 1,  𝐿 (𝜏𝑖 , 𝑡),  (𝐽𝐿 , 𝑡),  𝐻
+                                                         𝐿 (𝜏𝑖 , 𝑡),  (1) and
+                                                                                           We generate synthetic tasksets based on the following experiment
+ (2) are given in Eqs. (23)–(27), respectively.
+                                                                                       settings:
+
+Proof. Theorem 4 can be directly derived from Theorem 3.                                     • Number of tasks in each taskset 𝛤 is set to 𝑛 = 4.
+                                                                                             • Number of HI tasks in 𝛤 is set to 𝑛 ⋅ 𝐶 𝑃 , where the Criticality
+                                                                                               Proportion 𝐶 𝑃 is set to 𝐶 𝑃 = 0.5.
+5.3. Example 3
+                                                                                             • Number of discrete values of each task 𝜏𝑖 ’s nominal pWCET 𝑖 is
+                                                                                               set to 𝐾 = 4.
+     Let us consider the task set 𝛤3 that consists of tasks with the param-
+                                                                                             • Each of the 𝐾 probability values in the PMF of 𝑖 is selected
+eters presented in Table 4. The processor has tens discrete normalized
+                                                                                               randomly from [0, 1) while ensuring that they sum to 1 (similar
+processor speed, i.e., [0.1, 0.2, … , 1.0] [45]. According to Theorem 3, the
+                                                                                               to [46,47]).
+taskset is deterministically schedulable in both modes. We calculate
+                                                                                             • For each LO task 𝜏𝑖 ∈ 𝛤𝐿𝑂 , the index of the Degraded WCET 𝐶𝑖𝑑 𝑒𝑔
+𝑆𝐿 = 0.8 on the basis of Theorem 4, by iteratively trying out the
+                                                                                               among the 𝐾 discrete values of 𝑖 is set to 𝑖𝑛𝑑(𝐶𝑖𝑑 𝑒𝑔 ) = 0.5𝐾−1 = 1.
+available speeds, from lowest to highest, until we find the minimum
+speed that satisfies all constraints. According to Eq. (21), we have
+                                                                                             • For each HI task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 , the index of the Threshold WCET 𝐶𝑖𝑡ℎ𝑟
+𝑥̄ 1 = 1.775, 𝑥̄ 2 = 1.99, 𝑥̄ 3 = 2.2. In addition, we can then use Eq. (22) to
+                                                                                               among the 𝐾 discrete values of 𝑖 is set to 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) = 0.5𝐾 − 1 = 1.
+obtain the taskset’s normalized energy consumption to be 0.3242925
+with processor speed 𝑆𝐿 = 0.8 with DVFS, and 0.50197 with processor
+                                                                                             • 𝑇𝑖 is randomly selected the set {10, 20, 40, 50, 100, 200, 400, 500,
+speed 𝑆max = 1 for EDF without DVFS, which represents significant
+                                                                                               1000} [48].
+energy savings.
+                                                                                             • To control taskset processor utilization, max{𝐿𝑂𝐿𝑂 (𝛤 )} is varied
+
+                                                                                               from 0.1 to 0.9, in steps of 0.1, while max{𝐻𝐻𝐼𝐼 (𝛤 )} is chosen
+5.4. Energy-efficient task execution model
+                                                                                               randomly from the range [0.1, 1.0].
+    Assuming that the system is deterministically schedulable in both                     (Each task 𝜏𝑖 ’s pWCET 𝑖 and period 𝑇𝑖 are implicit, since both sys-
+modes, we can use DVFS to reduce the processor speed to 𝑆𝐿 in LO                       tem schedulability and normalized energy consumption are dependent
+mode, and set to 𝑆𝑚𝑎𝑥 = 1 in HI mode, while maintaining schedulability                 on the utilization values only, i.e., pWCU equal to pWCET divided by
+in both modes. We modify the task execution model in Section 3.1 to be                 period.) Note that the time overhead of the proposed method is mainly
+
+                                                                                   7
+Y.-W. Zhang and J.-L. Zhang                                                                                                  Journal of Systems Architecture 160 (2025) 103361
+
+
+
+
+Fig. 2. Impact on the schedulability ratio by varying the permitted system failure
+                        𝐿𝑂
+probability 𝐹𝑠 and max{𝐿𝑂 (𝛤 )}.
+
+
+
+
+spent on the schedulability test, with significant time consumption
+arising from the calculation of the probabilistic processor demands for
+the task set, which involves a large number of convolution operations.
+As the number of tasks increases, the time overhead grows exponen-
+tially. To maintain the accuracy of the scheduling test, we have not
+yet identified better methods to reduce the time overhead. Hence, we
+have limited the number of tasks to four. In the future, we will strive
+to reduce the time overhead associated with convolutions.
+    In the first experiment, we vary 𝐹𝑠 from 10−1 to 10−9 with a step
+size of 10 by multiplication, i.e., 𝐹𝑠 is plotted with log scale. The value
+𝐹𝑠 = 10−9 is based on the permitted failure probability of 10−9 for ASIL
+D, the highest safety certification level in ISO 26262. The additional
+case of 𝐹𝑠 = 0 is the special case of deterministic schedulability only for
+                                                                                          Fig. 3. Varying each HI task’s Threshold WCET index 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) and max{𝐿𝑂
+                                                                                                                                                                   𝐿𝑂
+                                                                                                                                                                      (𝛤 )}.
+hard real-time systems. Fig. 2 shows the results, where each data point
+represents the average outcome obtained from a variable number of
+task sets selected from 500 synthetic tasksets generated for each value
+of max{𝐿𝑂 𝐿𝑂 (𝛤 )}, using different seeds for the pseudo-random number                      • The schedulability ratio is negatively correlated with max
+                                                                                                  𝐿𝑂 (𝛤 )}, as expected.
+                                                                                               {𝐿𝑂
+generator.
+                                                                                             • The schedulability ratio is negatively correlated with 𝐶𝑖𝑡ℎ𝑟 . With
+   We make the following observations from Fig. 2:
+                                                                                               increasing 𝐶𝑖𝑡ℎ𝑟 , HI tasks have larger WCETs (both expected
+                                                                                               and maximum) in LO mode according to the trimming opera-
+    • The schedulability ratio is positively correlated with 𝐹𝑠 , con-                         tion for pWCET defined in Eq. (2), causing max{ 𝐿 (𝑡)} and
+      firming the significant advantages of considering probabilistic                          max{ 𝐻 (𝑡)} to increase, which reduces system schedulability.
+      schedulability compared to considering deterministic schedulabil-                      • The average normalized energy consumption 𝑁 𝐸(𝛤 ) is positively
+      ity only, even at very small values of 𝐹𝑠 for high levels of safety                      correlated with max{𝐿𝑂  𝐿𝑂 (𝛤 )}. From Eq. (22), 𝑁 𝐸(𝛤 ) is depen-
+      certification.                                                                           dent on each task’s expected pWCET 𝑥𝑖 and the energy-efficient
+    • The schedulability ratio is negatively correlated with max                               processor speed in LO mode 𝑆𝐿 . With increasing max{𝐿𝑂       𝐿𝑂 (𝛤 )},
+      {𝐿𝑂 𝐿𝑂 (𝛤 )}, since both max{ (𝑡)} and max{ (𝑡)} increase
+                                            𝐿                 𝐻                                both 𝑥𝑖 and 𝑆𝐿 increase, causing 𝑁 𝐸(𝛤 ) to increase.
+      with increasing max{𝐿𝑂    𝐿𝑂 (𝛤 )}, which reduces system schedulabil-
+                                                                                             • 𝑁 𝐸(𝛤 ) is positively correlated with 𝐶𝑖𝑡ℎ𝑟 . With increasing 𝐶𝑖𝑡ℎ𝑟 , HI
+      ity.                                                                                     task 𝜏𝑖 has a larger expected pWCET in LO mode, causing both 𝑥𝑖
+                                                                                               and 𝑆𝐿 to increase, which in turn causes 𝑁 𝐸(𝛤 ) to increase.
+   In the second experiment, we fix the permitted system failure prob-
+ability to be 𝐹𝑠 = 10−7 (based on the requirement for ASIL A in ISO                         Averaged over all cases, our approach achieves an average reduction
+26262). We vary each HI task’s 𝐶𝑖𝑡ℎ𝑟 through varying its index 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 )               of 33.49% for the average normalized energy consumption compared
+from 0 to 𝐾 − 1 with step size 1, i.e., the sequence {0, 1, 2, 3} (The                   to EDF without DVFS.
+case of 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) = 3 is the special case where each HI task 𝜏𝑖 has the
+                                                                                         7. Practical considerations
+same WCET in both modes.). Each LO task’s 𝐶𝑖𝑑 𝑒𝑔 is fixed to be the
+default value of 𝑖𝑛𝑑(𝐶𝑖𝑑 𝑒𝑔 ) = 1. The results are shown in Fig. 3, including
+                                                                                            In this section, we address some practical considerations in trans-
+both the schedulability ratio, and the normalized energy consumption
+                                                                                         posing our proposal into to industry practice.
+(𝑁 𝐸(𝛤 ) defined in Eq. (22)). Each data point represents the average
+                                                                                            Timing analysis for pWCET. Task 𝜏𝑖 ’s pWCET 𝑖 , as specified
+outcome obtained from a variable number of task sets selected from 500
+                                                         𝐿𝑂 (𝛤 )}, depending             by its PMF, may be obtained via static, dynamic or measurement-
+synthetic tasksets generated for each value of max{𝐿𝑂
+                        𝑡ℎ𝑟                                                              based, or hybrid timing analysis methods, as discussed in the survey
+on the value of 𝑖𝑛𝑑(𝐶𝑖 ).
+                                                                                         paper [49]. Static Probabilistic Timing Analysis (SPTA) is based on
+   We make the following observations from Fig. 3:                                       the analysis of the program code, along with an abstract model of the
+
+                                                                                     8
+Y.-W. Zhang and J.-L. Zhang                                                                                          Journal of Systems Architecture 160 (2025) 103361
+
+
+hardware behavior. Measurement-Based Probabilistic Timing Analysis              CRediT authorship contribution statement
+(MBPTA) typically applies Extreme Value Theory (EVT) to make a
+statistical estimate of the pWCET distribution of a program. Hybrid                Yi-Wen Zhang: Writing – review & editing, Writing – original draft,
+Probabilistic Timing Analysis (HyPTA) combines both statistical and             Methodology, Funding acquisition, Formal analysis, Conceptualization.
+analytical approaches, e.g., by taking measurements at the level of basic       Jin-Long Zhang: Writing – original draft, Visualization, Software, Data
+blocks or sub-paths, and then composing the results using structural            curation.
+information obtained from static analysis of the code.
+    Number of discrete value (𝐾) of pWCET 𝑖 . The value of 𝐾                   Declaration of competing interest
+determines the granularity of modeling the pWCET’s PMF: larger 𝐾
+implies finer granularity modeling, but may not be well-supported by
+                                                                                    The authors declare that they have no known competing finan-
+timing analysis techniques, and also leads to higher computational costs
+                                                                                cial interests or personal relationships that could have appeared to
+in schedulability analysis. The typical value of 𝐾 is 2-8 [5], although
+                                                                                influence the work reported in this paper.
+there is no hard lower or upper bound on its value. Our experiments
+with 𝐾 varying from 4 to 8 indicate that its value does not affect
+system schedulability and power consumption significantly, indicating           Acknowledgments
+that 𝐾 = 4 already provides sufficiently fine granularity modeling
+under our experimental setup.                                                      This work has been supported by the Natural Science Foundation
+    PMF of pWCET 𝑖 . In the absence of real industry tasksets, we              of Fujian Province of China under Grant 2023J01139 and the Funda-
+need to generate each task’s pWCET 𝑖 synthetically, as defined by              mental Research Funds for the Central Universities, China under Grant
+the PMF. There is no clear consensus on the generation method in the            ZQN-1009.
+literature on probabilistic schedulability analysis. An early work Edgar
+and Burns [50] used the trimmed and scaled Gumbel distribution to               Data availability
+model likely WCET values; Draskovic [36] used the Weibull distribution
+with an upper bound, which was used for modeling the distribution                  No data was used for the research described in the article.
+of long but unlikely execution times based on EVT [51] (the Log of a
+Weibull distribution is a Gumbel distribution); Wang et al. [46] and
+Markovic et al. [47] adopted the uniform random distribution; Bozhko            References
+et al. [52] assumed two execution modes for each task in an MCS: a
+typical mode and a rare exceptional mode. Its pWCET is equal to 𝑐                [1] Alan Burns, Robert Ian Davis, Mixed criticality systems-a review:(february 2022),
+with probability .95 (the typical mode), and 4𝑐 with probability .05                 2022, pp. 1–97, https://eprints.whiterose.ac.uk/183619/.
+                                                                                 [2] Steve Vestal, Preemptive scheduling of multi-criticality systems with varying
+(the exceptional mode), where 𝑐 was scaled to match the expected task
+                                                                                     degrees of execution time assurance, in: 28th IEEE International Real-Time
+utilization. In this paper, we adopt the simple approach of the uniform              Systems Symposium, RTSS 2007, IEEE, 2007, pp. 239–243.
+random distribution similar to [46,47].                                          [3] Yi-Wen Zhang, Jin-Peng Ma, Hui Zheng, Zonghua Gu, Criticality-aware EDF
+    Runtime overhead of DVFS. The overhead of varying the pro-                       scheduling for constrained-deadline imprecise mixed-criticality systems, IEEE
+cessor speed with DVFS is assumed to be zero. This is a common                       Trans. Comput.-Aided Des. Integr. Circuits Syst. 43 (2) (2024) 480–491.
+                                                                                 [4] Yi-Wen Zhang, Hui Zheng, Slack time management for imprecise mixed-criticality
+assumption adopted in the DVFS literature [7]. We can determine
+                                                                                     systems with reliability constraints, IEEE Trans. Comput. (2025).
+through offline measurement an upper bound on the processor speed                [5] Robert I. Davis, Liliana Cucu-Grosjean, A survey of probabilistic schedulability
+transition overhead, which is typically relatively small compared to the             analysis techniques for real-time systems, Leibniz Trans. Embed. Syst. 6 (1)
+WCET of the task, hence it can be added to each task’s execution time                (2019) 04:1–04:53.
+without a significant impact on the solution.                                    [6] Yi-Wen Zhang, Rong-Kun Chen, A survey of energy-aware scheduling in
+    Multiprocessor platforms. Our work can be easily extended to                     mixed-criticality systems, J. Syst. Archit. 127 (2022) 102524.
+                                                                                 [7] Ashikahmed Bhuiyan, Federico Reghenzani, William Fornaciari, Zhishan Guo,
+multi-processor platforms by a partitioned scheduling approach [31,32,
+                                                                                     Optimizing energy in non-preemptive mixed-criticality scheduling by exploiting
+53]. In partitioned scheduling, tasks are statically assigned to proces-             probabilistic information, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
+sors, with each processor managed by a local scheduler. We can use                   39 (11) (2020) 3906–3917.
+simple allocation methods, e.g., Criticality-unaware worst-fit decreas-          [8] Yi-Wen Zhang, Chen Ouyang, Semi-clairvoyant scheduling in non-preemptive
+ing (CU-WFD), and criticality-aware first-fit decreasing (CA-FFD), to                fixed-priority mixed-criticality systems, J. Syst. Archit. 159 (2025) 103332.
+                                                                                 [9] Qingling Zhao, Mengfei Qu, Zonghua Gu, Haibo Zeng, Minimizing stack memory
+allocate tasks to each processor while using an Energy-Efficient Task
+                                                                                     for partitioned mixed-criticality scheduling on multiprocessor platforms, ACM
+Execution Model to schedule tasks in each processor.                                 Trans. Embed. Comput. Syst. (TECS) 21 (2) (2022) 1–30.
+                                                                                [10] Benny Akesson, Mitra Nasri, Geoffrey Nelissen, Sebastian Altmeyer, Robert I
+8. Conclusions and future work                                                       Davis, A comprehensive survey of industry practice in real-time systems,
+                                                                                     Real-Time Syst. (2021) 1–41.
+    The classic MCS task model has several restrictive assumptions,             [11] Georg von der Brüggen, Nico Piatkowski, Kuan-Hsun Chen, Jian-Jia Chen,
+                                                                                     Katharina Morik, Björn B Brandenburg, Efficiently approximating the worst-
+including hard real-time constraints, dropping LO tasks in HI mode,
+                                                                                     case deadline failure probability under EDF, in: 2021 IEEE Real-Time Systems
+and lack of consideration of power/energy consumption issues. In                     Symposium, RTSS, IEEE, 2021, pp. 214–226.
+this paper, we relax these assumptions to make the MCS task model               [12] Alexandre Esper, Geoffrey Nelissen, Vincent Nélis, Eduardo Tovar, An industrial
+more practically applicable. We consider an IMC taskset scheduled                    view on the common academic understanding of mixed-criticality systems,
+with the EDF algorithm on a uniprocessor platform, and propose an                    Real-Time Syst. 54 (3) (2018) 745–795.
+                                                                                [13] Sanjoy Baruah, Alan Burns, Implementing mixed criticality systems in ADA, in:
+Energy-Efficient Task Execution Model that guarantees (deterministic
+                                                                                     International Conference on Reliable Software Technologies, Springer, 2011, pp.
+or probabilistic) schedulability, allows degraded QoS to LO tasks in HI              174–188.
+mode, and applies DVFS to save energy.                                          [14] Sanjoy K. Baruah, Alan Burns, Robert I. Davis, Response-time analysis for mixed
+    In this paper, we have considered EDF-based uniprocessor schedul-                criticality systems, in: 2011 IEEE 32nd Real-Time Systems Symposium, IEEE
+ing, dual-criticality MCS, and task execution time as probabilistic vari-            Computer Society, 2011, pp. 34–43.
+ables. As part of future work, these assumptions can be further relaxed         [15] François Santy, Gurulingesh Raravi, Geoffrey Nelissen, Vincent Nelis, Pratyush
+                                                                                     Kumar, Joël Goossens, Eduardo Tovar, Two protocols to reduce the critical-
+to fixed-priority scheduling, multi-processor platforms, multiple crit-              ity level of multiprocessor mixed-criticality systems, in: Proceedings of the
+icality levels, and the multiple task parameters (e.g., task period)                 21st International Conference on Real-Time Networks and Systems, 2013, pp.
+represented by random variables.                                                     183–192.
+
+
+                                                                            9
+Y.-W. Zhang and J.-L. Zhang                                                                                                            Journal of Systems Architecture 160 (2025) 103361
+
+
+[16] Sanjoy Baruah, Vincenzo Bonifaci, Gianlorenzo DAngelo, Haohan Li, Alberto                   [39] Yifeng Guo, Dakai Zhu, Hakan Aydin, Jian-Jun Han, Laurence T Yang, Exploit-
+     Marchetti-Spaccamela, Suzanne Van Der Ster, Leen Stougie, The preemptive                         ing primary/backup mechanism for energy efficiency in dependable real-time
+     uniprocessor scheduling of mixed-criticality implicit-deadline sporadic task sys-                systems, J. Syst. Archit. 78 (2017) 68–80.
+     tems, in: 2012 24th Euromicro Conference on Real-Time Systems, IEEE, 2012,                  [40] Yi-Wen Zhang, System level fixed priority energy management algorithm for
+     pp. 145–154.                                                                                     embedded real time application, Microprocess. Microsyst. 64 (2019) 170–177.
+[17] Di Liu, Nan Guan, Jelena Spasic, Gang Chen, Songran Liu, Todor Stefanov, Wang               [41] Yi-Wen Zhang, Chu-Gui Xu, Low power fixed priority scheduling sporadic task
+     Yi, Scheduling analysis of imprecise mixed-criticality real-time tasks, IEEE Trans.              with shared resources in hard real time systems, Microprocess. Microsyst. 45
+     Comput. 67 (7) (2018) 975–991.                                                                   (2016) 164–175.
+[18] Hang Su, Nan Guan, Dakai Zhu, Service guarantee exploration for mixed-                      [42] Wei Jiang, Xiong Pan, Ke Jiang, Liang Wen, Qi Dong, Energy-aware design of
+     criticality systems, in: 2014 IEEE 20th International Conference on Embedded                     stochastic applications with statistical deadline and reliability guarantees, IEEE
+     and Real-Time Computing Systems and Applications, IEEE, 2014, pp. 1–10.                          Trans. Comput.-Aided Des. Integr. Circuits Syst. 38 (8) (2019) 1413–1426.
+[19] Robert I. Davis, Alan Burns, Iain Bate, Compensating adaptive mixed criticality             [43] Yi-Wen Zhang, Hui Zheng, Energy-aware fault-tolerant scheduling for imprecise
+     scheduling, in: Proceedings of the 30th International Conference on Real-Time                    mixed-criticality systems with semi-clairvoyance, J. Syst. Archit. 151 (2024)
+     Networks and Systems, Association for Computing Machinery, 2022, pp. 81–93.                      103141.
+[20] Zhe Jiang, Xiaotian Dai, Alan Burns, Neil Audsley, Zonghua Gu, Ian Gray, A                  [44] Yi-Wen Zhang, Hui Zheng, Energy-aware reliability guarantee scheduling with
+     high-resilience imprecise computing architecture for mixed-criticality systems,                  semi-clairvoyant in mixed-criticality systems, J. Syst. Archit. 156 (2024) 103269.
+     IEEE Trans. Comput. (2022).                                                                 [45] Baoxian Zhao, Hakan Aydin, Dakai Zhu, Energy management under general
+[21] Yi-Wen Zhang, Rui-Feng Guo, Low-power scheduling algorithms for sporadic                         task-level reliability constraints, in: 2012 IEEE 18th Real Time and Embedded
+     task with shared resources in hard real-time systems, Comput. J. 58 (7) (2015)                   Technology and Applications Symposium, IEEE, 2012, pp. 285–294.
+     1585–1597.                                                                                  [46] Tianyi Wang, Soamar Homsi, Linwei Niu, Shaolei Ren, Ou Bai, Gang Quan,
+[22] Pengcheng Huang, Pratyush Kumar, Georgia Giannopoulou, Lothar Thiele, En-                        Meikang Qiu, Harmonicity-aware task partitioning for fixed priority scheduling
+     ergy efficient dvfs scheduling for mixed-criticality systems, in: 2014 International             of probabilistic real-time tasks on multi-core platforms, ACM Trans. Embed.
+     Conference on Embedded Software, EMSOFT, IEEE, 2014, pp. 1–10.                                   Comput. Syst. (TECS) 16 (4) (2017) 1–21.
+[23] Yi-Wen Zhang, Energy-aware mixed-criticality sporadic task scheduling algo-                 [47] Filip Markovic, Thomas Nolte, Alessandro Vittorio Papadopoulos, Analytical
+     rithm, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 40 (1) (2021)                       approximations in probabilistic analysis of real-time systems, in: Proceedings of
+     78–86.                                                                                           the 43rd IEEE Real-Time Systems Symposium, RTSS, IEEE, 2022.
+[24] Yi-Wen Zhang, Rong-Kun Chen, Energy aware fixed priority scheduling in                      [48] Jonah Caplan, Zaid Al-Bayati, Haibo Zeng, Brett H. Meyer, Mapping and
+     mixed-criticality systems, Comput. Stand. Interfaces 83 (2023) 103671.                           scheduling mixed-criticality systems with on-demand redundancy, IEEE Trans.
+[25] Yi-Wen Zhang, Energy efficient non-preemptive scheduling of imprecise                            Comput. 67 (4) (2017) 582–588.
+     mixed-criticality real-time tasks, Sustain. Comput.: Inform. Syst. 37 (2023)                [49] Robert I. Davis, Liliana Cucu-Grosjean, A survey of probabilistic timing analysis
+     100840.                                                                                          techniques for real-time systems, LITES: Leibniz Trans. Embed. Syst. (2019) 1–60.
+[26] Yi-Wen Zhang, Ning Cai, Energy efficient EDF-VD-based mixed-criticality                     [50] Stewart Edgar, Alan Burns, Statistical analysis of WCET for scheduling, in:
+     scheduling with shared resources, J. Syst. Archit. 119 (2021) 102246.                            Proceedings 22nd IEEE Real-Time Systems Symposium (RTSS 2001)(Cat. No.
+[27] Y.-W. Zhang, Energy aware algorithm based on actual utilization for periodic                     01PR1420), IEEE, 2001, pp. 215–224.
+     tasks in mixed-criticality real-time systems, Comput. Stand. Interfaces 79 (2022)           [51] Liliana Cucu-Grosjean, Luca Santinelli, Michael Houston, Code Lo, Tullio Var-
+     103563.                                                                                          danega, Leonidas Kosmidis, Jaume Abella, Enrico Mezzetti, Eduardo Quinones,
+[28] Yi-Wen Zhang, DVFS-based energy-aware scheduling of imprecise mixed-                             Francisco J Cazorla, Measurement-based probabilistic timing analysis for multi-
+     criticality real-time tasks, J. Syst. Archit. 137 (2023) 102849.                                 path programs, in: 2012 24th Euromicro Conference on Real-Time Systems, IEEE,
+[29] Sujay Narayana, Pengcheng Huang, Georgia Giannopoulou, Lothar Thiele,                            2012, pp. 91–101.
+     R Venkatesha Prasad, Exploring energy saving for mixed-criticality systems                  [52] Sergey Bozhko, Georg von der Brüggen, Björn Brandenburg, Monte carlo
+     on multi-cores, in: 2016 IEEE Real-Time and Embedded Technology and                              response-time analysis, in: IEEE 42nd Real-Time Systems Symposium, IEEE, 2021,
+     Applications Symposium, RTAS, IEEE, 2016, pp. 1–12.                                              pp. 342–355.
+[30] Behnaz Ranjbar, Tuan D.A. Nguyen, Alireza Ejlali, Akash Kumar, Power-aware                  [53] Yi-Wen Zhang, Rong-Kun Chen, Energy-efficient scheduling of imprecise mixed-
+     runtime scheduler for mixed-criticality systems on multicore platform, IEEE                      criticality real-time tasks based on genetic algorithm, J. Syst. Archit. 143 (2023)
+     Trans. Comput.-Aided Des. Integr. Circuits Syst. 40 (10) (2021) 2009–2023.                       102980.
+[31] Yi-Wen Zhang, Rong-Kun Chen, Zonghua Gu, Energy-aware partitioned schedul-
+     ing of imprecise mixed-criticality systems, IEEE Trans. Comput.-Aided Des. Integr.
+     Circuits Syst. 42 (11) (2023) 3733–3742.                                                                               Yi-Wen Zhang (Senior Member, IEEE) received his Ph.D
+[32] Yi-Wen Zhang, Jin-Peng Ma, Zonghua Gu, Partitioned scheduling with shared                                              in Computer Application Technology from University of Chi-
+     resources on imprecise mixed-criticality multiprocessor systems, IEEE Trans.                                           nese Academy of Sciences in 2016. He was a Post-doctoral
+     Comput.-Aided Des. Integr. Circuits Syst. 44 (1) (2025) 65–76.                                                         Fellow with Shenyang Institute of Computing Technology,
+[33] Luca Santinelli, Laurent George, Probabilities and mixed-criticalities: the                                            Chinese Academy of Sciences from 2017 to 2019.
+     probabilistic c-space, in: Proceedings of WMC, 2015.                                                                       He has been an associate professor since 2020. He is
+[34] Dorin Maxim, Robert I Davis, Liliana Cucu-Grosjean, Arvind Easwaran, Prob-                                             named in the world’s top 2% of Scientists List 2023 and
+     abilistic analysis for mixed criticality systems using fixed priority preemptive                                       2024 by Stanford University. His current research interests
+     scheduling, in: Proceedings of the 25th International Conference on Real-Time                                          include real-time systems and low-power design.
+     Networks and Systems, 2017, pp. 237–246.
+[35] Jasdeep Singh, Luca Santinelli, Federico Reghenzani, Konstantinos Bletsas,
+     Zhishan Guo, Non-preemptive scheduling of periodic mixed-criticality real-time                                         Jin-Long Zhang received the B.E. degree in Software En-
+     systems, in: Proceedings of the 10th European Congress on Embedded Real-Time                                           gineering from Jiangxi Agricultural University in 2023. He
+     Systems, ERTS 2020, IEEE, 2020.                                                                                        is currently pursuing the MS degree in Huaqiao University.
+[36] Stefan Draskovic, Rehan Ahmed, Pengcheng Huang, Lothar Thiele, Schedulability                                          His current research interests include real-time systems and
+     of probabilistic mixed-criticality systems, Real-Time Syst. 57 (4) (2021) 397–442.                                     low power design.
+[37] Zhishan Guo, Sudharsan Vaidhun, Luca Satinelli, Samsil Arefin, Jun Wang,
+     Kecheng Yang, Mixed-criticality scheduling upon permitted failure probability
+     and dynamic priority, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 41
+     (1) (2021) 62–75.
+[38] Kuan-Hsun Chen, Mario Günzel, Georg von der Brüggen, Jian-Jia Chen, Critical
+     instant for probabilistic timing guarantees: Refuted and revisited, in: 2022 IEEE
+     Real-Time Systems Symposium, RTSS, IEEE, 2022, pp. 145–157.
+
+
+
+
+                                                                                            10
+
--- a/papers_txt/Editorial-Board_2025_Journal-of-Systems-Architecture.txt
+++ b/papers_txt/Editorial-Board_2025_Journal-of-Systems-Architecture.txt
@@ -0,0 +1,70 @@
+                             Embedded
+                             Software Design
+                             Journal of Systems Architecture
+
+                             The EUROMICRO Journal
+Editor-in-Chief
+Dr. Zonghua Gu
+Department of Computer Science, Hofstra University, USA
+
+
+Subject Area Editors                                                                W. Meng
+L. Almeida                                                                          Technical University of Denmark, Lyngby, Denmark
+Faculdade de Engenharia, Dept. of Electrical and Computer Engineering,              M. Nasri
+Universidade do Porto, Porto, Portugal                                              Department of Mathematics and Computer Science, Eindhoven University of
+J.H. Anderson                                                                       Technology, Eindhoven, the Netherlands
+Dept. of Computer Science, University of North Carolina at Chapel Hill,             G. Palermo
+Chapel Hill, North Carolina, USA                                                    Department of Electronics Information and Bioengineering,
+P. Bellavista                                                                       Polytechnic University of Milan, Italy
+Dept. Computer Science and Engineering (DISI), Alma Mater Studiorum,                L. Palopoli
+Università di Bologna, Bologna, Italy                                               Dipartimento di Ingegneria e Scienza dell’Informazione (DISI),
+C.-S. Bouganis                                                                      Università di Trento, Povo (Trento), Italy
+South Kensington Campus, Department of Electrical and Electronic                    S. Ren
+Engineering, Imperial College London, London, England, UK                           Department of Electrical and Computer Engineering, San Diego State University,
+L. Cassano                                                                          USA
+Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano,   S. Sarangi
+Italy                                                                               Department of Computer Science and Engineering, Indian Institute of
+G. Chen                                                                             Technology Delhi, India
+School of Computer Science and Engineering, Sun Yat-sen University,                 M. Schoeberl
+Guangzhou, China                                                                    DTU Informatics, Danmarks Tekniske Universitet (DTU), Richard Petersens
+M. García-Valls                                                                     Plads, Kongens Lyngby, Denmark
+Departamento de Ingeniería Telemática, Universidad Carlos III de Madrid,            Z. Shao
+Leganés, Madrid, Spain                                                              Dept. of Computing, The Hong Kong Polytechnic University, Hong Kong
+C. Gill                                                                             M. Staron
+Department of Computer Science and Engineering, Washington University, USA          Computer Science and Engineering, University of Gothenburg,
+A. Gokhale                                                                          Gothenburg, Sweden
+Dept. of Electrical Engineering and Computer Science, Vanderbilt University,        F. Tramarin
+Nashville, Tennessee, USA                                                           Dip. Gestione e Tecnica dei Sistemi Industriali (DTG), Università degli Studi di
+N. Guan                                                                             Padova, Vicenza, Italy
+Dept. of Computing, The Hong Kong Polytechnic University, Hong Kong                 M.A. Vega-Rodriguez
+J. Hu                                                                               ARCO Research Group, Dept. Technologies of Computers & Communications,
+Department of Electrical and Computer Engineering, University of Pittsburgh, USA    Universidad de Extremadura, Escuela Politecnica. Campus Universitario,
+Y. Jiang                                                                            Cáceres, Spain
+School of Software, Tsinghua University, China                                      S. Wan
+H. Kapoor                                                                           School of Information and Safety Engineering, Zhongnan University of
+Department of Computer Science and Engineering, Indian Institute of Technology      Economics and Law, China
+Guwahati, India                                                                     H. Wu
+A. Kritikakou                                                                       Center for Applied Mathematics, Tianjin University, China
+University of Rennes, Inria, Irisa and CNRS, France                                 G. Xie
+F. Li                                                                               College of Computer Science and Electronic Engineering, Hunan University,
+School of Computer Science and Engineering, University of Electronics Science and   Changsha, China
+Technology of China, China                                                          W. Xu
+S. Li                                                                               Zhejiang University College of Electrical Engineering, Hangzhou, China
+College of Computer Science, Zhejiang University Hangzhou, China                    H. Zeng
+G. Lima                                                                             Virginia Tech, Blacksburg, Virginia, USA
+Instituto de Matematica, Departamento de Ciencia da Computacao,                     Y. Zhang
+Federal University of Bahia, Salvador, Bahia, Brazil                                Department of Computer Science, University of Pittsburgh,
+M. Lin                                                                              Pittsburgh, Pennsylvania, USA
+Department of Computer Science, St. Francis Xavier University, Canada               Q. Zhao
+G. Lipari                                                                           Nanjing University of Science and Technology, Nanjing, China
+Ecole Normale Superieure (ENS) de Cachan, Cachan, France                            N. Zheng
+D. Liu                                                                              Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, China
+College of Computer Science and Technology, Chongqing University, Chongqing,        J. Zhou
+China                                                                               Department of Computer Science and Technology, Nanjing University of Science
+W. Liu                                                                              and Technology, China
+School of Computer Science and Engineering, Nanyang Technological University,       D. Zhu
+Singapore                                                                           Dept. of Computer Science, University of Texas at San Antonio, San Antonio,
+L. Lo Bello                                                                         Texas, USA
+Dipart. di Ingegneria Elettrica Elettronica e Informatica (DIEEI),
+Università degli Studi di Catania, Catania, Italy
+
--- a/papers_txt/Efficient-and-secure-multi-user-kNN-queries-with-_2026_Computer-Standards---.txt
+++ b/papers_txt/Efficient-and-secure-multi-user-kNN-queries-with-_2026_Computer-Standards---.txt
@@ -0,0 +1,999 @@
+                                                                 Computer Standards & Interfaces 97 (2026) 104112
+
+
+                                                                     Contents lists available at ScienceDirect
+
+
+                                                           Computer Standards & Interfaces
+                                                              journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+Efficient and secure multi-user 𝑘NN queries with dynamic POIs updating
+Yining Jia a,b,c , Yali Liu a,b,c ,∗, Congai Zeng a,b,c , Xujie Ding a,b,c , Jianting Ning d,e
+a
+    School of Artificial Intelligence and Computer Science, Jiangsu Normal University, Xuzhou, Jiangsu Province, 221116, China
+b
+    State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu Province, 210023, China
+c Guangxi Key Laboratory of Cryptography and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi Province, 541004, China
+d School of Cyber Science and Engineering, Wuhan University, Wuhan, Hubei Province, 430072, China
+e Faculty of Data Science, City University of Macau, 999078, Macao Special Administrative Region of China
+
+
+
+
+ARTICLE                  INFO                               ABSTRACT
+
+Keywords:                                                   The 𝑘-nearest neighbors (𝑘NN) query is a key operation in spatial and multimedia databases, which is widely
+Cloud computing                                             applied in fields such as electronic healthcare and Location-Based Services (LBS). With the rapid development
+Security                                                    of cloud computing, uploading private data of Data Owner (DO) to Cloud Servers (CS) has become a trend.
+kNN queries
+                                                            However, existing 𝑘NN queries schemes are not designed for multi-user environments, cannot timely update
+Dynamic POIs updating
+                                                            the points of interest (POIs) stored in CS, and suffer from low query efficiency. Therefore, this paper proposes
+                                                            efficient and secure multi-user 𝑘NN queries with dynamic POIs updating, named DESM𝑘NN, which achieves
+                                                            secure multi-user 𝑘NN queries. To improve query efficiency, DESM𝑘NN adopts a two-stage search framework,
+                                                            which consists of an initial filtering stage based on hierarchical clustering to effectively constrain the search
+                                                            range, followed by a more efficient precise search stage. Based on this framework, DESM𝑘NN designs a set of
+                                                            security protocols for efficient query processing and enables dynamic POIs updates. Meanwhile, DESM𝑘NN not
+                                                            only utilizes Distributed Two Trapdoors Public-Key Cryptosystem (DT-PKC) to enable multi-user queries but
+                                                            also ensures data privacy, query privacy, result privacy and access pattern privacy. Moreover, DESM𝑘NN can
+                                                            verify the correctness and completeness of queries results. Finally, security analysis proves that DESM𝑘NN
+                                                            meets the formal security definition of multiparty computation, and experimental evaluation shows that
+                                                            DESM𝑘NN improves query efficiency by up to 45.5% compared with existing 𝑘NN queries scheme.
+
+
+
+1. Introduction                                                                                   and LBS systems. Once such information is exposed, it can lead to
+                                                                                                  privacy leakage, commercial losses, or even public security risks [4].
+    LBS [1–3] are increasingly integrated into real-world applications,                           Therefore, to protect POIs from malicious access or theft by CS and
+such as ride-hailing platforms (e.g., Uber, DiDi), navigation systems                             unauthorized users, DO needs to encrypt them before outsourcing to
+(e.g., Google Maps, Baidu Maps), and online food delivery services.                               CS. In addition, security needs to be considered in query processing to
+These services heavily rely on POIs databases to provide personalized                             maintain efficiency and protect the confidentiality of POIs databases.
+and efficient responses to queries of query user (QU). Among various                                  Although 𝑘NN queries have been widely studied in recent years,
+query types, the 𝑘NN query [4,5] is one of the most fundamental                                   several limitations still hinder their applicability in practice. First, most
+methods, which aims to find the 𝑘 nearest POIs to a given query point.                            existing schemes [8,9] for 𝑘NN queries are based on static spatial
+With the rapid development of cloud computing [6,7], DO increasingly
+                                                                                                  data [10], where the database remains unchanged within a certain
+outsource their POIs databases to CS, which provides scalable storage
+                                                                                                  time interval. Consistent with this common setting, DESM𝑘NN also
+and massive computing resources. Well-known commercial platforms,
+                                                                                                  assumes that POIs are static during query processing to enable fair
+such as Amazon Web Services and Google Cloud Platform, already
+                                                                                                  performance comparison. However, in practice, POIs may change over
+provide such services to support efficient 𝑘NN queries in LBS. Although
+                                                                                                  time, and their insertion or deletion frequency varies across different
+outsourcing databases to CS improves data accessibility and flexibility,
+it makes data more susceptible to unauthorized access threats. In prac-                           areas because these updates are driven by real-world change. In rapidly
+tice, POIs often contain sensitive or private information. For instance,                          developing areas where new facilities emerge or existing ones close
+POIs databases may include the locations of hospitals, government                                 frequently, POI updates occur more frequently, whereas in more stable
+facilities, or user-related activity areas in intelligent transportation                          regions, such updates tend to be infrequent. This dynamic updates of
+
+
+     ∗ Corresponding author at: School of Artificial Intelligence and Computer Science, Jiangsu Normal University, Xuzhou, Jiangsu Province, 221116, China.
+       E-mail address: liuyali@jsnu.edu.cn (Y. Liu).
+
+https://doi.org/10.1016/j.csi.2025.104112
+Received 12 June 2025; Received in revised form 18 November 2025; Accepted 8 December 2025
+Available online 11 December 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Y. Jia et al.                                                                                                     Computer Standards & Interfaces 97 (2026) 104112
+
+
+                                                                                   system construction is introduced. Section 6 presents the specific query
+                                                                                   procedure for DESM𝑘NN. Next, Section 7 analyzes computational com-
+                                                                                   plexity, communication complexity, and security. Section 8 provides an
+                                                                                   experimental evaluation of DESM𝑘NN. Section 9 concludes this paper.
+
+                                                                                   2. Related work
+
+                                                                                       Secure Key-Sharing Query: Wong et al. [11] introduced a 𝑘NN
+                                                                                   queries scheme for encrypted data based on ASPE. However, ASPE re-
+                                                                                   lied on a secret matrix to transform data points and query points, which
+                                                                                   required secret key to be shared among all QUs and DO. Additionally,
+                                                                                   ASPE has been proven insecure against known-plaintext attacks [13].
+                                                                                   To enhance query security, Elmehdwi et al. [15] developed a set of
+                                                                                   two-party computation protocols based on the Paillier cryptosystem.
+                                                                                   Although scheme [15] preserved the privacy of query results, QUs hold
+                                                                                   DO’s private key, and the query efficiency remains low. Moreover,
+                                                                                   scheme [16] employed Delaunay triangulation and order-preserving
+                   Fig. 1. Sample of the 𝑘NN query (𝑘 = 2).                        encryption [18] to accurately solve the secure 𝑘NN problem. Neverthe-
+                                                                                   less, the encryption schemes in [16] are symmetric, which also required
+                                                                                   DO and QUs to share the key. Cui et al. [8] proposed an efficient,
+POIs reflects the continuous changes in the physical environment. As               secure, and verifiable 𝑘NN queries scheme, which employed a secure
+shown in Fig. 1, 𝑈0 searches for the two nearest neighbors (𝑘 = 2)                 index structure to ensure data security and result integrity, along with
+in a POIs database 𝐷 = {𝑝0 , … , 𝑝7 }. The original 2NN query 𝑄 was                a set of novel protocols and verification strategies for various index
+{𝑝0 , 𝑝1 }. When a new and closer point 𝑝8 is inserted, the correct 2NN            operations. However, the search complexity of scheme [8] was linearly
+result becomes {𝑝1 , 𝑝8 }. This example shows that any updates to the              related to the database size, which led to a lack of scalability. To
+POI database, such as the insertion, modification, or deletion of POIs,            address the efficiency issues in [8], Liu et al. [14] introduced a two-
+may change the query results. Therefore, dynamic updates must be sup-              stage search framework for secure and verifiable 𝑘NN queries, which
+ported in outsourced POI databases. Second, existing schemes mostly                integrated Edge Servers (ES) into the classic Twin-Cloud model by
+use Asymmetric-Scalar-Product-Preserving Encryption (ASPE) [11,12]                 leveraging adaptive encryption strategies and secure data partition-
+or pure homomorphic encryption algorithms to encrypt outsourced                    ing to optimize query performance. However, both scheme [8] and
+data. Unfortunately, ASPE has been demonstrated to be insecure under               scheme [14] could not resolve the key-sharing issue.
+the known-plaintext attacks [13], and homomorphic operations lead to                   Secure Multi-User Query: To support multi-user 𝑘NN queries, re-
+a significant computational cost. These limitations raise the challenge            searchers first focused on multi-key queries. Cheng et al. [17] imple-
+of designing an efficient and secure query mechanism. Finally, most                mented 𝑘NN queries with multi-key support, where DO and QUs had
+solutions [14,15] assume a single-user setting, where all QUs share the            their own keys, and each QU’s key was not shared with others. How-
+same secret key to enable computability of encrypted data across multi-            ever, scheme [17] incurred high computational cost and lacked result
+user. In practice, the assumption of single-user setting has obvious               verification. Subsequently, Liu et al. proposed the DT-PKC [19], which
+flaws. Once the unique key of any QUs is leaked, the entire encrypted              also allowed different QUs to use different keys during queries. Building
+database can be completely decrypted, and the query content may also               on the DT-PKC, Cheng et al. [20] and Nayak et al. [21] explored range
+be intercepted by the adversary. As illustrated in Fig. 1, in such a single-       queries and keyword queries, respectively. Nevertheless, scheme [20]
+user setting, 𝑈1 and 𝑈2 can capture the query content and result of 𝑈0             and scheme [21] still suffered from computational cost and the inability
+and decrypt them using the same secret key as 𝑈0 . This highlights the             to verify results. Cui et al. [9] introduced a method for secure and
+need for secure multi-user queries.                                                verifiable 𝑘NN queries by utilizing DT-PKC, which encrypted grid and
+    To resolve the aforementioned challenges, this paper proposes                  bucket divisions within the Voronoi diagram to maintain data security,
+DESM𝑘NN. The contributions of DESM𝑘NN are as follows:                              while also introducing a verification strategy to ensure the correctness
+                                                                                   and completeness of the query results. However, scheme [9] relied
+   (1) Dynamic POIs Updating : DESM𝑘NN innovatively designs secure                 heavily on homomorphic encryption and data packing techniques,
+       insertion and deletion protocols, which avoids the problem of               which led to high computational cost and search complexity. Moreover,
+       incorrect and incomplete query results.                                     scheme [9] fails to address the issue of dynamic updates for POIs.
+   (2) Efficient Query: DESM𝑘NN proposes an efficient two-stage                        In summary, the limitations in the existing 𝑘NN queries schemes
+       search framework, which improves the query performance.                     are as follows: (1) The single-user queries schemes have a risk of key
+   (3) Multi-User Query: DESM𝑘NN designs a series of secure protocols              leakage. (2) The multi-user queries schemes have low efficiency. (3)
+       based on DT-PKC, which achieves secure multi-user 𝑘NN queries.              Most existing queries schemes unable to achieve dynamic updates of
+   (4) Security & Performance: Security analysis shows that the pro-               POIs. For ease of exhibition, we summarize the above works in Table
+       posed DESM𝑘NN is secure. Additionally, experimental evalua-                 1.
+       tion shows that DESM𝑘NN improves query efficiency by up to
+       45.5% compared with existing 𝑘NN queries scheme on two real
+                                                                                   3. Preliminaries
+       datasets (California Road Network and Points of Interest, San
+       Francisco Road Network1 ).
+                                                                                   3.1. Voronoi diagram
+    The rest of this paper is structured as follows. Section 2 presents
+related work. Section 3 describes preliminaries. The architecture and                 The Voronoi diagram [22] partitions the plane according to a set of
+security model of DESM𝑘NN is defined in Section 4. In Section 5, the               points. Each Voronoi Cell (VC) corresponds to a point and contains all
+                                                                                   locations that are closer to this point than to any other. Two points are
+                                                                                   Voronoi neighbors if their cells share an edge, and the neighbor set of
+  1
+      https://users.cs.utah.edu/~lifeifei/SpatialDataset.htm.                      a point is denoted as 𝑉 𝑁(𝑝).
+
+                                                                               2
+Y. Jia et al.                                                                                                                    Computer Standards & Interfaces 97 (2026) 104112
+
+
+Table 1
+Summary of existing 𝑘NN query works.
+ Method                    Data privacy            Query privacy           Result privacy              Access patterns    Verifiable         Multi-user          POIs updating
+                                                   √                       √
+ Wong [11]                 ×                                                                           ×                  ×                  ×                   ×
+                           √                       √                       √                           √
+ Elmehdwi [15]                                                                                                            ×                  ×                   ×
+                           √                       √                       √
+ Choi [16]                                                                                             ×                  ×                  ×                   ×
+                           √                       √                       √                                                                 √
+ Cheng [17]                                                                                            ×                  ×                                      ×
+                           √                       √                       √                           √                  √
+ Cui [8]                                                                                                                                     ×                   ×
+                           √                       √                       √                                              √
+ Liu [14]                                                                                              ×                                     ×                   ×
+                           √                       √                       √                           √                  √                  √
+ Cui [9]                                                                                                                                                         ×
+            √
+Notations: ‘ ’ represents the approach satisfies the condition; ‘×’ represents it fails to satisfy the condition.
+
+
+                                                                                                DESM𝑘NN introduces hierarchical clustering, which improves both the
+                                                                                                organization of spatial objects and the performance of query processing.
+                                                                                                    As shown in Fig. 3, it presents an R-tree with a fanout of 𝑓 = 2,
+                                                                                                which is built from the POIs in 𝑅𝑒𝑐𝑡1 . In this construction, the data are
+                                                                                                first grouped by applying hierarchical clustering based on the Euclidean
+                                                                                                distance. This process is performed in two rounds, and the resulting
+                                                                                                clusters naturally determine the partitioning of the dataset, which is
+                                                                                                then used to build the tree structure.
+
+                                                                                                3.3. Distributed two trapdoors public-key cryptosystem
+
+                                                                                                    The DT-PKC [19] is a variant of the traditional double trapdoor
+                                                                                                decryption cryptosystem. Given a public key 𝑝𝑘, a private key 𝑠𝑘, and
+                        Fig. 2. An example of Voronoi diagram.                                  a strong private key 𝑆𝐾, the cryptosystem supports several algorithms
+                                                                                                that enable encryption, decryption, and collaborative key operations.
+                                                                                                    First, encryption is carried out by the algorithm 𝐸𝑛𝑐. Given a
+                                                                                                message 𝑝 ∈ Z𝑁 and the public key 𝑝𝑘, the algorithm outputs the
+                                                                                                ciphertext 𝐸𝑝𝑘 (𝑝). The system then allows two types of decryption:
+
+                                                                                                   (1) With the private key (𝑠𝑘), the algorithm 𝑊 𝐷𝑒𝑐 takes 𝐸𝑝𝑘 (𝑝) as
+                                                                                                       input and recovers 𝑝.
+                                                                                                   (2) With the strong private key (𝑆𝐾), the algorithm 𝑆𝐷𝑒𝑐 also
+                                                                                                       decrypts 𝐸𝑝𝑘 (𝑝) to obtain 𝑝.
+
+                                                                                                    A distinctive feature of DT-PKC lies in the management of the strong
+                                                                                                private key. The algorithm 𝑆𝑘𝑒𝑦𝑆 enables the strong private key 𝑆𝐾 to
+                                                                                                be split into two partial strong private keys, 𝑆𝐾1 and 𝑆𝐾2 . This splitting
+                                                                                                supports a collaborative decryption mechanism in two steps:
+
+                                                                                                   (1) In step 1, 𝑃 𝑆𝐷𝑒𝑐1 takes 𝐸𝑝𝑘 (𝑝) and 𝑆𝐾1 as input, which results
+                                                                                                       in a partially decrypted ciphertext 𝐶𝑇1 .
+                Fig. 3. R-tree structure based on hierarchical clustering.                         (2) In step 2, 𝑃 𝑆𝐷𝑒𝑐2 completes the process by using 𝐶𝑇1 and 𝑆𝐾2 ,
+                                                                                                       which ultimately recovers 𝑝.
+
+
+   For example, given a dataset 𝐷 that contains 16 POIs as shown in                             3.4. Advanced comparable inner product encoding
+Fig. 2-(b), the Voronoi diagram is shown in Fig. 2-(a). Since 𝑉 𝐶(𝑝8 )
+                                                                                                    The CIPE𝑠 scheme [25] allows edges to determine whether a value
+shares a common edge with 𝑉 𝐶(𝑝𝑖 ) for 𝑖 ∈ {3, 4, 9, 11, 12, 13}, the
+                                                                                                lies within a query range based on encrypted data. Compared to the
+Voronoi neighbors of 𝑝8 include 𝑉 𝑁(𝑝8 ) = {𝑝3 , 𝑝4 , 𝑝9 , 𝑝11 , 𝑝12 , 𝑝13 }.
+                                                                                                original CIPE scheme, CIPE𝑠 enhances security by extending query
+Therefore, the search result of a 3NN query is 𝑅𝑒𝑠𝑢𝑙𝑡 = {𝑝9 , 𝑝11 , 𝑝13 }.
+                                                                                                vectors into random query matrices, which makes it more resilient to
+   The Voronoi diagram has two useful properties for 𝑘NN verification:
+                                                                                                chosen plaintext attacks.
+   (1) Given a query point 𝑞, the nearest neighbor of 𝑞 is data point 𝑝,                            CIPE𝑠 supports several key algorithms for encryption and range
+                                                                                                query evaluation. First, the key generation algorithm 𝐺𝑒𝑛𝐾𝑒𝑦 takes a
+       if 𝑞 ∈ 𝑉 𝐶(𝑝).
+                                                                                                security parameter 𝜅 ∈ N as input and outputs a secret key 𝑠𝑘𝑐 . The data
+   (2) If data points 𝑝1 , … , 𝑝𝑘 are the 𝑘(𝑘 > 1) nearest neighbors of the
+                                                                                                encryption algorithm 𝐸𝑛𝑐𝐼 encrypts a plaintext 𝑥 into ciphertext 𝐸𝑐 (𝑥)
+       query point 𝑞, then 𝑝𝑖 belongs to 𝑉 𝑁(𝑝1 ) ∪ ⋯ ∪ 𝑉 𝑁(𝑝𝑖−1 ), for
+                                                                                                with 𝑠𝑘𝑐 . To perform queries, the query encryption algorithm 𝐸𝑛𝑐𝑄
+       𝑖 = 2, … , 𝑘.
+                                                                                                transforms a query range 𝑄 = [𝑏𝑙 , 𝑏𝑢 ] into an encrypted range 𝐸𝑐 (𝑄).
+                                                                                                Finally, the calculation algorithm 𝐶𝑎𝑙 compares the encrypted value
+3.2. R-tree index based on hierarchical clustering                                              𝐸𝑐 (𝑥) with the encrypted query range 𝐸𝑐 (𝑄) and outputs a comparison
+                                                                                                result: −1 if 𝑥 < 𝑏𝑙 , 1 if 𝑥 > 𝑏𝑢 , and 0 if 𝑥 ∈ [𝑏𝑙 , 𝑏𝑢 ].
+   The R-tree index [23] organizes spatial objects into nested rect-
+angles, known as Minimum Bounding Rectangles, to enable efficient                               4. System architecture and security model
+querying of spatial data, such as range queries [24] and nearest neigh-
+bor searches. However, the efficiency of the R-tree strongly depends                               This section introduces the system architecture and security model
+on how the data are grouped during construction. To address this,                               of DESM𝑘NN. A summary of notations is given in Table 2.
+
+                                                                                            3
+Y. Jia et al.                                                                                                            Computer Standards & Interfaces 97 (2026) 104112
+
+
+Table 2                                                                                  verification object 𝑉 𝑂 to the QU (Step 7). The QU then verifies the
+Summary of notations.                                                                    correctness of the result before finalizing the query.
+ 𝐷                         A spatial dataset that includes 𝑛 points {𝑃1 , … , 𝑃𝑛 }
+ 𝑉𝐷                        Voronoi diagram built from 𝐷
+ 𝑠𝑘𝑐                       The secret key for CIPE𝑠 scheme                               4.2. Security model
+ 𝑠𝑘0 , 𝑝𝑘0                 The secret/public key for DO
+ 𝑠𝑘𝑢 , 𝑝𝑘𝑢                 The secret/public key for users                                   DESM𝑘NN is designed to address three security threats. First, CS
+ 𝑆𝐾, 𝑆𝐾1 , 𝑆𝐾2             Strong private key and partial ones
+ 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , ∗)          The first step of partial decryption
+                                                                                         cannot be fully trusted and may tamper with query results. Second, CS
+ 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , ∗, ∗)       The second step of partial decryption                         may act as honest-but-curious adversaries that attempt to infer sensitive
+ 𝑄, 𝐸𝑐 (𝑄)                 A query coverage and its encrypted range                      information from the encrypted data. Third, QUs themselves may be
+ 𝑞, 𝐸𝑝𝑘0 (𝑞)               A query point and its encrypted coordinates                   curious and try to learn the query information of others.
+ 𝑃𝑖 , 𝐸𝑝𝑘0 (𝑃𝑖 )           A POI and its encrypted coordinates
+                                                                                             To counter the risk of result tampering, DESM𝑘NN incorporates a
+ 𝑇̂𝑟𝑒𝑒𝑅 , 𝑇 𝑟𝑒𝑒𝑅           The encrypted/clear R-tree index built from 𝐷
+ ̂
+ 𝑃 𝐷, 𝑃 𝐷                  The encrypted/clear preprocessed data built from 𝑉 𝐷          verification mechanism that ensures both correctness and complete-
+ ̂ 𝑄 , 𝑅𝑒𝑐𝑡𝑄
+ 𝑅𝑒𝑐𝑡                      The encrypted/clear range query generated for 𝑄               ness [27]. Correctness requires that every returned point 𝑝 ∈ 𝑅𝑒𝑠𝑢𝑙𝑡
+ 𝐼𝑅                        The immediate result                                          remains unmodified and originates from the authentic database, while
+  ̂ 𝑅𝑒𝑠𝑢𝑙𝑡
+ 𝑅𝑒𝑠𝑢𝑙𝑡,                   The encrypted/clear result in the exact search phase          completeness guarantees that all true 𝑘NN results are included and no
+ 𝐻(∗)                      A hash function
+ 𝑉𝑂                        The verification object
+                                                                                         irrelevant points are omitted.
+                                                                                             The other two threats are addressed by designing a secure index and
+                                                                                         a set of novel secure protocols that jointly preserve multiple dimensions
+                                                                                         of privacy [4,28]. Specifically, data privacy ensures that the database
+                                                                                         𝐷 remains hidden from the CS; query privacy requires that the content
+                                                                                         of a QU’s query 𝑆𝑄 is concealed from both the CS and other QUs; result
+                                                                                         privacy guarantees that only the QU can access the returned 𝑅𝑒𝑠𝑢𝑙𝑡; and
+                                                                                         access-pattern privacy prevents the CS from learning which database
+                                                                                         entries satisfy a given query.
+                                                                                             It is noteworthy that during system setup stage, CCS is prevented
+                                                                                         from compromising or collaborating with CSS. Furthermore, collusion
+                                                                                         between CS and QUs must be prevented throughout the query process.
+
+                                                                                         5. DESM𝒌NN construction
+
+                                                                                            This section first introduces an optimized two-stage search frame-
+                                                                                         work that supports efficient and secure multi-user 𝑘NN queries with
+                                                                                         dynamic POIs updating. Subsequently, several well-designed secure
+                                                                                         protocols are proposed to enable private 𝑘NN search operations on the
+                                                                                         two-stage search framework.
+
+                                                                                         5.1. Two-stage search framework
+
+                        Fig. 4. System architecture.                                         DESM𝑘NN adopts a two-stage search framework, which consists of
+                                                                                         an initial filtering stage based on hierarchical clustering to effectively
+                                                                                         constrain the search range, followed by a precise search stage to
+4.1. System architecture                                                                 achieve efficient querying.
+                                                                                             Initial Filtering Stage: DO first preprocesses the dataset by using
+     DESM𝑘NN employs a two-stage framework: an initial filtering stage                   hierarchical clustering to construct a suitable 𝑇 𝑟𝑒𝑒𝑅 . Each node in the
+on ESs and a precise search stage on dual cloud servers. To protect                      tree is encrypted by using the CIPE𝑠 .EncI algorithm to ensure security.
+privacy, the system adopts a dual-cloud architecture [8,9,14,26], where                  The 𝑇̂ 𝑟𝑒𝑒𝑅 is then uploaded to ESs. When a QU at position (𝑥𝑞 , 𝑦𝑞 )
+collusion-resilient protocols ensure both efficiency and security beyond                 initiates a query, they define a scope 𝐿 and construct a rectangle 𝑅𝑒𝑐𝑡𝑞
+traditional single-cloud settings. As shown in Fig. 4, the architecture                  centered at (𝑥𝑞 , 𝑦𝑞 ) with edge length 𝐿. Each dimension of 𝑅𝑒𝑐𝑡𝑞 is
+involves several entities with distinct roles.                                           encrypted by using the CIPE𝑠 .EncQ algorithm and sent to the nearby
+     In the setup phase (Step 1), the Certified Authority (CA) generates                                         ̂𝑞 over 𝑇̂
+                                                                                         ES. The ES evaluates 𝑅𝑒𝑐𝑡          𝑟𝑒𝑒𝑅 to generate 𝐼𝑅, which efficiently
+cryptographic keys: (𝑝𝑘0 , 𝑠𝑘0 ) for the DO, (𝑝𝑘𝑖𝑢 , 𝑠𝑘𝑖𝑢 ) for each QU, and             narrows down the candidate objects.
+a split strong key (𝑆𝐾1 , 𝑆𝐾2 ), which are respectively assigned to the                      Precise Search Stage: Once receiving (𝐸𝑝𝑘0 (𝑞), 𝑘) and 𝐼𝑅 from ES,
+two cloud servers (CSS and CCS). All public keys are shared among the                    the dual-cloud servers collaboratively execute secure protocols over the
+entities. The DO then prepares the dataset. For sensitive data (Step 2),                 preprocessed dataset to obtain the exact 𝑘 nearest neighbors (𝑅𝑒𝑠𝑢𝑙𝑡).
+it preprocesses 𝑉 𝐷 into 𝑃 𝐷, encrypts 𝑃 𝐷 with DT-PKC to obtain 𝑃       ̂𝐷,             The servers also generate a verification object (𝑉 𝑂) and send it with
+and uploads it to CSS. For less sensitive data (Step 3), it builds an R-tree             the 𝑅𝑒𝑠𝑢𝑙𝑡 back to QU for checking. This stage ensures both accuracy
+index 𝑇 𝑟𝑒𝑒𝑅 , encrypts it with CIPE𝑠 , and distributes the encrypted index              and security of the 𝑘NN search.
+𝑇̂𝑟𝑒𝑒𝑅 to ESs for efficient query filtering.
+     When a QU issues a query (Step 4), it constructs 𝑆𝑄 = (𝑅𝑒𝑐𝑡   ̂𝑞 , 𝐸𝑝𝑘              5.2. Data pre-processing
+                                                                           0
+(𝑞), 𝑘) and sends it to a nearby ES. The ES evaluates 𝑅𝑒𝑐𝑡         ̂𝑞 over
+𝑇̂𝑟𝑒𝑒𝑅 , filters candidate results 𝐼𝑅, and forwards them together with                       To support DESM𝑘NN, DO preprocesses the dataset before outsourc-
+(𝐸𝑝𝑘0 (𝑞), 𝑘) to CSS (Step 5). Next, CSS and CCS jointly execute secure                  ing, which aims to protect sensitive information while retaining the
+protocols (Step 6), and return the final result set 𝑅𝑒𝑠𝑢𝑙𝑡 along with a                  structural relationships required for queries. First, DO constructs a
+
+                                                                                     4
+Y. Jia et al.                                                                                                                       Computer Standards & Interfaces 97 (2026) 104112
+
+
+Voronoi diagram 𝑉 𝐷 from the dataset 𝐷, and encrypts the coordinates                       Algorithm 1 Secure Squared Distance Computation
+of each POI and query point 𝑞 using DT-PKC. For every POI 𝑝𝑖 ∈
+                                                                                           Require: CSS has 𝐸𝑝𝑘0 (𝑥1 ), 𝐸𝑝𝑘0 (𝑦1 ), 𝐸𝑝𝑘0 (𝑥2 ), 𝐸𝑝𝑘0 (𝑦2 );
+𝑉 𝐷, a unique label 𝑖 = 𝐻(𝑥𝑖 |𝑦𝑖 ) is generated through the SHA-
+                                                                                                     CSS has 𝑆𝐾1 , 𝑝𝑘0 ; CCS has 𝑆𝐾2 , 𝑝𝑘0 ;
+256 hash function, which serves as a compact identifier. Subsequently,                     Ensure: 𝐸𝑝𝑘0 (|𝑥1 − 𝑥2 |2 + |𝑦1 − 𝑦2 |2 );
+DO obtains the neighborhood 𝑉 𝑁(𝑝𝑖 ) and its corresponding label set                           // Calculation in CSS:
+𝑉 𝑁(𝑝𝑖 ), then employs DT-PKC to encrypt the packaged 𝑉 𝑁(𝑝𝑖 ) after                       1: Choose 4 random numbers 𝑟1 , 𝑟2 , 𝑟4 , 𝑟5 ∈ Z𝑁 ;
+applying data packaging technology [29]. This technique helps handle                        2: Randomly choose the functionality 𝐹 ∈ {0, 1};
+multiple values together, which makes encryption more straightfor-                          3: if 𝐹 = 1 then
+ward. To guarantee integrity, a signature 𝑆𝐼𝐺𝑝𝑖 = 𝐻(𝐻(𝑝𝑖 )|𝐻(𝑉 𝑁(𝑝𝑖 )))                     4:     𝐸𝑝𝑘0 (𝐴) ← 𝐸𝑝𝑘0 (𝑥1 ) ∗ 𝐸𝑝𝑘0 (𝑥2 )𝑁−1 ;
+is created, where 𝐻(𝑉 𝑁(𝑝𝑖 )) is obtained by hashing all neighbors                          5:     𝐸𝑝𝑘0 (𝐵) ← 𝐸𝑝𝑘0 (𝑦1 ) ∗ 𝐸𝑝𝑘0 (𝑦2 )𝑁−1 ;
+together as                                                                                 6: else if 𝐹 = 0 then
+𝐻(𝑉 𝑁(𝑝𝑖 )) = 𝐻(𝐻(𝑝𝑉 𝑁1 )|𝐻(𝑝𝑉 𝑁2 )|...|𝐻(𝑝𝑉 𝑁𝑚𝑎𝑥 )).                                       7:     Swap 𝑥1 with 𝑥2 and 𝑦1 with 𝑦2 ;
+                                                                                                ′′               ′′
+                                                                                            8: 𝑎 ← 𝐸𝑝𝑘0 (𝐴)𝑟1 , 𝑏 ← 𝐸𝑝𝑘0 (𝐵)𝑟2 ;
+Intuitively, this signature ensures any tampering with 𝑝𝑖 or its neighbors                      ′                      ′′       ′
+                                                                                            9: 𝑎 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝑎 ), 𝑏 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝑏 );
+                                                                                                                                                        ′′
+
+can be detected. Since homomorphic encryption requires uniform input                                 ′′  ′′  ′   ′
+                                                                                           10: Send 𝑎 , 𝑏 , 𝑎 , 𝑏 and 𝐸𝑝𝑘0 (𝐴), 𝐸𝑝𝑘0 (𝐵) to CCS;
+length, DO also performs incremental obfuscation: if a POI has fewer                           // Calculation in CCS:
+neighbors than the maximum in 𝑉 𝐷, dummy neighbors are added to                            11: Choose a random number 𝑟3 ∈ Z𝑁 ;
+conceal the actual degree. Afterward, each POI is represented by a                                                ′′  ′                      ′′  ′
+                                                                                           12: 𝑎 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝑎 , 𝑎 ), 𝑏 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝑏 , 𝑏 );
+sextuple                                                                                   13: if 𝑎 > 0 then
+                                                                                           14:     𝐸1 ← 𝐸𝑝𝑘0 (𝐴);
+(𝐸𝑝𝑘0 (𝑖𝑑), 𝐸𝑝𝑘0 (𝑝𝑖 ), 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖 )), 𝑖, 𝑉 𝑁(𝑝𝑖 ), 𝑆𝐼𝐺𝑝𝑖 ),
+                                                                                           15: else if 𝐸𝑝𝑘0 (𝑟3 ) ∗ 𝐸𝑝𝑘0 (𝐴)𝑁−1 = 𝐸𝑝𝑘0 (𝑟3 ) then
+which combines encrypted attributes, hashed labels, and a verifiable                       16:    𝐸1 ← 𝐸𝑝𝑘0 (𝑟3 )0 ;
+signature.                                                                                 17: else
+    To further protect access pattern privacy, DO divides the sextuple                     18:    𝐸1 ← 𝐸𝑝𝑘0 (𝐴)𝑁−1 ;
+table into buckets [8,9] of size 𝑤, which ensures queries operate over                     19: Apply the same steps to 𝑏 to obtain 𝐸2 ;
+fixed-size groups instead of revealing individual record access. Since                     20: Send 𝐸1 , 𝐸2 to CSS;
+the final bucket may not be completely filled, DO pads it with randomly                        // Calculation in CSS:
+                                                                                                ′′
+generated dummy records, which prevents inference attacks [30,31]                          21: 𝑐 ← 𝐸1 ∗ 𝐸𝑝𝑘0 (𝑟4 );
+                                                                                                ′                      ′′
+where an adversary could deduce whether two queries target the                             22: 𝑐 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝑐 );
+                                                                                                                                          ′′  ′
+same bucket based on its record count. At this point, DO completes                         23: Apply the same steps to 𝐸2 , 𝑟5 to obtain 𝑑 , 𝑑 ;
+                                                                                                     ′′  ′   ′′  ′
+preprocessing and securely outsources the bucketized sextuples to CSS.                     24: Send 𝑐 , 𝑐 , 𝑑 , 𝑑 to CCS;
+                                                                                               // Calculation in CCS:
+                                                                                                                      ′′    ′
+5.3. Secure Square Distance Computation(SSDC)                                              25: 𝑐 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝑐 , 𝑐 );
+                                                                                           26: 𝑠 ← 𝑐 ∗ 𝑐;
+                                                                                                                            ′′ ′
+    The goal of SSDC is to compute the secure squared distance without                     27: Apply the same steps to 𝑑 , 𝑑 to obtain 𝑑, 𝑧;
+revealing any valid coordinate information to CSS and CCS. The process                     28: Send 𝐸𝑝𝑘0 (𝑠), 𝐸𝑝𝑘0 (𝑧) to CSS;
+is shown in Algorithm 1.                                                                       // Calculation in CSS:
+                                                                                                                   𝑁−𝑟4   𝑁−𝑟
+    Initially, CSS randomly chooses 4 random numbers 𝑟1 , 𝑟2 , 𝑟4 , 𝑟5 ∈                   29: 1 ← 𝐸𝑝𝑘0 (𝑠) ∗ 𝐸1     ∗ 𝐸1 4 ∗ 𝐸𝑝𝑘0 (𝑟4 ∗ 𝑟4 )𝑁−1 ;
+                                                                                                                 𝑁−𝑟5      𝑁−𝑟
+Z𝑁 , and chooses the functionality 𝐹 ∈ {0, 1} (line 1–2). If 𝐹 = 1, CSS                    30: 2 ← 𝐸𝑝𝑘0 (𝑑) ∗ 𝐸2     ∗ 𝐸2 5 ∗ 𝐸𝑝𝑘0 (𝑟5 ∗ 𝑟5 )𝑁−1 ;
+calculates the encrypted coordinate differences 𝐸𝑝𝑘0 (𝐴), 𝐸𝑝𝑘0 (𝐵) (line                   31: 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 ← 𝐸𝑝𝑘0 (|𝑥1 − 𝑥2 |2 + |𝑦1 − 𝑦2 |2 ) ← 1 ∗ 2 ;
+3–5). If 𝐹 = 0, the procedure is the same except that the positions
+of 𝑥1 and 𝑥2 , as well as 𝑦1 and 𝑦2 , are swapped when computing the
+differences (line 6–7). To mask these values and avoid direct leak-                        5.4. Secure Minimum Computation(SMC)
+age, CSS applies randomization with 𝑟1 and 𝑟2 (line 8). Subsequently,
+CSS partially decrypts the masked values 𝑎′′ , 𝑏′′ by using the PSDec1                         The goal of SMC is to compare two secure squared distances ob-
+function to get 𝑎′ , 𝑏′ (line 9). Eventually, CSS sends 𝑎′′ , 𝑏′′ , 𝑎′ , 𝑏′ and            tained by SSDC, determine the smaller one, and also obtain the corre-
+𝐸𝑝𝑘0 (𝐴), 𝐸𝑝𝑘0 (𝐵) to CCS (line 10).                                                       sponding 𝑖𝑑𝑚𝑖𝑛 and 𝑚𝑖𝑛 . The process is shown in Algorithm 2.
+    Upon receiving a series of encrypted values from CSS, CCS chooses                          To start with, CSS generates 7 random numbers and randomly
+a random number 𝑟3 ∈ Z𝑁 and decrypts the encrypted values to obtain                        selects a functionality 𝐹 , in a manner similar to SSDC (line 1–2). If
+𝑎 and 𝑏 (line 11–12). To conceal the sign information of the differences,                  𝐹 = 1, CSS masks the differences between the distances, identifiers,
+CCS applies a randomized comparison procedure (line 13–18). Specifi-                       and location labels by incorporating random numbers either as mul-
+cally, depending on the outcomes of 𝑎 versus 0 and related conditions,                     tiplicative factors or as exponents (line 3–10). For example, the key
+CCS produces three possible cases and outputs 𝐸1 accordingly; this                         step
+design prevents CSS from learning whether 𝑥1 − 𝑥2 or 𝑦1 − 𝑦2 is positive
+                                                                                           𝐸𝑝𝑘0 (𝛼) ← (𝐸𝑝𝑘0 (𝑑1 ) ∗ 𝐸𝑝𝑘0 (𝑑2 )𝑁−1 )𝑟𝛼
+or negative. The same process is repeated for 𝑏 to obtain 𝐸2 (line 19).
+Finally, CCS returns 𝐸1 , 𝐸2 to CSS (line 20).                                             ensures that CCS cannot infer the exact magnitude of 𝑑1 and 𝑑2 with
+    Upon receiving a series of encrypted values from CCS, CSS further                      no less than 1/2 probability, which enables to preserve the magnitude
+randomizes 𝐸1 and 𝐸2 with 𝑟4 and 𝑟5 , then partially decrypts them to                      relationship with semantic security. If 𝐹 = 0, the roles of 𝑑1 and 𝑑2 are
+produce (𝑐 ′′ , 𝑑 ′′ ) and (𝑐 ′ , 𝑑 ′ ), and sends these values to CCS (line 21–24).       swapped, and the same randomization procedure follows (line 11–12).
+CCS completes the decryption (line 25), squares the plaintexts to derive                   After randomization, CSS partially decrypts one of the masked values
+𝑠 = 𝑐 2 and 𝑧 = 𝑑 2 (line 26–27), and sends back 𝐸𝑝𝑘0 (𝑠), 𝐸𝑝𝑘0 (𝑧) (line                  to obtain 𝛼1 and sends it together with the corresponding encrypted
+28). Finally, CSS combines these ciphertexts through homomorphic                           terms to CCS (line 13–14).
+operations to obtain 1 and 2 , and computes the secure squared                               Upon receiving these values, CCS decrypts 𝛼1 to obtain 𝛼2 (line 15).
+distance as 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 = 1 ∗ 2 .                                                           By checking whether the bit-length of 𝛼2 exceeds half modulus size, CCS
+
+                                                                                       5
+Y. Jia et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104112
+
+
+decides whether 𝑑1 or 𝑑2 is smaller, and records this decision in a flag        token is then partially decrypted using 𝑆𝐾1 , producing an auxiliary
+𝑤 (line 16–19). Using 𝑤 and the remaining encrypted values from CSS,            value that, together with the token, is stored in a permuted list under
+CCS computes three encrypted auxiliary terms that encode the correct            a pseudo-random permutation to prevent linkability (line 7–9). After
+selection of the minimum distance, identifier, and label (line 20–22).          completing all comparisons, CSS sends the resulting table to CCS for
+These results, along with 𝑤, are then sent back to CSS (line 23).               further processing (line 10).
+                                                                                    On the CCS side, the server initializes an empty set and parses
+Algorithm 2 Secure Minimum Computation                                          the received tokens (line 11–12). Each token is decrypted with 𝑆𝐾2 ,
+Require: CSS has 𝐸𝑝𝑘0 (𝑑1 ), 𝐸𝑝𝑘0 (𝑑2 ), 𝐸𝑝𝑘0 (𝑖𝑑1 ), 𝐸𝑝𝑘0 (𝑖𝑑2 ),              and whenever a decryption reveals equality between an element of 𝑆   ̂1
+           𝐸𝑝𝑘0 (1 ), 𝐸𝑝𝑘0 (2 );                                              and 𝑆̂2 , the corresponding index is added to the set (line 13–15). This
+           CSS has 𝑆𝐾1 , 𝑝𝑘0 ; CCS has 𝑆𝐾2 , 𝑝𝑘0 ;                              set, containing the indices of overlapping elements, is then returned to
+Ensure: 𝐸𝑝𝑘0 (𝑑𝑚𝑖𝑛 ), 𝐸𝑝𝑘0 (𝑖𝑑𝑚𝑖𝑛 ), 𝐸𝑝𝑘0 (𝑚𝑖𝑛 );                              CSS (line 16). Finally, CSS uses the inverse permutation to locate the
+    // Calculation in CSS:                                                      original positions and removes the identified elements from 𝑆      ̂1 (line
+ 1: Choose 7 random numbers 𝑟𝛼 , 𝑟𝛽 , 𝑟𝛾 , 𝑟𝛿 , 𝑟𝜖 , 𝑟𝜁 , 𝑟𝜂 ∈ Z𝑁 ;             17–19). The remaining encrypted elements constitute the secure set
+ 2: Randomly choose the functionality 𝐹 ∈ {0, 1};                               difference 𝑆̂′ , which represents all values in 𝑆1 but not in 𝑆2 (line 20).
+ 3: if 𝐹 = 1 then
+                                                                                Algorithm 3 Secure Set Difference
+ 4:     𝐸𝑝𝑘0 (𝛼) ← (𝐸𝑝𝑘0 (𝑑1 ) ∗ 𝐸𝑝𝑘0 (𝑑2 )𝑁−1 )𝑟𝛼 ;
+ 5:     𝐸𝑝𝑘0 (𝛽) ← (𝐸𝑝𝑘0 (𝑑1 ) ∗ 𝐸𝑝𝑘0 (𝑑2 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝛽 ));                   Require: CSS has two sets of encrypted values
+                                                                                         ̂1 = {𝐸𝑝𝑘 (𝑥1 ), ..., 𝐸𝑝𝑘 (𝑥𝑀 )};
+                                                                                         𝑆
+ 6:     𝐸𝑝𝑘0 (𝛾) ← (𝐸𝑝𝑘0 (𝑑2 ) ∗ 𝐸𝑝𝑘0 (𝑑1 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝛾 ));                                     0               0
+                                                                                         ̂2 = {𝐸𝑝𝑘 (𝑦1 ), ..., 𝐸𝑝𝑘 (𝑦𝑇 )};
+                                                                                         𝑆
+ 7:     𝐸𝑝𝑘0 (𝛿) ← (𝐸𝑝𝑘0 (𝑖𝑑1 ) ∗ 𝐸𝑝𝑘0 (𝑖𝑑2 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝛿 ));                                        0             0
+                                                                                           CSS has 𝑆𝐾1 ; CCS has 𝑆𝐾2 ;
+ 8:     𝐸𝑝𝑘0 (𝜖) ← (𝐸𝑝𝑘0 (𝑖𝑑2 ) ∗ 𝐸𝑝𝑘0 (𝑖𝑑1 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝜖 ));                                                                                ′
+                                                                                Ensure: CSS obtains an encrypted difference set 𝑆̂ ;
+ 9:     𝐸𝑝𝑘0 (𝜁) ← (𝐸𝑝𝑘0 (1 ) ∗ 𝐸𝑝𝑘0 (2 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝜁 ));
+                                                                                    // Calculation in CSS:
+10:     𝐸𝑝𝑘0 (𝜂) ← (𝐸𝑝𝑘0 (2 ) ∗ 𝐸𝑝𝑘0 (1 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝜂 ));
+                                                                                 1: Initialize 𝑇 to an empty table;
+11: else if 𝐹 = 0 then                                                                                                       ̂1 do
+                                                                                 2: for the 𝑖−th element 𝐸𝑝𝑘0 (𝑥𝑖 ) ∈ 𝑆
+12:      Swaps the roles of 𝑑1 , 𝑖𝑑1 , 1 with 𝑑2 , 𝑖𝑑2 , 2 .
+                                                                                 3:      Initialize 𝑡 to an empty list;
+13: 𝛼1 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝐸𝑝𝑘0 (𝛼));                                                                               ̂2 in random order do
+                                                                                 4:      for all 𝐸𝑝𝑘0 (𝑦𝑗 ) ∈ 𝑆
+14: Send 𝛼1 , 𝐸𝑝𝑘0 (𝛼), 𝐸𝑝𝑘0 (𝛽), 𝐸𝑝𝑘0 (𝛾), 𝐸𝑝𝑘0 (𝛿), 𝐸𝑝𝑘0 (𝜖),
+                                                                                 5:          Generate a random number 𝑟𝑖,𝑗 ;
+    𝐸𝑝𝑘0 (𝜁), 𝐸𝑝𝑘0 (𝜂) to CSS;
+                                                                                 6:          𝑡𝑖,𝑗 [0] ← (𝐸𝑝𝑘0 (𝑥𝑖 ) ∗ 𝐸𝑝𝑘0 (𝑦𝑗 )𝑁−1 )𝑟𝑖,𝑗 ;
+    // Calculation in CCS:
+                                                                                 7:          𝑡𝑖,𝑗 [1] ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝑡𝑖,𝑗 [0]);
+15: 𝛼2 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝐸𝑝𝑘0 (𝛼), 𝛼1 );
+                                                                                 8:          Append 𝑡𝑖,𝑗 to t;
+16: if 𝐿𝑒𝑛𝑔𝑡ℎ(𝛼2 ) > 𝐿𝑒𝑛𝑔𝑡ℎ(𝑁)∕2 then
+                                                                                 9:      𝑇 [𝜋(𝑖)] ← 𝑡;
+17:     𝑤 ← 1;
+                                                                                10: Send 𝑇 to CCS;
+18: else
+                                                                                    // Calculation in CCS:
+19:     𝑤 ← 0;
+                                                                                11: Initialize 𝑉 to an empty set;
+20: 𝐸𝑝𝑘0 (𝜃) ← (𝐸𝑝𝑘0 (𝛽)1−𝑤 ∗ 𝐸𝑝𝑘0 (𝛾)𝑤 )𝑁−1 ;
+                                                                                12: for 𝑖 ∈ [𝑀] do
+21: 𝐸𝑝𝑘0 (𝜗) ← (𝐸𝑝𝑘0 (𝛿)1−𝑤 ∗ 𝐸𝑝𝑘0 (𝜖)𝑤 )𝑁−1 ;                                  13:      Parse 𝑇 [𝑖] as (𝑡𝑖,1 , ..., 𝑡𝑖,𝑇 );
+22: 𝐸𝑝𝑘0 (𝜄) ← (𝐸𝑝𝑘0 (𝜁)1−𝑤 ∗ 𝐸𝑝𝑘0 (𝜂)𝑤 )𝑁−1 ;                                  14:      if ∃𝑡𝑖,𝑗 ∈ 𝑇 [𝑖] ∩ 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝑡𝑖,𝑗 [0], 𝑡𝑖,𝑗 [1]) then
+23: Send 𝑤, 𝐸𝑝𝑘0 (𝜃), 𝐸𝑝𝑘0 (𝜗), 𝐸𝑝𝑘0 (𝜄) to CSS;                                15:          Add 𝑖 into set 𝑉 ;
+    // Calculation in CSS:                                                      16: Send 𝑉 to CSS;
+24: if 𝑠 = 𝑤 then                                                                   // Calculation in CSS:
+25:      𝐸𝑝𝑘0 (𝑑𝑚𝑖𝑛 ) = 𝐸𝑝𝑘0 (𝑑2 ) ∗ 𝐸𝑝𝑘0 (𝜃) ∗ 𝐸𝑝𝑘0 (𝑤)𝑟𝛾 ∗                    17: for each element 𝑖 in 𝑉 do
+        (𝐸𝑝𝑘0 (1 − 𝑤))𝑟𝛽 ;                                                      18:      𝑗 ← 𝜋 −1 (𝑖);
+26:      𝐸𝑝𝑘0 (𝑖𝑑𝑚𝑖𝑛 ) = 𝐸𝑝𝑘0 (𝑖𝑑2 ) ∗ 𝐸𝑝𝑘0 (𝜗) ∗ 𝐸𝑝𝑘0 (𝑤)𝑟𝜖 ∗                  19:      Remove the 𝑗−th element 𝐸𝑝𝑘0 (𝑥𝑗 ) from 𝑆           ̂1 ;
+        (𝐸𝑝𝑘0 (1 − 𝑤))𝑟𝛿 ;                                                           ̂′ ← 𝑆̂1 ;
+                                                                                20: 𝑆
+27:      𝐸𝑝𝑘0 (𝑚𝑖𝑛 ) = 𝐸𝑝𝑘0 (2 ) ∗ 𝐸𝑝𝑘0 (𝜄) ∗ 𝐸𝑝𝑘0 (𝑤)𝑟𝜂 ∗
+        (𝐸𝑝𝑘0 (1 − 𝑤))𝑟𝜁 ;
+28: else                                                                        5.6. Secure Insertion(SI)
+29:      Swaps the roles of 𝑑2 , 𝑖𝑑2 , 2 with 𝑑1 , 𝑖𝑑1 , 1 .
+                                                                                    To support secure data insertion in databases, DESM𝑘NN innova-
+    At the end of Algorithm 2, CSS computes 3 encrypted values:
+                                                                                tively proposes a secure insertion protocol. When DO inserts a new POI
+𝐸𝑝𝑘0 (𝑑𝑚𝑖𝑛 ), 𝐸𝑝𝑘0 (𝑖𝑑𝑚𝑖𝑛 ), 𝐸𝑝𝑘0 (𝑚𝑖𝑛 ) via homomorphic encryption. The
+                                                                                into the database, two key problems must be addressed.
+computation applies to 𝑠 = 𝑤 and 𝑠 ≠ 𝑤 (line 24-29). In this way, the
+protocol securely determines the minimum distance and its associated                • How to determine the insertion position of the POI?
+information without revealing any intermediate values.                              • How to update 𝑇 𝑟𝑒𝑒𝑅 and 𝑉 𝐷?
+
+5.5. Secure Set Difference(SSD)                                                     The first problem can be effectively resolved by CIPE𝑠 . First, DO
+                                                                                generates an insertion query rectangle 𝑅𝑒𝑐𝑡𝑖𝑛𝑠 for the POI to be inserted,
+    The goal of SSD is to securely compute the set difference between           similar to generating a query rectangle 𝑅𝑒𝑐𝑡𝑞 for the query point 𝑞 in the
+two encrypted sets, which allows CSS to obtain the elements in 𝑆1               initial filtering stage, where the 𝐿 of the rectangle can be customized.
+that are not in 𝑆2 , without exposing any plaintext values. To achieve          Then, DO encrypts each dimension of 𝑅𝑒𝑐𝑡𝑖𝑛𝑠 with CIPE𝑠 .EncQ algo-
+                                   ̂1 and 𝑆
+this, CSS holds the encrypted sets 𝑆      ̂2 together with 𝑆𝐾1 , while          rithm and sends 𝑅𝑒𝑐𝑡 ̂    𝑖𝑛𝑠 to ES near the inserted POI. ES will evaluate
+CCS holds 𝑆𝐾2 . The protocol begins with CSS initializing an empty              the obtained 𝑅𝑒𝑐𝑡̂              ̂
+                                                                                                     𝑖𝑛𝑠 over 𝑇 𝑟𝑒𝑒𝑅 to obtain the insertion position.
+table and iteratively processing each encrypted element in 𝑆  ̂1 (line              Once the insertion position is determined, the label of the inserted
+1–2). For each comparison with an element in 𝑆    ̂2 , CSS generates a          POI can be added to the 𝑇 𝑟𝑒𝑒𝑅 , thus completing the update of 𝑇 𝑟𝑒𝑒𝑅 .
+random blinding factor and constructs a masked comparison token                 To address the problem of how to update 𝑉 𝐷, the Bowyer-Watson
+that conceals the difference between the two values (line 3–6). This            algorithm [32,33] is introduced. The Bowyer-Watson algorithm is an
+
+                                                                            6
+Y. Jia et al.                                                                                                            Computer Standards & Interfaces 97 (2026) 104112
+
+
+incremental method that updates 𝑉 𝐷 by progressively updating the
+Delaunay triangulation. When inserting a new point, algorithm first
+identifies all the affected triangles, then removes them and reconstructs
+the triangulation mesh by using the new point and the boundary of
+the cavity, which ensures that the new Delaunay triangulation is valid.
+Since 𝑉 𝐷 and Delaunay triangulation are duals, when the Delaunay
+triangulation is updated by using the Bowyer-Watson algorithm, 𝑉 𝐷
+is updated accordingly. When a new generating point is inserted, the
+shape and boundaries of the Voronoi cells are adjusted. Therefore, DO
+obtains the updated Voronoi diagram based on the Bowyer-Watson
+algorithm and can obtain the encrypted id of the newly inserted POI:
+𝐸𝑝𝑘0 (𝑖𝑑𝑖𝑛𝑠 ), the encrypted inserted POI: 𝐸𝑝𝑘0 (𝑝𝑖𝑛𝑠 ), the label of the newly
+inserted POI: 𝑖𝑛𝑠 , the encrypted Voronoi neighbors: 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖𝑛𝑠 )),
+the encrypted labels of Voronoi neighbors: 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖𝑛𝑠 )), and the
+signature: 𝑆𝐼𝐺𝑖𝑛𝑠 used for verification. Finally, these six values are                Fig. 5. Secure insertion and deletion in R-tree. (For interpretation of the
+organized into a tuple and sent to CSS for storage. As shown in Fig.                  references to color in this figure legend, the reader is referred to the web
+                                                                                      version of this article.)
+5, the secure insertion in the R-tree is highlighted with green lines.
+Algorithm 4 Secure 𝑘NN Query
+Require: CSS has 𝐼𝑅, 𝐸𝑝𝑘0 (𝑞), 𝑆𝐾1 ;
+           CCS has 𝑆𝐾2 ;                                                              diagrams. The key idea behind dynamic deletion and update algorithm
+Ensure: CSS obtains the encrypted search result 𝑅𝑒𝑠𝑢𝑙𝑡;                               is that Voronoi diagrams and Delaunay triangulations are dual to each
+    // Calculations in CSS and CCS:                                                   other: the vertices of Delaunay triangles correspond to the vertices of
+ 1: CSS initializes 𝑅, 𝐶, 𝐷𝑒 to empty sets;                                           Voronoi diagram, and the edges of Delaunay triangles correspond to the
+ 2: for each triple (𝐸𝑝𝑘0 (𝑝𝑖 ), 𝐸𝑝𝑘0 (𝑖𝑑𝑖 ), 𝐸𝑝𝑘0 (𝑖 )) ∈ 𝐼𝑅 do                     edges of Voronoi diagram. The Delaunay triangulation-based Voronoi
+ 3:      CSS appends 𝐸𝑝𝑘0 (𝑃𝑖 ) to 𝐶;                                                 diagram dynamic deletion and update algorithm leverages the duality
+ 4: CSS with input (𝐶, 𝐸𝑝𝑘0 (𝑞), 𝑆𝐾1 , 𝑝𝑘0 ) and CCS with input (𝑆𝐾2 , 𝑝𝑘0 )          of Delaunay triangles to efficiently update Voronoi diagram. When a
+    run SSDC protocol, and CSS obtains {𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒1 , ..., 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒|𝐶| };               point is deleted, the corresponding Delaunay triangles are removed,
+ 5: if |𝐶| ≥ 𝑘 then                                                                   and the algorithm updates the connectivity of affected neighboring
+                                               |𝐶|
+ 6:      ({𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑖 , 𝐸𝑝𝑘0 (𝑖𝑑𝑖 ), 𝐸𝑝𝑘0 (𝑖 )}𝑖=1 , 𝑆𝐾1 , 𝑝𝑘0 ) as
+                                                                                      triangles to maintain the Delaunay condition, which ensures that
+         input in CSS and CCS with input (𝑆𝐾2 , 𝑝𝑘0 ) run
+                                                                                      the triangulation is reconstructed. Then, based on the new Delaunay
+          SMC protocol, and CSS puts (𝐸𝑝𝑘0 (𝑖𝑑𝑖∗ ))𝑘𝑖=1
+                                                                                      triangulation, Voronoi diagram’s boundaries are updated to ensure the
+         into 𝑅𝑒𝑠𝑢𝑙𝑡;
+ 7: else
+                                                                                      correct topological structure of the diagram.
+                                               |𝐶|                                       Similarly, DO obtains the updated 𝑉 𝐷 and the labels of affected
+ 8:      ({𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑖 , 𝐸𝑝𝑘0 (𝑖𝑑𝑖 ), 𝐸𝑝𝑘0 (𝑖 )}𝑖=1 , 𝑆𝐾1 , 𝑝𝑘0 ) as
+         input in CSS and CCS with input (𝑆𝐾2 , 𝑝𝑘0 ) run                             POIs 𝑎𝑓 𝑓 𝑒𝑐𝑡𝑖 , the encrypted Voronoi neighbors 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑎𝑓 𝑓 𝑒𝑐𝑡𝑖 )), the
+         SMC protocol, and CSS puts (𝐸𝑝𝑘0 (𝑖𝑑1∗ )) into                               encrypted labels of Voronoi neighbors 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑎𝑓 𝑓 𝑒𝑐𝑡𝑖 )), and the
+         𝑅𝑒𝑠𝑢𝑙𝑡 and puts (𝐸𝑝𝑘0 (∗1 )) into 𝐷𝑒;                                       signature 𝑆𝐼𝐺𝑎𝑓 𝑓 𝑒𝑐𝑡𝑖 used for verification. Finally, these four values are
+ 9: CSS and CCS collaborate to run SCR protocol to get the row                        organized into a quadruple and sent to CSS, which updates the database
+    corresponding to the 𝐸𝑝𝑘0 (𝑖𝑑1∗ );                                                based on the labels of the affected POIs. As shown in Fig. 5, the secure
+10: CSS with input (𝐸𝑝𝑘0 (𝑉 𝑁(𝑝∗1 )), 𝐷𝑒, 𝑆𝐾1 ) and CCS with input 𝑆𝐾2
+                                                 ′
+                                                                                      deletion in the R-tree is highlighted with red lines.
+     run SSD protocol, and CSS obtains 𝑉 𝑁 (𝑝∗1 );
+                                            ′
+11: for 𝐸𝑝𝑘0 (𝑝𝑗 ) ∈ 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝∗1 )) ∩ 𝑉 𝑁 (𝑝∗1 ) do
+                                                                                      Algorithm 5 Secure Transformation
+12:     CSS puts 𝐸𝑝𝑘0 (𝑝𝑗 ) into 𝐶 and 𝐸𝑝𝑘0 (𝑗 ) into 𝐷𝑒;
+13: CSS and CCS collaborate to run SSD and SMC protocols to select                    Require: CSS has 𝐸𝑝𝑘0 (𝑎), 𝑆𝐾1 ;
+    the POI closest to 𝑞 from 𝐶 again, and removing it from 𝐶;                                   CCS has 𝑆𝐾2 ;
+14: CSS inserts 𝐸𝑝𝑘0 (𝑖𝑑2∗ ) into 𝑅𝑒𝑠𝑢𝑙𝑡;                                             Ensure: CSS obtains 𝐸𝑝𝑘𝑢 (𝑎);
+15: while |𝑅| < 𝑘                                                                         // Calculations in CSS:
+16:     Repeat line 9-14;                                                              1: Choose one random number 𝑟 ∈ Z𝑁 ;
+                                                                                       2: 𝐸𝑝𝑘0 (𝛼) = 𝐸𝑝𝑘0 (𝑎) ∗ 𝐸𝑝𝑘0 (𝑟);
+                                                                                           ′
+                                                                                       3: 𝛼 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝐸𝑝𝑘0 (𝛼));
+5.7. Secure Deletion(SD)                                                                                   ′
+                                                                                       4: Send 𝐸𝑝𝑘0 (𝛼), 𝛼 to CCS;
+                                                                                          // Calculations in CCS:
+    To support secure data deletion in database, DESM𝑘NN innovatively                  5: 𝛼 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝐸𝑝𝑘0 (𝛼), 𝛼 );
+                                                                                                                          ′
+
+proposes a secure deletion protocol. First, DO generates an deletion                   6: Send 𝐸𝑝𝑘𝑢 (𝛼) to CSS;
+query rectangle 𝑅𝑒𝑐𝑡𝑑𝑒𝑙 for the POI to be deleted, where the 𝐿 of the                     // Calculations in CSS:
+rectangle can be customized. Then, DO encrypts each dimension of                       7: 𝐸𝑝𝑘𝑢 (𝑎) = 𝐸𝑝𝑘𝑢 (𝛼) ∗ 𝐸𝑝𝑘𝑢 (𝑟)𝑁−1 ;
+                                                    ̂
+𝑅𝑒𝑐𝑡𝑑𝑒𝑙 with the CIPE𝑠 .EncQ algorithm and sends 𝑅𝑒𝑐𝑡  𝑑𝑒𝑙 to ES near the
+                                              ̂
+deleted POI. ES will evaluate the obtained 𝑅𝑒𝑐𝑡          ̂
+                                                𝑑𝑒𝑙 over 𝑇 𝑟𝑒𝑒𝑅 to obtain
+the deletion position.
+    Once the deletion position is determined, DO sends 𝑑𝑒𝑙 , which is                6. DESM𝒌NN query processing
+the label of the POI, to ES near the deleted POI. ES deletes the POI
+label from the data at deletion location based on 𝑑𝑒𝑙 sent by DO. At
+this point, the deletion update of 𝑇 𝑟𝑒𝑒𝑅 is completed.                                  This section provides a detailed introduction to DESM𝑘NN query
+    Similar to SI protocol, DESM𝑘NN introduces a Delaunay                             processing, which consists of two parts: secure 𝑘NN query processing
+triangulation-based dynamic deletion and update algorithm for Voronoi                 and verification processing.
+
+                                                                                  7
+Y. Jia et al.                                                                                                            Computer Standards & Interfaces 97 (2026) 104112
+
+
+6.1. Secure 𝑘NN query processing                                                           • Verifying completeness: Similar to correctness, completeness is de-
+                                                                                             fined as follows: all the points returned are valid solutions to the
+   Based on comprehensive search framework, DESM𝑘NN proposes a                               𝑘NN query, while the points not returned do not correspond to
+secure and verifiable query processing strategy, which is divided into                       the actual answers. First, assume that 𝑝∗𝑖 represents the 𝑖th nearest
+three steps as follows:                                                                      point to the query point 𝑞 in 𝑅𝑒𝑠𝑢𝑙𝑡. Subsequently, based on the
+                                                                                             properties of the Voronoi diagram, 𝑉 𝐶(𝑝∗𝑖 ) can be derived from
+     • Step 1. Calculating k nearest neighbors: The specific details and                     𝑉 𝑁(𝑝∗𝑖 ) and 𝑝∗𝑖 . The specific process is divided into four steps: (1)
+       procedures are illustrated in Algorithm 4. First, CSS will create                     determine the coordinates of the neighboring points; (2) calculate
+       three new sets, which includes the result set 𝑅𝑒𝑠𝑢𝑙𝑡, the candidate                   the perpendicular bisectors between 𝑝∗𝑖 and each neighboring
+       set 𝐶, and the deduplication set 𝐷𝑒 (line 1). After initial filtering                 point; (3) identify the intersection points of all these perpen-
+       stage, CSS has 𝐼𝑅 = {(𝐸𝑝𝑘0 (𝑝𝑖 ), 𝐸𝑝𝑘0 (𝑖𝑑𝑖 ), 𝐸𝑝𝑘0 (𝑖 ))}. Next, CSS                dicular bisectors, these intersection points form the vertices of
+       will insert each encrypted POI 𝐸𝑝𝑘0 (𝑝𝑖 ) from 𝐼𝑅 into 𝐶 (line                        the polygon, which represent the Voronoi cell; (4) connect these
+       2–3). Since CSS has already stored the encrypted query point                          vertices in either a clockwise or counterclockwise order to form
+       𝐸𝑝𝑘0 (𝑞), the SSDC protocol is executed for each intermediate POI                     the Voronoi cell surrounding the point 𝑝∗𝑖 . Thereafter, the final
+                                                                                             verification is conducted based on the two important properties
+       to obtain the secure squared distance between each POI and the
+                                                                                             of the Voronoi diagram. The first step is to determine whether 𝑞
+       query point (line 4). If |𝐶| ≥ 𝑘, which means that the required
+                                                                                             lies within 𝑉 𝐶(𝑝∗1 ). If it does, 𝑝∗1 is confirmed as the nearest POI;
+       𝑘 POIs can be found in 𝐼𝑅, CSS and CCS will collaborate to
+                                                                                             otherwise, the verification process is terminated immediately.
+       execute SMC protocol to obtain the desired 𝑘 POIs (line 5–6). If
+                                                                                             The second step is to test each point (except for 𝑝∗1 ) in 𝑅𝑒𝑠𝑢𝑙𝑡
+       |𝐶| < 𝑘, CSS and CCS collaborate to execute the SMC protocol
+                                                                                             individually, which determines whether 𝑝∗𝑖 ∈ {𝑉 𝑁(𝑝∗1 ) ∪ ⋯ ∪
+       to obtain the nearest POI, and insert the corresponding 𝐸𝑝𝑘0 (𝑖𝑑1∗ )
+                                                                                             𝑉 𝑁(𝑝∗𝑖−1 )}, 𝑖 > 1. If it does, 𝑝∗𝑖 is confirmed as the 𝑖th nearest POI.
+       into 𝑅𝑒𝑠𝑢𝑙𝑡, and the corresponding 𝐸𝑝𝑘0 (∗1 ) into 𝐷𝑒 (line 7–8).
+       To further get the next nearest neighbor, CSS and CCS collaborate
+                                                                                        7. Analysis
+       to execute the SCR protocol [8,9], to get the row corresponding
+       to the 𝐸𝑝𝑘0 (𝑖𝑑1∗ ): 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝∗1 )), 𝑉 𝑁(𝑝∗1 ), 𝑆𝐼𝐺𝑝∗ (line 9). CSS and
+                                                           1                            7.1. Computational complexity
+       CCS collaborate to execute the SSD protocol, with two input
+       sets 𝑉 𝑁(𝑝∗1 ) and 𝐷𝑒. CSS obtains 𝑉 𝑁′ (𝑝∗1 ) (line 10). If one                  To verify the efficiency of DESMkNN, we analyze the computational
+       POI 𝐸𝑝𝑘0 (𝑃𝑗 ) in 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝∗1 )) also exists in 𝑉 𝑁′ (𝑝∗1 ), 𝐸𝑝𝑘0 (𝑝𝑗 )       complexity of all four entities involved in the system: DO, QU, ESs, and
+       is added to 𝐶, and 𝐸𝑝𝑘0 (𝑗 ) is added to 𝐷𝑒 (line 11–12). CSS                   dual-cloud servers. Let 𝑒𝑐 and 𝑑𝑐 denote the encryption and decryption
+       and CCS collaborate to execute SSD protocol and SMC protocol,                    operations of CIPE𝑆 , and let 𝑒𝑑𝑡 and 𝑑𝑑𝑡 represent the encryption and
+       which selects the POI closest to the query point from 𝐶 again                    decryption operations of DT-PKC.
+       and removes it from 𝐶 (line 13). CSS inserts 𝐸𝑝𝑘0 (𝑖𝑑2∗ ), which
+       corresponds to the obtained point, into 𝑅𝑒𝑠𝑢𝑙𝑡 and checks whether                  (1) DO: In the data pre-processing stage, DO needs to generate
+       the content in 𝑅𝑒𝑠𝑢𝑙𝑡 meets the requirements of 𝑘NN queries. If                        𝑇 𝑟𝑒𝑒𝑅 and 𝑉 𝐷 based on the database 𝐷. 𝑇 𝑟𝑒𝑒𝑅 and the 𝑃 𝐷
+       not, S𝑘Q will repeat line 9–14.                                                        generated from 𝑉 𝐷 are encrypted by using CIPE𝑆 and DT-PKC,
+                                                                                              respectively. Therefore, the total computational complexity is
+     • Step 2. Generating verification object : During secure 𝑘NN queries,
+                                                                                              𝑂(𝑛)𝑒𝑐 + 𝑂(𝑛 ∗ 𝑀)𝑒𝑑𝑡 ,
+       DESM𝑘NN also need to generate 𝑉 𝑂. By collaborating to execute
+       the SCR protocol, CSS and CCS can obtain 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖 )) and                           where 𝑀 represents the maximum number of neighbors in 𝑉 𝐷.
+       𝑆𝐼𝐺𝑝𝑖 from the row, which corresponds to 𝑝𝑖 . Additionally, al-                    (2) QU : Due to the key conversion mechanism in Algorithm 5, QU
+       gorithm 5 enables key conversion, which transforms 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖 ))                     only needs to perform a single DT-PKC decryption to obtain the
+       into 𝐸𝑝𝑘𝑢 (𝑉 𝑁(𝑝𝑖 )). At last, CSS adds 𝐸𝑝𝑘𝑢 (𝑉 𝑁(𝑝𝑖 )) and 𝐸𝑝𝑘𝑢 (𝑆𝐼𝐺𝑝𝑖 )              final result and 𝑉 𝑂. Thus, the computational cost is 𝑂(1)𝑑𝑑𝑡 .
+       of each result point into 𝑉 𝑂.                                                     (3) ESs: The ESs perform initial filtering by evaluating the encrypted
+                                                                                                                ̂𝑞 over the encrypted R-tree 𝑇̂
+                                                                                              query rectangle 𝑅𝑒𝑐𝑡                                      𝑟𝑒𝑒𝑅 to gen-
+     • Step 3. Returning results and verification object to QU : Based on                     erate the intermediate result set 𝐼𝑅. Their total computational
+       secure protocols we proposed, CSS can directly retrieve the final                      complexity is 𝑂(𝑙𝑜𝑔𝑛 )𝑑𝑐 .
+       results encrypted with 𝑝𝑘𝑢 in order, without needing an additional                 (4) Dual-Cloud Servers: The dual-cloud servers undertake the pre-
+       transformation process. Therefore, CSS puts the final points into                      cise search stage and therefore incur the highest computational
+       𝑅𝑒𝑠𝑢𝑙𝑡 and sends it, along with 𝑉 𝑂, to QU.                                            complexity, as this stage requires executing several secure sub-
+                                                                                              protocols. Specifically, the SSDC protocol is used to compute
+6.2. Verification processing                                                                  the secure squared distance between the query point 𝑞 and each
+                                                                                              POI in the intermediate result set 𝐼𝑅. The SMC protocol is re-
+                                                                                              sponsible for comparing encrypted distance values and obtaining
+   QU utilizes 𝑅𝑒𝑠𝑢𝑙𝑡 and 𝑉 𝑂 to authenticate the correctness and
+                                                                                              the corresponding encrypted identifiers and location records. To
+completeness of 𝑅𝑒𝑠𝑢𝑙𝑡.
+                                                                                              determine the nearest POI among candidates, the SMC proto-
+     • Verifying correctness: Recall the definition of correctness described                  col must be executed 𝑛-1 times. In addition, the SSD protocol
+       in the security model, which means that each returned point                            computes the set difference between two encrypted sets and
+                                                                                              must perform DT-PKC decryption |𝑆      ̂1 | ∗ |𝑆
+                                                                                                                                             ̂2 | times. The overall
+       𝑝 ∈ 𝑅𝑒𝑠𝑢𝑙𝑡 remains unmodified and is an authentic entry in the
+                                                                                              complexity depends on whether the number of candidates in
+       original database. To verify the correctness of 𝑅𝑒𝑠𝑢𝑙𝑡, QU first de-
+                                                                                              𝐼𝑅 is greater than or smaller than 𝑘. When |𝐼𝑅| > 𝑘, the
+       crypts 𝑉 𝑂 by using his private key 𝑠𝑘𝑢 to obtain {𝑉 𝑁(𝑝𝑖 ), 𝑆𝐼𝐺𝑝𝑖 }.
+                                                                                              SkQ protocol repeatedly invokes the SMC protocol to iteratively
+       Next, QU uses the obtained 𝑉 𝑁(𝑝𝑖 ) to compute 𝐻(𝑉 𝑁(𝑝𝑖 )) and
+                                                                                              determine the top-𝑘 POIs, which requires (|𝐼𝑅|−1+|𝐼𝑅|−𝑘) ∗ 𝑘∕2
+       further calculates 𝐻(𝐻(𝑝𝑖 )|𝐻(𝑉 𝑁(𝑝𝑖 ))) (the specific method has
+                                                                                              executions in total. In this case, the computational complexity of
+       been detailed in Data Pre-processing). Finally, QU only needs to
+                                                                                              the precise search stage is
+       check whether 𝑆𝐼𝐺𝑝𝑖 matches the computed 𝐻(𝐻(𝑝𝑖 )|𝐻(𝑉 𝑁(𝑝𝑖 )))
+       to verify correctness.                                                                 𝑂(|𝐼𝑅| ∗ 𝑘)(𝑒𝑑𝑡 + 𝑑𝑑𝑡 ),
+
+                                                                                    8
+Y. Jia et al.                                                                                                                    Computer Standards & Interfaces 97 (2026) 104112
+
+
+Table 3
+Computational complexity of existing approaches and DESM𝑘NN.
+                              DO                                 QU                            ES                       Dual-cloud servers
+                                                                                                                        {
+                                                                                                                          𝑂(|𝐼𝑅| ∗ 𝑘)(𝑒𝑑𝑡 + 𝑑𝑑𝑡 )
+ DESM𝑘NN                      𝑂(𝑛)𝑒𝑐 + 𝑂(𝑛 ∗ 𝑀)𝑒𝑑𝑡               𝑂(1)𝑑𝑑𝑡                       𝑂(𝑙𝑜𝑔𝑛 )𝑑𝑐                                                     √
+                                                                                                                          𝑂(|𝐼𝑅| + 𝑘2 ∗ 𝑀)𝑒𝑑𝑡 + 𝑂(|𝐼𝑅| + 𝑘 ∗ ( 𝑛 + 𝑘 ∗ 𝑀))𝑑𝑑𝑡
+                                   2
+                                                                                                                                                   √
+ MSV𝑘NN [9]                   𝑂(𝑚 ∗ 𝑔 + 𝑛 ∗ 𝑀)𝑒𝑑𝑡                𝑂(1)𝑑𝑑𝑡                       –                        𝑂(𝑘 ∗ (𝑛 + 𝑀))𝑒𝑑𝑡 + 𝑂(𝑘 ∗ ( 𝑛 + 𝑀))𝑑𝑑𝑡
+                                                                                                                        {
+                                                                                                                          𝑂(|𝐼𝑅| ∗ 𝑘)(𝑒𝑝 + 𝑑𝑝 )
+ SecVKQ [14]                  𝑂(𝑛)𝑒𝑐 + 𝑂(𝑛 ∗ 𝑀)𝑒𝑝                𝑂(1)(𝑒𝑐 + 𝑒𝑝 )                𝑂(𝑙𝑜𝑔𝑛 )𝑑𝑐
+                                                                                                                          𝑂(|𝐼𝑅| + 𝑘2 ∗ 𝑀)(𝑒𝑝 + 𝑑𝑝 )
+                                                                                                                                                  √
+ SV𝑘NN [8]                    𝑂(𝑚2 ∗ 𝑔 + 𝑛 ∗ 𝑀)𝑒𝑝                𝑂(1)𝑑𝑝                        –                        𝑂(𝑘 ∗ (𝑛 + 𝑀))𝑒𝑝 + 𝑂(𝑘 ∗ ( 𝑛 + 𝑀))𝑑𝑝
+
+Notations: Let 𝑛 represents the size of dataset 𝐷, 𝑘 represents the search parameter for 𝑘NN search, and 𝑀 represents the maximal number of Voronoi neighbors. 𝑚 refers to the
+number of grids, while 𝑔 represents the maximum number of grid points, as discussed in [8,9].
+
+
+Table 4                                                                                     Theorem 1. The DT-PKC cryptosystem described in Section 3 is seman-
+Comparison of communication costs (MB) under the setting of 𝐾                       =       tically secure under the assumed intractability of the DDH problem over
+{1024, 2048}.                                                                               Z∗ 2 . This ensures that ciphertexts produced by DT-PKC reveal no infor-
+                                                                                              𝑁
+ 𝑛       DESM𝑘NN                           MSV𝑘NN                                           mation about the underlying plaintexts, even to computationally bounded
+         California    San Francisco       California          San Francisco                adversaries (The details of the proof can be referred to [19]).
+         1024 2048 1024            2048    1024         2048   1024          2048
+ 1024 6.1       12.7   5.9         12.3    6.5          13.1   6.1           12.4           Theorem 2 (Composition Theorem [35]). If a protocol is composed of mul-
+ 2048 12.8      27.8   11.9        25.6    14.3         31.4   13.9          30.7           tiple subprotocols, each of which is secure under the simulation paradigm,
+                                                                                            and all intermediate values are either random or pseudorandom, the com-
+                                                                                            posed protocol is secure. This theorem allows the security of DESM𝑘NN to
+         When |𝐼𝑅| < 𝑘, the nearest POI is first identified by using |𝐼𝑅|−1                 be deduced from the security of its individual subprotocols.
+         SMC comparisons. Next, the SCR protocol is executed to locate
+         the bucket row containing this POI, after which the remaining                      Theorem 3 (Security of SSDC). Assuming DT-PKC is semantically se-
+         𝑘 − 1 POIs are obtained through the subsequent steps of SkQ.                       cure, the SSDC subprotocol securely computes encrypted squared distances
+         In this case, the computational complexity of the precise search                   between the query point and candidate points in 𝐼𝑅 for semi-honest adver-
+         stage is                                                                           saries.
+                                              √
+         𝑂(|𝐼𝑅| + 𝑘2 ∗ 𝑀)𝑒𝑑𝑡 + 𝑂(|𝐼𝑅| + 𝑘 ∗ ( 𝑛 + 𝑘 ∗ 𝑀))𝑑𝑑𝑡 .                              Proof. In SSDC, the cloud server’s view consists of the ciphertexts
+                                                                                            𝑎′′ , 𝑏′′ , 𝑎′ , 𝑏′ , which are derived from plaintext differences scaled by
+         where 𝑀 denotes the maximum number of neighbors in the
+                                                                                            random factors, and the encrypted comparison results 𝐸1 , 𝐸2 . The sim-
+         Voronoi diagram. The comparison results between DESM𝑘NN                                                 ∏
+                                                                                            ulated view 𝑠𝐶𝐶𝑆 (𝑆𝑆𝐷𝐶) is constructed by sampling all elements uni-
+         and existing secure 𝑘NN query schemes are summarized in Table
+         3.                                                                                 formly at random from the appropriate domain. The semantic security
+                                                                                            of DT-PKC ensures that 𝑎′′ , 𝑏′′ , 𝑎′ , 𝑏′ are computationally indistinguish-
+    Moreover, The computational complexity of POI insertion and dele-                       able from the corresponding simulated values (𝑎′′           ′′ ′ ′
+                                                                                                                                                   𝑠 , 𝑏𝑠 , 𝑎𝑠 , 𝑏𝑠 ). Similarly,
+tion in DESM𝑘NN is 𝑂(𝑙𝑜𝑔𝑛 + 𝑙𝑜𝑔(𝑀1 )) on average, which is asymp-                           the randomized encryption of the comparison outcomes 𝐸1 , 𝐸2 ensures
+totically equivalent to 𝑂(𝑙𝑜𝑔(𝑀1 𝑛)). Here, 𝑀1 represents the number                        that these values are indistinguishable from their simulated counter-
+of neighboring POIs affected by the local Voronoi diagram update.                           parts 𝐸1𝑠 , 𝐸2𝑠 . This demonstrates that the real execution reveals no
+This complexity arises from updating the encrypted R-tree and locally                       additional information beyond what is contained in the input and
+maintaining the Voronoi diagram.                                                            output, which confirms the security of SSDC. For CSS, the execu-
+                                                                                                                   ∏
+                                                                                            tion image is 𝐶𝐶𝑆 (𝑆𝑆𝐷𝐶) = {𝐸1 , 𝐸2 }, and the simulated image is
+                                                                                            ∏𝑠
+7.2. Communication complexity                                                                  𝐶𝐶𝑆 (𝑆𝑆𝐷𝐶) = {𝐸1𝑠 , 𝐸2𝑠 }. Since 𝐸1 , 𝐸2 are produced by randomized
+                                                                                            procedures, they are computationally indistinguishable from 𝐸1𝑠 , 𝐸2𝑠 ,
+   In this subsection, the communication cost incurred during the                           which further supports the security argument.
+entire query processing is evaluated. As shown in Table 4, it presents
+the communication cost of DESM𝑘NN compared with MSV𝑘NN. It is                               Theorem 4 (Security of SMC). Assuming DT-PKC is semantically secure,
+observed that DESM𝑘NN consistently incurs the lowest communication                          the SMC protocol securely compares encrypted distance values and returns
+cost. These experimental results align well with the theoretical analysis.                  encrypted identifiers or labels.
+
+7.3. Security analysis                                                                      Proof. In SMC, the server’s view contains ciphertexts (𝐸𝑝𝑘0 (𝛼), 𝛼1 , 𝛼2 )
+                                                                                                                                               ∏
+                                                                                            and a local output bit 𝑤. The simulated view 𝑠𝐶𝐶𝑆 (𝑆𝑀𝐶) is obtained
+    To establish the security of the proposed subprotocol, it is important                  by sampling all elements randomly. Semantic security guarantees that
+to highlight that the semantic security of the DT-PKC cryptosystem has                      (𝐸𝑝𝑘0 (𝛼), 𝛼1 ) are indistinguishable from their simulated counterparts
+been proven in [19]. Additionally, in accordance with the formal secu-                      (𝐸𝑝𝑘0 (𝛼)𝑠 , 𝛼1𝑠 ). Additionally, 𝛼2 is derived from random coin flips and
+rity definition of multiparty computation introduced in [29] and [34],                      is indistinguishable from 𝛼2𝑠 . The local output bit 𝑤 also matches
+the framework of the simulation paradigm proposed in [35] is adopted.                       the distribution of the simulated 𝑤𝑠 . Hence, the simulated view is
+Specifically, the simulation paradigm requires that the view of each                        computationally indistinguishable from the real view, which confirms
+participant in the protocol can be simulated based solely on its input                      the security of SMC.
+and output, which ensures that no participant gains any additional in-
+formation from the protocol. In other words, the real execution of each                     Theorem 5 (Security of DESM𝑘NN). If DT-PKC is semantically secure,
+subprotocol is computationally indistinguishable from its simulated                         DESM𝑘NN is secure under the semi-honest model.
+counterpart. For clarity, the SSDC and SMC are formally demonstrated
+as examples, and other protocols we proposed can be proven in a                             Proof. Since each subprotocol (SSDC, SMC, SSD, and others) produces
+similar manner.                                                                             views indistinguishable from their respective simulated views, and all
+
+                                                                                        9
+Y. Jia et al.                                                                                                    Computer Standards & Interfaces 97 (2026) 104112
+
+
+
+
+                                               Fig. 6. The data processing time with varying parameters.
+
+
+
+
+                            Fig. 7. Comparison of search time between MSV𝑘NN and DESM𝑘NN on two datasets (𝑘 = 1 to 10).
+
+
+intermediate values are either DT-PKC ciphertexts or explicitly ran-             8.1. Parameter setting
+domized, the composition theorem applies. Consequently, the overall
+DESM𝑘NN protocol is secure, ensuring confidentiality of the database,               The evaluation of DESM𝑘NN is carried out on a system equipped
+privacy of queries, and integrity of computation.                                with an Intel Core i7-14650HQ processor, clocked at 2.80 GHz, and
+                                                                                 16 GB of RAM, which runs Windows 11. For this purpose, the DT-
+   In DESM𝑘NN, a quantitative security comparison across existing
+                                                                                 PKC cryptosystem is implemented by using the JAVA development kit,
+methods is not conducted due to significant differences in their threat
+models, cryptographic assumptions, and supported functionalities,                which forms the core element of the proposed protocol.
+which make such evaluation extremely difficult. Instead, DESM𝑘NN                    In the experiment, the dataset size 𝑛 ranges from 1024 to 2024. The
+focuses on formally achieving and proving multiple security properties           search parameter 𝑘 is set between 1 and 10. The key size 𝐾 of the DT-
+that prior methods do not simultaneously provide. DESM𝑘NN ensures                PKC cryptosystem are selected from {1024, 2048, 3072}. These settings
+data privacy, query privacy, result privacy, and access patterns privacy,        apply to all values of 𝑛, 𝑘, 𝐾 in the experiment. While implementing the
+while also supporting result verification, multi-user querying, and              MSV𝑘NN and SV𝑘NN schemes, the grid granularity is fixed at 90 and
+dynamic updates to the encrypted POIs database in outsourced POIs                the cryptographic hash functions are implemented via HMAC-SHA-256.
+queries, which prior methods cannot achieve simultaneously.
+                                                                                 8.2. Experiment results
+8. Experimental evaluation
+                                                                                     The following analysis of the experimental results will focus on DO
+   This section evaluates the computational cost of DESM𝑘NN by us-               and Dual-Cloud Servers. It should be noted that the experiment results
+ing real-world datasets for spatial databases: California Road Network           for the CIPE𝑠 scheme are not included, as its execution time is negligible
+and San Francisco Road Network. A comparison is made between                     compared to the DT-PKC cryptosystem. For example, the CIPE𝑠 scheme
+DESM𝑘NN and scheme MSV𝑘NN [9] in different phases.                               takes less than 1 s to retrieve 𝐼𝑅 from 1 million POIs.
+
+                                                                            10
+Y. Jia et al.                                                                                               Computer Standards & Interfaces 97 (2026) 104112
+
+
+
+
+                         Fig. 8. Comparison of search time between MSV𝑘NN and DESM𝑘NN on two datasets (𝐾 = 1024 to 3072).
+
+
+
+
+                          Fig. 9. Comparison of search time between MSV𝑘NN and DESM𝑘NN on two datasets (𝑛 = 1024 to 2024).
+
+
+
+
+                                       Fig. 10. The search time of DESM𝑘NN on two datasets (𝐾 = 1024 to 3072).
+
+
+
+
+     • DO: The execution time in data preprocessing are shown in Fig.             that in Fig. 7, both datasets (California Road Network and Points
+       6. The computational cost includes two components: the cost                of Interest, San Francisco Road Network) are real-world datasets,
+       of encrypting 𝑉 𝐷 and the cost of generating 𝑆𝐼𝐺. Experiment               where realistic POI distributions result in consistent performance
+       results show that MSV𝑘NN and SV𝑘NN require additional oper-                gaps between DESM𝑘NN and MSV𝑘NN. Moreover, real-world
+       ations such as grid partition, grid padding, and grid encryption,          datasets often exhibit a high density of POIs. Due to the grid
+       and thus perform worse in this stage.                                      partitioning mechanism, MSV𝑘NN tends to be inefficient when
+                                                                                  handling real-world datasets. For example, in the California road
+     • Dual-Cloud Servers: As shown in Section 7, the execution time              network dataset, when setting the fine-grained grid parameter 𝑚
+       in search stage is influenced by parameters 𝑛, 𝑘, 𝐾. Experiments           in MSV𝑘NN to 32 (which is the optimal parameter for MSV𝑘NN),
+       are conducted under different parameter settings to demonstrate            the number of POIs contained within each grid reaches as high as
+       the effectiveness of DESM𝑘NN. We can observe that the search               108. To utilize data packing techniques, the parameter 𝐾 needs
+       time of DESM𝑘NN is significantly shorter than MSV𝑘NN, as shown             to be adjusted to no less than 4096, which results in extremely
+       in Figs. 7–9, primarily because MSV𝑘NN incurs a high computa-              high computational costs. However, in DESM𝑘NN, well-designed
+       tional cost when executing the critical SGC protocol. Please note          data structures are employed to regulate the number of POIs
+
+                                                                           11
+Y. Jia et al.                                                                                                           Computer Standards & Interfaces 97 (2026) 104112
+
+
+       per partition, which keeps 𝐾 within a reasonable range and                 References
+       prevents excessive computational overhead. As shown in Fig. 10,
+       when 𝐼𝑅 is smaller than the query parameter 𝑘, the query time               [1] R. Li, A. Liu, A. Wang, Fast and scalable range query processing with strong
+                                                                                       privacy protection for cloud computing, IEEE/ACM Trans. Netw. 24 (4) (2015)
+       is significantly higher compared to when 𝐼𝑅 exceeds 𝑘, since
+                                                                                       2305–2318.
+       CS need to perform more calculations related to homomorphic                 [2] G. Xiao, F. Wu, X. Zhou, K. Li, Probabilistic top-k range query processing for
+       encryption. For a given scheme, larger values of 𝑘 and 𝑛 increase               uncertain databases, J. Intell. Fuzzy Syst. 31 (2) (2016) 1109–1120.
+       query time by expanding the search space and raising computa-               [3] K. Xue, S. Li, J. Hong, Y. Xue, N. Yu, P. Hong, Two-cloud secure database
+       tional demands. Likewise, a larger 𝐾 leads to longer plaintexts for             for numeric-related SQL range queries with privacy preserving, IEEE Trans. Inf.
+                                                                                       Forensic Secur. 12 (7) (2017) 1596–1608.
+       encryption, which adds overhead from cryptographic operations.              [4] Y. Miao, Y. Yang, X. Li, K.-K.R. Choo, X. Meng, R.H. Deng, Comprehensive survey
+                                                                                       on privacy-preserving spatial data query in transportation systems, IEEE Trans.
+    In general, it can be concluded that DESM𝑘NN not only meets the                    Intell. Transp. Syst. 24 (12) (2023) 13603–13616.
+security requirements mentioned in Section 4 but also achieves higher              [5] Y. Zhang, B. Wang, Z. Zhao, Verifiable and privacy-preserving 𝑘-NN query
+efficiency than scheme MSV𝑘NN in all stages of POI queries, with an                    scheme with multiple keys, IEEE Trans. Big Data 11 (3) (2024) 1434–1446.
+improvement of up to 45.5%.                                                        [6] Q. Liu, Y. Peng, J. Wu, T. Wang, G. Wang, Secure multi-keyword fuzzy searches
+                                                                                       with enhanced service quality in cloud computing, IEEE Trans. Netw. Serv.
+                                                                                       Manag. 18 (2) (2021) 2046–2062.
+9. Conclusion                                                                      [7] Q. Liu, Y. Peng, Q. Xu, H. Jiang, J. Wu, T. Wang, T. Peng, G. Wang, S.
+                                                                                       Zhang, 𝖬𝖠𝖱𝖲Mars: Enabling verifiable range-aggregate queries in multi-source
+    This paper proposes efficient and secure multi-user 𝑘NN queries                    environments, IEEE Trans. Dependable Secur. Comput. 21 (4) (2024) 1994–2011.
+                                                                                   [8] N. Cui, X. Yang, B. Wang, J. Li, G. Wang, SVkNN: Efficient secure and verifiable
+with dynamic POIs updating, which preserves the privacy of data,
+                                                                                       k-nearest neighbor query on the cloud platform, in: Proc. of ICDE, 2020, pp.
+queries, results, access patterns and ensures the results are correct                  253–264.
+and complete in a multi-user environment. Firstly, DESM𝑘NN proposes                [9] N. Cui, K. Qian, T. Cai, J. Li, X. Yang, J. Cui, H. Zhong, Towards multi-user,
+a two-stage search framework to accelerate query speed. Secondly,                      secure, and verifiable 𝑘 NN query in cloud database, IEEE Trans. Knowl. Data
+DESM𝑘NN designs a series of novel secure protocols and a compact ver-                  Eng. 35 (9) (2023) 9333–9349.
+                                                                                  [10] H. Xie, Y. Guo, X. Jia, A privacy-preserving online ride-hailing system without
+ification strategy to facilitate the operation over the two-stage search               involving a third trusted server, IEEE Trans. Inf. Forensics Secur. 16 (2021)
+framework. Finally, computational complexity, security analysis and                    3068–3081.
+experimental evaluation demonstrate that DESM𝑘NN improves query                   [11] W. Wong, D. Cheung, B. Kao, N. Mamoulis, Secure kNN computation on
+efficiency by up tp 45.5% compared to MSV𝑘NN. In future research,                      encrypted databases, in: Proc. of SIGMOD, 2009, pp. 139–152.
+                                                                                  [12] Y. Zhu, R. Xu, T. Takagi, Secure k-NN computation on encrypted cloud data
+we plan to study 𝑘NN queries for multi-type POIs to address the
+                                                                                       without sharing key with query users, in: Proc. of IWSEC, 2013, pp. 55–60.
+limitation of single-type POI scenarios, where query results are too              [13] B. Yao, F. Li, X. Xiao, Secure nearest neighbor revisited, in: Proc. of ICDE, 2013,
+homogeneous. Moreover, we will focus more on exploring the balance                     pp. 733–744.
+between security and efficiency.                                                  [14] Q. Liu, Z. Hao, Y. Peng, H. Jiang, J. Wu, T. Peng, G. Wang, S. Zhang, SecVKQ:
+                                                                                       Secure and verifiable kNN queries in sensor–cloud systems, J. Syst. Archit. 120
+                                                                                       (2021) 102300.
+CRediT authorship contribution statement                                          [15] Y. Elmehdwi, B.K. Samanthula, W. Jiang, Secure k-nearest neighbor query
+                                                                                       over encrypted data in outsourced environments, in: Proc. of ICDE, 2014, pp.
+   Yining Jia: Writing – original draft, Software, Methodology, In-                    664–675.
+vestigation, Conceptualization. Yali Liu: Writing – review & editing,             [16] S. Choi, G. Ghinita, H.-S. Lim, E. Bertino, Secure kNN query processing in
+                                                                                       untrusted cloud environments, IEEE Trans. Knowl. Data Eng. 26 (11) (2014)
+Resources. Congai Zeng: Writing – review & editing. Xujie Ding:
+                                                                                       2818–2831.
+Writing – review & editing. Jianting Ning: Writing – review & editing.            [17] K. Cheng, L. Wang, Y. Shen, H. Wang, Y. Wang, X. Jiang, H. Zhong, Secure 𝑘
+                                                                                       k-NN query on encrypted cloud data with multiple keys, IEEE Trans. Big Data
+Declaration of competing interest                                                      7 (4) (2021) 689–702.
+                                                                                  [18] A. Boldyreva, N. Chenette, Y. Lee, A. O’neill, Order-preserving symmetric
+                                                                                       encryption, in: Proc. of EUROCRYPT, 2009, pp. 224–241.
+    The authors declare that they have no known competing finan-
+                                                                                  [19] X. Liu, R.H. Deng, K.-K.R. Choo, J. Weng, An efficient privacy-preserving
+cial interests or personal relationships that could have appeared to                   outsourced calculation toolkit with multiple keys, IEEE Trans. Inf. Forensics
+influence the work reported in this paper.                                             Secur. 11 (11) (2016) 2401–2414.
+                                                                                  [20] K. Cheng, Y. Shen, Y. Wang, L. Wang, J. Ma, X. Jiang, C. Su, Strongly secure
+                                                                                       and efficient range queries in cloud databases under multiple keys, in: Proc. of
+Acknowledgments
+                                                                                       INFOCOM, 2019, pp. 2494–2502.
+                                                                                  [21] S.K. Nayak, S. Tripathy, SEMKC: Secure and efficient computation over out-
+    The authors thank the editor and the reviewers for their comments                  sourced data encrypted under multiple keys, IEEE Trans. Emerg. Top. Comput.
+and suggestions. This work was supported by the National Natural Sci-                  9 (1) (2018) 414–428.
+ence Foundation of China under Grant No. 61702237, No. 62425205,                  [22] A. Okabe, B. Boots, K. Sugihara, S. Chiu, Spatial tessellations: Concepts and
+                                                                                       applications of voronoi diagrams, College Math. J. (2001).
+and No. 12441101, the Opening Foundation of State Key Laboratory
+                                                                                  [23] Y. Manolopoulos, A. Nanopoulos, A.N. Papadopoulos, Y. Theodoridis, R-Trees:
+for Novel Software Technology, Nanjing University under Grant No.                      Theory and Applications: Theory and Applications, Springer Science & Business
+KFKT2025B54, the Science and Technology Planning Foundation of                         Media, 2006.
+Xuzhou City under Grant No. KC22052, the Opening Foundation of                    [24] N. Cui, D. Wang, H. Zhu, J. Li, J. Xu, X. Yang, Enabling verifiable and secure
+                                                                                       range query in multi-user setting under cloud environments, IEEE Trans. Knowl.
+Guangxi Key Laboratory of Cryptography and Information Security,
+                                                                                       Data Eng. 36 (12) (2024) 8148–8163.
+Guilin University of Electronic Technology under Grant GCIS202114,                [25] Q. Liu, S. Wu, S. Pei, J. Wu, T. Peng, G. Wang, Secure and efficient multi-
+the Postgraduate Research & Practice Innovation Program of Jiangsu                     attribute range queries based on comparable inner product encoding, in: Proc.
+Normal University under Grant 2024XKT2579, and the University-                         of CNS, 2018, pp. 1–9.
+Industry Collaborative Education Program of China under Grant No.                 [26] Y. Zhang, B. Wang, Z. Zhao, Secure k-NN query with multiple keys based on
+                                                                                       random projection forests, IEEE Internet Things J. 11 (9) (2023) 15205–15218.
+202101374001. All authors have read and approved the final version
+                                                                                  [27] S. Wu, Q. Li, G. Li, D. Yuan, X. Yuan, C. Wang, ServeDB: Secure, verifiable,
+of the manuscript.                                                                     and efficient range queries on outsourced database, in: Proc. of ICDE, 2019, pp.
+                                                                                       626–637.
+Data availability                                                                 [28] H.-I. Kim, H.-J. Kim, J.-W. Chang, A secure kNN query processing algorithm
+                                                                                       using homomorphic encryption on outsourced database, Data Knowl. Eng. 123
+                                                                                       (2019) 101602.
+    Data will be made available on request.                                       [29] A. Liu, K. Zhengy, L. Liz, G. Liu, L. Zhao, X. Zhou, Efficient secure similarity
+                                                                                       computation on encrypted trajectory data, in: Proc. of ICDE, 2015, pp. 66–77.
+
+
+                                                                             12
+Y. Jia et al.                                                                                             Computer Standards & Interfaces 97 (2026) 104112
+
+
+[30] P. Williams, R. Sion, B. Carbunar, Building castles out of mud: practical access          Congai Zeng received her M.Sc. in Electronic Information in
+     pattern privacy and correctness on untrusted storage, in: Proc. of CCS, 2008, pp.         2024 from Jiangsu Normal University, China. Currently, she
+     139–148.                                                                                  is pursuing the Ph.D. degree in the Faculty of Information
+[31] M.S. Islam, M. Kuzu, M. Kantarcioglu, Access pattern disclosure on searchable             Technology at Beijing University of Technology, China. Her
+     encryption: ramification, attack and mitigation, in: Proc. of NDSS, vol. 20, 2012,        research interests include Internet of Vehicles security and
+     p. 12.                                                                                    privacy.
+[32] A. Bowyer, Computing dirichlet tessellations, Comput. J. 24 (2) (1981) 162–166.
+[33] D.F. Watson, Computing the n-dimensional delaunay tessellation with application
+     to voronoi polytopes, Comput. J. 24 (2) (1981) 167–172.
+[34] J. Liu, J. Yang, L. Xiong, J. Pei, Secure skyline queries on cloud platform, in:
+     Proc. of ICDE, 2017, pp. 633–644.
+[35] A.C.-C. Yao, How to generate and exchange secrets, in: Proc. of Sfcs, 1986, pp.
+     162–167.                                                                                  Xujie Ding received his B.Sc. in Software Engineering in
+                                                                                               2023 from Jiangsu Normal University, China. Currently, he
+                                                                                               is pursuing the M.Sc. degree in the School of Artificial Intel-
+                                                                                               ligence and Computer Science at Jiangsu Normal University,
+                          Yining Jia received his B.Sc. in Computer Science and Tech-
+                                                                                               China. His research interests include privacy preservation
+                          nology in 2023 from Nanjing Forestry University, China.
+                                                                                               and secure data sharing technology in smart healthcare.
+                          Currently, he is pursuing the M.Sc. degree in the School
+                          of Artificial Intelligence and Computer Science at Jiangsu
+                          Normal University, China. His research interests include
+                          data privacy, query processing, information security.
+
+                                                                                               Jianting Ning received his Ph.D. in 2016 from Shanghai
+                                                                                               Jiao Tong University, China. He has been a Research Sci-
+                                                                                               entist at the School of Computing and Information Systems,
+                                                                                               Singapore Management University, and a Research Fellow at
+                          Yali Liu received her Ph.D. in 2014 from Nanjing Uni-
+                                                                                               the National University of Singapore. His research interests
+                          versity of Aeronautics and Astronautics, China. She is a
+                                                                                               include applied cryptography and information security. He
+                          senior member of China Computer Federation (CCF). She
+                                                                                               is currently a Professor with the School of Cyber Science
+                          has been a Research Scientist at Nanyang Technological
+                                                                                               and Engineering, Wuhan University, China, and with Fac-
+                          University, Singapore. She is currently a Professor in the
+                                                                                               ulty of Data Science, City University of Macau, China. He
+                          School of Artificial Intelligence and Computer Science at
+                                                                                               has published papers in major conferences/journals, such
+                          Jiangsu Normal University, China. Her research interests
+                                                                                               as ACM CCS, NDSS, ASIACRYPT, ESORICS, ACSAC, IEEE
+                          include information security, authentication and privacy-
+                                                                                               Transactions on Information Forensics and Security, and
+                          preserving technology, blockchain security and privacy,
+                                                                                               IEEE Transactions on Dependable and Secure Computing.
+                          vehicular ad hoc networks, cryptographic algorithms and
+                          protocols and their applications in Internet of things and
+                          mobile communication.
+
+
+
+
+                                                                                          13
+
--- a/papers_txt/Eliminating-duplicate-writes-of-logging-via-no-loggi_2025_Journal-of-Systems.txt
+++ b/papers_txt/Eliminating-duplicate-writes-of-logging-via-no-loggi_2025_Journal-of-Systems.txt
@@ -0,0 +1,654 @@
+                                                           Journal of Systems Architecture 160 (2025) 103347
+
+
+                                                               Contents lists available at ScienceDirect
+
+
+                                                       Journal of Systems Architecture
+                                                      journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+Eliminating duplicate writes of logging via no-logging flash translation layer
+in SSDs
+Zhenghao Yin a , Yajuan Du a ,∗, Yi Fan a , Sam H. Noh b
+a Wuhan University of Technology, Wuhan, 430070, Hubei Province, China
+b
+    Virginia Tech, Blacksburg, 24061-0326, VA, USA
+
+
+
+ARTICLE                   INFO                         ABSTRACT
+
+Keywords:                                              With the development of high-density flash memory techniques, SSDs have achieved high performance and
+Flash memory                                           large capacity. Databases often use logging to ensure transactional atomicity of data updates. However, it
+Transaction                                            introduces duplicate writes because of multi-versioning, which significantly weakens the performance and
+Flash translation layer
+                                                       endurance of SSDs. This is also often considered as the main reason for slow response of databases. This
+Duplicate writes
+                                                       paper proposes a novel flash translation layer (FTL) for SSDs, which we refer to as NoLgn-FTL, to reduce
+                                                       the overhead of logging-induced duplicate writes by exploiting the inherent multi-version feature of flash
+                                                       memories. Specifically, during a transaction, NoLgn-FTL retains the old data as valid and establishes the
+                                                       mapping between the new physical addresses and the old physical addresses. Thus, the database can easily
+                                                       roll back to the old-version data to maintain system consistency when a power failure occurs. To evaluate
+                                                       NoLgn-FTL, we implement it within FEMU and modify the SQLite database and the file system to make them
+                                                       compatible with the extended abstractions provided by NoLgn-FTL. Experimental results show that, in normal
+                                                       synchronization mode, NoLgn-FTL can reduce SSD writes by 20% and improve database performance by 15%
+                                                       on average.
+
+
+
+1. Introduction                                                                           To investigate the performance of database logging in SSD, this
+                                                                                      paper first performs a preliminary study to collect latency that happens
+   Solid-state drives (SSDs) have been widely adopted in database sys-                during WAL-based data updates. We find that WAL takes a larger
+tems due to their high performance. Databases employ logging-based                    proportion of latency than regular data updates, especially for small
+methods, such as write-ahead logging (WAL) and rollback journals, to                  data updates. This inspires us to design a direct update scheme to
+ensure the transactional atomicity of multiple data updates. In these                 alleviate the overhead of duplicate writes by leveraging the out-of-
+methods, data is first written to persistent logs before updating the                 place update feature of flash memory. This feature inherently maintains
+original data, which induces duplicate writes [1]. For SSDs, duplicate                multiple versions of data upon updates, allowing the database to easily
+writes occur in the following manner. First, the updated data and                     roll back to the previous version of the data in the event of a power
+metadata are written into log files in flash memory. Then, due to the                 failure or system crash, ensuring data consistency without the need for
+inherent out-of-place update nature of the SSD [2], the updated data                  explicit logging.
+is written into new flash pages rather than overwriting the original
+                                                                                          This paper proposes a no-logging flash translation layer (NoLgn-
+ones [3]. Thus, one user data write induces two SSD internal writes
+                                                                                      FTL) by reusing old flash data pages. The key idea is to keep the
+onto two different flash pages, increasing extra program/erase (P/E)
+                                                                                      mapping information of old data during transactions, eliminating the
+cycles. This reduces SSD lifespan and degrades overall performance by
+                                                                                      need for separate log writes. We establish a mapping table between
+consuming write throughput.
+                                                                                      new and old physical addresses (called a P2P table) in the RAM of
+   To address the issue of SSD duplicate writes in logging-based
+                                                                                      the flash controller. Meanwhile, the old physical address is written
+databases, researchers have proposed data remapping methods. These
+methods aim to convert logs directly into new data by modifying the                   into the out-of-band area of new flash pages, providing a backup
+mapping between logical pages (LPs) and physical pages (PPs) in flash                 of the mapping information. In this way, uncommitted transactions
+memory [4,5]. However, dealing with the inconsistency of logging and                  can be rolled back to the old data version upon power failure, thus
+data LPs is challenging during power failures.                                        maintaining consistency. We implement NoLgn-FTL within FEMU and
+
+
+    ∗ Corresponding author.
+       E-mail address: dyj@whut.edu.cn (Y. Du).
+
+https://doi.org/10.1016/j.sysarc.2025.103347
+Received 31 October 2024; Received in revised form 15 December 2024; Accepted 18 January 2025
+Available online 25 January 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Z. Yin et al.                                                                                                       Journal of Systems Architecture 160 (2025) 103347
+
+
+evaluate it with the SQLite database. Experimental results show that,                  The write overhead incurred by WAL cannot be overlooked com-
+in normal synchronization mode, NoLgn-FTL can reduce SSD writes by                 pared to directly updating the page. Multiple update operations may
+20% and improve database performance by 15% on average, compared                   be performed on the same data page in the buffer, but during a
+to existing methods. Our paper makes the following contributions.                  checkpoint, the storage engine writes the latest data page to a database
+                                                                                   file. Fig. 2 illustrates the storage engine layer writing process. In the
+     • We conduct a preliminary study that reveals the significant la-             example, two concurrent transactions, Transaction1 and Transaction2,
+       tency impact of logging, compared to pure data updates in                   modify the database. Transaction1 updates A and B with values 2
+       databases, motivating the need for a more efficient approach to             and 4, while Transaction2 updates A and C with values 3 and 7.
+       handling duplicate writes.                                                  During the first step of the write merging process, the modifications
+     • We propose a novel SSD FTL, called NoLgn-FTL, which fully                   made by both transactions are recorded in the WAL file. The WAL file
+       utilizes the out-of-place update nature of flash memory to largely          maintains separate regions for each transaction, capturing the updated
+       remove duplicate writes caused by database logging.                         page identifiers and their corresponding values. Consequently, the WAL
+     • We modify SQLite and integrate NoLgn-FTL in the FEMU simula-                file contains two distinct entries: one for Transaction1, documenting
+       tor. We verify the efficiency of NoLgn-FTL in reducing duplicate            the updates to pages A(2) and B(4), and another for Transaction2,
+       writes and improving database performance through extensive                 recording the updates to pages A(3) and C(7). In the second step, the
+       experiments.                                                                changes recorded in the WAL file are applied to the database during the
+                                                                                   checkpointing process. As both transactions modify page A, the WAL
+    The rest of this paper is organized as follows. Section 2 introduces
+                                                                                   mechanism merges these updates into a single write operation. The
+the basics of SSDs and logging methods as well as the motivation of
+                                                                                   WAL mechanism consolidates the updates and writes the final value
+this paper. Section 3 presents the design of NoLgn-FTL. Section 4 shows
+                                                                                   of page A(3) to the database file. A contains the merged value of 3,
+the experimental setup and evaluation results of NoLgn-FTL. Section 5
+                                                                                   while B and C hold 4 and 7.
+reviews existing work, and Section 6 concludes this paper.
+
+                                                                                   2.3. Existing solutions
+2. Background and motivation
+                                                                                       Existing works propose to exploit data remapping to eliminate
+   This section begins by introducing the basics of SSDs, with a focus
+                                                                                   duplicate writes in SSDs [8–10]. The key design is not to remove the
+on logging methods. Then, we present existing remapping-based meth-
+                                                                                   out-of-place data update but to directly remap the WAL file to the
+ods. Finally, we present the preliminary study as the motivation for this
+                                                                                   new-version data, as shown in Fig. 1b.
+paper.
+                                                                                       However, address remapping can lead to mapping inconsistency.
+                                                                                   Flash pages are divided into a data area for storing user data and
+2.1. Basics of SSD                                                                 an OOB area for maintaining metadata. The OOB area contains the
+                                                                                   physical-to-logical (P2L) mappings, which are crucial for maintaining
+    Flash memory utilizes a flash translation layer (FTL) to store and             data consistency during garbage collection and database recovery.
+manage a logical-to-physical address translation, called L2P mapping.              During garbage collection, the P2L mappings enable quick identifica-
+This mapping is often stored in the SRAM internal to the SSD to achieve            tion of the logical address corresponding to a physical address, which
+high access performance. Meanwhile, the logical address is also stored             accelerates the update of L2P mappings during data migration. During
+in the out-of-band (OOB) area of physical flash pages. Upon a data                 recovery upon a system crash, the FTL can reconstruct the lost L2P
+update request, the FTL first stores the new data in new flash pages and           mapping table using the P2L mapping stored within the page.
+invalidates the old flash pages. Meanwhile, the L2P mapping is directed                Without remapping, the P2L mappings in the OOB area directly
+to the new physical page addresses, and the requested logical addresses            correspond to the LPN in the L2P mapping table. However, mapping
+are also stored in the OOB areas as the new flash pages are written. The           inconsistencies may arise after remapping because remapping opera-
+invalidated old pages are reclaimed during garbage collection (GC).                tions do not simultaneously update the related P2L mappings in the
+As shown in Fig. 1a, when data with physical addresses P1, P2, and                 OOB area.
+P3 need to be updated, new data would eventually be stored in new
+physical pages P1′ , P2′ , and P3′ . (Note L𝑖 and P𝑖 in the figure represent       2.4. Preliminary study and motivation
+the logical address and physical addresses).
+                                                                                       To investigate the performance of database transactions, we conduct
+2.2. Write ahead logging                                                           preliminary experiments using the FEMU simulator [11], which is
+                                                                                   discussed in more detail in Section 4.
+    Relational databases are typically run in rollback mode or write-                  We run the SQLite database, perform 1 million overwrite operations
+ahead log mode in order to support atomic execution of transactions [1,            for each fixed value size, and collect the transaction latency under four
+6,7]. New updates are first written in a dedicated log, and the data               value sizes. In Fig. 3, the 𝑥-axis represents the transaction value size and
+is kept consistent by rolling back or forwarding to the log. How-                  the 𝑦-axis represents the percentage of the time spent on WAL writes,
+ever, using logs often generates write amplification, affecting database           WAL synchronization, data writes, and data synchronization.
+performance. Write-ahead logging (WAL) serves as an example. A                         From Fig. 3, we observe that WAL (WAL write and WAL synchro-
+WAL-based transaction update includes three steps: WAL writing, WAL                nization) takes up a significant portion of the total transaction latency.
+synchronization, and database writing, as shown in Fig. 1a. First, when            Compared to the data (data write and data synchronization) operations,
+a transaction is initiated, the new data are written into the page cache           the proportion is significantly higher for small value sizes, while for the
+of WAL files (Step 1). Upon transaction commit, the WAL files are                  16 KB size, the two are comparable.
+physically written to flash memory (WAL synchronization) (Step 2).                     Two main factors contribute to this phenomenon. Firstly, WAL
+Finally, the database data is updated during system checkpointing. As              introduces additional overhead by writing an extra frame header for
+this checkpoint is performed at the database software level, WAL data              each transaction. This header contains essential recovery information
+cannot be directly moved into the database data. Thus, the WAL file is             and is stored alongside the normal data. Consequently, the relative
+read again into the page cache (Step 3) and written into flash memory              overhead of the frame header becomes more significant for smaller
+upon database synchronization (Step 4). Duplicated writes introduced               transactions. Secondly, although WAL consolidates multiple updates to
+by WAL are detrimental to flash memory endurance and performance.                  the same data pages into a single write operation during checkpointing,
+
+                                                                               2
+Z. Yin et al.                                                                                                             Journal of Systems Architecture 160 (2025) 103347
+
+
+
+
+                                                        Fig. 1. Existing write-ahead logging schemes in SSDs.
+
+
+
+
+                                                                                                  Fig. 3. Transaction latency distribution in SQLite database.
+                   Fig. 2. Multi-version pages in the WAL.
+
+
+
+
+the logging mechanism still necessitates storing multiple versions of                 3.1. Overview
+the same data in log files. It results in increased storage requirements,
+particularly affecting smaller transactions with frequent updates on the                  We propose NoLgn-FTL, a novel approach that optimizes both soft-
+same page, as the overhead of maintaining multiple versions becomes                   ware and hardware architectures to efficiently manage transactions and
+more significant relative to the size of the transactions.                            data version control at the FTL layer, thereby avoiding the overhead of
+    This paper proposes a novel approach by directly updating data and                logs in databases. At the core of NoLgn-FTL is the novel FTL, where
+leveraging the inherent multi-version characteristic of flash memory.                 transaction information is utilized to perform mapping conversion of
+Shifting the focus of transaction support to flash can reduce the reliance            logical and physical addresses in the L2P and P2P tables only when
+on logs and frequent file synchronization operations in the database.                 data is written, minimizing overhead. However, the use of NoLgn-
+This leads to faster application response times as it reduces the need                FTL starts at the database layer where the transaction information is
+for excessive logging and synchronization.                                            attached to write requests. The file system layer also plays a crucial role
+                                                                                      by providing transaction-related interfaces and transmitting necessary
+                                                                                      transactional metadata.
+3. The proposed NoLgn-FTL                                                                 Fig. 4 shows the overall workflow with an example of transactional
+                                                                                      data update on three pages in L1, L2, and L3. The process is divided
+    We first introduce the overview of the whole system flow using an                 into three key stages: transaction delivery, transaction persistence, and
+no-logging flash translation layer, which, hereafter, we simply refer to              GC. These stages can be further subdivided into six steps.
+as NoLgn-FTL. Then, we delve into the design details of NoLgn-FTL,                        First, the database assigns transaction flags to each transaction (⃝  1
+including old page information storage, transaction process, garbage                  in Fig. 4) to indicate the completion status of the transaction. Then, a
+collection (GC), and data recovery. Without loss of generality, the SQL               transaction ID is added to the original transactional data request (⃝). 2
+database is used in discussing the use of NoLgn-FTL. Finally, we analyze              To retain transaction flags and IDs, we design new interfaces in the file
+and discuss the overhead associated with NoLgn-FTL.                                   system (⃝).3
+
+
+                                                                                  3
+Z. Yin et al.                                                                                                    Journal of Systems Architecture 160 (2025) 103347
+
+
+
+
+                                                              Fig. 4. Overview of NoLgn-FTL.
+
+
+    In the second stage, which occurs within the SSDs, the flash con-            of the old pages are also stored in the OOB area of the new flash pages.
+troller identifies transaction data by transaction flags and IDs. Data and       The primary purposes of the P2P table are twofold: firstly, to facilitate
+transaction information are persisted, obtaining their corresponding             the management of transactional information by the underlying FTL,
+physical addresses. The old addresses and transaction information are            and secondly, to enhance the performance during GC and transaction
+written in the OOB area of the corresponding flash pages, as well as in          operations. Note that locating old pages can be accelerated by using the
+the P2P table in DRAM (⃝).   4   The old pages remain valid in this step         P2P table, thereby avoiding frequent access on flash pages to the OOB
+but will be invalidated only after the transaction is committed (⃝). 5           area. This table does not need to be written to flash memory and can be
+    As transactions are continuously executed, a large amount of invalid         recovered through a full scan even after a sudden power failure, thus
+data accumulates in the flash memory. The GC process (⃝)   6 reclaims the        avoiding frequent writes of transaction information to flash memory.
+invalid data. The collaboration between the database, file system, and               Furthermore, transaction information, including transaction IDs and
+flash controller in NoLgn-FTL ensures data consistency and integrity             flags, is stored in the OOB area of new flash pages. In detail, flags S,
+throughout the transactional data update process.                                M, and E represent the starting page, the middle pages, and the end
+    The modified file system interfaces play a crucial role in preserving        page of a transaction, respectively. In the implementation of transac-
+the necessary transaction metadata. The design of NoLgn-FTL in the               tion flags, since we are only concerned whether the transaction has
+above-mentioned three main stages will be presented in Sections 3.2,
+                                                                                 ended, we use only one bit to mark the transaction’s completion. By
+3.3, and 3.4.
+                                                                                 storing transaction information alongside the corresponding pages, the
+                                                                                 progress and state of transactions can be more effectively tracked,
+3.2. Metadata management in transaction delivery
+                                                                                 enabling data recovery in case of unexpected failures or interruptions.
+                                                                                 Database recovery will be explained in Section 3.5.
+    In the transaction delivery process, we introduce additional meta-
+                                                                                     In addition to transaction information, one extra bit, referred to
+data to facilitate the implementation of the no-logging scheme. This
+                                                                                 as the lock bit, is used to indicate the block lock state. The lock bit
+metadata is passed along with the transactional data requests to en-
+                                                                                 value ‘1’ signifies that valid old pages exist in the current block, while
+sure proper handling and management of transactions throughout the
+                                                                                 ‘0’ indicates the block is stale and can be reclaimed during GC. By
+system.
+                                                                                 embedding the lock bit within the FTL, blocks containing valid old
+    In the FTL, we establish a physical-to-physical (P2P) table that
+                                                                                 pages and normal blocks can be efficiently distinguished, allowing for
+stores the mapping between new and old physical pages (i.e., their old
+                                                                                 GC optimization. The GC process under NoLgn-FTL will be presented
+version). In detail, one entry in the P2P table includes the transaction
+                                                                                 in Section 3.4.
+ID, the physical page number (PPN) of the new page and the PPN of the
+corresponding old page. To ensure persistent P2P mappings, the PPNs
+
+                                                                             4
+Z. Yin et al.                                                                                                           Journal of Systems Architecture 160 (2025) 103347
+
+
+3.3. Transaction persistence in NoLgn-FTL                                                  P2P Table Storage and Overhead: The P2P table is stored in the RAM
+                                                                                       of the flash controller. The number of entries in the P2P table depends
+    To ensure transaction persistence, the transaction needs to do the                 on the number of concurrent transactions. In our experiment, the table
+following during its write and commit process. During transaction                      contains 10 000 entries. Each P2P entry takes 12 bytes, including a 4-
+writing, NoLgn-FTL first looks up the original L2P table to find the                   byte transaction ID and 4 bytes each for the new page PPN and the
+old PPN corresponding to the requested logical addresses. As shown in                  old page PPN. The total size of the P2P table is about 120 KB. The
+                                                                                                                        1
+Fig. 4, the old PPNs are P1, P2, and P3 for the requested L1, L2, and L3,              DRAM size is usually around 1024    of the SSD capacity. For an SSD with
+respectively. Then, the updated data are written into the new pages P1′ ,              a 1TB capacity, the DRAM size will be 1 GB, and the P2P table will be
+P2′ , and P3′ , respectively. At the same time, transaction information                0.12 MB, which is only 0.012% of the DRAM size and is negligible. The
+and the old PPN are written into the OOB area of these new pages.                      block lock state is stored in the metadata of data blocks as a bitmap,
+Finally, NoLgn-FTL stores the mapping entry of P1, P2, and P3 into the                 with each block requiring only 1 bit, which is insignificant in terms of
+P2P table. Different from the original flash write, the old page remains               overhead. This lock bit is loaded into the SSD’s DRAM during startup.
+valid. Meanwhile, the block’s lock state containing valid old pages is                     Transaction Information Storage in OOB Area: Transaction informa-
+set to ‘1’.                                                                            tion is stored in the OOB area of flash pages. NoLgn-FTL uses 4 bytes for
+    During transaction commit, NoLgn-FTL first searches the P2P table                  old PPNs and 4 bytes for transaction information (comprising the trans-
+to find old valid pages and then invalidates them. Then, the block’s lock              action ID and 1 bit for transaction flag). In current flash chips, the ratio
+state containing these old valid pages would be set to ‘0’. Finally, the               of the OOB area size to the data area size is about 18 [12]. Therefore,
+corresponding entries in the P2P table are deleted.                                    the OOB area has enough space to store transaction information.
+
+
+3.4. Garbage collection with NoLgn-FTL                                                 4. Evaluation
+
+
+    GC in NoLgn-FTL requires handling valid old pages temporarily                         In this section, we present a comprehensive evaluation of NoLgn-
+generated during transaction processing. Selecting a victim block for                  FTL, using an SQLite and Ext4 combination as a case study. We first
+                                                                                       describe the experimental setup. Then, we present the sqlite-bench
+GC involves several steps to ensure data integrity and efficient space
+                                                                                       experimental results, focusing on two key aspects: flash write and
+reclamation.
+                                                                                       database performance. We also investigate the impact of NoLgn-FTL
+    When selecting a victim block for GC, the first step is to check the
+                                                                                       on GC. Furthermore, we show the performance of real-world workloads
+block’s lock state. If the lock state is ‘1’, valid old pages still exist within
+                                                                                       with the YCSB and TPC-C benchmarks.
+the block, and therefore, the block cannot be reclaimed. At this time,
+the next victim block in the queue is selected until the selected block’s
+                                                                                       4.1. Experimental setup
+lock state is ‘0’. Then, whether there is a transaction page in the block
+must be checked. As the transaction information and old PPN are stored
+                                                                                           NoLgn-FTL is implemented on FEMU [13–15], a QEMU-based NVMe
+in the OOB area of the new valid pages, GC in NoLgn-FTL deals with
+                                                                                       SSD emulator. The host system kernel of FEMU is Linux 5.15, and the
+them differently depending on the transaction state. That is, before the
+                                                                                       file system is Ext4. To ensure a representative and consistent setup, the
+transaction is committed, GC will migrate these valid pages together
+                                                                                       simulated SSD has a 16 GB logical capacity, with 1024 pages per flash
+with the OOB area. However, after a commit has occurred, GC only
+                                                                                       block and a 4 KB page size. The flash latency for read, write, and erase
+migrates valid page data, removing the extra metadata of NoLgn-FTL
+                                                                                       operations is 50 μs, 500 μs, and 5 ms, respectively [16]. To ensure the
+that resides in the OOB area.
+                                                                                       GC (Garbage Collection) mechanism is appropriately triggered during
+                                                                                       our experiments, we conducted 4 million 4 KB write operations on the
+3.5. Database recovery with NoLgn-FTL
+                                                                                       SSD in each test. This setup guarantees that GC operations occur as part
+                                                                                       of the evaluation.
+    In the event of a power-off or system crash, data stored in the                        For the logging database, we make use of SQLite. We make nec-
+flash controller’s RAM is lost, and only the OOB area of flash pages                   essary modifications to the Linux kernel to receive and process trans-
+can be used for system recovery. One solution is to recover to the                     action information from the SQLite database. To enable SQLite to
+consistent states in the latest checkpoint, which requires periodically                transmit transaction information to the kernel, we utilize the ioctl
+storing checkpoints. The other solution involves a full flash scan to                  system call to change database write, commit, and abort operations into
+rebuild mappings, as shown in Step 1 of Fig. 5. Physical pages and                     write, commit, and abort commands. As SQLite does not automatically
+their OOB area would be read one by one (Step 2). For pages that                       generate unique transaction IDs for each transaction, the transaction
+do not have transaction information in the OOB area, NoLgn-FTL can                     IDs are generated in the kernel after each transaction is committed.
+directly recover the L2P table of PPNs based on the LPNs in their OOB                  Upon receiving the written information from SQLite, the kernel first
+area. Otherwise, NoLgn-FTL decides to recover old-version pages or                     assigns flags to the requested transaction pages. This enables the kernel
+not according to transaction information. NoLgn-FTL would first obtain                 to keep track of the transaction status and perform necessary operations
+pages with the same transaction ID. If the page with the end flag bit                  accordingly. Approximately 150 lines of code were modified in SQLite,
+can be found, these pages would be directly put into the L2P table                     around 100 lines in the file system, and about 300 lines in FEMU.
+together with their LPNs (Step 3). Otherwise, if all pages have the flag                   Hereafter, NoLgn-FTL will refer to the entire SQLite-Ext4-SSD sys-
+bit ‘0’, which indicates that the current transaction is not committed,                tem stack modified to ensure the seamless integration and functionality
+the old-version pages would be first read out (Step 4), and only the L2P               of NoLgn-FTL within the existing software and hardware stack. The
+mappings of old-version pages would then be put into the L2P table.                    newly introduced commands, which are based on the ioctl system
+                                                                                       call, are as follows.
+3.6. Discussion and overhead analysis                                                      write(page p, tid t, flag f). This command adds a transaction ID (tid),
+                                                                                       𝑡, and a transaction flag, 𝑓 , to the original write operation. It is the
+    Compared to existing logging methods that store extra logs for each                beginning of a transaction and corresponds to Step 4 in Fig. 4. The
+transaction, the use of NoLgn-FTL allows normal data updates without                   inclusion of the transaction ID and flag enables the FTL to track and
+the need for additional logging. The overhead of NoLgn-FTL is due                      manage the transaction.
+to the storage of extra metadata, including the P2P table, transaction                     commit (tid t). This command with the parameter of transaction ID
+information, and the block lock state.                                                 tid t is sent to NoLgn-FTL along with the original fsync command in the
+
+                                                                                   5
+Z. Yin et al.                                                                                                    Journal of Systems Architecture 160 (2025) 103347
+
+
+
+
+                                                             Fig. 5. Recovery with NoLgn-FTL.
+
+
+Linux kernel. It indicates the successful completion of a transaction and        shows the normalized number of writes in flash memory compared to
+aligns with Step 5 in Fig. 4. Upon receiving this command, NoLgn-FTL             Base-WAL under two synchronization modes. In NORMAL mode, SW-
+finalizes the transaction and ensures the durability of the associated           WAL reduces writes by 35% compared to Base-WAL, as it eliminates
+data.                                                                            extra writes caused by out-of-place updates through WAL file remap-
+    abort(tid t). This command is invoked to terminate ongoing trans-            ping. On average, NoLgn-FTL reduces 55% and 20% of the flash page
+actions before committing transaction 𝑡. It indicates a rollback opera-          writes compared to Base-WAL and SW-WAL, respectively. The superior
+tion, reverting the data pages to their previous versions, akin to the           performance of NoLgn-FTL is due to its elimination of WAL writes
+data recovery process for uncommitted transactions as mentioned in               and WAL synchronization, resulting in a greater reduction of writes
+Section 3.5.                                                                     compared to SW-WAL. Specifically, there are two reasons for NoLgn-
+    We compare NoLgn-FTL with Base-WAL, the original SQLite, which               FTL’s write reduction. First, as WAL has to write an extra log header,
+uses the native logging scheme, and SW-WAL [4], which reduces                    WAL write involves more data than normal data write. Second, since
+duplicate writes by SSD remapping as shown in Fig. 1a. For each trans-
+                                                                                 synchronization does not happen immediately after each transaction,
+action size, the database runs separately, but these transactions share
+                                                                                 in NORMAL mode, updates onto the same page are serviced from
+the same SSD storage. It is important to consider that in real-world
+                                                                                 the cache. NoLgn-FTL combines several updates into a single update,
+scenarios, particularly in mobile environments, the characteristics of
+                                                                                 thereby reducing writes. However, this combination cannot be realized
+write requests can significantly impact the performance of storage
+                                                                                 in SW-WAL as it uses different LPNs for data updates and WAL writes.
+systems. SQLite is a lightweight, embedded database commonly used in
+mobile devices for local data storage, making it highly relevant to our              In FULL mode, NoLgn-FTL reduces flash page writes by 35% and
+analysis. Studies have shown that approximately 90% of write requests            2% compared to Base-WAL and SW-WAL, respectively. Both methods
+in Android applications, such as Facebook and Twitter, are related to            show reductions in page writes compared with Base-WAL, similar to the
+SQLite databases and journal files. In environments like these, the data         NORMAL mode. However, the enhancement brought by NoLgn-FTL is
+items stored in the database are typically small, often below 4 KB.              less than that of the NORMAL mode. As each transaction is forcibly
+These small data items, such as individual records or key–value pairs,           synchronized to flash memory after committing, there is no chance for
+are frequently written to the storage medium in the form of random               NoLgn-FTL to combine updates on the same page. The reduction from
+write operations. These operations usually target data blocks ranging            log header writes is limited. Thus, in this mode, NoLgn-FTL behaves
+from 64B to 4 KB, and such small writes often involve high interaction           similarly to SW-WAL.
+with the underlying file system, such as EXT4, which is commonly
+used in Android devices [17,18]. Therefore, we set different transaction         4.3. Results of database performance
+sizes from 256B to 16 KB in the experiment to observe their impact on
+performance.                                                                         We used sqlite-bench to observe SQLite performance. Fig. 7 shows
+    We conduct experiments in both the FULL and NORMAL syn-                      the normalized throughput results of SQLite under the three com-
+chronous modes of the database. In FULL mode, synchronization is                 pared methods. In NORMAL mode, NoLgn-FTL achieves an average
+triggered after each transaction is committed. This forces all transaction       performance improvement of 51% and 15% against Base-WAL and SW-
+data to be written into SSDs, thus providing the highest atomicity
+                                                                                 WAL, respectively. NoLgn-FTL performs particularly better compared
+and durability. Conversely, in NORMAL mode, synchronization is not
+                                                                                 to SW-WAL for small-sized transactions, due to the reasons described
+triggered immediately after the transaction is committed. Typically,
+                                                                                 earlier.
+transactions are synchronized into SSDs only when a certain number
+                                                                                     In FULL mode, we observe that NoLgn-FTL outperforms Base-WAL
+of frames (including transaction heads and data) are accumulated.
+                                                                                 and SW-WAL by an average of 26% and 4%, respectively. This perfor-
+Note that NoLgn-FTL has no explicit WAL synchronization operation.
+In NORMAL mode, we manually control the frequency of commit in                   mance improvement is primarily due to the reduction in the number
+NoLgn-FTL to keep consistent with the synchronization operation of the           of writes achieved by NoLgn-FTL. Meanwhile, we find that both SW-
+other two existing methods. In NoLgn-FTL, a synchronization operation            WAL and NoLgn-FTL demonstrate a gradual performance improvement
+will be triggered every 1000 data pages.                                         as the transaction size increases. This is because, for large-size trans-
+                                                                                 actions, Base-WAL takes up more latency to write flash pages and GC.
+4.2. Results of flash page writes                                                Since SW-WAL and NoLgn-FTL reduce the number of data writes, this
+                                                                                 degradation is mitigated. Even in this situation, the performance of
+   We used sqlite-bench with 200 thousand overwrite operations to                SW-WAL is still inferior to that of NoLgn-FTL, as it maintains head
+observe the effect of NoLgn-FTL on flash memory page writes. Fig. 6              information that consumes data write latency.
+
+                                                                             6
+Z. Yin et al.                                                                                                 Journal of Systems Architecture 160 (2025) 103347
+
+
+
+
+                                                          Fig. 6. Results of flash page writes.
+
+
+
+
+                                                          Fig. 7. SQLite database performance.
+
+
+
+
+                                                            Fig. 8. SQLite database latency.
+
+
+    Besides, we also evaluated database latency data under different           by NoLgn-FTL remains significant. Compared to Base-WAL, NoLgn-FTL
+conditions. Fig. 8 illustrates the normalized latency results under the        reduces latency by an average of 16.4%, and compared to SW-WAL,
+three compared methods: Base-WAL, SW-WAL, and NoLgn-FTL, in both               the reduction is 3.7%. Both NoLgn-FTL and SW-WAL exhibit a gradual
+NORMAL and FULL modes.                                                         latency improvement as transaction size increases, which aligns with
+    In NORMAL mode, NoLgn-FTL demonstrates the lowest latency                  the behavior observed in throughput analysis. For larger transactions
+among the three methods, achieving an average reduction of 34.4%               (e.g., 8 KB and 16 KB), Base-WAL experiences higher latency due to
+compared to Base-WAL and 11% compared to SW-WAL. The latency                   more extensive flash page writes and garbage collection overhead. In
+advantage of NoLgn-FTL is particularly pronounced for small-sized              contrast, NoLgn-FTL and SW-WAL effectively mitigate this degradation
+transactions (e.g., 256B and 512B). This stems from its ability to             by reducing the volume of writes.
+reduce the number of writes and optimize metadata updates, minimiz-
+ing the overhead typically associated with WAL. SW-WAL also shows              4.4. Results of GC overhead
+improved latency compared to Base-WAL, with an average reduction
+of approximately 26.2%, thanks to its selective write strategy. How-              We used sqlite-bench to investigate the impact of block locking on
+ever, its performance is still limited due to the additional overhead          GC performance by collecting write distribution results under different
+introduced by writing WAL, which becomes increasingly noticeable for           transaction sizes. Fig. 9 shows the write distribution of host requests,
+smaller transactions. In FULL mode, the latency reduction achieved             GC migration, and block locking (denoted as additional pages) under
+
+                                                                           7
+Z. Yin et al.                                                                                                                 Journal of Systems Architecture 160 (2025) 103347
+
+
+
+
+                  Fig. 9. Results of GC overhead. NoLgn-FTL would lock certain blocks, which would affect victim block selection and induce more migrations.
+
+
+Table 1                                                                                   E), the improvements from both methods are not significant. This is
+YCSB workloads.
+                                                                                          mainly because both methods only enhance write performance and
+ Workload                Description                                                      have little impact on read performance. Meanwhile, NoLgn-FTL still
+ A                       50% read and 50% update, Zipfian distribution                    outperforms SW-WAL due to its greater write performance benefits. In
+ B                       95% read and 5% update, Zipfian distribution
+                                                                                          the case of workload C, which only contains read requests, there are
+ C                       100% read, Zipfian distribution
+ D                       95% read and 5% insert, latest read                              no obvious differences in the three methods. This is because the remap-
+ E                       95% scan and 5% insert, Zipfian distribution                     based logging in SW-WAL and no-logging scheme in NoLgn-FTL are not
+ F                       50% read and 50% read–modify–write, Zipfian distribution         triggered. The slight performance fluctuations arise from the random
+                                                                                          nature of read operations.
+                                                                                              Fig. 11 shows the performance of SQLite in terms of transactions
+different transaction sizes.                                                              per minute (tpmC) with different SSD free spaces. To obtain SSDs
+    Two key observations can be made from Fig. 9. First, as transaction                   with varying free space, sufficient random overwrite iterations are
+value size increases, the proportion of valid page migration involved                     performed before each of the experiments. TPC-C is a write-intensive
+in GC also increases, reaching a maximum of 62%. This trend can be                        workload with operations such as new orders, payment, and delivery,
+attributed to the fact that larger transaction sizes require more frequent                with an average of two pages updated per transaction. The results
+GC to accommodate new content. Second, the block locking mechanism                        show that when SSD free space is 75%, the performance differences
+impacts the number of valid pages migrated. The maximum proportion                        among the three modes are relatively small. However, as SSD free
+of additional migration pages due to block locking is 6%, with an                         space decreases, the performance gap widens. Overall, NoLgn-FTL sig-
+average increase of 3.5% in total write pages. This impact is more                        nificantly outperforms Base-WAL and SW-WAL. On average, SW-WAL
+significant for smaller transaction sizes, as updates may be concentrated                 improves transaction throughput by 20% compared to Base-WAL, while
+in fewer blocks, preventing them from being chosen as optimal victim                      NoLgn-FTL improves throughput by 38%. Notably, the performance
+blocks for GC and leading to suboptimal data migration with more valid                    gains of SW-WAL and NoLgn-FTL become more pronounced when SSD
+pages.                                                                                    free space is limited. When SSD remaining space is 25%, NoLgn-FTL’s
+    Despite the extra page writes caused by block locking, these over-                    throughput is 81% higher than Base-WAL. This is mainly because when
+heads are acceptable compared to the significant reduction in duplicate                   SSD free space is low, there may be a lack of free blocks, requiring
+writes achieved by NoLgn-FTL. The benefits of eliminating duplicate                       frequent GC to accommodate new writes. Additionally, TPC-C’s trans-
+writes and improving overall write performance outweigh the relatively                    action data size is relatively small, allowing multiple data items to be
+minor increase in valid page migrations caused by locking SSD blocks.                     stored in a single page. Therefore, NoLgn-FTL effectively reduces write
+                                                                                          operations and GC needs by minimizing duplicated writes.
+4.5. Results of YCSB and TPC-C performance
+                                                                                          5. Related works
+    We also evaluate NoLgn-FTL using the YCSB benchmark to assess its
+performance under various realistic workloads. YCSB provides six core                         Research addressing duplicate writes can be divided into two direc-
+workloads as summarized in Table 1. To evaluate the long-term impact                      tions: optimization on atomic writes and remapping-based methods.
+of NoLgn-FTL, we use TPC-C benchmarks with four 4 warehouses [19]                             An atomic write interface was initially proposed by Park et al. [20],
+tested under different SSD free space conditions. TPC-C contains the                      which achieved atomicity for multi-page writes. Prabhakaran et al. [21]
+following 5 transaction types: 43% new order, 43% payment, 4%                             further introduced a transactional FTL called txFlash, which provides
+delivery, 4% order status, 4% stock level. The number of database                         a transaction interface (WriteAtomic) to higher-level software. It pro-
+connections was set to 1 to avoid frequent aborts of update transactions.                 vides isolation among multiple atomic write calls by ensuring that
+    Fig. 10 shows the normalized throughput results of SQLite under                       no conflicting writes are issued. Xu et al. [22] used the native off-
+YCSB benchmarks in NORMAL mode. On average, SW-WAL shows                                  site update feature of NAND flash memory to simulate copy-on-write
+a 10% performance improvement over Base-WAL, while NoLgn-FTL                              technology and, at the same time, used NVM to store the FTL mapping
+achieves a 17% improvement. For write-intensive workloads (A and F),                      table. However, these methods mostly supported atomicity for multi-
+both SW-WAL and NoLgn-FTL exhibit significantly better performance                        page writes only. Kang et al. presented X-FTL [23], aiming to support
+than Base-WAL. However, for read-intensive workloads (B, D, and                           general transactional atomicity, allowing data pages in a transaction
+
+                                                                                      8
+Z. Yin et al.                                                                                                  Journal of Systems Architecture 160 (2025) 103347
+
+
+
+
+                                                     Fig. 10. SQLite performance on YCSB benchmarks.
+
+
+
+
+                                                     Fig. 11. SQLite performance on TPC-C benchmark.
+
+
+to be written to flash at any time. However, it requires an additional          6. Conclusion
+X-L2P table and needs to persist it to flash upon transaction commit.
+    Address remapping is another extensively researched method that                 In this paper, we presented NoLgn-FTL to directly update the
+modifies the mapping table directly without performing actual writing.          database in a no-logging way by reusing the old flash pages. NoLgn-
+Wu et al. [24] proposed KVSSD, which exploits the FTL mapping mech-             FTL uses a P2P table and OOB area of flash pages to keep old page
+anism to implement copy-free compaction of LSM trees, and it enables            information and transaction information. Thus, systems can recover
+direct data allocation in flash memory for efficient garbage collection.        to a consistent state when a crash happens. As there is no need to
+However, address remapping may suffer from mapping inconsistencies              store logging files in NoLgn-FTL, duplicate writes can be avoided. We
+due to the inability of flash memory to perform in-place updates.               implemented a prototype of NoLgn-FTL on the FEMU SSD simulator
+Hahn et al. [25] use the address remapping operation for file system            and integrated it with the SQLite database. The file system is modified
+defragmentation. However, after remapping, it uses file system logs to          to enable SQLite to use the provided interface and transfer transaction
+deal with mapping inconsistencies. The larger log size results in longer        information. Experimental results demonstrate that NoLgn-FTL can
+search times and increased memory consumption when performing                   significantly reduce writes to SSDs and improve the performance of
+read operations. As the number of remappings escalates, the log can             SQLite, while still ensuring atomicity.
+become several hundred MB or even GB. Therefore, these methods
+may incur significant lookup overhead. Zhou et al. [26] address this            CRediT authorship contribution statement
+issue by storing the new mapping table in Non-Volatile Memory, re-
+ducing lookup overhead. Besides, Wu et al. [4] proposed SW-WAL, a                   Zhenghao Yin: Writing – original draft, Visualization, Validation,
+novel approach that emulates the maintenance of a mapping table by              Software, Methodology, Investigation, Formal analysis, Data curation.
+inscribing transaction information directly into the OOB area of flash          Yajuan Du: Writing – review & editing, Supervision, Project adminis-
+pages. This strategy markedly reduces the footprint of the search table         tration, Conceptualization. Yi Fan: Visualization. Sam H. Noh: Writing
+and concurrently boosts search efficiency. Additionally, to deal with           – review & editing.
+the heavy query latency during WAL checkpointing, Yoon et al. [27]
+proposed Check-In to align journal logs to the FTL mapping unit.                Funding
+The FTL creates a checkpoint by remapping the journal logs to the
+checkpoint, effectively reducing the checkpointing overhead and WAL’s               This research did not receive any specific grant from funding agen-
+duplicate writes.                                                               cies in the public, commercial, or not-for-profit sectors.
+
+                                                                            9
+Z. Yin et al.                                                                                                                         Journal of Systems Architecture 160 (2025) 103347
+
+
+Declaration of competing interest                                                               [23] W.-H. Kang, S.-W. Lee, B. Moon, G.-H. Oh, C. Min, X-FTL: transactional FTL
+                                                                                                     for SQLite databases, in: Proceedings of the 2013 ACM SIGMOD International
+                                                                                                     Conference on Management of Data, 2013, pp. 97–108.
+    The authors declare that they have no known competing finan-
+                                                                                                [24] S.-M. Wu, K.-H. Lin, L.-P. Chang, KVSSD: Close integration of LSM trees and
+cial interests or personal relationships that could have appeared to                                 flash translation layer for write-efficient KV store, in: 2018 Design, Automation
+influence the work reported in this paper.                                                           & Test in Europe Conference & Exhibition, DATE, IEEE, 2018, pp. 563–568.
+                                                                                                [25] S.S. Hahn, S. Lee, C. Ji, L. Chang, I. Yee, L. Shi, C.J. Xue, J. Kim, Improving file
+                                                                                                     system performance of mobile storage systems using a decoupled defragmenter,
+Data availability
+                                                                                                     in: 2017 USENIX Annual Technical Conference (USENIX ATC 17), 2017, pp.
+                                                                                                     759–771.
+    The original contributions presented in the study are included in the                       [26] Y. Zhou, Q. Wu, F. Wu, H. Jiang, J. Zhou, C. Xie, Remap-SSD: Safely and
+article, further inquiries can be directed to the corresponding author.                              efficiently exploiting SSD address remapping to eliminate duplicate writes, in:
+                                                                                                     19th USENIX Conference on File and Storage Technologies (FAST 21), 2021, pp.
+                                                                                                     187–202.
+                                                                                                [27] J. Yoon, W.S. Jeong, W.W. Ro, Check-In: In-storage checkpointing for key-
+References
+                                                                                                     value store system leveraging flash-based SSDs, in: 2020 ACM/IEEE 47th Annual
+                                                                                                     International Symposium on Computer Architecture, ISCA, 2020, pp. 693–706,
+ [1] C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, P. Schwarz, ARIES: A transaction                 http://dx.doi.org/10.1109/ISCA45697.2020.00063.
+     recovery method supporting fine-granularity locking and partial rollbacks using
+     write-ahead logging, ACM Trans. Database Syst. 17 (1) (1992) 94–162.
+ [2] S. Lee, D. Park, T. Chung, D. Lee, S. Park, H. Song, A log buffer-based flash                                         Zhenghao Yin received the BS degree in Computer Science
+     translation layer using fully-associative sector translation, ACM Trans. Embed.                                       from Wuhan University of Technology, Wuhan, China, in
+     Comput. Syst. ( TECS) 6 (3) (2007) 18–es.                                                                             2022, and is currently pursuing the MS degree in Computer
+ [3] L. Shi, J. Li, C.J. Xue, C. Yang, X. Zhou, ExLRU: A unified write buffer cache                                        Science, expected to graduate in 2025. His research interests
+     management for flash memory, in: Proceedings of the Ninth ACM International                                           include flash memory and database technologies.
+     Conference on Embedded Software, 2011, pp. 339–348.
+ [4] Q. Wu, Y. Zhou, F. Wu, K. Wang, H. Lv, J. Wan, C. Xie, SW-WAL: Leveraging
+     address remapping of SSDs to achieve single-write write-ahead logging, in: 2021
+     Design, Automation & Test in Europe Conference & Exhibition, DATE, 2021, pp.
+     802–807.
+ [5] F. Ni, X. Wu, W. Li, L. Wang, S. Jiang, Leveraging SSD’s flexible address mapping
+     to accelerate data copy operations, in: 2019 IEEE 21st International Conference                                       Yajuan Du received the joint Ph.D. degrees from the City
+     on High Performance Computing and Communications; IEEE 17th International                                             University of Hong Kong and the Huazhong University of
+     Conference on Smart City; IEEE 5th International Conference on Data Science                                           Science and Technology, in December 2017 and February
+     and Systems (HPCC/SmartCity/DSS), 2019, pp. 1051–1059.                                                                2018, respectively. She is currently an Assistant Professor
+ [6] J. Coburn, T. Bunker, M. Schwarz, R. Gupta, S. Swanson, From ARIES to MARS:                                           with the School of Computer Science and Technology,
+     Transaction support for next-generation, solid-state drives, in: Proceedings of                                       Wuhan University of Technology. Her research interests
+     the Twenty-Fourth ACM Symposium on Operating Systems Principles, 2013, pp.                                            include optimizing access performance, data reliability, and
+     197–212.                                                                                                              persistency of flash memories and non-volatile memories.
+ [7] J. Arulraj, M. Perron, A. Pavlo, Write-behind logging, Proc. VLDB Endow. 10 (4)
+     (2016) 337–348.
+ [8] K. Han, H. Kim, D. Shin, WAL-SSD: Address remapping-based write-ahead-logging
+     solid-state disks, IEEE Trans. Comput. 69 (2) (2019) 260–273.
+ [9] G. Oh, C. Seo, R. Mayuram, Y.-S. Kee, S.-W. Lee, SHARE interface in flash storage
+     for relational and NoSQL databases, in: Proceedings of the 2016 International                                         Yi Fan received the BS degree in Computer Science from
+     Conference on Management of Data, 2016, pp. 343–354.                                                                  Wuhan University of Technology, Wuhan, China, in 2022,
+[10] Q. Wu, Y. Zhou, F. Wu, H. Jiang, J. Zhou, C. Xie, Understanding and exploiting                                        and is currently pursuing the MS degree in Computer
+     the full potential of SSD address remapping, IEEE Trans. Comput.-Aided Des.                                           Science, expected to graduate in 2025. His research interests
+     Integr. Circuits Syst. 41 (11) (2022) 5112–5125.                                                                      include key–value databases and flash memory technologies.
+[11] H. Li, M. Hao, M.H. Tong, S. Sundararaman, M. Bjørling, H.S. Gunawi, The
+     CASE of FEMU: Cheap, accurate, scalable and extensible flash emulator, in:
+     16th USENIX Conference on File and Storage Technologies (FAST 18), 2018,
+     pp. 83–90.
+[12] Y. Zhou, F. Wu, Z. Lu, X. He, P. Huang, C. Xie, SCORE: A novel scheme to
+     efficiently cache overlong ECCs in NAND flash memory, ACM Trans. Archit.
+     Code Optim. ( TACO) 15 (4) (2018) 1–25.
+                                                                                                                           Sam H. (Hyuk) Noh received his BE in Computer Engineer-
+[13] L. Long, S. He, J. Shen, R. Liu, Z. Tan, C. Gao, D. Liu, K. Zhong, Y. Jiang, WA-
+                                                                                                                           ing from Seoul National University in 1986 and his Ph.D. in
+     Zone: Wear-aware zone management optimization for LSM-Tree on ZNS SSDs,
+                                                                                                                           Computer Science from the University of Maryland in 1993.
+     ACM Trans. Archit. Code Optim. 21 (1) (2024) 1–23.
+                                                                                                                           He held a visiting faculty position at George Washington
+[14] D. Huang, D. Feng, Q. Liu, B. Ding, W. Zhao, X. Wei, W. Tong, SplitZNS: Towards
+                                                                                                                           University (1993–1994) before joining Hongik University,
+     an efficient LSM-tree on zoned namespace SSDs, ACM Trans. Archit. Code Optim.
+                                                                                                                           where he was a professor in the School of Computer and
+     20 (3) (2023) 1–26.
+                                                                                                                           Information Engineering until 2015. From 2001 to 2002, he
+[15] S.-H. Kim, J. Shim, E. Lee, S. Jeong, I. Kang, J.-S. Kim, NVMeVirt: A versatile
+                                                                                                                           was a visiting associate professor at UM IACS, University of
+     software-defined virtual NVMe device, in: 21st USENIX Conference on File and
+                                                                                                                           Maryland. In 2015, Dr. Noh joined UNIST as a professor
+     Storage Technologies (FAST 23), 2023, pp. 379–394.
+                                                                                                                           in the Department of Computer Science and Engineering.
+[16] B.S. Kim, J. Choi, S.L. Min, Design tradeoffs for SSD reliability, in: 17th USENIX
+                                                                                                                           He became the inaugural Dean of the Graduate School
+     Conference on File and Storage Technologies (FAST 19), 2019, pp. 281–294.
+                                                                                                                           of Artificial Intelligence and previously served as Dean of
+[17] Z. Shen, Y. Shi, Z. Shao, Y. Guan, An efficient LSM-tree-based sqlite-like database
+                                                                                                                           the School of Electrical and Computer Engineering (2016–
+     engine for mobile devices, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
+                                                                                                                           2018). He has contributed to numerous conferences, serving
+     38 (9) (2018) 1635–1647.
+                                                                                                                           as General Chair, Program Chair, or committee member
+[18] A. Mäkinen, Tracing Android applications for file system optimization.
+                                                                                                                           for events like ACM SOSP, USENIX FAST, ACM ASPLOS,
+[19] S.T. Leutenegger, D. Dias, A modeling study of the TPC-C benchmark, ACM
+                                                                                                                           and USENIX OSDI. He also chaired the ACM HotStorage
+     Sigmod Rec. 22 (2) (1993) 22–31.
+                                                                                                                           Steering Committee and serves on the Steering Committees
+[20] S. Park, J.H. Yu, S.Y. Ohm, Atomic write FTL for robust flash file system, in:
+                                                                                                                           for USENIX FAST and IEEE NVMSA. Dr. Noh was Editor-
+     Proceedings of the Ninth International Symposium on Consumer Electronics,
+                                                                                                                           in-Chief of ACM Transactions on Storage (2016–2022) and
+     2005.(ISCE 2005), 2005, pp. 155–160.
+                                                                                                                           is now co-Editor-in-Chief of ACM Transactions on Computer
+[21] V. Prabhakaran, T.L. Rodeheffer, L. Zhou, Transactional flash, in: OSDI, Vol. 8,
+                                                                                                                           Systems. His research focuses on system software and storage
+     2008.
+                                                                                                                           systems, emphasizing emerging memory technologies like
+[22] Y. Xu, Z. Hou, NVM-assisted non-redundant logging for Android systems, in:
+                                                                                                                           flash and persistent memory.
+     2016 IEEE Trustcom/BigDataSE/ISPA, 2016, pp. 1427–1433.
+
+
+                                                                                           10
+
--- a/papers_txt/Energy-consumption-assessment-in-embedded-AI--Metrolog_2026_Computer-Standar.txt
+++ b/papers_txt/Energy-consumption-assessment-in-embedded-AI--Metrolog_2026_Computer-Standar.txt
@@ -0,0 +1,595 @@
+                                                                Computer Standards & Interfaces 97 (2026) 104120
+
+
+                                                                     Contents lists available at ScienceDirect
+
+
+                                                           Computer Standards & Interfaces
+                                                              journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+Energy consumption assessment in embedded AI: Metrological
+improvements of benchmarks for edge devices
+Andrea Apicella b , Pasquale Arpaia a ,∗, Luigi Capobianco d , Francesco Caputo a ,
+Antonella Cioffi d , Antonio Esposito a , Francesco Isgrò a , Rosanna Manzo c ,
+Nicola Moccaldi a , Danilo Pau e , Ettore Toscano d
+a
+  Dipartimento di Ingegneria Elettrica e delle Tecnologie dell’Informazione, Università degli Studi di Napoli Federico II, Naples, Italy
+b
+  Dipartimento di Ingegneria dell’Informazione ed Elettrica e Matematica applicata (DIEM), Università degli Studi di Salerno, Fisciano, Italy
+c
+  Dipartimento di Sanità Pubblica e Medicina Preventiva, Università degli Studi di Napoli Federico II, Naples, Italy
+d
+  Software Design Center, STMicroelectronics, Marcianise, Italy
+e System Research and Applications, STMicroelectronics, Agrate Brianza, Italy
+
+
+
+
+ARTICLE                INFO                                 ABSTRACT
+
+Keywords:                                                   This manuscript proposes a new method to improve the MLCommons protocol for measuring power consump-
+Energy assessment                                           tion on Microcontroller Units (MCUs) when running edge Artificial Intelligence (AI). In particular, the proposed
+Embedded AI                                                 approach (i) selectively measures the power consumption attributable to the inferences (namely, the predictions
+Tiny-ML
+                                                            performed by Artificial Neural Networks — ANN), preventing the impact of other operations, (ii) accurately
+Uncertainty analysis
+                                                            identifies the time window for acquiring the sample of the current thanks to the simultaneous measurement of
+Edge device benchmark
+                                                            power consumption and inference duration, and (iii) precisely synchronize the measurement windows and the
+                                                            inferences. The method is validated on three use cases: (i) Rockchip RV1106, a neural MCU that implements
+                                                            ANN via hardware neural processing unit through a dedicated accelerator, (ii) STM32 H7, and (iii) STM32 U5,
+                                                            high-performance and ultra-low-power general-purpose microcontroller, respectively. The proposed method
+                                                            returns higher power consumption for the two devices with respect to the MLCommons approach. This result
+                                                            is compatible with an improvement of selectivity and accuracy. Furthermore, the method reduces measurement
+                                                            uncertainty on the Rockchip RV1106 and STM32 boards by factors of 6 and 12, respectively.
+
+
+
+1. Introduction                                                                                   (MCUs), widely used in IoT, this is particularly true. Many IoT applica-
+                                                                                                  tions, such as autonomous driving [6], demand low-latency responses
+    The rapid expansion of Internet of Things (IoT) devices has ushered                           to be effectively reactive. Moreover, several IoT devices often operate
+in a new era of connected intelligence at the edge, where data process-                           under very limited power sources. Promising energy-efficient strategies
+ing, low latency, and real-time decision making can take place directly                           aim to minimize consumption. For instance, index modulation [7,8] is
+at the edge [1]. These IoT devices cover a variety of applications, from                          a transmission technique that conveys additional information through
+smart home sensors [2], to industrial automation [3], and health mon-                             the indices of available resources such as antennas, subcarriers, or
+itoring systems [4], where low latency responses and energy efficiency                            time slots, and it can significantly reduce energy usage while maintain-
+are essential.                                                                                    ing data throughput. Nevertheless, even with advanced optimization
+    Extending computation to more peripheral network nodes enhances                               strategies, the repetitive and frequent processing required by many ap-
+all key aspects of edge computing, including energy efficiency, carbon                            plications can rapidly deplete power resources, thereby limiting device
+footprint reduction, security, latency, privacy, offline functionality, and
+                                                                                                  lifetime.
+data management costs [5]. However, deploying intelligence at the
+                                                                                                      In recent years, Machine Learning (ML) methods [9], particularly
+end nodes requires careful consideration of the IoT devices inherent
+                                                                                                  Artificial Neural Networks (ANNs), have been increasingly deployed on
+limitations, such as memory and computational resources impacting
+                                                                                                  IoT devices to enhance localized data processing capabilities and reduce
+time performances, and energy constraints. For Microcontroller Units
+
+
+    ∗ Corresponding author.
+     E-mail addresses: andapicella@unisa.it (A. Apicella), pasquale.arpaia@unina.it (P. Arpaia), luigi.capobianco@st.com (L. Capobianco),
+francesco.caputo3@unina.it (F. Caputo), antonella.cioffi@st.com (A. Cioffi), antonio.esposito9@unina.it (A. Esposito), francesco.isgro@unina.it (F. Isgrò),
+rosanna.manzo@unina.it (R. Manzo), nicola.moccaldi@unina.it (N. Moccaldi), danilo.pau@st.com (D. Pau), ettore.toscano@st.com (E. Toscano).
+
+https://doi.org/10.1016/j.csi.2025.104120
+Received 10 January 2025; Received in revised form 2 September 2025; Accepted 21 December 2025
+Available online 22 December 2025
+0920-5489/© 2025 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
+A. Apicella et al.                                                                                               Computer Standards & Interfaces 97 (2026) 104120
+
+
+dependency on cloud infrastructures [10,11]. It is common to refer to
+these devices as tiny devices [12] and embedded ML as tiny machine
+learning or tiny ML [5].
+    Consequently, assessing the inference time provided by the IoT
+hardware for a specific ANN model is crucial to ensure that the em-
+bedded system can satisfy real-time processing requirements. In this
+context, inference refers to the process of an ANN generating outputs
+based on its trained model parameters and given inputs.
+    Therefore, tailored energy consumption metrics are essential to
+ensure the alignment between the ANN implementation and the en-
+ergy constraints of the targeted IoT application. To this aim, Neural
+MCUs are new edge devices embedding ANN accelerators, specifically
+designed to manage the trade-off between reliability, latency, cost,
+and power consumption [13]. Therefore, adopting standardized metrics
+and procedures is essential for assessing the actual performance gains
+achieved by neural MCUs in the context of embedded AI. Despite
+several frameworks and tools have been proposed to facilitate the
+benchmarking of tinyML models [14–16], no standardized metrics and
+procedures are currently defined.
+                                                                                  Fig. 1. Energy measurement set up proposed by MLPerf Tiny Benchmark [17,
+    Among the proposed benchmarking protocols, MLPerf Tiny Bench-                 19]. The DUT is powered by the Energy Monitor. The IO manager serves as
+mark (MLPTB) [17] is developed by the MLCommons Association,                      an electrical-isolation proxy.
+the largest and most authoritative community aimed at improving
+the industrialization standardization process of machine learning [18].
+MLPTB provides protocols and AI components, namely datasets and
+                                                                                  functionalities: (i) sending a trigger signal, (ii) enabling UART commu-
+pre-trained ML models. These can act as metrological references when
+                                                                                  nication, (iii) generating and feeding random input data to the ANN,
+implemented on different hardware to assess their performance such
+                                                                                  (iv) performing inferences, and (v) printing the prediction results. The
+as the inference time and the power consumption under real-world
+                                                                                  software includes a graphical user interface that can be run on the Host
+conditions. However, the MLPTB protocols exhibit some metrological
+                                                                                  Computer, allowing the initiation of the measurement and monitoring
+weakness: (i) both the assessment of time performance and energy
+                                                                                  of input data. It is important to emphasize that in phase (iii) random
+consumption is realized without measurement uncertainty computa-
+                                                                                  data are generated to feed the ANN. This operation, however, does
+tion, (ii) the energy consumption analysis is performed based on an
+                                                                                  not reflect real-world applications, where the network processes sensor
+approximate estimate of the average inference duration, and (iii) the
+                                                                                  data in real time. Although not an intrinsic part of ANN inference,
+impact on consumption caused by inferences is not isolated with respect
+                                                                                  MLPTB includes this step in the performance and energy measurements.
+to other processes.
+                                                                                  Throughout this paper, phase (iii) is explicitly distinguished from phase
+    In this paper, a new method is proposed and validated to improve
+                                                                                  (iv) (i.e., inference) and is referred to as the pre-inference phase.
+MLPTB protocols to measure power consumption in MCUs running
+ANNs, in a rigorous metrological framework. Specifically, in Section 2                The energy per inference (𝐸𝑖𝑛𝑓 ) is calculated using latency infor-
+the MLPTB framework is reported, then the proposed method is pre-                 mation determined in the Performance phase. Specifically, the IPS is
+sented in Section 3. Experiments and results are reported in Section 4            determined by taking the median value across five experiments. In each
+and discussed in Section 5.                                                       experiment, input data is provided for a duration of at least 10 s, and
+                                                                                  the number of inferences is recorded via a direct connection between
+2. Background                                                                     the Host Computer and the DUT. Given the IPS, 𝐸𝑖𝑛𝑓 is computed as:
+                                                                                            𝐼𝑚 × 𝑉𝑛
+                                                                                  𝐸𝑖𝑛𝑓 =                                                                     (1)
+     Several frameworks and tools have been introduced to support                          𝜏 × 𝐼𝑃 𝑆
+the benchmarking of tinyML models [14–16]. Among the available                    where 𝑉𝑛 is the nominal voltage, 𝐼𝑚 is the current averaged over the
+benchmarking protocols, the MLPerf Tiny Benchmark (MLPTB) [17],                   fixed period 𝜏.
+developed by the MLCommons Association [18], emerges as a key
+initiative.
+                                                                                  3. Proposed method
+     MLPTB proposes two modalities of assessment: (i) Performance and
+(ii) Energy. The former measures Latency (inferences per second — IPS)
+and accuracy (percentage of correct predictions to all predictions ratio)             The MLCommons pre-inference phase generates random numbers as
+through a direct USB connection between a Device Under Test (DUT)                 input to the ANN in order to perform inference (in addition to memory
+and an host computer, while the latter measures energy (micro-joules              operations needed to provide the input to the network). However, ran-
+per inference). In the remainder of this section, the energy configura-           dom number generation is hardly reproducible across different devices
+tion mode is detailed, as it represents the central focus of this study. In       under test, since both the libraries and the hardware resources available
+the energy configuration mode (Fig. 1), an Energy Monitor is proposed             on the microcontrollers for random number generation vary. In con-
+to supply power to the DUT while measuring the current consumption.               trast, the proposed work selectively excludes the pre-inference phase
+An Input/Output Manager is introduced to interface the Host Computer              from the performance and energy measurements, ensuring greater re-
+with the DUT and serving as an electrical-isolation proxy. Furthermore,           producibility while also providing a closer adherence to the actual
+MLPTB requires level shifters to adapt the power supply in input to the           operation of the device in real-world scenarios. In the following of this
+DUT (not reported in Fig. 1 to simplify the schematic as they are not             section, the proposed method is described. In paragraph 3.1 the circuit
+essential to the discussion).                                                     solution for the joint measurement of time and energy consumption
+     In addition to defining assessment procedures, MLPTB provides                is described. In paragraph 3.2 the expected impact of the method on
+some firmware and software [19] for ML tasks on DUT. In particular,               selectivity, accuracy, and uncertainty during the energy measurement
+the provided firmware to be loaded onto the DUT ensures the following             is highlighted.
+
+                                                                              2
+A. Apicella et al.                                                                                              Computer Standards & Interfaces 97 (2026) 104120
+
+
+                                                                                inference. Furthermore, it is assumed with a non-negligible degree of
+                                                                                approximation that the inferences are executed consecutively by the
+                                                                                MCU, disregarding the impact of inter-inference operations that are
+                                                                                still present. Finally, the delays in the transmission of the command for
+                                                                                starting the measurement have a further impact on the accuracy, albeit
+                                                                                to a very small extent. Specifically, this refers to the time taken by the
+                                                                                CPU on the DUT to generate the trigger signal and by the Measurement
+                                                                                Board to handle the interrupt triggered at its input pin (see Fig. 3).
+                                                                                     In the proposed method, limiting the observation to a single in-
+                                                                                ference at a time eliminates the approximation inherent in MLPTB,
+                                                                                where the inference duration is estimated through the average of
+                                                                                multiple successive inferences executed within a known time window.
+                                                                                Specifically, the proposed method allows the exclusion of all energy
+                                                                                contributions unrelated to the inference itself (e.g., data transfer op-
+                                                                                erations to memory during the pre-inference phase). However, in the
+                                                                                proposed method, the repetition of the measurement for each inference
+                                                                                amplifies the impact of inaccuracies caused by the delay in transmitting
+                                                                                the status signal. In contrast, the MLPTB approach mitigates this effect
+Fig. 2. Proposed energy measurement setup. The Host Computer powers the
+                                                                                because the delay only occurs at the start of the measurement for
+DUT and an ammeter is connected in series along the power line on the DUT       multiple inferences. To address this issue, the inference duration (𝛥𝑡)
+(e.g. a MCU).                                                                   measurement is also performed. In the firmware for the DUT, the
+                                                                                onboard counter is read immediately before and after the inference
+                                                                                execution. The 𝛥𝑡, is used to appropriately resize the current sample
+                                                                                vector acquired while the inference status signal is active. The current
+3.1. Circuit diagram and measurement procedure
+                                                                                sample vector is trimmed at both ends by a number of elements (𝑁𝑡𝑟𝑖𝑚 ),
+                                                                                calculated as follows:
+    The proposed method utilizes an ammeter that does not require                           (            )
+powering the DUT to measure the absorbed current. The ammeter is                          𝑓    𝑁𝑐𝑠
+                                                                                𝑁𝑡𝑟𝑖𝑚 = 𝑐          − 𝛥𝑡                                                (2)
+connected in series to the microprocessor on the MCU powered by the                       2     𝑓𝑐
+Host Computer through the USB port (Fig. 2). This approach allows               where 𝑓𝑐 is the sampling frequency of the Ammeter, 𝑁𝑐𝑠 is the number
+the Host Computer to perform both latency and energy measurements               of current samples acquired when the inference status signal is high,
+simultaneously. Indeed, the firmware provided by MLPTB enables the              and 𝛥𝑡 is the inference duration.
+DUT to update the Host Computer on the number of completed infer-
+ences through the USB connection. Instead of computing the energy               3.3. Uncertainty improvements
+per inference as the ratio between the total energy measured in a
+specific time window and the number of inferences (MLPTB method),                   Two distinct phases should be addressed in the evaluation of un-
+the proposed method computes the energy for each inference without              certainty: (i) the inference time measurement, and (ii) the energy
+considering the impact of pre-inference phase. This is obtained by              consumption assessment. In particular, an important source of un-
+modifying the firmware provided by MLPTB: the trigger is replaced by            certainty in MLPTB is due to the counting of inferences during the
+a logic signal (inference status) that goes high during an ongoing infer-       IPS measurement affecting inference time measurement and, conse-
+ence and returns low otherwise. The inference status signal output from         quently, also the energy consumption assessment. More deeply, the
+the device under test is sampled by the Measurement Board (ammeter)             measurement window is not an integer multiple of the inference period,
+in parallel with the current (Fig. 3.a). Two vectors of synchronously           therefore, there is no synchronization between the end of the last
+sampled data (current and inference status signal) are sent to the Host         inference and the end of the measurement window. This contribution
+Computer. The current samples are processed, and the energy consump-            can be modeled by a uniform random variable whose domain is equal
+tion is calculated only when the inference status samples indicate a            to the central value inference duration 𝛥𝑡𝑚 , with a standard deviation
+low logic signal. Additionally, before and after each inference, the DUT        𝜎1𝑐𝑜𝑛𝑡 computed as:
+reads the values of the Clock and Reset Management Unit (CRMU) and
+                                                                                               𝛥𝑡
+transmits them to the Host Computer to determine the duration of the            𝜎1𝑐𝑜𝑛𝑡 = 𝑢𝑡1 = √𝑚                                                           (3)
+inference. Finally, the software on the Host Computer computes the                            2 3
+mean value of 𝑁 inferences with associated uncertainty. In this work,           The uncertainty of the MLPTB method is assessed by assuming the
+𝑁 is set to 100. Similar to the MLPTB, the proposed firmware runs as            median inference duration approximately equal to the mean. Differ-
+the sole program on the MCU, with fully sequential execution and no             ently, in the proposed method the counting uncertainty is determined
+concurrency, or interrupts. Furthermore, in the proposed method, the            by the fact that the inference duration is not an integer multiple of
+inference status signal is set high immediately after the pre-inference         the counter period (𝑇𝑐 ). Again, the random variable with uniform
+phase, and the CRMU is queried right before the inference execution.            probability distribution effectively describes this aspect. The standard
+As soon as the inference completes, the CRMU is queried again, and              deviation 𝜎2𝑐𝑜𝑛𝑡 is computed as:
+finally the inference status is set low to signal the ammeter that the                         𝑇
+inference has finished. In Fig. 4, a flowchart describing the customized        𝜎2𝑐𝑜𝑛𝑡 = 𝑢𝑡2 = √𝑐                                                           (4)
+firmware behavior is reported.                                                                2 3
+                                                                                Assuming that 𝛥𝑡𝑚 ≫ 𝑇𝑐 , it follows 𝑢𝑡1 ≫ 𝑢𝑡2 and the proposed method
+3.2. Accuracy improvements                                                      improves the measurement uncertainty due to counting.
+                                                                                    Then there is the uncertainty due to the variability of the duration
+   In the MLPTB, the number of inferences during the measurement                time of the processes between the inferences (pre-inference phase). The
+time in energy mode is calculated using the IPS obtained from the               proposed method is not affected by this source of uncertainty because
+previous latency measurement. This approach introduces accuracy is-             it excludes from the energy measurement all the processes outside
+sues because an estimator is used instead of the actual time of each            the inference. Finally, both methods are exposed to the uncertainty
+
+                                                                            3
+A. Apicella et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104120
+
+
+
+
+Fig. 3. Comparison between the block diagram of the proposed method (a) and ML Commons-Tiny approach (b) for energy consumption measurement. The
+added blocks and signals are reported in red. In the proposed method, the Device Under Test stops the power consumption computation after each inference.
+Differently, in the MLCommons-Tiny approach, the Host Computer stops the acquisition of current samples after a fixed time window, without distinguishing
+between pre-inference and inference phases. Furthermore, it computes the energy consumption (μJ per inference) based on the Inference per Second measured
+exploiting the Performance mode (see Section 2.) The Counter and the Time Calculator blocks are used for the measurement of the duration of each inference,
+while an Inference Status ADC minimizes the latency between the inference start and current sample consideration. (For interpretation of the references to color
+in this figure legend, the reader is referred to the web version of this article.)
+
+
+                                                                                          according to the following formula [20]:
+                                                                                              √
+                                                                                          𝑢𝑐 = 𝑢2𝐴 + 𝑢2𝐵 + 𝑢2𝐵 + ⋯ + 𝑢2𝐵 .                                           (5)
+                                                                                                        1     2         𝐾
+
+
+                                                                                          4. Experiments and results
+
+                                                                                             In this section, a comparison between the application of the pro-
+                                                                                          posed and MLPTB methods is presented. In paragraph 4.1 the ex-
+                                                                                          perimental procedure is described. The DUTs and the ammeter are
+                                                                                          presented in paragraph 4.2. Results are reported in paragraph 4.3.
+
+                                                                                          4.1. Experimental procedure
+
+                                                                                             The MLPTB method was implemented using two different circuit
+                                                                                          configurations for measuring inference duration and energy per infer-
+                                                                                          ence, as described in [17]. Instead, in the proposed method the two
+                                                                                          measures were realized with the same circuital solution shown in Fig. 2.
+                                                                                          The Firmware used for MLPTB measurement was modified to allow the
+                                                                                          measurement of the single inference as described in the paragraph 3.1.
+                                                                                          The four MLPerf benchmarks were retained: (i) Anomaly Detection, (ii)
+                                                                                          Keyword Spotting, (iii) Image Classification, (iv) Visual Wake Words.
+                                                                                          Each benchmark targets a specific use case and specifies a dataset, a
+                                                                                          model, and a quality target [17].
+
+                                                                                          4.2. Experimental setup
+
+                                                                                              Both methods are applied on three different MCU: STMicroelec-
+                                                                                          tronics STM32-H7 (Clock Frequency = 280 MHz), STMicroelectronics
+                                                                                          STM32-U5 (Clock Frequency = 160 MHz), and Rockchip RV1106 (Clock
+Fig. 4. Flow chart of the proposed Firmware. The pre-inference phase (in red)             Frequency = 1200 MHz). The STM32H7 and the STM32U5 are general-
+is excluded from both time (CRMU timestamp read) and energy assessment                    purpose microcontrollers, the former designed for high-performance
+(‘‘Inference Status’’ digital signal setting and unsetting). (For interpretation of       applications and the latter for ultra-low-power operation, both pro-
+the references to color in this figure legend, the reader is referred to the web          duced by STMicroelectronics. These devices do not have any ded-
+version of this article.)                                                                 icated Neural Processing Unit (NPU) hardware for ANN computa-
+                                                                                          tion, so this part is commonly made by implemented firmware that
+                                                                                          run on main Central Process Unit (CPU). The firmware is automati-
+of the stability of the DUT (jitter) and ammeter precision, as well                       cally deployed using ST EdgeAI Core Technology and compiled through
+as to the uncertainty of the signal transmission times between the                        STMCubeIDE [21] compiler implementing all needed tools to convert,
+devices involved in the measurement process. For the calculation of                       optimize, and implement ANN models on the DUT.
+the measurement uncertainty, the combined standard uncertainty 𝑢𝑐 is                          The evaluation boards of the STMicroelectronics Nucleo-STM32H7
+adopted, where the contribution from the type A evaluation (𝑢𝐴 ) is                       with STM32H7 microcontroller and B-U585I-IOT02 A Discovery Kit
+integrated with the 𝐾 contributions from the type B evaluations (𝑢𝐵𝑘 ),                   with STM32U5 microcontroller were chosen for the experimental setup
+
+                                                                                      4
+A. Apicella et al.                                                                                               Computer Standards & Interfaces 97 (2026) 104120
+
+
+
+
+                                          (a)                     (b)                                (c)
+
+
+
+
+                                                                            (d)
+
+
+Fig. 5. Hardware components used in the experiments: (a) H7 board with STM32H7 MCU, (b) Luckfox Pico Pro Max with Rockchip RV1106 SoC, (c) B-U585I-
+IOT02 A Discovery Kit with STM32U5 MCU, and (d) Power Profiler Kit II ammeter.
+
+
+(Figs. 5(a), 5(c)). They include a connector in series to the MCU’s power         counter values returned by two consecutive CRMU readings. On each
+supply line allowing an ammeter to be inserted to assess the power                board, 30 experiments were performed, each providing two latency
+consumption of the DUT under operating conditions.                                values. For each board, the mean value and type A uncertainty were
+    The RV1106 is a System on Chip (SoC) produced by Rockchip Elec-               computed. In the worst case, namely the Rockchip, the latency was
+tronics. This device has a dedicated NPU hardware, so the computation             found to be 7 ± 4 CPU clock cycles (2 ± 1 for the other two boards),
+of ANN models are made by hardware, and the software shall only                   which corresponds to only a few nanoseconds. Tables 1, 2, and 3
+allocate necessary data into a dedicated memory area. While STM32                 present the results of inference duration (𝛥𝑡) assessments conducted
+microcontrollers operate without an operating system, RV1106 requires             using both the MLPTB and the proposed methods. The results are
+the use of an operating system given its CPU architecture. Ubuntu                 reported for the Rockchip RV1106, STM32H7, and STM32U5, respec-
+22.04 RT [22] was therefore installed to minimize execution timing                tively, with varying ANN models. Concerning uncertainty computation,
+uncertainties.                                                                    the MLPTB method does not provide strategies for calculating mea-
+    The software is deployed using RKNN Toolkit compiler that im-                 surement uncertainty and, in this work, it was computed by referring
+plements all needed tools to convert, optimize, and implement ANN                 to the sole contribution of the counting inferences (Eq. (2)). In the
+models on the device. The evaluation board with Rockchip RV1106                   proposed method, since the Clock and Reset Management Unit (CRMU)
+chosen for the experimental setup is the Luckfox Pico Pro Max (Fig.               of the MCUs is employed for inference time measurement, the type
+5(b)). The ammeter is inserted between USB-C main supply and the                  A uncertainty is combined with type B contributions arising from
+SoC’s power supply line in order to assess the power consumption of               counting uncertainty, system clock stability (jitter), and the response
+device under operative conditions.                                                time required by the CRMU to be queried and to return a value.
+    The measurement board used for the power assessment is the Power              For all the considered microcontrollers, the type B contribution was
+Profiler Kit II (PPKII) produced by Nordic Semiconductor (Fig. 5(d)).             found to be dominated by the counting uncertainty, computed using
+This device is composed by an ammeter and a 8-bits digital sampler                formula (4), and equal to 289 ns. The jitter contribution is at least
+synchronized with the same time base. It can work into two different              three orders of magnitude smaller at room temperature (between 20 ◦ C
+modes that affect the only ammeter component:                                     and 30 ◦ C) [23–25]. Similarly, the uncertainty related to the CRMU
+                                                                                  response time, characterized in this work for all three microcontrollers,
+     • Source Meter: With this mode, the internal ammeter is linked               was found to be equal to 1 CPU clock cycle. In the worst case, i.e., con-
+       to a power supply generator that can be used to provide the                sidering the STM32U5 device with the lowest CPU clock frequency, this
+       power supply to DUT. This mode was adopted for the MLPTB                   contribution was on the order of nanoseconds. Therefore, the overall
+       implementation                                                             evaluated uncertainty corresponds to the joint contribution of type A
+     • Ammeter Mode: With this mode, the instrument works as a pure               and type B, with the latter coinciding with the counting uncertainty,
+       ammeter and the power supply of DUT can be provided ex-                    according to:
+       ternally. This mode was implemented in the proposed method                      √
+       application.                                                               𝑢𝑡 = 𝑢2𝐴 + 𝑢2𝐵                                                        (6)
+
+   For both modes, the device was metrologically characterized under                 To propagate the measurement uncertainty of the 𝛥𝑡 on the energy
+operating conditions of 20–30 ◦ C (the same conditions used for all               per inference (𝐸𝑖𝑛𝑓 ) measurement, a constant power 𝑃 is assumed
+experiments), exhibiting an uncertainty of less than 2%.                          during the inference time, obtaining the following propagation formula:
+
+4.3. Results
+                                                                                  𝐸𝑖𝑛𝑓 = 𝑃 𝛥𝑡 ⇒ 𝑢𝑒 = 𝑃 𝑢𝑑                                                    (7)
+    For the proposed method, a characterization of the CRMU query                 where 𝑢𝑒 is the energy per inference measurement uncertainty. With
+latency was carried out on all devices. A modified version of the same            respect to the energy consumption estimation, an additional uncer-
+firmware used for the energy consumption assessment was employed.                 tainty source arises from the measuring instrument, i.e., the ammeter
+Specifically, an additional CRMU query was appended directly after                employed. For both methods, an instrumental uncertainty of 2% was
+the preceding one, making it consecutive to the two already present.              considered, after a metrological characterization performed under oper-
+The CRMU query latency was measured as the difference between the                 ational conditions at room temperature (between 20 ◦ C and 30 ◦ C). The
+
+                                                                             5
+A. Apicella et al.                                                                                                                            Computer Standards & Interfaces 97 (2026) 104120
+
+
+                     Table 1
+                     Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑡 ) of inference duration (expressed in ms) assessed by MLCommons and
+                     proposed methods on Rockchip RV1106 at varying of neural models.
+                      Method          Visual Wake Words                Image Classification                    Keyword Spotting                   Anomaly Detection
+                                      𝑚𝑡                   𝑢𝑡          𝑚𝑡                         𝑢𝑡           𝑚𝑡                  𝑢𝑡             𝑚𝑡                  𝑢𝑡
+                      Proposed        0.820                0.006       0.415                      0.012        0.400               0.008          0.558               0.033
+                      MLPTB           0.815                0.235       0.414                      0.120        0.371               0.107          0.350               0.101
+                     a
+                         In MLPTB, the counting uncertainty was taken into account.
+
+
+                     Table 2
+                     Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑡 ) of inference duration (expressed in ms) assessed by MLCommons and
+                     proposed methods on STM32H7 microcontroller at varying of neural models.
+                      Method          Visual Wake Words               Image Classification                      Keyword Spotting                  Anomaly Detection
+                                      𝑚𝑡                  𝑢𝑡          𝑚𝑡                      𝑢𝑡                𝑚𝑡                 𝑢𝑡             𝑚𝑡                  𝑢𝑡
+                      Proposed        29.656              0.003       49.941                  0.001             14.860             0.001          1.690               0.002
+                      MLPTB           29.600              8.545       51.900                  14.982            15.400             4.446          1.800               0.520
+                     a   In MLPTB, the Counting Uncertainty was taken into account.
+
+
+                     Table 3
+                     Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑡 ) of inference duration (expressed in ms) assessed by MLCommons and
+                     proposed methods on STM32U5 microcontroller at varying of neural models.
+                      Method          Visual Wake Words                Image Classification                    Keyword Spotting                   Anomaly Detection
+                                      𝑚𝑡                  𝑢𝑡           𝑚𝑡                     𝑢𝑡               𝑚𝑡                  𝑢𝑡             𝑚𝑡                  𝑢𝑡
+                      Proposed        78.447              0.002        133.280                0.002            48.060              0.001          4.910               0.002
+                      MLPTB           71.600              20.669       128.200                37.008           38.600              11.143         4.800               1.386
+                     a
+                         In MLPTB, the Counting Uncertainty was taken into account.
+
+
+                     Table 4
+                     Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑒 ) of energy (expressed in μJ) assessed by MLCommons and proposed methods
+                     on Rockchip RV1106 at varying of neural models.
+                      Method          Visual Wake Words               Image Classification                     Keyword Spotting                   Anomaly Detection
+                                      𝑚𝑡                       𝑢𝑒     𝑚𝑡                               𝑢𝑒      𝑚𝑡                       𝑢𝑒        𝑚𝑡                       𝑢𝑒
+                      Proposed        380                      13     193                              15      165                      9         222                      11
+                      MLPTB           373                      108    183                              53      159                      46        148                      43
+                     a
+                         In MLPTB, the counting uncertainty was propagated into the energy measurements.
+
+
+                     Table 5
+                     Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑒 ) of energy (expressed in μJ) assessed by MLCommons and proposed methods
+                     on STM32H7 microcontroller at varying of neural models.
+                      Method          Visual Wake Words               Image Classification                     Keyword Spotting                   Anomaly Detection
+                                      𝑚𝑡                   𝑢𝑒         𝑚𝑡                          𝑢𝑒           𝑚𝑡                       𝑢𝑒        𝑚𝑡                       𝑢𝑒
+                      Proposed        4386                 88         7536                        151          2202                     44        236                      6
+                      MLPTB           3699                 1068       6311                        1822         1870                     540       221                      64
+                     a   In MLPTB, the counting uncertainty was propagated into the energy measurements.
+
+
+final uncertainty was thus obtained by applying the following formula:                                 trends: for two networks, the measured consumption is higher with the
+                                                                                                       proposed method, while for the other two networks it is higher with
+       √                                                                                               MLCommons. Regarding the uncertainty, the proposed method reduces
+𝑢𝑒 =    𝑢2𝑡 + 𝑢2𝑠                                                                     (8)
+           𝑝                                                                                           it by a factor of 12.
+where 𝑢𝑡𝑝 denotes the inference time measurement uncertainty 𝑢𝑡 prop-
+agated through the functional relation used for energy computation                                     5. Discussion
+(see formula), and 𝑢𝑠 represents the instrumental uncertainty of the
+ammeter. The measurement uncertainty obtained for the proposed                                             The contrasting trends from energy assessment on STM32U5 pro-
+method appears for all tested devices to be very low compared to the                                   vide an opportunity to discuss the relationship between the two meth-
+uncertainty of the MLPTB method.                                                                       ods in terms of metrological accuracy. The MLCommons method ex-
+    In Tables 4, 5, and 6 a comparison between results of energy per                                   tracts a central Inference Per Second value based on five experiments,
+inference assessment by MLPTB and proposed methods are reported for                                    whereas our method computes a central value as the mean over 100
+the three DUTs. On the Rockchip RV1106, the proposed method mea-                                       acquisitions. Given the large uncertainty of the MLPTB method and
+sures an inference energy value that is, on average, 15% higher than                                   the limited number of experiments, the calculated central value is
+that obtained with MLPTB, while improving the uncertainty by a factor                                  unlikely to be a reliable estimator of the true value of the measured
+of 6. In the case of a STM32H7 inference energy assessment grows                                       quantity [26]. The comparison of mean values obtained with the two
+by 16% while the uncertainty improves by a factor of 12. Notably,                                      methods is limited by the large difference in their associated uncertain-
+the inference energy assessment on the STM32U5 shows contrasting                                       ties. The less precise method exhibits an uncertainty up to two orders
+
+                                                                                              6
+A. Apicella et al.                                                                                                               Computer Standards & Interfaces 97 (2026) 104120
+
+
+
+
+Fig. 6. Temporal diagram of current values acquired from MCU during ANN operations. Orange traces represent (a) the inference status signal in the proposed
+method and (b) the trigger signal in the MLPTB method. The windows used for energy consumption estimation are highlighted in light blue. Specifically, the
+proposed method (a) considers only the current samples acquired during each neural network inference phase, whereas the MLPTB method (b) also includes the
+energy contribution of pre-inference phases (light yellow window). (For interpretation of the references to color in this figure legend, the reader is referred to
+the web version of this article.)
+
+
+
+
+Fig. 7. Comparison between proposed method (orange) and MLPTB (green) in Energy per inference Assessment on the Rockchip RV1106, at varying th Models
+provided by MLCommons. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
+
+
+                     Table 6
+                     Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑒 ) of energy (expressed in μJ) assessed by MLCommons and proposed methods
+                     on STM32U5 microcontroller at varying of neural models.
+                      Method          Visual Wake Words              Image Classification               Keyword Spotting             Anomaly Detection
+                                      𝑚𝑡                   𝑢𝑒        𝑚𝑡                         𝑢𝑒      𝑚𝑡                 𝑢𝑒        𝑚𝑡                    𝑢𝑒
+                      Proposed        2362                 47        3249                       65      1184               27        116                   3
+                      MLPTB           1921                 556       3384                       980     1004               291       121                   35
+                     a
+                         In MLPTB, the counting uncertainty was propagated into the energy measurements.
+
+
+of magnitude higher than the other, rendering direct statistical com-                           by low energy consumption) from the calculation (Fig. 6). This prevents
+parisons of the means largely insignificant. Observed differences may                           underestimation of the actual energy consumption, which may occur
+therefore primarily reflect the inherent variability of the less accurate                       when using the MLPTB method.
+method rather than genuine differences in the measured phenomenon.                                  Finally the Figs. 7, 8, and 9 present the histograms of Energy
+However, it is important to note that the proposed method provides                              per Inference assessment with the two methods on Rockchip RV1106,
+greater selectivity by excluding the pre-inference phase (characterized                         STM32H7, and STM32U5, respectively. The orange bars (proposed
+
+                                                                                            7
+A. Apicella et al.                                                                                            Computer Standards & Interfaces 97 (2026) 104120
+
+
+
+
+Fig. 8. Comparison between proposed method (orange) and MLPTB (green) in Energy per inference Assessment on the STM32 H7, at varying th Models provided
+by MLCommons. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
+
+
+
+
+Fig. 9. Comparison between proposed method (orange) and MLPTB (green) in Energy per inference Assessment on the STM32 U5, at varying th Models provided
+by MLCommons. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
+
+
+method) are generally higher than the green bars (MLPTB). However,             6. Conclusions
+comparing the mean values measured by the two methods is challeng-
+ing due to the large uncertainty intervals (error bars) associated with            A new method for assessing power consumption of edge devices
+MLPTB. Nevertheless, the differences in error bar lengths confirm the          such as MCUs running ANNs is presented, claiming metrological im-
+improved precision of the proposed method.                                     provements over the MLPerf Tiny Benchmark. Unlike MLPTB, the
+    The metrological improvements introduced in this work have direct          proposed method calculates the duration and energy consumption of
+                                                                               each individual inference performed by the Device Under Test. Through
+consequences for the practical adoption of embedded AI. First, more
+                                                                               an appropriate circuit and firmware design, the method measures only
+accurate and reproducible energy assessments enhance the reliability of
+                                                                               the energy consumed by the inference, excluding other operations from
+benchmarking, enabling fair comparisons among devices and support-
+                                                                               the computation. This approach not only enhances the selectivity and
+ing informed selection of hardware for battery-powered applications,
+                                                                               accuracy of the measurement process but also reduces measurement
+where autonomy is a critical design constraint. Second, the improved           uncertainty. Instead of counting the number of inferences over a fixed
+accuracy in energy characterization facilitates more precise sizing of         interval, as MLPTB does, the proposed method counts the number of
+power supply components, which is essential for ensuring efficiency,           ticks from the counter of the DUT during a single inference execution.
+stability, and cost-effectiveness in embedded deployments. Finally, the        On a NPU powered microcontroller, the proposed method improves
+refined timing characterization allows designers to better estimate            measurement uncertainty by a factor of 6. In the case of two general-
+inference latency, a key parameter for real-time and safety-critical           purpose microcontrollers (high-performance and ultra-low-power), the
+applications.                                                                  measurement uncertainty improves by a factor of 12.
+
+                                                                           8
+A. Apicella et al.                                                                                                                 Computer Standards & Interfaces 97 (2026) 104120
+
+
+CRediT authorship contribution statement                                                      [6] M. Cunneen, M. Mullins, F. Murphy, Autonomous vehicles and embedded
+                                                                                                  artificial intelligence: The challenges of framing machine driving decisions, Appl.
+                                                                                                  Artif. Intell. 33 (8) (2019) 706–731.
+    Andrea Apicella: Writing – review & editing, Methodology, Con-
+                                                                                              [7] J. Li, S. Dang, M. Wen, Q. Li, Y. Chen, Y. Huang, W. Shang, Index modulation
+ceptualization. Pasquale Arpaia: Writing – review & editing, Method-                              multiple access for 6G communications: Principles, applications, and challenges,
+ology, Conceptualization. Luigi Capobianco: Writing – review & edit-                              IEEE Netw. 37 (1) (2023) 52–60.
+ing, Methodology, Conceptualization. Francesco Caputo: Writing – re-                          [8] M. Wen, B. Zheng, K.J. Kim, M. Di Renzo, T.A. Tsiftsis, K.-C. Chen, N.
+view & editing, Writing – original draft, Visualization, Validation, Soft-                        Al-Dhahir, A survey on spatial modulation in emerging wireless systems: Re-
+                                                                                                  search progresses and applications, IEEE J. Sel. Areas Commun. 37 (9) (2019)
+ware, Methodology, Investigation, Formal analysis, Data curation, Con-                            1949–1972.
+ceptualization. Antonella Cioffi: Writing – review & editing, Methodol-                       [9] M.I. Jordan, T.M. Mitchell, Machine learning: Trends, perspectives, and
+ogy, Conceptualization. Antonio Esposito: Writing – review & editing,                             prospects, Science 349 (6245) (2015) 255–260.
+Methodology, Conceptualization. Francesco Isgrò: Writing – review                            [10] S. Mishra, J. Manda, Improving real-time analytics through the internet of things
+                                                                                                  and data processing at the network edge, J. AI Assist. Sci. Discov. 4 (1) (2024)
+& editing, Methodology, Conceptualization. Rosanna Manzo: Writ-
+                                                                                                  184–206.
+ing – review & editing, Methodology, Conceptualization. Nicola Moc-                          [11] M. De Donno, K. Tange, N. Dragoni, Foundations and evolution of mod-
+caldi: Writing – review & editing, Methodology, Conceptualization.                                ern computing paradigms: Cloud, IoT, edge, and fog, IEEE Access 7 (2019)
+Danilo Pau: Writing – review & editing, Methodology, Conceptual-                                  150936–150948.
+ization. Ettore Toscano: Writing – review & editing, Methodology,                            [12] D.P. Pau, P.K. Ambrose, F.M. Aymone, A quantitative review of automated neural
+                                                                                                  search and on-device learning for tiny devices, Chips 2 (2) (2023) 130–141.
+Conceptualization.                                                                           [13] C.-T. Lin, P.X. Huang, J. Oh, D. Wang, M. Seok, iMCU: A 102-𝜇J, 61-ms digital
+                                                                                                  in-memory computing-based microcontroller unit for edge TinyML, in: 2023 IEEE
+Declaration of competing interest                                                                 Custom Integrated Circuits Conference, CICC, IEEE, 2023, pp. 1–2.
+                                                                                             [14] S. Gal-On, M. Levy, Exploring coremark a benchmark maximizing simplicity and
+                                                                                                  efficacy, Embed. Microprocess. Benchmark Consortium (2012).
+    The authors declare that they have no known competing finan-
+                                                                                             [15] P. Torelli, M. Bangale, Measuring Inference Performance of Machine-Learning
+cial interests or personal relationships that could have appeared to                              Frameworks on Edge-Class Devices with the Mlmark Benchmark, Techincal Re-
+influence the work reported in this paper.                                                        port, 2021, Available Online: https://www.eembc.org/techlit/articles/MLMARK-
+                                                                                                  WHITEPAPERFINAL-1.pdf. (Accessed on 5 April 2021).
+Acknowledgments                                                                              [16] B. Sudharsan, S. Salerno, D.-D. Nguyen, M. Yahya, A. Wahid, P. Yadav, J.G.
+                                                                                                  Breslin, M.I. Ali, Tinyml benchmark: Executing fully connected neural networks
+                                                                                                  on commodity microcontrollers, in: 2021 IEEE 7th World Forum on Internet of
+    This work was carried out within the DHEAL-COM project (ID: PNC-                              Things, WF-IoT, IEEE, 2021, pp. 883–884.
+E3-2022-23683267 PNC – HLS – DH; CUP: E63C22003790001), which                                [17] C. Banbury, V.J. Reddi, P. Torelli, J. Holleman, N. Jeffries, C. Kiraly, P. Montino,
+was financially supported by the Italian Ministry of Health through                               D. Kanter, S. Ahmed, D. Pau, et al., Mlperf tiny benchmark, 2021, arXiv preprint
+                                                                                                  arXiv:2106.07597.
+the Complementary National Plan (CNP) to the PNRR. This publication
+                                                                                             [18] MLCommons, 2024, URL: https://mlcommons.org/benchmarks/inference-tiny/.
+reflects only the authors’ view and the Italian Ministry of Health is not                    [19] Performance mode vs. Energy mode, 2022, URL: https://github.com/eembc/
+responsible for any use that may be made of the information it contains.                          energyrunner?tab=readme-ov-file#performance-mode-vs-energy-mode.
+                                                                                             [20] B.N. Taylor, C.E. Kuyatt, Guidelines for Evaluating and Expressing the Un-
+Data availability                                                                                 certainty of NIST Measurement Results, NIST Technical Note 1297, National
+                                                                                                  Institute of Standards and Technology (NIST), Gaithersburg, MD, 2020, http:
+                                                                                                  //dx.doi.org/10.6028/NIST.TN.1297-2020.
+    Data will be made available on request.                                                  [21] STMCubeIDE, 2022, URL: https://stm32ai.st.com/stm32-cube-ai/.
+                                                                                             [22] Ubuntu 12 RT, 2012, Real-time variant of Ubuntu 12, Canonical Ltd. https:
+                                                                                                  //ubuntu.com/real-time. Canonical Ltd.
+References                                                                                   [23] STMicroelectronics, STM32H753xI - 32-bit Arm® Cortex® -M7 480MHz MCUs,
+                                                                                                  2MB flash, 1MB RAM, 46 com. and Analog Interfaces, Crypto - Datasheet -
+ [1] R. Chataut, A. Phoummalayvane, R. Akl, Unleashing the power of IoT: A                        Production Data, Datasheet DS12117 Rev 9, STMicroelectronics, 2023, p. 358,
+     comprehensive review of IoT applications and future prospects in healthcare,                 URL: https://www.st.com/resource/en/datasheet/stm32h753vi.pdf. (Accessed 21
+     agriculture, smart homes, smart cities, and industry 4.0, Sensors 23 (16) (2023)             August 2025).
+     7194.                                                                                   [24] STMicroelectronics, STM32U575xx - Ultra-low-power Arm® Cortex® -M33 32-bit
+ [2] Q. Ma, H. Tan, T. Zhou, Mutual authentication scheme for smart devices in                    MCU+TrustZone® +FPU, 240 DMIPS, up to 2 MB Flash memory, 786 KB SRAM -
+     IoT-enabled smart home systems, Comput. Stand. Interfaces 86 (2023) 103743.                  Datasheet - production data, Datasheet DS13737 Rev 10, STMicroelectronics,
+ [3] C.-W. Shih, C.-H. Wang, Integrating wireless sensor networks with statistical                2024, p. 346, URL: https://www.st.com/resource/en/datasheet/stm32u575ag.
+     quality control to develop a cold chain system in food industries, Comput. Stand.            pdf. (Accessed 21 August 2025).
+     Interfaces 45 (2016) 62–78.                                                             [25] UEC Electronics, AR4236–AR4237 Luckfox Pico Pro/Max Datasheet,
+ [4] S.B. Baker, W. Xiang, I. Atkinson, Internet of things for smart healthcare:                  Datasheet, UEC Electronics, 2024, URL: https://uelectronics.com/wp-
+     Technologies, challenges, and opportunities, IEEE Access 5 (2017) 26521–26544.               content/uploads/2024/07/AR4236-AR4237-Luckfox-Pico-Pro-Max-Datasheet.pdf.
+ [5] Y. Abadade, A. Temouden, H. Bamoumen, N. Benamar, Y. Chtouki, A.S. Hafid,                    (Accessed 21 August 2025).
+     A comprehensive survey on tinyml, IEEE Access (2023).                                   [26] I. BIPM, I. IFCC, I. ISO, O. IUPAP, Evaluation of measurement data—guide to
+                                                                                                  the expression of uncertainty in measurement, JCGM 100: 2008 GUM 1995 with
+                                                                                                  minor corrections, Jt. Comm. Guides Metrol. 98 (2008).
+
+
+
+
+                                                                                         9
+
--- a/papers_txt/Fast-post-quantum-private-set-intersection-from-oblivio_2025_Journal-of-Syst.txt
+++ b/papers_txt/Fast-post-quantum-private-set-intersection-from-oblivio_2025_Journal-of-Syst.txt
@@ -0,0 +1,834 @@
+                                                               Journal of Systems Architecture 160 (2025) 103346
+
+
+                                                                   Contents lists available at ScienceDirect
+
+
+                                                          Journal of Systems Architecture
+                                                          journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+Fast post-quantum private set intersection from oblivious pseudorandom
+function for mobile social networks✩
+Zhuang Shan a , Leyou Zhang a ,∗, Qing Wu b , Qiqi Lai c , Fuchun Guo d
+a School of Mathematics and Statistics, Xidian University, Xi’an 710126, China
+b
+  School of Automation, Xi’an University of Posts and Telecommunications, Xi’an 710121, China
+c
+  School of Computer Science, Shaanxi Normal University, Xi’an 710121, China
+d
+  Centre for Computer and Information Security Research, University of Wollongong, Wollongong, NSW 2522, Australia
+
+
+
+ARTICLE               INFO                                ABSTRACT
+
+Keywords:                                                 Mobile social networks have become integral to our daily lives, transforming communication methods and
+Mobile social networks                                    facilitating social interactions. With technological advancements, users generate vast amounts of valuable
+Private set intersection                                  and sensitive personal data, which is stored on servers to enable instant information sharing. To protect the
+Oblivious pseudorandom function
+                                                          sharing data, each platform has implemented many techniques such as end-to-end encryption mechanisms,
+Private information retrieval
+                                                          fully homomorphic encryption, etc. However, these approaches face several security and privacy challenges,
+                                                          including potential leaks of user data, vulnerabilities in encryption that expose privacy ciphertexts to
+                                                          probabilistic attacks, and threats posed by future quantum computers.
+                                                              Aimed at the above, we introduce a private set intersection (PSI) protocol based on oblivious pseudorandom
+                                                          functions (OPRF) under ring LPR problem from lattice. The proposed perturbed pseudorandom generator
+                                                          not only enhances the PSI’s resistance to probabilistic attacks, but also leads to generate a more efficient
+                                                          OPRF and a PSI. It boasts a time complexity of 𝑂(𝑛 log 𝑛) and is superior to existing well-known fast post-
+                                                          quantum PSI protocol operating at 𝑂(𝑚𝑛 log(𝑚𝑛)), where 𝑚 is the bit length of the cryptographic modulus and 𝑛
+                                                          represents the dimension of the security parameter. Simulation experiments and security analyses demonstrate
+                                                          that our proposal effectively preserves user privacy, ensures collusion resilience, verifies computation results,
+                                                          and maintains low computational costs. Finally, as an expansion of our OPRF, we also give a fast private
+                                                          information retrieval (PIR) protocol.
+
+
+
+1. Introduction                                                                             respective data sets. This way, even if data is stored in distributed
+                                                                                            systems, it can effectively prevent data breaches and violations of user
+    Mobile social networks have greatly enriched the ways people com-                       privacy, such as those caused by data leaks or unauthorized access.
+municate and enhanced the convenience of social interactions. With the                          The application of PSI in mobile social networks not only enhances
+development of technology, users generate a large amount of useful                          data security but also strengthens user trust in the platform, which
+and sensitive personal data within mobile social networks. This data
+                                                                                            is crucial for protecting user privacy and improving the platform’s
+often needs to be stored and processed to provide more personalized
+                                                                                            competitiveness. In this way, mobile social networks can continue to
+services and experiences [1,2]. However, due to the limited storage
+capacity of mobile social network devices, it is impossible to store all                    provide a rich and vibrant social experience and efficient information
+the data generated at any given moment, which presents challenges for                       services while safeguarding personal privacy. Furthermore, as an im-
+data storage and privacy protection.                                                        portant application in the field of privacy computing, PSI has recently
+    To address this issue while ensuring data confidentiality and se-                       garnered widespread attention due to its efficiency and practicality,
+curity, many mobile social network platforms have started adopting                          jointly promoting the rapid implementation of privacy computing tech-
+advanced privacy-preserving technologies, such as private set inter-                        nology and ensuring the secure flow and value extraction of data
+section (PSI). The technology allows two or more parties to securely                        elements.
+compute the intersection of their datasets without disclosing their
+
+
+    ✩ This document is the results of the research project funded by the National Science Foundation.
+    ∗ Corresponding author.
+    E-mail addresses: arcsec30@stu.xidian.edu.cn (Z. Shan), lyzhang@mail.xidian.edu.cn (L. Zhang), xiyouwuq@126.com (Q. Wu), laiqq@snnu.edu.cn (Q. Lai),
+fuchun@uow.edu.au (F. Guo).
+
+https://doi.org/10.1016/j.sysarc.2025.103346
+Received 3 November 2024; Received in revised form 24 December 2024; Accepted 16 January 2025
+Available online 25 January 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Z. Shan et al.                                                                                                     Journal of Systems Architecture 160 (2025) 103346
+
+
+                                                                                  set intersection from oblivious pseudorandom function is proposed in
+                                                                                  this paper, and it has the following advantages:
+
+                                                                                  • Symmetric encryption is adopted, which is efficient and reduces the risk of
+                                                                                    privacy leakage. The PSI in this paper is constructed based on OPRF,
+                                                                                    which belongs to asymmetric encryption, thus reducing the number
+                                                                                    of interactions between users and lowering the risk of user privacy
+                                                                                    leakage. Compared to symmetric encryption, the operational cost of
+                                                                                    asymmetric encryption is lower, reducing reliance on authoritative
+                                                                                    institutions.
+                                                                                  • The structure of OPRF is simple, and it is relatively efficient in post-
+                                                                                    quantum OPRF. The OPRF used to construct PSI in this paper is based
+                                                                                    on a new lattice problem, namely the learning parity with rounding
+                       Fig. 1. Mobile social networks.
+                                                                                    over ring problem(Ring-LPR). The Ring-LPR problem not only has a
+                                                                                    simple structure but also possesses the capability to resist quantum
+                                                                                    attacks.
+                                                                                  • A perturbed pseudorandom generator (PPRG) can withstand probabilistic
+                                                                                    attacks. In addition to OPRF, the PSI in this paper also includes
+                                                                                    a structure with a perturbed pseudorandom generator, which can
+                                                                                    overcome the weakness of weak encryption in symmetric encryp-
+                                                                                    tion, thereby preventing adversaries from guessing the corresponding
+                                                                                    plaintext using statistical methods on the ciphertext ratios.
+
+
+                       Fig. 2. Private set intersection.                          1.2. Technical overview
+
+                                                                                      We adopted oblivious transfer technique and hamming correlation
+    There are many common construction tools for PSI [3], and obliv-              robustness, both of which are used in the OPRF construction presented
+ious transfer (OT) is one of them. An OT [4] is a crucial tool used               in this paper. For the incidental pseudorandom function subject, we
+for secure multiparty computation. In this tool, the sender transmits             initially aimed to use learning parity with noise (LPN) over rings.
+data from a set of messages to the receiver but remains oblivious to              However, this approach results in varying encryption outcomes for the
+which specific message was sent, while the receiver is unaware of the             same private data, preventing the recipient from matching the private
+other messages they did not receive. This protocol is also known as the
+                                                                                  data. Thus, we sought to make LPN over rings behave consistently
+oblivious transfer protocol. The essence of an oblivious pseudorandom
+                                                                                  like learning with rounding (LWR), leading to the introduction of the
+function is a pseudorandom function (PRF) enhanced with oblivious
+                                                                                  concept of learning parity with rounding over rings (LPR over rings) in
+transfer capabilities.
+                                                                                  this paper.
+    In 1986, Goldreich, Goldwasser, and Micali introduced a new cryp-
+                                                                                      To prove that LPR over rings is quantum-resistant, we established
+tographic primitive known as the pseudorandom function, whose out-
+put appears to be randomly chosen [5]. Two decades later, Naor and                a reduction bridge between LPR over rings and LWR. Yes, LPR over
+Reingold [6] noticed that their number-theoretic PRF allows for an                rings is reduced to LWR, not LPN over rings. For (𝑞 = 2𝑛 , 𝑝)-LWR
+interactive and oblivious evaluation, where a ‘‘client’’ with input 𝑥             instances, we demonstrated the hardness of (𝑞 = 2, 𝑝 = 1)-LWR instances
+obtains 𝐹𝑘 (𝑥) for a function 𝐹𝑘 (𝑥) that is contributed by a ‘‘server’’.         and (𝑞 = 2, 𝑝 = 1)-LWR over rings, where (𝑞 = 2, 𝑝 = 1)-LWR over
+Neither does the client learn the function (i.e., its key 𝑘), nor does the        rings corresponds to LPR over rings. To verify that the computational
+server learn 𝑥 or 𝐹𝑘 (𝑥). Freedman et al. later called such two-party             efficiency of the post-quantum OPRF in this paper is quite fast, we
+protocol an OPRF and gave first formal definitions and two OPRFs                  compared the OPRF with the LWE-instantiated OPRF from [14]. The
+based on the Naor-Reingold PRF [7]. In 2009, Jarecki and Liu presented            results showed that, as theoretical analysis suggested, the computation
+an efficient OPRF for securing intersection data [8].                             efficiency improves with the increase of security parameters.
+    Oblivious pseudorandom functions have been utilized in PSI [9].                   Based on OPRF, we constructed private set intersection (PSI) based
+The additional functionalities of oblivious pseudorandom functions                on OPRF. Since the paper [15] analyzed that PSI based on symmetric
+also exhibit diversity, such as verifiable oblivious pseudorandom func-           encryption does not resist probabilistic attacks and proposed the con-
+tions (VOPRF, [10]) and partially oblivious pseudorandom functions                cept of perturbed pseudorandom generator, we used LPN over rings
+(POPRF, [11]).                                                                    to construct a pseudorandom generator and proved that it satisfies the
+    Currently, OPRFs still faces challenges, as summarized by Casacu-             definition of PPRG as given in [15].
+berta, Hesse, and Lehmann [12]. Efficient OPRF constructions often
+rely on discrete-log or factoring-type hardness assumptions, which
+                                                                                  1.3. Organizations
+are vulnerable to quantum computers. This paper aims to address
+this by constructing OPRFs based on lattice-hardness assumptions and
+improving their efficiency (see Figs. 1 and 2).                                       The structure of this paper is as follows. Section 3 provides the
+                                                                                  necessary definitions and lemmas as a foundation for the readers’
+1.1. Contributions                                                                knowledge. Section 4 presents the construction and efficiency analysis
+                                                                                  of OPRF, along with the definition and reduction of Ring-LPR. Section 5
+   Regarding the open problem proposed by Casacuberta, there are                  details the construction of the PSI in this paper, security proofs, and
+currently quantum-resistant OPRFs, namely Albrecht et al.’s lattice-              LWE-based efficiency analysis, as well as the construction of the PPRG
+based VOPRF [10] and Boneh et al.’s isogeny-based OPRF [13]. Both                 and the proof of its pseudorandomness. Finally, Section 6 summarizes
+constructions represent significant feasibility results but require further       the advantages and limitations of the PSI presented in this paper, as
+research to improve their efficiency [12]. So, fast post-quantum private          well as the extension of OPRF to PIR
+
+                                                                              2
+Z. Shan et al.                                                                                                                Journal of Systems Architecture 160 (2025) 103346
+
+
+2. Preliminary                                                                             ⎛ 0        0   0   ⋯     0      −1 ⎞
+                                                                                           ⎜ 1        0   0   ⋯     0       0 ⎟
+    Each element of a lattice in R𝑛 can be expressed linearly by 𝑛                         ⎜                                  ⎟
+                                                                                             0        1   0   ⋯     0       0 ⎟
+                                                                                         𝑋=⎜                                    .
+linearly independent vector integer coefficients. This set of linearly                     ⎜ 0        0   1   ⋯     0       0 ⎟
+independent vectors is called a lattice basis, and we know that the                        ⎜ ⋮        ⋮   ⋮   ⋱     ⋮      ⋮ ⎟⎟
+                                                                                           ⎜
+lattice basis is not unique. Given a set of lattice bases (𝑣1 , … , 𝑣𝑛 ) in                ⎝ 0        0   0   ⋯     1       0 ⎠
+the lattice , then the fundamental parallelelepiped is
+                  { 𝑛                     }                                              So there is
+                   ∑          |
+(𝑣1 , … , 𝑣𝑛 ) =      𝑘𝑖 𝑣𝑖 ||𝑘𝑖 ∈ [0, 1) .                                                       ⎛ 𝑎0       −𝑎𝑛−1     ⋯      −𝑎1 ⎞
+                              |                                                                    ⎜                               ⎟
+                   𝑖=1                                                                                𝑎1       𝑎0       ⋯      −𝑎2 ⎟
+                                                                                         𝑅𝑜𝑡(𝑓 ) = ⎜                                 ,
+If the lattice base (𝑣1 , … , 𝑣𝑛 ) is determined, use the symbol () to                           ⎜ ⋮          ⋮       ⋱       ⋮ ⎟
+replace (𝑣1 , … , 𝑣𝑛 ). ∀𝑥 ∈ R𝑛 , project it onto (). According to the                          ⎜ 𝑎        𝑎𝑛−2      ⋯          ⎟
+                                                                                                                                𝑎0 ⎠
+                                                                                                   ⎝ 𝑛−1
+properties of projection, there is a unique 𝑦 ∈ () makes 𝑦 − 𝑥 ∈ .
+                                                                                         it is easy to prove that this mapping relationship is isomorphic.
+Use the symbol det () to represent the volume of the fundamental
+parallelelepiped of the lattice . In other words, the symbol det ()
+                                                                                         Definition 3 (Learning with Rounding, [16,17]). Let 𝜆 be the security
+represents the determinant of a matrix composed of a set of lattice bases
+                                                                                         parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆), 𝑞 = 𝑞(𝜆), 𝑝 = 𝑝(𝜆) be integers. The LWR
+(𝑣1 , … , 𝑣𝑛 ). For a given 𝑛 dimensional lattice, the det () size of any set
+                                                                                         problem states that for 𝐴 ∈ Z𝑚×𝑛            𝑛        𝑚
+                                                                                                                          𝑞 , 𝑠 ∈ Z𝑞 , 𝑢 ∈ Z𝑞 the following distri-
+of lattice bases of the lattice is constant.
+                                                                                         butions are computationally indistinguishable: (𝐴, ⌊𝐴𝑠⌋𝑝 ) ≈𝐶 (𝐴, ⌊𝑢⌋𝑝 ).
+     Given 𝑛 lattice , (𝑣1 , … , 𝑣𝑛 ) and (𝑢1 , … , 𝑢𝑛 ) are two arbitrary groups
+                                                                        ∑                Here ⌊𝑥⌋𝑝 = ⌊ 𝑞𝑝 𝑥⌋, ⌊𝑥⌋ represents the floor function, which rounds down
+of lattice  respectively lattice bases. Therefore, there is 𝑣𝑖 = 𝑛𝑗=1 𝑚𝑖𝑗 𝑢𝑗
+           ∑𝑛       ′                                                                    to the nearest integer. For example, ⌊3.14⌋ = 3 and ⌊3⌋ = 3.
+and 𝑢𝑖 = 𝑗=1 𝑚𝑖𝑗 𝑣𝑗 , 𝑖 ∈ {1, … , 𝑛}, there are two integer matrices 𝑀 and
+𝑀 ′ such that
+⎛ 𝑣1 ⎞           ⎛ 𝑢1 ⎞       ⎛ 𝑢1 ⎞          ⎛ 𝑣1 ⎞                                     Definition 4 (Learning Parity with Noise, [18,19]). Let 𝜆 be the security
+⎜ ⋮ ⎟ = 𝑀 ⎜ ⋮ ⎟ and ⎜ ⋮ ⎟ = 𝑀 ′ ⎜ ⋮ ⎟ .                                                  parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆) be integers. The LPN problem states
+⎜      ⎟         ⎜     ⎟      ⎜       ⎟       ⎜       ⎟
+⎝ 𝑣𝑛 ⎠           ⎝ 𝑢𝑛 ⎠       ⎝ 𝑢𝑛 ⎠          ⎝ 𝑣𝑛 ⎠                                     that for 𝐴 ∈ Z𝑚×𝑛
+                                                                                                        2
+                                                                                                           , 𝑠 ∈ Z𝑛2 , 𝑢, 𝑒 ∈ Z𝑚
+                                                                                                                               2
+                                                                                                                                  the following distributions are
+                                                                                         computationally indistinguishable: (𝐴, 𝐴𝑠 + 𝑒) ≈𝐶 (𝐴, 𝑢).
+It is easy to prove that 𝑀 and 𝑀 ′ are inverse to each other, and 𝑀
+and 𝑀 ′ are both integer matrices, there are det (𝑀)⋅ det (𝑀 ′ ) = 1 and
+det (𝑀) = det (𝑀 ′ ) = ±1, so                                                            Definition 5 (Hamming Correlation Robustness, [14]). For a hash func-
+det (𝑣1 , … , 𝑣𝑛 ) = ± det (𝑢1 , … , 𝑢𝑛 ).                                               tion (⋅) and a pseudorandom function 𝐹𝑘 (⋅) with key 𝑘, (⋅) is Ham-
+                                                                                         ming correlation robust if (𝑥) ≈𝐶 𝐹𝑘 (𝑥).
+
+
+Definition 1. An ideal lattice is a subset of rings or domains that                      Definition 6 (OT1 ). The message sender sends data to the receiver
+satisfies the following two properties:                                                  from a set of pending messages but remains oblivious to which specific
+                                                                                         message was sent. Meanwhile, the receiver is unaware of the additional
+    1. Additive closure: If any two elements in the ideal are added, the                 data they want to receive. This protocol is also known as oblivious
+       result is still in the ideal. In other words, for any elements 𝑎 and              transfer.
+       𝑏 in the ideal, 𝑎 + 𝑏 also belongs to that ideal.
+    2. Multiplicative absorptivity: If an element in the ideal is multi-
+       plied by any element in the ring (or field), the result is still in               Definition 7 (OPRF, [20]). Let the PRF key 𝑘 consist of two bit-
+       the ideal. In other words, for any element 𝑎 in the ideal and any                 strings 𝑞 , 𝑠 ∈ {0, 1}𝜆 . Let 𝐹 (⋅)be a pseudorandom code that produces a
+       element 𝑟 in the ring (or field), 𝑎𝑟 and 𝑟𝑎 belong to that ideal.                 pseudorandom string and let  be a hash function. The pseudorandom
+                                                                                         function is computed as
+For a commutative ring, further require that the ideal be closed for both
+addition and multiplication. Such an ideal is called a true ideal.                       OPRF𝑘 (𝑥) = (𝑞 ⊕ [𝐹 (𝑥) ⋅ 𝑠]),
+
+                                                                                         where ⋅ denotes bitwise-AND and ⊕ denotes bitwise-XOR. For a ran-
+Definition 2. Referring to the definition of ideal, the ideal lattice  is               domly generated s, if 𝐹 (𝑥) has enough Hamming weight then the
+a subset of the lattice  that satisfies the following two properties:                   function OPRF𝑘 (𝑥) is pseudorandom assuming the hash function  is
+                                                                                         correlation robust.
+    1. Additive closure: If any two elements in an ideal lattice are
+       added, the result is still in the ideal lattice. In other words, for
+       any elements 𝑎 and 𝑏 in an ideal lattice, 𝑎+𝑏 also belongs to that                Definition 8 (PSI, [14]). PSI enables two parties, each holding a private
+       ideal lattice.                                                                    set of elements, to compute the intersection of the two sets while
+    2. Multiplicative absorptivity: If an element in an ideal lattice is                 revealing nothing more than the intersection itself.
+       multiplied by an element in any other ideal lattice, the result
+       remains in the ideal lattice. In other words, for any element 𝑎 in
+                                                                                         Definition 9 (Dihedral Coset Problem). Given a security parameter 𝜅, for
+       the ideal and any element 𝑟 in another ideal lattice, both 𝑎𝑟 and
+                                                                                         an instance of the DCP𝓁𝑞 problem, where 𝑁 denotes the modulus and 𝓁
+       𝑟𝑎 belong to that ideal lattice.
+                                                                                         represents the number of states. Each state is expressed as
+                                                                                         |0⟩|𝑥𝑖 ⟩ + |1⟩|(𝑥𝑖 + 𝑠) mod 𝑞⟩,    𝑖 ≤ 𝓁,
+Corollary 1. The ideal lattice  is a true idea of the lattice .                        and it stores 1 + ⌈log2 𝑞⌉ bits, where 𝑥 ∈𝑅 Z𝑛𝑞 and 𝑠 ∈ Z𝑛𝑞 . If 𝑠 can be
+    For 𝑓 (𝑥) = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 is mapped to                                   computed with probability poly(1∕ log 𝑞) in time poly(log 𝑞), then the
+                                                                                         DCP𝓁𝑞 problem is considered to be broken.
+𝑅𝑜𝑡(𝑓 ) = 𝑎0 𝐼 + 𝑎1 𝑋 + ⋯ + 𝑎𝑛−1 𝑋 𝑛−1 ∈ .
+                                         ̃
+
+Among them,    ̃ is the mapping of all Z[𝑥]∕<𝑥𝑛 + 1> to the elements in
+                                                                                           1
+the ideal lattice  collection, and                                                            https://blog.csdn.net/m0_61869253/article/details/139362753
+
+
+                                                                                     3
+Z. Shan et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103346
+
+
+                                                                                    3.2. Security proof of OPRF
+
+Note 1. The Dihedral Coset Problem is a difficult problem in quantum                   In this subsection, we will provide the definition of the underly-
+computing, and solving it has a time complexity of 𝑂(𝑒𝑛 ) or 𝑂(𝑛!).                 ing lattice problem for OPRF, learning parity with rounding, and its
+                                                                                    reduction proof.
+
+Lemma 1. If an efficient algorithm  can solve DCP𝓁2 in polynomial
+                                                                                    Definition 11 (Learning Parity with Rounding). Let 𝜆 be the security
+time, then there exists an efficient algorithm  ′ that can solve DCP𝓁𝑞 in
+                                                                                    parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆) be integers. The LPR problem states
+polynomial time.
+                                                                                    that for 𝐴 ∈ Z𝑚×𝑛
+                                                                                                   2
+                                                                                                      , 𝑠 ∈ Z𝑛2 , 𝑢 ∈ Z𝑚 2
+                                                                                                                            the following distributions are
+                                                                                    computationally indistinguishable: (𝐴, ⌊𝐴𝑠 mod 4⌋1 ) ≈𝐶 (𝐴, ⌊𝑢⌋1 ).
+Proof. We use a proof by contradiction. Suppose 𝑞 = 2𝑛 and there exists
+an efficient algorithm  that can solve DCP𝓁2 in polynomial time. For               Definition 12 (Learning Parity with Rounding Over Ring). The Ring LPR
+instances of DCP𝓁4 , we have                                                        problem states that for 𝑎, 𝑠, 𝑢 ∈ 2 the following distributions are
+|0⟩|𝑥𝑖 ⟩+|1⟩|(𝑥𝑖 + 𝑠) mod 4⟩ = |0⟩|𝑥′𝑖 ⟩ + |1⟩|(𝑥′𝑖 + 𝑠′ ) mod 2⟩                   computationally indistinguishable: (𝑎, ⌊𝑎𝑠 mod 4⌋1 ) ≈𝐶 (𝑎, ⌊𝑢⌋1 ).
+             + 2(|0⟩|𝑥′′          ′    ′′
+                      𝑖 ⟩ + |1⟩|(𝑥𝑖 + 𝑠 ) mod 2), 𝑖 ≤ 𝓁,
+
+so running the algorithm  twice will solve DCP𝓁4=22 . Similarly, run-              Lemma 4. For an LWR problem instance ⌊𝐴𝑠⌋𝑝 , if there exists an algorithm
+ning  four times will solve DCP𝓁16=24 , and continuing in this manner,              for solving 𝑠 from ⌊𝐴𝑠⌋1 , then there also exists an algorithm  ′ for
+running the algorithm  𝑛 times will solve DCP𝓁𝑞 . Let 𝑂() represent               solving the LWR problem.
+the time complexity of the algorithm . Thus, we have  ′ ≤ 𝑛𝑂()
+and algorithm  ′ is an efficient algorithm. □                                      Proof. Given that there exists an algorithm  that can solve ⌊𝐴𝑠⌋1 =
+                                                                                    ⌊ 𝐴𝑠 ⌋, for an LWR problem instance ⌊𝐴𝑠⌋𝑝 , we have:
+                                                                                      𝑞            ⌊   ⌋
+Definition 10 (Extrapolated Dihedral Coset Problem with model 2, [21]).             1            1 𝑝𝐴𝑠
+                                                                                      ⌊𝐴𝑠⌋𝑝 =
+Given a security parameter 𝜅, an instance of EDCP𝓁𝑛,2,𝜌 is provided,                𝑝            𝑝   𝑞
+                                                                                                   (       )
+where 2 denotes the modulus, 𝜌 represents the probability density                                1 𝑝𝐴𝑠
+                                                                                              =         +𝑒     (𝑒 ∈ (−1, 0]𝑚 )
+function, and 𝓁 denotes the number of states. Each state is expressed                            𝑝   𝑞
+                                                                                                          (     (      ]𝑚 )
+as                                                                                               1                  1
+   ∑                                                                                          = 𝐴𝑠 + 𝑒′    𝑒′ ∈ − , 0
+        𝜌(𝑗)|𝑗⟩|(𝑥𝑖 + 𝑗 𝑠) mod 2⟩, 𝑖 ≤ 𝓁,                                                        𝑞                  𝑝
+𝑗∈supp(𝜌)                                                                                    ≈ ⌊𝐴𝑠⌋1 .
+and stores 2 bits, where 𝑥𝑖 ∈𝑅 Z𝑛2 and 𝑠 ∈ Z𝑛2 . If 𝑠 can be determined
+                                                                                       Thus, the algorithm  can be used to solve the LWR problem.                □
+with probability poly(1∕(𝑛 log 2)) in time poly(𝑛 log 2), then the EDCP𝓁𝑛,2,𝜌
+problem is considered to be broken.                                                    We get next corollary by Lemma 3.
+                                                                                                                    √
+                                                                                    Corollary 3. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)
+Lemma 2. If there exists an algorithm for solving EDCP𝓁𝑛,4,𝜌 , then this            be an instance of 2-LWR. If there exists an algorithm for solving 2-LWR,
+algorithm can also solve DCP𝓁4 .                                                    then there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .
+                                                                                                                                           𝑟
+
+
+                                                                                                                   √
+Proof. Let                                                                          Corollary 4. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)
+      1            1                                                                be an instance of LPR. If there exists an algorithm for solving LPR, then
+|𝑏⟩ = √ |0⟩|𝑥𝑖 ⟩ + √ |1⟩|(𝑥𝑖 + 𝑠) mod 4⟩.
+       2            2                                                               there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .
+                                                                                                                                     𝑟
+
+Thus, 𝜌(0)|0⟩ = √1 |0⟩ and 𝜌(1)|1⟩ = √1 |1⟩. Hence, DCP𝓁2 is a special
+                     2                       2
+case of EDCP𝓁𝑛,2,𝜌 . Therefore, if there exists an algorithm for solving            Lemma 5. If there exists an algorithm  for solving the Ring-LPR problem,
+EDCP𝓁𝑛,2,𝜌 , this algorithm can also solve DCP𝓁2 . □                                then there also exists an algorithm  ′ for solving the LPR problem.
+
+
+                                       √                                            Proof. For an instance of the inner product Ring-LPR
+Lemma 3 ([21]). Let (𝑛, 𝑞 , 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and
+(𝑛, 𝑞 , 𝛼) be an instance of LWE. If there exists an algorithm for solving          𝑏 = ⌊𝑎 ⋅ 𝑠⌋1
+LWE𝑛,𝑞,𝛼 , then there exists an algorithm for solving G-EDCP𝓁𝑛,𝑞,𝜌 .                where 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 , we can represent 𝑎 as a circulant
+                                                                 𝑟
+                                                                                    matrix, specifically
+                               √                                                          ⎛ 𝑎0      −𝑎𝑛−1 ⋯ −𝑎1 ⎞
+Corollary 2. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)                 ⎜                           ⎟
+                                                                                              𝑎        𝑎0  ⋯ −𝑎2 ⎟
+be an instance of LPN. If there exists an algorithm for solving LPN𝑛,𝛼 , then       𝐴1 ∶= ⎜     1
+                                                                                                                        .
+                                                                                          ⎜ ⋮          ⋮   ⋱     ⋮ ⎟
+there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .                                      ⎜ 𝑎                         ⎟
+                                                    𝑟
+                                                                                          ⎝ 𝑛−1      𝑎𝑛−2  ⋯     𝑎0 ⎠
+                                                                                    Thus,
+3. Ring-LPR based OPRF
+                                                                                    𝑏 = ⌊𝑎 ⋅ 𝑠⌋1 ⇒ 𝑏 = 𝐴1 𝑠.
+3.1. Constructing OPRF                                                              where 𝑎 = (𝑎0 , 𝑎1 , … , 𝑎𝑛−1 ) ← 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 . We use
+                                                                                    a proof by contradiction. Suppose there exists an efficient algorithm
+    Fig. 3 presents the ring LPR-based oblivious pseudorandom func-                  that can solve Ring-LPR in polynomial time. We take the first row
+tion. In the next section, we will prove the security of the oblivious              from 𝐴1 , denote it as 𝛼1 , and have ⌊𝛼1 𝑠⌋1 = 𝑏1 , where 𝑏1 is the first
+pseudorandom function.                                                              component of 𝑏. For the LWR problem instance, 𝛽⃗ = ⌊𝛬𝑠⃗⌋1 , assume
+
+                                                                                4
+Z. Shan et al.                                                                                                   Journal of Systems Architecture 160 (2025) 103346
+
+
+
+
+                                                      Fig. 3. Oblivious Pseudorandom Function (OPRF).
+
+
+
+𝛬𝑇 = (𝛼1 , 𝛼2 , … , 𝛼𝑚 ).
+
+Thus, we use the algorithm  𝑚 times to find 𝛽𝑖 such that ⌊𝛾𝑖 ⌋1 = 𝛽𝑖 =
+⌊𝛼1 𝑠1 ⌋1 , and thus we can solve the equation
+𝛾 = 𝛬𝑠⃗, 𝛾 𝑇 = (𝛾1 , … , 𝛾𝑚 ).
+
+
+    Assuming that the time complexity of solving 𝑠 from LWR problem
+instance is 𝑂(𝛬, 𝛽), according to Corollary 3, let 𝑂(𝛾 = 𝛬𝑠⃗) be the
+computational complexity of solving the equation 𝛾 = 𝛬𝑠⃗, we have
+𝑚𝑂() + 𝑂(𝛾 = 𝛬𝑠⃗) ≥ 𝑂(𝛬, 𝛽) ≥ 𝑂(𝑛!) or 𝑂(𝑒𝑛 ).
+
+Let 𝑚 = 𝑛, then
+        𝑂(𝛬, 𝛽) − 𝑂(𝛾 = 𝛬𝑠⃗)
+𝑂() ≥
+                 𝑛
+        𝑂(𝑛!) − 𝑂(𝛾 = 𝛬𝑠⃗)    𝑂(𝑒𝑛 ) − 𝑂(𝛾 = 𝛬𝑠⃗)
+      ≥                    or                     .
+                𝑛                      𝑛
+This contradicts the assumption that there is an efficient algorithm 
+that can solve the inner product Ring-LPR in polynomial time, thus the
+theorem holds. □
+
+
+3.3. Efficiency analysis
+
+    This section simulates the OPRF computation efficiency of this
+paper and OPRF in [14] on MAC, Pad and Phone. The PRF of [14]
+is instantiated based on LWE.
+
+3.3.1. Efficiency analysis on MAC
+   The tools used in the subsection are Python 3.12, the programs are
+performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB (see
+Fig. 4).
+
+3.3.2. Efficiency analysis on mobile pad
+   The tools used in the subsection are Pydriod 3, the programs are
+performed on Xiaomi Pad 6 Pro File Explorer 1th Qualcomm(R)AI En-
+gine(TM) Xiaolong 8+ mobile platform@3.2 GHz, RAM 8.00+3.00 GB
+(see Fig. 5).
+                                                                                 Fig. 4. Parallel comparison of OPRF on MAC, where 𝑛 represents the security
+                                                                                 parameter, unit is microseconds.
+3.3.3. Summary of data comparison
+    From the simulation results, it can be seen that for 𝑛 ≤ 250, the
+LWE-based OPRF in [14] is slightly faster, while for 𝑛 > 250, the ring
+LPR-based OPRF in this paper is faster. Furthermore, as 𝑛 increases,             4. PSI based on OPRF
+the advantages of ring LPR become more pronounced. Based on the
+simulation results for Pad, the OPRF in this paper is more stable;                  In this paper, apart from OPRF, another tool used in the construction
+although there are fluctuations, they are less significant compared to           of PSI is a perturbed pseudorandom generator [15]. The perturbed
+the LWE-based OPRF in [14].                                                      pseudorandom generator in this paper is constructed from Ring-LPN.
+
+                                                                             5
+Z. Shan et al.                                                                                                                          Journal of Systems Architecture 160 (2025) 103346
+
+
+
+
+                                                                                                              Fig. 6. Pseudorandom generator with perturbation 𝐺𝛾 (⋅).
+
+
+
+                                                                                                √
+                                                                                                √𝑛−1
+                                                                                                √∑
+                                                                                          ‖𝑎‖ = √ |𝑎 |2 .       𝑖
+                                                                                                        𝑖=0
+
+
+
+
+                                                                                          Definition 15 ([15]). A pseudorandom generator with perturbation,
+                                                                                          denoted as 𝐺𝛾 (⋅), is defined such that for 𝑥1 , 𝑥2 ∈ , there exists 𝛾
+                                                                                          satisfying the following conditions:
+
+                                                                                                1. When 𝑥1 = 𝑥2 , Pr (𝐺𝛾 (𝑥1 ) = 𝐺𝛾 (𝑥2 )) ≤ 𝑂(exp(−𝑛)),
+                                                                                                2. When 𝑥1 = 𝑥2 , such that ‖𝐺𝛾 (𝑥1 ) − 𝐺𝛾 (𝑥2 )‖ < 𝛾, there exists 𝑁
+                                                                                                   such that ‖𝐺𝛾 (𝑥1 ) − 𝐺𝛾 (𝑥2 )‖ ≥ 𝛾 ⋅ 𝑁, where clearly 𝑁 = 1 is
+                                                                                                   optimal.
+
+
+
+                                                                                          Theorem 1. The Ring-LPN problem itself can be viewed as a pseudorandom
+                                                                                          function with perturbations.
+
+
+                                                                                          Proof. We prove each statement separately. First, when 𝑥1 = 𝑥2 , we
+Fig. 5. Parallel comparison of OPRF on mobile pads, where 𝑛 represents the security       have
+parameter, unit is microseconds.                                                            (                   )                  1
+                                                                                          Pr 𝐺𝛾 (𝑥1 ) = 𝐺𝛾 (𝑥2 ) = Pr (𝑒1 = 𝑒2 ) = 𝑛 .
+                                                                                                                                  2
+                                                                                                                  √
+                                                                                          Additionally, set 𝛾 = 𝑛 + 1, so
+Next, we will present the reduction process for Ring-LPN.
+                                                                                          ‖(𝐴𝑥1 + 𝑒1 ) − (𝐴𝑥2 + 𝑒2 )‖ = ‖𝑒1 − 𝑒2 ‖ < 𝛾 .
+4.1. Reduction of ring-LPN                                                                When 𝑥1 ≠ 𝑥2 , set 𝑣1 = 𝐺𝛾 (𝑥1 ), 𝑣2 = 𝐺𝛾 (𝑥2 ), and know that
+                                                                                                          √     ∑𝑛      ( )𝑘 ( )𝑛−𝑘
+                                                                                                                          1     1
+Definition 13 (Learning Parity with Noise Over Ring). The learning parity                 Pr (‖𝑣1 − 𝑣2 ‖ ≤ 𝑛) =     𝐶𝑛𝑘
+                                                                                                                𝑘=0
+                                                                                                                          3     2
+with noise over ring problem states that for 𝑎, 𝑠, 𝑒, 𝑢 ∈ {0,1} the
+following distributions are computationally indistinguishable: (𝑎, 𝑎𝑠 +                                                   ∑
+                                                                                                                          𝑛∕2         ( )𝑘 ( )𝑘 ( )𝑛−2𝑘
+                                                                                                                                       1    1    1
+                                                                                                                      +         𝐶𝑛𝑘                     .
+𝑒) ≈𝐶 (𝑎, 𝑢).                                                                                                                          3    6    2
+                                                                                                                          𝑘=0
+
+                                                                                          Because
+                                                                                                  ( )𝑘 ( )𝑛−𝑘     (     ( )2     ( )𝑛 )
+Corollary 5. If there exists an efficient algorithm  that can solve the                  ∑𝑛
+                                                                                                   1    1       1 2       2       2
+Ring-LPN problem in polynomial time, then there also exists an algorithm                      𝐶𝑛𝑘             = 𝑛     +      +⋯+
+                                                                                          𝑘=0
+                                                                                                   3    2      2    3     3       3
+ ′ that can solve the LPN problem.                                                                               (    ( )𝑛 )
+                                                                                                                3        2
+                                                                                                              = 𝑛 1−          ,
+                                                                                                               2         3
+Proof. The proof method is similar to that of Lemma 5, but this way
+                                                                                          and
+the computational complexity of  will decrease. If we want the Ring-                                                                             (                         )
+                                                                                          ∑
+                                                                                          𝑛∕2         ( )𝑘 ( )𝑘 ( )𝑛−2𝑘                                    (         ) 2𝑛
+LPN problem to be ‘approximately’ as hard as the LPN problem, then                                     1    1    1        3⋅6 1                                 1
+                                                                                                𝐶𝑛𝑘                     ≤                             1−                        .
+for the security parameters 𝜅1 of the Ring-LPN problem and 𝜅2 of the                      𝑘=0
+                                                                                                       3    6    2         17 2𝑛− 2𝑛                           3⋅6
+LPN problem, we have
+                                                                                          Therefore
+𝑒𝜅1            (𝜅 )!                                                                        (            √   √     )
+     ≥ 𝑒𝜅2 , or 1 ≥ (𝜅2 )!.                                                                                           1
+                                                                                          Pr ‖𝑣1 − 𝑣2 ‖ ≤ 𝑛 < 𝑛 + 1 ≤ 𝑛 .
+ 𝜅12            𝜅12                                                                                                  2
+                                                                                                                                                                       √
+Thus, we can roughly obtain 𝜅1 ≥ 1.5𝜅2 and 𝜅2 ≥ 12. Note that 𝑂(𝑛)                        Thus, there is a very high probability that ‖𝑣1 −𝑣2 ‖ ≥                       𝑛 + 1, and 𝑁 = 1
+is an asymptotically large quantity with respect to 𝑛. We use the most                    (see Fig. 6). □
+extreme case to determine the relationship between 𝜅1 and 𝜅2 . □
+
+
+4.2. Perturbed pseudorandom generator                                                     4.3. PSI based on OPRF
+
+Definition 14. Let 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 ∈ {0,1} . Define the                    Lemma 6. Assuming 𝑓 (𝑦) ≈𝐶 𝑢1 and 𝑔(𝑢1 ) ≈𝐶 𝑢2 , then (𝑔◦𝑓 )(𝑦) ≈𝐶 𝑢2 .
+norm of 𝑎 as ‖𝑎‖, and
+
+                                                                                      6
+Z. Shan et al.                                                                                                                 Journal of Systems Architecture 160 (2025) 103346
+
+
+
+
+                            Fig. 7. PSI based on OPRF.
+
+
+
+
+                                                                                            Fig. 9. Parallel comparison of PSI on mobile pads, where 𝑛 represents the security
+                                                                                            parameter, unit is microseconds.
+
+
+
+
+Fig. 8. Parallel comparison of PSI on MAC, where 𝑛 represents the security parameter,       Fig. 10. Comparison of PSI on mobile phones, where 𝑛 represents the security
+unit is microseconds.                                                                       parameter, unit is microseconds.
+
+
+
+                                                                                        7
+Z. Shan et al.                                                                                                                 Journal of Systems Architecture 160 (2025) 103346
+
+
+
+
+                                                                          Fig. 11. PIR based on OPRF.
+
+
+                                                                                             Proof. On one hand, because the pseudorandom 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →
+                                                                                             {0,1} , for any 𝑘 ∈ {0,1} , 𝑦 ∈  ⊂ {0, 1}∗ , we have 𝐹̃𝑘 (𝑦) ≈𝐶 𝑢𝜔 ∈
+                                                                                             {0,1} .
+                                                                                                  On the other hand, due to the pseudorandom function 𝐹𝑘 ∶ {0,1} ×
+                                                                                             {0,1} → {0,1} , for 𝑢𝓁1 ∈ {0,1} , we have 𝐹𝑘 (𝑢𝓁1 ) ≈𝐶 𝑢𝜔 . According
+                                                                                             to the property of the hash function, have 1 (𝑦) ≈𝐶 𝑢𝓁1 . Combining
+                                                                                             with Lemma 6, one can obtain that 𝐹𝑘 (1 (𝑦)) ≈𝐶 𝑢𝜔 . Consequently,
+                                                                                             𝐹̃𝑘 (𝑦) ≈𝐶 𝐹𝑘 (1 (𝑦)). □
+
+
+                                                                                             Theorem 2. If 1 is a collision resistant hash function, 2 and 3
+                                                                                             are hamming correlation robustness, then the protocol in Fig. 7 securely
+                                                                                             realizes 𝑃 𝑆 𝐼 in the semi-honest model when parameters 𝑚, 𝑤 are chosen
+                                                                                             as described in [14].
+
+
+                                                                                             Proof. Perspective from 𝑃1 .
+                                                                                             Hyb0 𝑃1 ’s view and 𝑃2 ’s output in the real protocol.
+                                                                                             Hyb1 Same as Hyb0 except that on 𝑃2 ’s side, for each 𝑖 ∈ [𝜔], if 𝑠[𝑖] = 0,
+                                                                                                  then sample 𝐴𝑖 ← {0, 1}𝑚 and compute 𝐵𝑖 = 𝐴𝑖 ⊕ 𝐷𝑖 ; otherwise
+                                                                                                  sample 𝐵𝑖 ← {0, 1}𝑚 and compute 𝐴𝑖 = 𝐵𝑖 ⊕ 𝐷𝑖 . This hybrid is
+                                                                                                  identical to Hyb0 .
+                                                                                             Hyb2 Initialize an 𝑚 × 𝑤 binary matrix 𝐷 to all 1’s. Denote its column
+                                                                                                  vectors by 𝐷1 , … , 𝐷𝜔 . Then 𝐷1 = ⋯ = 𝐷𝜔 = 1𝑚 . For 𝑦 ∈ ,
+                                                                                                  randomly select 𝑣 ← [𝑚]𝜔 , and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
+                                                                                             Hyb3 Find a suitable pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →
+                                                                                                  {0,1} . For 𝑦 ∈ , compute 𝑣̃ = 𝐹̃𝑘 (𝑦), randomly select 𝑣 ← [𝑚]𝜔 ,
+                                                                                                  and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
+                                                                                             Hyb4 Let there be a pseudorandom function 𝐹 ∶ {0,1} ×{0,1} → {0,1}
+                                                                                                  and a hash function 1 ∶ {0, 1}∗ → {0,1} . For 𝑦 ∈ , compute
+                                                                                                  𝑣′ = 𝐹𝑘 (1 (𝑦)), randomly select 𝑣 ← [𝑚]𝜔 , and set 𝐷𝑖 [𝑣[𝑖]] = 0 for
+                                                                                                  all 𝑖 ∈ [𝜔].
+                                                                                             Hyb5 Let there be a pseudorandom function 𝐹 ∶ {0,1} × {0,1} →
+                                                                                                   {0,1} , Hamming Correlation Robustness 2 ∶ Z𝑚×𝜔         {0,1}
+                                                                                                                                                                   → {0,1}
+                                                                                                   and a hash function 1 ∶ {0, 1}∗ → {0,1} . For 𝑦 ∈ , compute
+                                                                                                   𝑣′ = 𝐹𝑘 (1 (𝑦)), 𝑣 = 2 (𝑣′ ), and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
+Fig. 12. Parallel comparison of PIR on MAC, where 𝑛 represents the security parameter,          Given that Hyb0 ≈𝐶 Hyb1 ≈𝐶 Hyb2 ≈𝐶 Hyb3 , Hyb4 ≈𝐶 Hyb5 and
+unit is microseconds.                                                                        according to Lemma 7, it be known that Hyb3 ≈𝐶 Hyb4 . Therefore, we
+                                                                                             have Hyb0 ≈𝐶 Hyb5 .
+                                                                                                Perspective from 𝑃2 .
+Lemma 7. Find a suitable pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →                      Hyb0 𝑃2 ’s view in the real protocol.
+{0,1} . Assuming that the pseudo-random function 𝐹𝑘 ∶ {0,1} × {0,1} →
+                                                                                             Hyb1 𝜓 ← {0,1} , all other aspects are consistent with the real
+{0,1} and the hash function 1 ∶ {0, 1}∗ → {0,1} are indistinguishable,
+                                                                                                  protocol.
+we have
+                                                                                             Hyb2 Introduce 𝐺𝛾 ∶ {0,1} → {0,1} and Hamming Correlation
+𝐹̃𝑘 (𝑦) ≈𝐶 𝐹𝑘 (1 (𝑦)).
+                                                                                                  Robustness 3 ∶ Z𝑚×𝜔 {0,1}
+                                                                                                                             → {0,1} , let the initial matrices be
+                                                                                                  𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , randomly select 𝑣 ∈ [𝑚]𝜔 , set 𝐶𝑖 [𝑣[𝑖]] = 0
+                                                                                                  for all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]]).
+
+                                                                                         8
+Z. Shan et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103346
+
+
+Hyb3 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , find an appropriate             • Setup The simulator  generates some necessary parameters for the
+     pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ → {0,1} . For 𝑦 ∈ ,               algorithms and selects an appropriate hash functions 1 ∶ {0, 1}∗ →
+     compute 𝑣̃ = 𝐹̃𝑘 (𝑦), randomly select 𝑣 ← [𝑚]𝜔 , set 𝐶𝑖 [𝑣[𝑖]] = 0 for           {0,1} , Hamming Correlation Robustness 2 ∶ {0,1} → [𝑚]𝜔 , Ham-
+     all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]]).                               ming Correlation Robustness 3 ∶ Z𝑚×𝜔    → {0,1} and a 𝐺𝛾 ∶ {0,1} →
+                                                                                                                         {0,1}
+Hyb4 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , set a pseudo-                     {0,1} , a pseudorandom function 𝐹 ∶ {0,1} × {0,1} → {0,1} with
+     random function 𝐹 ∶ {0,1} × {0,1} → {0,1} , a hash function                   key 𝑘 ∈ {0,1} . The adversary 𝑃1 selects 𝑠 and transmits 𝑠 to the
+     1 ∶ {0, 1}∗ → {0,1} and Hamming Correlation Robustness                         simulator  using OT.
+              𝑚×𝜔
+     3 ∶ Z{0,1}     → {0,1} . For 𝑦 ∈ , compute 𝑣′ = 𝐹𝑘 (1 (𝑦)),                • H-Query, PRF-Query and PRG-Query The adversary 𝑃1 makes
+     randomly select 𝑣 ← [𝑚]𝜔 . Set 𝐶𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔]. Compute            queries about the hash function, pseudorandom function, oblivious
+     𝐺𝛾 (3 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]])).                                               transfer values, and pseudorandom generator. The simulator  pre-
+Hyb5 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , set a pseu-                       establishes lists for handling H-Query, PRF-Query, and PRG-Query
+     dorandom function 𝐹 ∶ {0,1} × {0,1} → {0,1} and a hash                        respectively.
+     function 1 ∶ {0, 1}∗ → {0,1} , Hamming Correlation Robustness
+               𝑚×𝜔
+     2 ∶ Z{0,1}    → {0,1} and 3 ∶ Z𝑚×𝜔        → {0,1} . For 𝑦 ∈ ,                  – 1 -Query For the 𝑖th query 𝑥𝑖 ∈ {0, 1}∗ corresponding to the
+                                            {0,1}
+     compute 𝑣′ = 𝐹𝑘 (1 (𝑦)), compute 𝑣′ = 𝐹𝑘 (1 (𝑦)). Set 𝐶𝑖 [𝑣[𝑖]] = 0                 value of 1 , the simulator  selects from the hash value list
+     for all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (3 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]])).                           if available, otherwise selects a random 𝑋𝑖 ∈ {0,1} . Set 𝑋𝑖 =
+  Similarly, it can be proven that Hyb0 ≈𝐶 Hyb5 . □                                        1 (𝑥𝑖 ) and update the list accordingly.
+                                                                                         – 2 -Query For the 𝑖th query 𝑦𝑖 ∈ {0,1} corresponding to the
+                                                                                           value of 2 , the simulator  selects from the hash value list if
+Definition 16 (CPA Security Model of the Protocol in Fig. 7). Assume                       available, otherwise selects a random 𝑌𝑖 ∈ [𝑚]𝜔 . Set 𝑌𝑖 = 2 (𝑦𝑖 )
+there exists a perturbed pseudorandom oracle machine 𝑃 𝑟𝑀𝛾 (where
+                                                                                           and update the list accordingly.
+𝛾 is the upper bound on the norm of the perturbation in 𝑃 𝑟𝑀𝛾 ), such
+                                                                                         – 3 -Query For the 𝑖th query 𝑧𝑖 ∈ Z𝑚×𝜔          corresponding to the
+that for an input 𝑥, it outputs two values: one is a random value 𝑦0 ,                                                              {0,1}
+                                                                                           value of 3 , the simulator  selects from the hash value list
+and the other is a pseudorandom value 𝑦1 with 𝑥 as its input.
+                                                                                           if available, otherwise selects a random 𝑍𝑖 ∈ {0,1} . Set 𝑍𝑖 =
+     • Setup The simulator  generates the necessary parameters for                        3 (𝑧𝑖 ) and update the list accordingly.
+       the algorithms. The adversary  chooses 𝑠 and sends it to the                     – 𝐹 -Query For the 𝑖th query 𝑢𝑖 ∈ {0,1} corresponding to the value
+       simulator  using OT.                                                               of 𝐹 , the simulator  selects from the pseudorandom function
+     • Hash Queries, PRF Queries and PRG Queries The adversary                             value list if available, otherwise selects a random 𝑈𝑖 ∈ {0,1} .
+        sequentially performs hash function queries, pseudorandom                         Set 𝑈𝑖 = 𝐹 (𝑢𝑖 , 𝑘) and update the list accordingly.
+       function queries, and pseudorandom synthesizer queries. Here,
+                                                                                         – 𝐺𝛾 -Query For the 𝑖th query 𝑤𝑖 ∈ {0,1} corresponding to the
+       the adversary cannot know the key in pseudorandom function
+                                                                                           value of 𝐺𝛾′ , the simulator  selects from the pseudorandom
+       queries.
+                                                                                           generator value list if available, otherwise selects a random
+     • Challenge The adversary  selects a private message 𝑚 and sends
+                                                                                           𝑊𝑖 ∈ {0,1} . Set 𝑊𝑖 = 𝐺𝛾′ (𝑤𝑖 ) and update the list accordingly.
+       it to the simulator . The simulator queries the hash function,
+       pseudorandom function, and oblivious transfer values of the real                    Note that 𝐺𝛾′ is not 𝐺𝛾black-box .
+       scheme, inputs these results into the pseudorandom oracle ma-
+       chine 𝑃 𝑟𝑀𝛾 , obtains two ciphertexts 𝑐0 and 𝑐1 , and sends them            • Challenge 𝑃1 selects 𝑚 ∈ ∕ and sends it to .  using the corre-
+       to the adversary .                                                            sponding hash function queries and pseudorandom function queries,
+     • Guessing After receiving the two ciphertexts 𝑐0 and 𝑐1 ,  guesses             inputs the queried values into the black-box 𝐺𝛾′ , obtaining 𝜓0 and 𝜓1 ,
+       which ciphertext corresponds to the encryption of 𝑚 and sends the              and then sends 𝜓0 , 𝜓1 to 𝑃1 .
+       guess back to the simulator .                                               • Guess Based on the received 𝜓0 and 𝜓1 , 𝑃1 guesses whether 𝜓0 or
+   The advantage of the adversary  is defined as the advantage of the                𝜓1 is the ciphertext of the encrypted message 𝑚.
+simulator  in distinguishing the outputs of 𝑃 𝑟𝑀𝛾 .                                   According to the assumption, if the adversary 𝑃1 can break the
+                                                                                    scheme with a non-negligible advantage, then the simulator  can
+Note 2. The 𝑃 𝑟𝑀 mentioned in this paper differs from [22]. In [22],               also break the black-box 𝐺𝛾′ with a non-negligible advantage. This
+𝑃 𝑟𝑀 refers to a pseudorandom oracle machine that outputs random                   contradicts the assumption that 𝐺𝛾′ is secure. □
+values when the adversary does not know the pseudorandom function key,
+and outputs pseudorandom function values based on the key known to the
+adversary when the key is known. This is a single-value output. However, the        4.4. Efficiency analysis PSI
+𝑃 𝑟𝑀 required in this paper outputs both of these values simultaneously,
+making it a multi-value output.                                                         This section simulates the PSI computation efficiency of this pa-
+                                                                                    per and PSI in [14] on MAC, Pad, and Phone. The PRF of [14] is
+Theorem 3. If 1 is a collision resistant hash function, 2 and 3 are              instantiated based on LWE.
+hamming correlation robustness, then the protocol in Fig. 7 securely realizes
+𝑃 𝑆 𝐼 in Definition 16.
+                                                                                    4.4.1. Efficiency analysis on MAC
+                                                                                       The tools used in the subsection are Python 3.12, the programs are
+Proof. Suppose the adversary 𝑃1 can break the scheme with non-                     performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB (see
+negligible advantage. Now, the simulator  simulates the scheme.                    Fig. 8).
+Suppose there exists a black-box 𝐺𝛾𝑏𝑙𝑎𝑐 𝑘−𝑏𝑜𝑥 such that
+                                    𝑦0 = 𝐺𝛾 (𝑥) ∈ {0,1} ,
+                                                                                    4.4.2. Efficiency analysis on mobile pad
+                                ↗                                                      The tools used in the subsection are Pydriod 3, the programs are
+𝐺𝛾𝑏𝑙𝑎𝑐 𝑘−𝑏𝑜𝑥 (𝑥) → (𝑦0 , 𝑦1 )
+                                ↘                                                   performed on Xiaomi Pad 6 Pro File Explorer 1th Qualcomm(R)AI En-
+                                    𝑦1 ∈𝑅 {0,1} .                                  gine(TM) Xiaolong 8+ mobile platform@3.2 GHz, RAM 8.00+3.00 GB
+                                                                                    (see Fig. 9).
+
+                                                                                9
+Z. Shan et al.                                                                                                         Journal of Systems Architecture 160 (2025) 103346
+
+
+4.5. Analysis of efficiency on mobile phones                                     Acknowledgments
+
+   The tools used in the subsection are Pydriod 3, the programs are per-            This work was supported in part by the National Nature Science
+formed on Redmi K30 File Explorer 4th Qualcomm(R)AI Engine(TM)                   Foundation of China under Grant 61872087 and Grant 51875457; in
+Qualcomm Xiaolong 730G 8+ mobile platform@2.2 GHz, RAM 6.00 GB                   part by the Key Foundation of National Natural Science Foundation
+(see Fig. 10).                                                                   of China under Grant U19B2021; and in part by the Key Research
+                                                                                 and Development Program of Shaanxi under Program 2022GY-028 and
+                                                                                 Program 2022GY-050.
+4.5.1. Summary of data comparison
+    From the simulation results, it can be seen that for 𝑛 ≤ 400, the            Data availability
+LWE-based OPRF in [14] is slightly faster, while for 𝑛 > 400, the ring
+LPR-based OPRF in this paper is faster. Furthermore, as 𝑛 increases,                No data was used for the research described in the article.
+the advantages of ring LPR become more pronounced. Based on the
+simulation results for Pad, the OPRF in this paper is more stable;
+although there are fluctuations, they are less significant compared to           References
+the LWE-based OPRF in [14].
+                                                                                  [1] R. Lei, X. Chen, D. Liu, C. Song, Y. Tan, A. Ren, CEIU: Consistent and efficient
+                                                                                      incremental update mechanism for mobile systems on flash storage, J. Syst. Ar-
+5. Expansion of this work                                                             chit. 152 (2024) 103151, http://dx.doi.org/10.1016/j.sysarc.2024.103151, URL:
+                                                                                      https://www.sciencedirect.com/science/article/pii/S1383762124000882.
+                                                                                  [2] J. Sun, L. Yin, M. Zou, Y. Zhang, T. Zhang, J. Zhou, Makespan-minimization
+   Private Information Retrieval (PIR) [23–29] is a technique that                    workflow scheduling for complex networks with social groups in edge
+enables a client to securely download a specific element, such as a                   computing, J. Syst. Archit. 108 (2020) 101799, http://dx.doi.org/10.1016/
+movie or a friend’s record, from a database managed by an untrusted                   j.sysarc.2020.101799, URL: https://www.sciencedirect.com/science/article/pii/
+server, such as a streaming service or a social network, without disclos-             S1383762120300928.
+                                                                                  [3] Y. Gao, Y. Luo, L. Wang, X. Liu, L. Qi, W. Wang, M. Zhou, Efficient scalable
+ing to the server which particular element has been retrieved. Given
+                                                                                      multi-party private set intersection(-variants) from bicentric zero-sharing, in:
+the functional similarities between PIR and PSI, this paper extends its
+                                                                                      Proceedings of the Conference on Computer and Communications Security, CCS,
+exploration into the construction of PIR using OPRF (see Fig. 11).                    Association for Computing Machinery (ACM), New York, NY, USA, 2024.
+                                                                                  [4] M.O. Rabin, How to exchange secrets with oblivious transfer, 2005, URL: https:
+5.1. Efficiency analysis PIR                                                          //eprint.iacr.org/2005/187.
+                                                                                  [5] O. Goldreich, S. Goldwasser, S. Micali, How to construct random functions, J.
+                                                                                      ACM 33 (4) (1986) 792–807, http://dx.doi.org/10.1145/6490.6503.
+    This section simulates the PSI computation efficiency of this paper           [6] M. Naor, O. Reingold, Number-theoretic constructions of efficient pseudo-random
+and machine learning-based PIR in [30](DLMI for short) on MAC.                        functions, J. ACM 51 (2) (2004) 231–262, http://dx.doi.org/10.1145/972639.
+The tools used in the subsection are Python 3.12, the programs are                    972643.
+                                                                                  [7] M.J. Freedman, Y. Ishai, B. Pinkas, O. Reingold, Keyword search and oblivious
+performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB.
+                                                                                      pseudorandom functions, in: J. Kilian (Ed.), Theory of Cryptography, Springer
+    The OPRF-based PIR proposed in this paper has a runtime that                      Berlin Heidelberg, Berlin, Heidelberg, 2005, pp. 303–324.
+differs from the machine learning-based PIR by no more than approx-               [8] S. Jarecki, X. Liu, Efficient oblivious pseudorandom function with applications
+imately 5 × 10−3 seconds. Additionally, the security of our PIR scheme                to adaptive OT and secure computation of set intersection, in: O. Reingold (Ed.),
+is theoretically supported in comparison to [30] (see Fig. 12).                       Theory of Cryptography, Springer Berlin Heidelberg, Berlin, Heidelberg, 2009,
+                                                                                      pp. 577–594.
+                                                                                  [9] V.K. Yadav, N. Andola, S. Verma, S. Venkatesan, A survey of oblivious trans-
+6. Conclusion                                                                         fer protocol, ACM Comput. Surv. 54 (10s) (2022) http://dx.doi.org/10.1145/
+                                                                                      3503045.
+    This paper presents a PSI based on efficient post-quantum OPRF and           [10] M.R. Albrecht, A. Davidson, A. Deo, N.P. Smart, Round-optimal verifiable
+                                                                                      oblivious pseudorandom functions from ideal lattices, in: J.A. Garay (Ed.), Public-
+proves its security under the semi-honest model, demonstrating security
+                                                                                      Key Cryptography – PKC 2021, Springer International Publishing, Cham, 2021,
+even in the CPA model in Definition 16. The addition of PPRG enables                  pp. 261–289.
+the PSI to effectively resist probabilistic attacks. In the simulation           [11] N. Tyagi, S. Celi, T. Ristenpart, N. Sullivan, S. Tessaro, C.A. Wood, A fast
+experiments, the proposed PSI shows greater efficiency compared to                    and simple partially oblivious PRF, with applications, in: O. Dunkelman, S.
+post-quantum PSIs represented by LWE.                                                 Dziembowski (Eds.), Advances in Cryptology – EUROCRYPT 2022, Springer
+    Although the PIR in this study is not as efficient as the machine                 International Publishing, Cham, 2022, pp. 674–705.
+                                                                                 [12] S. Casacuberta, J. Hesse, A. Lehmann, Sok: Oblivious pseudorandom functions,
+learning-based PIR, the gap between the two is already quite small.
+                                                                                      in: 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P),
+However, there are also notable shortcomings; the efficiency of the                   2022, pp. 625–646, http://dx.doi.org/10.1109/EuroSP53844.2022.00045.
+proposed PSI still lags behind that of non-post-quantum PSIs, which              [13] D. Boneh, D. Kogan, K. Woo, Oblivious pseudorandom functions from isogenies,
+will be addressed in future work.                                                     in: S. Moriai, H. Wang (Eds.), Advances in Cryptology – ASIACRYPT 2020,
+                                                                                      Springer International Publishing, Cham, 2020, pp. 520–550.
+                                                                                 [14] M. Chase, P. Miao, Private set intersection in the internet setting from lightweight
+CRediT authorship contribution statement                                              oblivious PRF, in: D. Micciancio, T. Ristenpart (Eds.), Advances in Cryptology –
+                                                                                      CRYPTO 2020, Springer International Publishing, Cham, 2020, pp. 34–63.
+   Zhuang Shan: Writing – original draft, Conceptualization. Leyou               [15] Z. Shan, L. Zhang, Q. Wu, Q. Lai, Analysis, modify and apply in IIOT form
+Zhang: Writing – review & editing, Writing – original draft. Qing Wu:                 light-weight PSI in CM20, 2024, URL: https://eprint.iacr.org/2024/969.
+                                                                                 [16] J. Alwen, S. Krenn, K. Pietrzak, D. Wichs, Learning with rounding, revisited, in:
+Conceptualization. Qiqi Lai: Writing – review & editing. Fuchun Guo:
+                                                                                      R. Canetti, J.A. Garay (Eds.), Advances in Cryptology – CRYPTO 2013, Springer
+Writing – review & editing.                                                           Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 57–74.
+                                                                                 [17] A. Banerjee, C. Peikert, A. Rosen, Pseudorandom functions and lattices, in: D.
+Declaration of competing interest                                                     Pointcheval, T. Johansson (Eds.), Advances in Cryptology – EUROCRYPT 2012,
+                                                                                      Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 719–737.
+                                                                                 [18] D. Bellizia, C. Hoffmann, D. Kamel, H. Liu, P. Méaux, F.-X. Standaert, Y.
+    The authors declare that they have no known competing finan-                      Yu, Learning parity with physical noise: Imperfections, reductions and FPGA
+cial interests or personal relationships that could have appeared to                  prototype, IACR Trans. Cryptogr. Hardw. Embed. Syst. 2021 (2021) 390–417,
+influence the work reported in this paper.                                            URL: https://api.semanticscholar.org/CorpusID:235814670.
+
+
+                                                                            10
+Z. Shan et al.                                                                                              Journal of Systems Architecture 160 (2025) 103346
+
+
+[19] Y. Yu, J. Zhang, Smoothing out binary linear codes and worst-case sub-                      Leyou Zhang received the M.S. and Ph.D. degrees from Xid-
+     exponential hardness for LPN, in: T. Malkin, C. Peikert (Eds.), Advances in                 ian University, Xi’an, China, in 2002 and 2009, respectively.
+     Cryptology – CRYPTO 2021, Springer International Publishing, Cham, 2021, pp.                From 2013 to 2014, he served as a visiting scholar at the
+     473–501.                                                                                    University of Wollongong, Australia. He currently worked
+[20] V. Kolesnikov, R. Kumaresan, M. Rosulek, N. Trieu, Efficient batched oblivious              in Xidian University as a professor.
+     PRF with applications to private set intersection, in: Proceedings of the 2016                  His current research interests include public key cryp-
+     ACM SIGSAC Conference on Computer and Communications Security, CCS ’16,                     tography, network security and computer security. He has
+     Association for Computing Machinery, New York, NY, USA, 2016, pp. 818–829,                  over 120 scientific publications in many highly ranked
+     http://dx.doi.org/10.1145/2976749.2978381.                                                  cybersecurity journals and conferences.
+[21] Z. Brakerski, E. Kirshanova, D. Stehlé, W. Wen, Learning with errors and
+     extrapolated dihedral cosets, in: Public-Key Cryptography – PKC 2018, Springer
+     International Publishing, 2018, pp. 702–727.
+[22] A. Jain, H. Lin, J. Luo, D. Wichs, The pseudorandom oracle model and ideal
+     obfuscation, in: H. Handschuh, A. Lysyanskaya (Eds.), Advances in Cryptology –
+     CRYPTO 2023, Springer Nature Switzerland, Cham, 2023, pp. 233–262.
+                                                                                                 Qing Wu received the M.S. and Ph.D. degrees from the Xid-
+[23] S. Angel, H. Chen, K. Laine, S. Setty, PIR with compressed queries and amortized
+                                                                                                 ian University, Xi’an, China, in 2006 and 2009, respectively.
+     query processing, in: 2018 IEEE Symposium on Security and Privacy, SP, 2018,
+                                                                                                     She currently works with Xi’an University of Posts and
+     pp. 962–979, http://dx.doi.org/10.1109/SP.2018.00062.                                       Communications, Xi’an, as a Professor. Her current research
+[24] A. Burton, S.J. Menon, D.J. Wu, Respire: High-rate PIR for databases with small             interests include artificial intelligence security and cloud
+     records, in: Proceedings of the Conference on Computer and Communications                   security.
+     Security, CCS, Association for Computing Machinery (ACM), New York, NY, USA,
+     2024.
+[25] J. Dujmovic, M. Hajiabadi, Lower-bounds on public-key operations in PIR, in: M.
+     Joye, G. Leander (Eds.), Advances in Cryptology – EUROCRYPT 2024, Springer
+     Nature Switzerland, Cham, 2024, pp. 65–87.
+[26] B. Fisch, A. Lazzaretti, Z. Liu, C. Papamanthou, Thorpir: Single server PIR via
+     homomorphic thorp shuffles, in: Proceedings of the Conference on Computer and
+     Communications Security, CCS, Association for Computing Machinery (ACM),
+     New York, NY, USA, 2024.
+                                                                                                 Qiqi Lai received the B.S. from PLA University of Informa-
+[27] A. Gascon, Y. Ishai, M. Kelkar, B. Li, Y. Ma, M. Raykova, Computationally
+                                                                                                 tion Engineering, henan, China, in 2008. And he received
+     secure private information retrieval and aggregation in the shuffle model, in:
+                                                                                                 the M.S. and Ph.D. degrees from Xidian University, Xi’an,
+     Proceedings of the Conference on Computer and Communications Security, CCS,                 China, in 2011 and 2015.
+     Association for Computing Machinery (ACM), New York, NY, USA, 2024.                             His currently works with Shaanxi Normal University,
+[28] A. Ghoshal, M. Zhou, E. Shi, Efficient pre-processing PIR without public-                   Xi’an, as a Professor. His current research interests include
+     key cryptography, in: M. Joye, G. Leander (Eds.), Advances in Cryptology –                  the theory of lattice-based public key cryptography and its
+     EUROCRYPT 2024, Springer Nature Switzerland, Cham, 2024, pp. 210–240.                       provable security, as well as the construction and analysis
+[29] M. Luo, F.-H. Liu, H. Wang, Faster FHE-based single-server private information              of homomorphic encryption schemes.
+     retrieval, in: Proceedings of the Conference on Computer and Communications
+     Security, CCS, Association for Computing Machinery (ACM), New York, NY, USA,
+     2024.
+[30] M. Lam, J. Johnson, W. Xiong, K. Maeng, U. Gupta, Y. Li, L. Lai, I. Leontiadis,
+     M. Rhu, H.-H.S. Lee, V.J. Reddi, G.-Y. Wei, D. Brooks, E. Suh, GPU-based
+                                                                                                 Funcun Guo received the B.S. and M.S. degrees from Fujian
+     private information retrieval for on-device machine learning inference, in:
+                                                                                                 Normal University, China, in 2005 and 2008, respectively,
+     Proceedings of the 29th ACM International Conference on Architectural Support               and the Ph.D. degree from the University of Wollongong,
+     for Programming Languages and Operating Systems, Volume 1, ASPLOS ’24,                      Australia, in 2013. He is currently an Associate Research
+     Association for Computing Machinery, New York, NY, USA, 2024, pp. 197–214,                  Fellow with the School of Computing and Information
+     http://dx.doi.org/10.1145/3617232.3624855.                                                  Technology, University of Wollongong.
+                                                                                                     His primary research interests include the public
+                                                                                                 key cryptography, in particular protocols, encryption and
+                          Zhuang Shan received the B.S. from Liaoning Institute of               signature schemes, and security proof.
+                          Science and Technology, benxi, China, in 2019. And he
+                          received the M.S. from North Minzu University, yinchuan,
+                          China, in 2022.
+                               He is currently pursuing the Ph,D. degree in mathemat-
+                          ics with Xidian University, Xi’an, China. His current interests
+                          include cryptography, reduction of hard problems in lattice,
+                          and network security.
+
+
+
+
+                                                                                            11
+
--- a/papers_txt/Fully-decentralized-period-k-times-anonymous-authen_2026_Computer-Standards-.txt
+++ b/papers_txt/Fully-decentralized-period-k-times-anonymous-authen_2026_Computer-Standards-.txt
@@ -0,0 +1,989 @@
+                                                                Computer Standards & Interfaces 97 (2026) 104097
+
+
+                                                                     Contents lists available at ScienceDirect
+
+
+                                                           Computer Standards & Interfaces
+                                                              journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+Fully decentralized period k-times anonymous authentication with access
+criteriaI , II
+Hongyan Di a , Yinghui Zhang a                            ,∗, Ziqi Zhang a , Yibo Pang a , Rui Guo a , Yangguang Tian b
+a
+    School of Cyberspace Security, Xi’an University of Posts & Telecommunications, 710121, Xi’an, China
+b
+    University of Surrey, GU2 7XH, Surrey, UK
+
+
+
+ARTICLE                  INFO                               ABSTRACT
+
+Keywords:                                                   The explosive growth of Internet user devices highlights the strong and urgent need for digital identity
+Fully decentralized                                         infrastructure. However, the existing decentralized identity schemes are still not fully decentralized, and there
+Publicly auditable                                          is still a contradiction between publicly auditable credentials and maintaining anonymity. Therefore, using
+Access criteria
+                                                            advanced cryptographic techniques such as signature proof of knowledge, Pedersen commitment, and Merkle
+Anonymous authentication
+                                                            tree, this paper propose a fully decentralized period k-times anonymous authentication with access criteria.
+Signature proof of knowledge
+                                                            The scheme allows user credentials to be publicly audited, users can manage their identity independently, and
+                                                            the verifier can not only verify the user’s identity, but also implement access control. The issuer does not need
+                                                            to hold a key or maintain a list, and it can still authenticate even after the trusted center is attacked, and only
+                                                            three zero-knowledge proofs are needed for registration and verification. The security analysis indicates that
+                                                            this scheme satisfies unforgeability, anonymity, unlinkability and attribute privacy. Performance evaluation
+                                                            shows significant improvements in both computational and communication efficiency over existing schemes.
+
+
+
+1. Introduction                                                                                  control over digital resources such as services. The core of this system is
+                                                                                                 the concept of digital identity. The evolution of digital identity has gone
+    With the surge in digital services accessed through network con-                             through multiple eras, during which digital identity recognition has
+nections, the number of digital identities has seen an unprecedented                             gradually shifted from centralized to decentralized identity models [3].
+increase. Therefore, the vast majority of the global population has                              In fact, the way entities prove the ownership of digital identities may be
+at least one digital identity, which becomes the key to unlocking a                              affected by various vulnerabilities [4]. The current Internet ecosystem
+variety of online functions and services. However, the concept of digital                        generally adopts the centralized Identity Provider (IdP) model, with
+identity goes far beyond human identity recognition [1]. With the wide                           tech giants such as Google and Facebook (e.g., Meta) serving as the
+adoption of IoT and the powerful functions of the 5th Generation Mo-                             custodians of digital identities. Other services can directly rely on the
+bile Communication Technology (5G) network, as well as the upcoming                              identity information provided by IdP. This architecture simplifies the
+6th Generation Mobile Communication Technology (6G), the number                                  authentication process by achieving single sign-on through protocols
+of connected devices has increased significantly [2]. These devices                              such as OAuth, it has fundamental flaws when examined from the
+require unique digital identities to enable their participation in digital                       perspective of privacy protection, users lose control over their digital
+ecosystems, such as establishing secure communications.                                          identities [5], and all their identity attributes are centrally stored in the
+    Authentication and authorization are crucial security-related core                           IdP’s servers. Users neither know the specific usage of these data nor
+tasks in the digital world. Their purpose is to ensure the authenticity                          can they effectively manage their flow. More seriously, this architecture
+of the identities of the communicating parties and implement access                              has created a dangerous ‘‘data island’’ phenomenon—IdP can fully
+
+
+ I This article is part of a Special issue entitled: ‘Information Security and Privacy’ published in Computer Standards & Interfaces.
+II This work is supported by the National Cryptologic Science Fund of China (2025NCSF02037), the National Natural Science Foundation of China (62072369),
+the Youth Innovation Team of Shaanxi Universities (23JP160), the Shaanxi Special Support Program Youth Top-notch Talent Program, the Technology Innovation
+Leading Program of Shaanxi (2023-YD-CGZH-31), the Technology Innovation Guidance Special Fund of Shaanxi Province (2024QY-SZX-17), the Graduate
+Innovation Fund of Xi ’an University of Posts and Telecommunications (CXJJBDL2024004).
+  ∗ Corresponding author.
+     E-mail addresses: 15029659213@163.com (H. Di), yhzhaang@163.com (Y. Zhang), qiqizhang0408@163.com (Z. Zhang), ybpang1998@163.com (Y. Pang),
+guorui@xupt.edu.cn (R. Guo), yangguang.tian@surrey.ac.uk (Y. Tian).
+     URLs: https://www.xiyou.edu.cn/ (Y. Zhang), http://www.surrey.ac.uk (Y. Tian).
+
+https://doi.org/10.1016/j.csi.2025.104097
+Received 12 July 2025; Received in revised form 26 September 2025; Accepted 11 November 2025
+Available online 19 November 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+H. Di et al.                                                                                                      Computer Standards & Interfaces 97 (2026) 104097
+
+
+grasp the cross-platform service usage trajectory and behavioral char-            have emerged. These include zero-knowledge credentials, lightweight
+acteristics of users, essentially constructing a panoramic user profile.          anonymous credentials without heavy zero-knowledge proofs and other
+IdP, on the other hand, can obtain information about all the network              computationally intensive operations, self-blinding credentials, group
+services used by users (and related usage data). When the server storing          signatures, AC schemes without unlinkability, and post-quantum AC
+user data is invaded, sensitive personal information may be ‘‘obtained’’          schemes. In order to reduce the trust dependence of the credential
+by malicious attackers, causing significant loss of personal data and             issuance process on a central authority in traditional anonymous cre-
+damaging the reputation of stakeholders [6]. In 2022 alone, there were            dential schemes, Garman et al. [14] proposed the concept of decen-
+over 1800 major data breaches worldwide, involving more than 400                  tralized anonymous credential (DAC), which allows users to construct
+million user records. The increasing number of data breach cases has              and manage credentials in a completely anonymous manner. Derler
+raised significant concerns to data confidentiality and transparency              et al. [15] designed a new revocable multi-show attribute anonymous
+in the field of digital identity management. In addition, centralized             credential based on previous work, which has good scalability and con-
+identity management systems rely on specific identity service nodes,              stant operation of two roles. Bui and Aura [16] developed a distributed
+making them vulnerable to single point of failure problem [7].                    access control revocation framework to facilitate the manipulation of
+    Therefore, the increasing popularity of online services, the growing          revocation methods. Subsequently, Sonnino et al. [17] proposed a
+trend of decentralization, and the rising awareness of the shortcomings           special selective disclosure voucher solution based on blind signatures
+of traditional methods are paving the way for more secure and privacy-            and bilinear pairing, which holds short and highly efficient vouch-
+protecting approaches. Under this trend, supported by current laws and            ers. Inspired by Sonnino’s work, Halpin [18] redesigned the tagging
+regulations (such as the General Data Protection Regulation (GDPR)                mechanism to improve scalability and support embedding arbitrary
+of the European Union) [8], the concept of Self-Sovereign Identity                attributes. Cui et al. [19] constructed a Blockchain Digital Identity
+(SSI) [9] has attracted significant attention from both academia and              Management System (BDIdM) by extending the functional features of
+industry. SSI is based on the idea that individuals should have full              the DAC scheme [14], which enabled limited reusability of specific cre-
+control over their information without being forced to outsource data             dentials on the premise of maintaining the security of the DAC scheme.
+to any centralized institution or third party. Such technologies play a           In addition, decentralized anonymous credentials are widely integrated
+crucial role in establishing trust among entities (including non-human            with other scenarios. Lin et al. [20] applied the DAC scheme to the
+entities such as humans and IoT devices) and ensuring communication               smart grid scenario and enhanced the privacy protection mechanism.
+security through digital identities. Decentralized Identifiers (DIDs) and         The solutions combined with the application scenarios of blockchain-
+Verifiable Credentials (VCs), as effective solutions for enhancing pri-           based Internet of Vehicles include [21–25], Zeng et al. [26] also applied
+vacy and security, have been promoted in multiple application fields              anonymous credentials to cross-domain authentication in IIoT.
+such as intelligent transportation and smart healthcare. These standards
+can be extended to anyone or anything, covering cloud, edge, and IoT              2.2. 𝑘-Time anonymous authentication (𝑘-TAA)
+resources. It is worth noting that several institutions, including industry
+giants such as Microsoft, have recently developed and released a variety              The 𝑘-period anonymous authentication allows users to be authen-
+of implementation plans to support these technologies. In addition,               ticated up to 𝑘-times within a certain time period while remaining
+global government agencies are also actively promoting the widespread             anonymous. Teranishi et al. [27] introduced the first 𝑘-TAA scheme,
+application of DIDs and VCs. For instance, the European union pro-                allowing the identification of users who exceeded the authentication
+mulgated regulation 2024/1183 [10] in May 2024, establishing the                  limit. Nguyen and Safavi-Naini [28] extended this concept to dynamic
+European digital identity framework, aiming to provide European cit-              𝑘-TAA, enabling each authenticator to independently grant or revoke
+izens with digital passes for cross-border access to public and private           access rights. Au et al. [29] proposed a fixed-size dynamic 𝑘-times.
+services through the SSI system. This represents a significant milestone          Chaterjee et al. [30] proposed a 𝑘-TAA scheme based on physically
+in the development of digital identity solutions. However, current                unclonable functions (PUFs), which is applicable to trusted platform
+decentralized anonymous authentication schemes still face significant             modules (TPM). Huang et al. [31] designed an efficient 𝑘-TAA system
+challenges. These include the inability to achieve full decentralization,         tailored for pay-as-you-go pricing, facilitating multiple service accesses
+a lack of mutual trust between users and issuers, and the persistent              and related payments within each certification cycle. However, many
+contradiction between public verifiability and true anonymity. Against            existing 𝑘-TAA schemes fail to provide periodic anonymous authenti-
+this backdrop, AI-driven identity threat analysis has become a new                cation. Although the existing schemes [32,33] support periodic anony-
+focus of security research. Initiatives such as the Global Digital Iden-          mous authentication, they have deficiencies in supporting the selective
+tity Wallet (GDIW) have launched cross-border interoperability tests,             disclosure of credential attributes to achieve fine-grained authentica-
+while ‘‘Digital Identity Chain’’ has completed the integration of DIDs            tion. In addition, they require a large number of pairing operations,
+with the national government service platform—efforts that represent              resulting in significant verification delays. In contrast, scheme [34,35]
+preliminary but critical explorations in addressing these underlying              supports periodic 𝑘-times anonymous authentication while reducing
+issues.                                                                           cumbersome pairing operations. However, scheme [34] does not sup-
+                                                                                  port credential revocation. As shown in Table 1, our scheme, while
+2. Relate work                                                                    meeting the above requirements, supports full decentralization and
+                                                                                  access control.
+2.1. Decentralized anonymous credential (DAC)
+                                                                                     • Research Contributions
+    In the 1980s, David Chaum [11,12] introduced privacy-preserving                    Next, we list the main research contributions of this paper.
+cryptographic techniques, aiming to create a more privacy-focused                      The Proposed Scheme: We propose a fully decentralized 𝑘-times
+and user-centered authentication and authorization solution. It enables                period anonymous authentication scheme with access control.
+users to prove their membership, identity, or any other arbitrary at-                  The scheme enforces both access criteria and authentication dur-
+tribute in a group in a privacy-preserving manner. Such techniques are                 ing the verification process, while eliminating the need for issuers
+often referred to as anonymous credentials (ACs), and various methods                  to hold keys or maintain lists, thus remaining secure even if the
+for building AC systems have been widely studied in the academic com-                  trusted center is compromised. Only three zero-knowledge proofs
+munity. However, since Camenish and Lysyanskaya [13] first proposed                    are required for registration and verification.
+a completely anonymous credential scheme in 2001, a large number of                     Security Analysis: We conducted a correctness and theoretical
+anonymous credit construction schemes suitable for various scenarios                   security analysis based on the game definition of the proposed
+
+                                                                              2
+H. Di et al.                                                                                                                      Computer Standards & Interfaces 97 (2026) 104097
+
+
+                Table 1
+                Function comparison.
+                  Security features                              [29]        [30]         [31]      [33]       [19]       [34]         [35]       Our Scheme
+                  Anonymity                                      ✓           ✓            ✓         ✓          ✓          ✓            ✓          ✓
+                  Unlinkability                                  ✓           N.A          ✓         N.A        ✓          ✓            ✓          ✓
+                  𝑘-times period anonymous authentication        ×           ×            ×         ✓          ×          ✓            N.A        ✓
+                  Publicly auditable                             N.A         ×            N.A       N.A        ✓          ✓            ✓          ✓
+                  Select attribute disclosure                    ×           ×            ×         ×          ✓          ✓            N.A        ✓
+                  Key forward and backward secure                ✓           ✓            ✓         ✓          ✓          ✓            ✓          ✓
+                  Reveal violator’s identity without TTP         ✓           ✓            ×         ✓          ✓          ✓            ×          ✓
+                  Issuer not hold key and identity list          ×           ×            ×         ×          ×          ×            ×          ✓
+                  Support credential revocation                  ✓           ✓            ✓         ✓          ✓          ×            ✓          ✓
+
+                Note*: ✓: Support this feature; ×: Does not support this feature; N.A: No applicable; TTP: Trusted third party.
+
+
+       scheme. By simulating games and citing programmable random                          3.2. Zero-knowledge proof
+       oracles and fork lemmas, among other techniques, we demon-
+       strated that the scheme meets the requirements of unforgeability,
+                                                                                               A signature proof of knowledge (SPK) is a non-interactive zero-
+       anonymity, unlinkability, and attribute privacy. This analysis em-
+                                                                                           knowledge proof (ZKP) technique that enables a prover to demonstrate
+       phasizes that the plan has protected the integrity and validity of
+       the data.                                                                           knowledge of a secret value without revealing it, while also signing
+       Performance Evaluation: We conducted a detailed analysis of                         a message. We constructed a cyclic group G of prime order 𝑞 and
+       this authentication scheme, demonstrating its efficiency advan-                     employed the Fiat–Shamir heuristic [36] to convert an interactive
+       tages over existing authentication schemes. Tests were also car-                    proof into a non-interactive one. These non-interactive constructs are
+       ried out on secp256k1 and BLS12-381 curves, verifying that the                      precisely referred to as signature proofs of knowledge (SPK). All the
+       proposed algorithm performs better on lightweight curves.                           signatures of knowledge are secure in the random oracle model. Ac-
+     • Structure of Paper                                                                  cording to the symbols introduced by Camenisch and Stadler [37],
+       The remaining paper is structured as follows: Section 3 intro-                      𝑃 𝑜𝐾{(𝑥) ∶ 𝑦 = 𝑔 𝑥 } represents the zero-knowledge proof protocol
+       duces the problem assumptions and fundamentals. Section 4 de-                       between the prover and the verifier. Such prover knows 𝑥 ∈ Z𝑝 and
+       fines the syntax, security model, and detailed construction of                      𝑦 = 𝑔 𝑥 ∈ G. The corresponding non-interactive signature knowledge
+       the scheme. Section 5 analyzes its correctness and theoretical                      proof on the message 𝑚 should be expressed as 𝑆𝑃 𝐾{(𝑥) ∶ 𝑦 = 𝑔 𝑥 }(𝑚).
+       security. Section 6 evaluates performance in terms of computation                   It can be regarded as a signature on the message 𝑚, which is signed by
+       and communication overhead, and Section 7 concludes the paper.                      a key pair (𝑔 𝑥 , 𝑥) based on discrete logarithms.
+
+3. Preliminaries
+                                                                                           3.3. Pedersen commitment
+3.1. Group description and hardness assumptions
+                                                                                              Literature [38] uses Poseidon to realize the hash of Merkle tree
+   A group generator 𝐺𝐺𝑒𝑛(1𝜅 ) → (G, 𝑞) inputs a security parameter 𝜅                      and commitment. Instantiate another method of using Pedersen hash-
+and outputs a cyclic group G of prime order 𝑞. This scheme is based on                     ing and perfectly hiding commitments in the scheme. The Pedersen
+the following hard problem assumption.
+                                                                                           commitment algorithm as follows:
+
+Definition 2.1 (Discrete Logarithm Problem (DLP) Assumption). Let 𝑔 be
+                                                                                                 • 𝐺𝑒𝑛(1𝜅 ) → 𝑐𝑘 ∶ Select a finite group G with a large prime order
+a generator of a group G. Given a tuple (𝑔, 𝑔 𝑎 ) ∈ G2 , where 𝑎 ∈ Z∗𝑞 , the
+                                                                                                   𝑞, and choose two generators 𝑔 and ℎ from the group G. The
+Discrete Logarithm Problem is output 𝑎. The DLP assumption holds if
+                                                                                                   parameters of this commitment scheme are 𝑐𝑘 = (G, 𝑞, 𝑔, ℎ).
+for all PPT adversary , the advantage is negligible.
+                                                                                                 • 𝐶𝑜𝑚𝑚𝑖𝑡(𝑐𝑘, 𝑢) → 𝑐: Generate a commitment 𝑐 for a secret value 𝑢.
+AdvDLP
+    (𝜅) = |𝑃 𝑟[(𝑔, 𝑔 )| = 𝑎] ≤ 𝑛𝑒𝑔𝑙(𝜅).
+                      𝑎                                                                            The commitment party randomly selects a blind factor 𝑟 and then
+                                                                                                   calculates 𝑐 = 𝑔 𝑢 ℎ𝑟 .
+                                                                                                 • 𝑂𝑝𝑒𝑛𝐶𝑜𝑚(𝑐𝑘, 𝑐, 𝑢, 𝑟) → 0∕1: The verifier checks whether 𝑐 is equal
+Definition 2.2 (Decisional Diffie–Hellman (DDH) Assumption). Let G
+                                                                                                   to 𝑔 𝑢 ℎ𝑟 .
+be a group of order a large prime 𝑞, 𝑔 be the generator of G. The
+input is a random quadruple  = (𝑔, 𝑔 𝑥 , 𝑔 𝑦 , 𝑔 𝑥𝑦 ) ∈ G3 , and quadruple
+ = (𝑔, 𝑔 𝑥 , 𝑔 𝑦 , 𝑔 𝑧 ) ∈ G3 , where 𝑥, 𝑦, 𝑧 ← Z∗𝑞 . It is computationally hard
+                                                                                           3.4. Merkle tree
+for adversary  to distinguish between two tuples, the advantage of
+PPT adversary  is negligible.
+                                                                                             In the proposed scheme, the Merkle tree 𝑇 is used to represent the
+𝐴𝑑𝑣DDH
+    (𝜅) = |𝑃 𝑟[() = 1] − 𝑃 𝑟[() = 1]| ≤ 𝑛𝑒𝑔𝑙(𝜅).                                      membership of the set. The root of the tree 𝑇 is denoted 𝑇𝑟𝑜𝑜𝑡 . The
+                                                                                           Merkle tree has the following functions:
+Definition 2.3 (Computing Diffie–Hellman (CDH) Assumption). Let G
+be a cyclic group of order 𝑞 with generator 𝑔. Given the tuple  =                               • 𝑇 .𝐼𝑛𝑠𝑒𝑟𝑡(𝑣) → 𝑇 ∶ Inserts the value 𝑣 into the next available leaf
+(𝑔, 𝑔 𝑎 , 𝑔 𝑏 ) where 𝑎, 𝑏 ← Z∗𝑞 , computing 𝑔 𝑎𝑏 is hard. For all probabilistic                   in 𝑇 and returns the modified tree.
+polynomial-time (PPT) algorithms , the advantage probability of                                 • 𝑇 .𝑅𝑒𝑚𝑜𝑣𝑒(𝑣) → 𝑇 ′ ∶ Removes 𝑣 from the tree, if it exists, and
+successfully solving the CDH problem is negligible.                                                returns the modified tree 𝑇 ′ .
+                    | [                         ]|                                               • 𝑇 .𝐴𝑢𝑡ℎ𝑃 𝑎𝑡ℎ(𝑣) → 𝜃 ∶ Generate an authentication path 𝜃 that
+𝐴𝑑𝑣𝐶𝐷𝐻        (𝜅) = |𝑃 𝑟 (𝑔, 𝑔 𝑎 , 𝑔 𝑏 ) = 𝑔 𝑎𝑏 | ≤ 𝑛𝑒𝑔𝑙(𝜅).
+                   |                            |                                                 proves 𝑣 ∈ 𝑇 . The size of 𝜃 is proportional to the height of the
+where 𝜅 is a security parameter, 𝑛𝑒𝑔𝑙(𝜅) denotes a negligible function.                            tree, ensuring efficient verification in cryptographic protocols.
+
+                                                                                      3
+H. Di et al.                                                                                                                       Computer Standards & Interfaces 97 (2026) 104097
+
+
+                        Table 2
+                        Summary of notations.
+                         Symbol                                         Description
+                          , ,                                        User, Issuer, Verifier
+                         𝜆                                              Security parameter
+                         ℎ                                              The maximum height of the Merkle tree
+                         𝑚                                              The maximum number of attributes
+                         𝑛                                              The number of access criteria the verifier is allowed to define
+                         𝜄𝑝𝑢𝑏 , 𝜄𝑧𝑘                                     Verify the access policy for ancillary information when the request is issued
+                         𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏                               Auxiliary information when requesting registration
+                         𝜙𝑖                                             The verifier defines the 𝑖th access criterion
+                         𝑎𝑢𝑥𝑖                                           Show proof of auxiliary information
+                                     {    }𝑚
+                         𝐴𝑡𝑡𝑟𝑠 = 𝑎𝑡𝑡𝑟𝑖 𝑖=1                              The 𝑖th attribute of the user and the attribute set
+                         𝑤                                              Witness Collection
+                         𝑐𝑡𝑥                                            Context information
+                         𝐼, 𝑉                                           Collection of issuance criteria and access criteria
+                         𝛱𝑈1 , 𝛱𝑉1 , 𝛱̃                                 Zero-knowledge proofs generated by the user and issuer
+                         𝑠′′ ← Z∗𝑞                                      A secret random number randomly selected by the issuer
+                         𝜃                                              The authentication path generated by the Merkle tree
+                         𝑇𝑟𝑜𝑜𝑡 , 𝑇𝜅 , 𝑇𝜅′                               Merkle tree root, Merkle tree, updated Merkle tree
+
+                        Note*: 𝜄, 𝜙 ∶  → {0, 1} is a predicate over the user’s attributes that needs to be satisfied in order to pass verification, i.e.,
+                        verification only passes if 𝜄𝑝𝑢𝑏 (𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) = 1, 𝜙(𝐴𝑡𝑡𝑟𝑠, 𝑎𝑢𝑥) = 1.
+
+
+3.5. Pseudo-Random Function (PRF)                                                                • 𝑆𝑒𝑡𝑢𝑝(1𝜆 , 1ℎ , 1𝑚 ) → 𝑝𝑝 ∶ The algorithm inputs the security pa-
+                                                                                                   rameter 𝜆, the maximum height ℎ of the Merkle tree, and the
+    A Pseudo-Random Function (PRF) is a family of computational func-                              maximum number 𝑚 of attributes in a credential. Generates the
+      { }                                                                                          system parameters 𝑝𝑝.
+tions 𝐹𝑘 , where 𝑘 is a key and 𝐹𝑘 is a function from the input space
+to the output space. For an ideal PRF, when the key 𝑘 is unknown, its                            • 𝐼𝑠𝑠𝑢𝑒𝑆𝑒𝑡𝑢𝑝𝐼 (𝑝𝑝) → (𝐼, 𝜄𝑝𝑢𝑏 ) ∶ The algorithm inputs the public
+output is computationally indistinguishable from that of a true random                             parameter 𝑝𝑝, outputs the issue criteria set 𝐼 and the issue criteria
+                                                                                                   for verifying public auxiliary information 𝜄𝑝𝑢𝑏 .
+function. We construct a PRF with efficient correctness proof. We adopt
+the specific PRF construction proposed by Dodis and Yampolskiy [39]                              • 𝑆ℎ𝑜𝑤𝑆𝑒𝑡𝑢𝑝𝑉 (𝑝𝑝) → 𝑉 ∶ The verifier sets up 𝑛 access criteria to
+(DY-PRF). The DY-PRF is defined by the tuple (G, 𝑞, 𝑔, 𝑠), where G = ⟨𝑔⟩                           define the user’s access policy. This algorithm outputs a collection
+                                                                                                   of access criteria 𝑉 = {𝜙1 , 𝜙2 , … , 𝜙𝑛 } where each 𝜙𝑖 represents an
+is a cyclic group of prime order 𝑞 and 𝑠 ∈ Z𝑞 . For an input 𝑘, 𝑃 𝑅𝐹𝑔,𝑠 (𝑘)
+                                                                                                   access criteria.
+is defined as 𝑃 𝑅𝐹𝑔,𝑠 (𝑘) ∶ 𝑘 ↦ 𝑔 −(𝑠+𝑘+1) . There exists an efficient proof of
+                                                                                                 • 𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞
+                                                                                                   (    ( 𝑈 (𝑝𝑝, 𝐼, 𝐴𝑡𝑡𝑟𝑠,
+                                                                                                                     )         ) 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 )
+                                                                                                                           𝑤, 𝑐𝑡𝑥,                                     →
+correct formation for the output, and as long as the 𝑞-DDHI assumption
+                                                                                                    𝐶𝑚, 𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ∶ The issue request algorithm inputs
+holds, the output 𝑃 𝑅𝐹𝑔,𝑠 (𝑘) is indistinguishable from a random element
+                                                                                                   the public parameters 𝑝𝑝, the issue criteria 𝐼, the set of attributes
+in G𝑞 .
+                                                                                                   𝐴𝑡𝑡𝑟𝑠 of  , the secret value 𝑤, the context 𝑐𝑡𝑥, and the auxiliary
+                                                                                                   information (𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ).  generates the 𝛱𝑈1 associated with
+4. Proposed scheme                                                                                 𝑖𝑎𝑢𝑥𝑧𝑘 and outputs ((𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 ), 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ).
+                                                                                                 • 𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡𝐼 (𝑝𝑝, (𝐼, 𝜄𝑝𝑢𝑏 ), (𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 ), 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 )                 →
+   In this section, we describe in Table 2 all the symbolic definitions                            (𝑠′′ , (𝜃, 𝑇𝑟𝑜𝑜𝑡 ), 𝑘, 𝑇𝜅 ) ∶ The algorithm inputs the zero-knowledge sig-
+involved as well as the implications, followed by defining the syntax                              nature 𝛱𝑈1 , and the auxiliary information (𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ). Then
+and designing the scheme.                                                                           return the random value 𝑠′′ , authentication path 𝜃, number of
+                                                                                                   times 𝑘 to  , and locally generated Merkle tree 𝑇𝜅 .
+                                                                                                                                            {       }𝑛           {   }
+                                                                                                 • 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑𝑈 (𝑝𝑝, 𝑉 , 𝑇𝑟𝑜𝑜𝑡 , 𝑐𝑟𝑒𝑑, 𝜃, 𝑤𝑖 , 𝑎𝑢𝑥𝑖 𝑖=1 ) → (𝛱,   ̃ 𝑎𝑢𝑥𝑖 𝑛 ) ∶
+4.1. Syntax and security model                                                                                                                                         𝑖=1
+                                                                                                    inputs the root 𝑇𝑟𝑜𝑜𝑡 of the affiliated tree, the credential 𝑐𝑟𝑒𝑑,
+                                                                                                   and the authentication path 𝜃.  shows that the sent credential
+4.1.1. Security definition                                                                         satisfies the access criterion 𝜙𝑖 and proves that the displayed
+    The security of the system is defined by the standard properties                               credential
+                                                                                                          {      } belongs to the tree 𝑇𝜅 . Then, the algorithm outputs
+of anonymous credentials, including unforgeability, anonymity, un-                                   ̃ 𝑎𝑢𝑥𝑖 𝑛 ).
+                                                                                                   (𝛱,            𝑖=1                            {    }
+linkability, and attribute privacy. In our model, the attacker is as-                            • 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤𝑉 (𝑝𝑝, 𝑉 , (𝑐𝑟𝑒𝑑, 𝑇𝑟𝑜𝑜𝑡 ), (𝛱, ̃ 𝑎𝑢𝑥𝑖 𝑛 )) → 0∕1 ∶  ver-
+                                                                                                                                                         𝑖=1
+sumed to have only polynomial-time computational capability, and all                               ifies that the credentials 𝑐𝑟𝑒𝑑 displayed by  meet the access
+communications occur over open channels.                                                           criteria and that 𝑐𝑟𝑒𝑑 belongs to the Merkle tree 𝑇𝜅 ,  outputting
+    Threat Model. Our model considers adversaries as external attack-                              0/1.
+ers intercepting or modifying communications without breaking hard                               • 𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑𝐼 (𝑝𝑝, 𝑇𝜅 , 𝑐𝑟𝑒𝑑) → 𝑇𝜅′ ∶  revoke the 𝑐𝑟𝑒𝑑 registered by
+cryptographic problems, internal attackers misusing valid credentials                              dishonest users and update the Merkle tree 𝑇𝜅 to 𝑇𝜅′ .
+for forgery, transfer, or link attacks, semi-honest verifiers inferring user
+identities or attributes while following the protocol, and trusted-but-                     4.1.3. Security requirements
+curious issuers complying with the protocol but attempting to snoop                            The scheme is required to satisfy the following security require-
+on user data.                                                                               ments:
+                                                                                               Unforgeability: Attackers cannot forge valid credentials and de-
+                                                                                            ceive validators into performing correct verification. This game is
+4.1.2. Syntax definition                                                                    reduced to discrete logarithm or CDH problems.
+    Referring to the ideal function  in [38], the zk-credit anonymous                         Anonymity: Credentials are displayed without revealing the user’s
+credential approach realizes  using Groth16 [40], which is not suitable                    identity. This game specification is reduced to the DDH problem.
+for authentication. In this work,  is instantiated using signatures of                        Unlinkability: Different displays of the same certificate cannot
+knowledge, resulting in an algorithm that meets the authentication                          be linked, even if the merkle path remains identical across multiple
+requirements. The specific algorithm is as follows:                                         authentications.
+
+                                                                                        4
+H. Di et al.                                                                                                           Computer Standards & Interfaces 97 (2026) 104097
+
+
+
+
+                                                                  Fig. 1. System Model.
+
+
+   Attribute Privacy: Hides attributes when displaying credentials                from untrusted channels, forge information and impersonate users.
+unless the access policy requires them to be displayed.                           Therefore, this paper adopts the method of zero-knowledge proof to
+   Security is analyzed using a formal game-based model [41] under                realize the user’s verification of the certificate sent by the issuer, and
+the random oracle assumption [42]. The game is defined as follows:                prove to the verifier that the certificate is the user’s own, and at the
+                                                                                  same time, it can reduce the risk of privacy leakage. As shown in Fig.
+Game 1: Unforgeability Game                                                       1.
+   Setup. The challenger-1 run system initialization algorithm
+𝑆𝑒𝑡𝑢𝑝(1𝜆 , 1ℎ , 1𝑚 ) generate 𝑝𝑝, send 𝑝𝑝 to adversary 1 . 1 save issuer            • Issuer: The issuer is the issuer of the certificate, usually an
+private key 𝑖𝑠𝑘.                                                                        authority or trusted entity (such as government, enterprise, de-
+   Query. In this phase, the adversary 1 can querie three random                       centralized organization, etc.), which is responsible for verifying
+oracles, as follows:                                                                    the identity or attribute of the user and generating the encrypted
+                                                                                        credential. Before sending the certificate, the issuing criteria will
+    1. − 𝑄𝑢𝑒𝑟𝑦: 1 query random oracle 1 , 2 , 3 , 1 random re-                    be verified.
+       sponse and recording.                                                          • User: The user is the holder of the credential, requests the cre-
+    2. 𝑄𝑢𝑒𝑟𝑦2 : 1 query the issuer to registered certificate, 1 use                   dential from the issuer, upon receipt, verifies the credential.
+       the simulator  Simulate the interaction between 𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞 and                  • Verifier: The verifier is the receiver of credentials, who receives
+       𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡, using the programmability of random oracle to gen-                   the user’s credentials, goes through a secure channel, downloads
+       erate effective 𝑆𝑃 𝐾2 .                                                          the criteria and auxiliary verification data, verifies the access
+    3. 𝑄𝑢𝑒𝑟𝑦3 : 1 query certificate display, simulate the interaction                  criteria, and then verifies the user’s identity.
+       between 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑 and 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤, and simulate 𝑆𝑃 𝐾3 using
+       a zero-knowledge simulator.                                                4.2.1. System  (     initialization
+                                                                                                                )
+                                                                                       𝑆𝑒𝑡𝑢𝑝 1𝜆 , 1ℎ , 1𝑚 → 𝑝𝑝 ∶
+    Forgery. 1 output a forged certificate 𝑐𝑟𝑒𝑑 ∗ , correspond Merkle                 −  select a cyclic group G of order 𝑞, and generate generators
+tree path 𝜃 ∗ , satisfy that 𝑐𝑟𝑒𝑑 ∗ is not on the list of previously issued                                         𝑢, {𝑢𝑖 }𝑖∈[0,𝑛] ) ∈ G, along with hash functions 𝐻1 ∶
+                                                                                  (𝑔0 , 𝑔1 , 𝑔2 , 𝛾, ℎ0 , ℎ1 , ℎ2 , ̃
+credentials. 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤 accept 𝑐𝑟𝑒𝑑 ∗ and 𝜃 ∗ . 1 wins conditional on          {0, 1}∗ → Z∗𝑞 and 𝐻2 ∶ {0, 1}∗ × {0, 1}∗ → Z∗𝑞 ;
+the output of valid forged credentials.                                                − Define a Merkle tree of height ℎ, where for public input (𝑇𝑟𝑜𝑜𝑡 , 𝑐𝑟𝑒𝑑),
+                                                                                  it can prove 𝑐𝑟𝑒𝑑 ∈ 𝑇𝜅 through an authentication path 𝜃;
+Game 2: Anonymity and Unlinkability Game                                               − Define the global period 𝑒𝑝𝑜𝑐ℎ and pseudorandom function
+    Setup. The challenger-2 run system initialization algorithm                  𝑃 𝑅𝐹𝑔,𝑠 (𝑘) ∶ 𝑘 ↦ 𝑔𝑠+𝑘+1      1
+                                                                                                                     ;
+𝑆𝑒𝑡𝑢𝑝(1𝜆 , 1ℎ , 1𝑚 ) generate 𝑝𝑝, send 𝑝𝑝 to adversary 2 . 2 save issuer                                                                          𝑦
+                                                                                       −  selects random number 𝑦1 , 𝑦2 ← Z∗𝑞 , computes 𝑌1 = ℎ11 , 𝑌2 =
+private key 𝑖𝑠𝑘.                                                                    𝑦2
+                                                                                  ℎ2 , and sets the issuer secret key 𝑖𝑠𝑘 = (𝑦1 , 𝑦2 ) and issuer public key
+    Query. Adversary 2 can continue to query issuance and pre-
+                                                                                  𝑖𝑝𝑘 = (𝑌1 , 𝑌2 );                           (
+sentation, but cannot query revocation or presentation of challenge
+                                                                                       − Set the public parameters 𝑝𝑝 ) ∶= 𝑞, G, 𝑔0 , 𝑔1 , 𝑔2 , 𝛾, ℎ0 , ℎ1 , ℎ2 ,
+credentials.                                                                       𝑢, {𝑢𝑖 }𝑖∈[0,𝑛] , 𝐻1 , 𝐻2 , 𝑇𝜅 (, 𝑇𝑟𝑜𝑜𝑡 , 𝑒𝑝𝑜𝑐ℎ,
+                                                                                   ̃                                                𝑖𝑝𝑘 .
+    challenge. The adversary 2 selects the identity and attribute sets                                                      )
+                 (           ) (           )                                           𝐼𝑠𝑠𝑢𝑒𝑆𝑒𝑡𝑢𝑝𝐼 (𝑝𝑝) → 𝐼, 𝜄𝑝𝑢𝑏 ∶
+of two users, 𝐼0 , 𝐴𝑡𝑡𝑟𝑠0 ∗ , 𝐼1 , 𝐴𝑡𝑡𝑟𝑠1 ∗ , which satisfy the same access            − Define the relevant issuance criteria 𝜄 = (𝜄𝑧𝑘 , 𝜄𝑝𝑢𝑏 ), set
+policy. Send it to the challenger 2 . 2 randomly selects 𝑏 ← {0, 1}             𝐼𝑠𝑠𝑢𝑒𝐶𝑟𝑖𝑡𝑒𝑟𝑖𝑎[𝐼] ∶= 𝐼𝑠𝑠𝑢𝑒𝐶𝑟𝑖𝑡𝑒𝑟𝑖𝑎[𝐼] ∪ 𝜄;
+to generate a credential for 𝐼𝑏 and display it (i.e., run 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑 to                  − For the public input auxiliary information 𝑖𝑎𝑢𝑥𝑧𝑘 , prove:
+generate 𝛱𝑏 ), and then gives 𝛱𝑏 to 2 .                                          𝜄𝑧𝑘 (𝐴𝑡𝑡𝑟𝑠, 𝑖𝑎𝑢𝑥𝑧𝑘 ) = 1;
+    Guess. 2 outputs 𝑏′ and wins if 𝑏′ = 𝑏.                                           − Publish (𝐼, 𝜄𝑝𝑢𝑏 ).
+                                                                                       𝑆ℎ𝑜𝑤𝑆𝑒𝑡𝑢𝑝𝑉 (𝑝𝑝) → 𝑉 ∶
+4.2. Scheme construction                                                               −  define access criteria 𝜙 for user attributes 𝐴𝑡𝑡𝑟𝑠 (Multiple access
+                                                                                  criteria 𝜙𝑖            can be defined), and set 𝐴𝑐𝑐𝑒𝑠𝑠𝐶𝑟𝑖𝑡𝑒𝑟𝑖𝑎[𝑉 ]
+   In this scheme, the user is untrusted, the issuer is semi-trusted, the         ∶= 𝐴𝑐𝑐𝑒𝑠𝑠𝐶𝑟𝑖𝑡𝑒𝑟𝑖𝑎[𝑉 ] ∪ {𝜙𝑖 };
+channel between the verifier and the issuer is trusted, and the rest of                − For public input (𝑇root , 𝑐𝑟𝑒𝑑, 𝑎𝑢𝑥), prove: 𝜙(𝐴𝑡𝑡𝑟𝑠, 𝑎𝑢𝑥) = 1𝛬𝑐𝑟𝑒𝑑;
+the channels are untrusted channels. Attackers can steal information                   − Publish the access criteria set 𝑉 .
+
+                                                                              5
+H. Di et al.                                                                                                                                  Computer Standards & Interfaces 97 (2026) 104097
+
+
+4.2.2. Credential registration                                                                      Proof 𝛱̃ = 𝑆𝑃 𝐾3 . The generation of 𝛱̃ = 𝑆𝑃 𝐾3 is as follows:
+                   (               (                 ))
+    𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞𝑈 𝑝𝑝, 𝐼, 𝐴𝑡𝑡𝑟𝑠, 𝑤, 𝑐𝑡𝑥, 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏            →                                                        (                                                     )
+(     ( 1             )       )                                                                                ⎧            𝑛𝑘, 𝑟𝑘, 𝐴𝑡𝑡𝑟𝑠, 𝛼0 , 𝑥𝑢 , 𝑠, 𝑡, 𝑛𝑗 , 𝑎𝑡𝑡𝑟𝑗 ∉ 𝐴𝑇 𝑇 𝑅 ∶ ⎫
+ 𝐶𝑚, 𝛱𝑈 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ∶                                                                                   ⎪                    𝛼                                             ⎪
+                                                                                                               ⎪         𝑋0 = 𝑔0 0 𝛾 𝐻1 (𝜃)                                       ⎪
+    −  generate anonymous key 𝑛𝑘 and rate-limiting key 𝑟𝑘 us-
+                                                                                                               ⎪ ∧ 𝜁 ′ = 𝑌1𝑥𝑢 𝑌2𝑠 ⋅ 𝐶𝑚𝑡                                           ⎪
+ing pseudorandom function 𝑃 𝑅𝐹 and context 𝑐𝑡𝑥, calculate 𝑛𝑘 ∶=                                                ⎪                                        1                         ⎪
+𝑃 𝑅𝐹 (𝑐𝑡𝑥), 𝑟𝑘 ∶= 𝑃 𝑅𝐹 (𝑒𝑝𝑜𝑐ℎ ∥ 𝑐𝑡𝑥), define 𝑚 attributes 𝐴𝑡𝑡𝑟𝑠 =                                              ⎪ ∧ 𝜂 = 𝑃 𝑅𝐹𝑟𝑘,𝑢̃ (𝑛𝑗 ) = 𝑟𝑘+𝑛 +1                                  ⎪
+                                                                                                               ⎪                                    𝑢̃     𝑗                      ⎪
+{𝑎𝑡𝑡𝑟1 , 𝑎𝑡𝑡𝑟2 , … , 𝑎𝑡𝑡𝑟𝑚 };                                                                       𝛱̃ = 𝑆𝑃 𝐾3 ⎨                   𝑥𝑢                𝑅       𝑥𝑢
+                                                                                                                                                                      𝑅
+                                                                                                                                                                   𝑛𝑘+𝑛𝑗 +1       ⎬
+    − Select a random blind factor 𝑟 ← Z∗𝑞 and compute pedersen                                                ⎪ ∧ 𝛤 = 𝑢0 𝑃 𝑅𝐹𝑛𝑘,𝑢̃ (𝑛𝑗 ) = 𝑢0 ⋅ 𝑢̃                               ⎪
+                                                                                                               ⎪ ∧ 0 ≤ 𝑛𝑗 < 𝑘                                                     ⎪
+commitment 𝐶𝑚, where 𝐶𝑚 ∈ G:                                                                                   ⎪                                                                  ⎪
+                                          ( 𝑚           )                                                      ⎪  ∧      𝜙   1 (𝐴𝑡𝑡𝑟𝑠, 𝑎𝑢𝑥  1 ) = 1                               ⎪
+                                           ∏ 𝐻 (𝑎𝑡𝑡𝑟 )                                                         ⎪ ∧ ⋮                                                              ⎪
+𝐶𝑚 = 𝐶𝑜𝑚𝑚𝑖𝑡(𝑛𝑘, 𝑟𝑘, 𝐴𝑡𝑡𝑟𝑠; 𝑟) = 𝑔1𝑛𝑘 𝑔2𝑟𝑘      𝑢𝑖 1 𝑖 ⋅ ℎ𝑟0 ;                                                  ⎪ ∧ 𝜙 (𝐴𝑡𝑡𝑟𝑠, 𝑎𝑢𝑥 ) = 1                                            ⎪
+                                                                                                               ⎩             𝑖             𝑖                                      ⎭
+                                                   𝑖=1                                                     (                             )
+     − Set 𝑤 ∶= (𝑟, 𝑛𝑘, 𝑟𝑘, 𝐴𝑡𝑡𝑟𝑠) (collect private witness 𝑤), select                                   × 𝑎𝑢𝑥𝑖 , 𝑋0 , 𝜁 ′ , 𝜂, 𝛤 , 𝑇𝑟𝑜𝑜𝑡 ;
+𝑥𝑢 , 𝑠′ , 𝑡 ← Z∗𝑞 and generate 𝛱𝑈1 :
+                                                                                                        − Send (𝛱, ̃ {𝑎𝑢𝑥𝑖 }𝑛 , 𝑋0 , 𝜁 ′ , 𝜂, 𝛤 , (𝜃, 𝑇𝑟𝑜𝑜𝑡 ), 𝛷′ , 𝑎𝑡𝑡𝑟𝑖 ∈ 𝐴𝑇 𝑇 𝑅 ) to the
+                                                                                                                             𝑖=1
+                ⎧        (                            )          ⎫                                  verifier .
+                           𝑥𝑢 , 𝑠′ , 𝑡, 𝑟, 𝑛𝑘, 𝑟𝑘, 𝐴𝑡𝑡𝑟𝑠 ∶ ⎪                                                             (       (              ) ( {              } ))
+                ⎪                    𝑥𝑢 𝑠′                                                              𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤𝑉 𝑝𝑝, 𝑉 , 𝑐𝑟𝑒𝑑, 𝑇𝑟𝑜𝑜𝑡 , 𝛱,         ̃ 𝑎𝑢𝑥𝑖 𝑛             → 0∕1 ∶
+                ⎪        𝑋𝑢 = 𝑔1 𝑔2                              ⎪(                        )                                                                         𝑖=1
+𝛱𝑈1 = 𝑆𝑃 𝐾1 ⎨                      𝑥𝑢 𝑠′         𝑡               ⎬ 𝑋𝑢 , 𝜁, 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ;           −  checks whether the user’s submitted 𝛷′ matches its defined
+                ⎪   ∧    𝜁  =   𝑌     𝑌      ⋅ 𝐶𝑚                ⎪
+                             ( 1 2                )                                                 access criteria set 𝛷. Using 𝜃, verify and calculate 𝑐𝑟𝑒𝑑 = 𝜁 ′ ⋅𝑢0 2
+                                                                                                                                                                           ?     𝐻 (𝑒𝑝𝑜𝑐ℎ∥𝑘)
+                                                                                                                                                                                             .
+                ⎪ ∧ 𝜄𝑧𝑘 𝐴𝑡𝑡𝑟𝑠, 𝑖𝑎𝑢𝑥𝑧𝑘 = 1                        ⎪
+                ⎩                                                ⎭                                  If (𝜂, 𝛤 ) is valid, it proves that 𝑛𝑗 is within the range allowed to be
+                     1
+    −  send (𝛱𝑈 , 𝑋𝑢 , 𝜁, 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) to Issuer ;                                          displayed within 𝑒𝑝𝑜𝑐ℎ;
+    −  received 𝛱𝑉1 . If verification passes, receive the returned au-                                 − If verification succeeds, accept the request, otherwise reject it and
+thentication path 𝜃, 𝑠′′ and 𝑘;                                                                     invoke the 𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑 function to revoke 𝑐𝑟𝑒𝑑. For the specific process,
+    − Locally store (𝑛𝑘, 𝑟𝑘, 𝑟, 𝐴𝑡𝑡𝑟𝑠, 𝜃, 𝑠, 𝑡, 𝑒𝑝𝑜𝑐ℎ, 𝑘), where 𝑠 = 𝑠′ + 𝑠′′ and                   please refer to Fig. 2.
+𝑘 is the maximum allowed accesses within epoch 𝑒𝑝𝑜𝑐ℎ.
+    𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡𝐼 (𝑝𝑝, (𝐼, 𝜄𝑝𝑢𝑏 ), (𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 ), 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 )                                 →
+(           (        )          )                                                                   4.2.4. Credential revocation
+ 𝑐𝑟𝑒𝑑, 𝑠′′ , 𝜃, 𝑇𝑟𝑜𝑜𝑡 , 𝑘, 𝑇𝜅 ∶                                                                                     (            )
+                                                                                                        𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑 𝑝𝑝, 𝑇𝜅 , 𝑐𝑟𝑒𝑑 → 𝑇𝜅′ ∶
+    − verify 𝜄𝑝𝑢𝑏 (𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ), 𝜄𝑝𝑢𝑏 checks for publicly auxiliary information                           − Search for 𝑐𝑟𝑒𝑑 ∈ 𝑇𝜅 , if 𝑐𝑟𝑒𝑑 is not found, terminate the process;
+𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ;
+                                                                                                        − Else run 𝑇𝜅′ ∶= 𝑇𝜅 . Remove(𝑐𝑟𝑒𝑑), store and update the Merkle
+    − Verify 𝛱𝑈1 ∶= 𝑆𝑃 𝐾1 , where 𝛱𝑈1 proves the correctness of                                     tree 𝑇𝜅′ ;
+(𝜁, 𝑋𝑢 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) and that the hidden attributes satisfy the issuance                         − Return 𝑇𝑘′ and publicly notify that 𝑐𝑟𝑒𝑑 has been revoked.
+criteria 𝜄𝑧𝑘 . If verification fails, reject issuance and abort ⟂;
+    − Else verification passes,  randomly selects 𝑠′′ ← Z∗𝑞 , and define
+                                                                                                    5. Analysis of correctness and security
+the maximum times of accesses 𝑘 allowed by users within 𝑒𝑝𝑜𝑐ℎ,
+                               ′′       𝐻 (𝑒𝑝𝑜𝑐ℎ∥𝑘)
+calculate 𝑐𝑟𝑒𝑑 ∶= (𝜁 ⋅ 𝑌2𝑠 ) ⋅ 𝑢0 1                  , run 𝑇𝜅 = 𝑇 .Insert(𝑐𝑟𝑒𝑑) registers
+                                                                                                    5.1. Correctness analysis
+the anonymous credential. Where the registered 𝑐𝑟𝑒𝑑 is only known
+privately by the issuer. Then, run 𝜃 = 𝑇𝜅 .AuthPath(𝑐𝑟𝑒𝑑) generate
+authentication path. Updated Merkle tree root 𝑇𝑟𝑜𝑜𝑡 , and upload to a                               5.1.1. Details of 𝑆𝑃 𝐾1
+public panel such as blockchain;                                                                       𝑆𝑃 𝐾1 can be implemented using standard discrete logarithm proof
+                                                                                                    techniques.
+    − Next, select 𝑧0 , 𝑧1 ← Z∗𝑞 and generate 𝛱𝑉1 :
+                        (                      )                                                        1. (Commitment.) User  randomly selects 𝑠1 , 𝑠2 , 𝑠3 ∈𝑅 Z∗𝑞 and
+                ⎧          𝑧0 , 𝑧1 , 𝑦1 , 𝑦2 ∶                          ⎫
+  1             ⎪       𝑌    =    ℎ
+                                    𝑦1 𝑦2
+                                       ℎ                                ⎪(      ′′
+                                                                                       )                   computes:
+𝛱𝑉 = 𝑆𝑃 𝐾2 ⎨               𝑢    ( 1 2 ′′ )𝑧1                            ⎬ 𝑌𝑢 , 𝑠 , 𝑘,  ;                        𝑠 𝑠             𝑠    𝑠             𝑦        𝑦
+                ⎪ ∧ = 𝜁 ⋅𝑌                𝑠          𝐻 2 (𝑒𝑝𝑜𝑐ℎ∥𝑘)⋅𝑧 0 ⎪                                  𝑇1 = 𝑔11 𝑔22 , 𝑇2 = 𝑌1 1 𝑌2 2 ⋅ 𝐶𝑚𝑠3 = (ℎ11 )𝑠1 (ℎ22 )𝑠2 ⋅ 𝐶𝑚𝑠3 .
+                ⎩                         2
+                                                  ⋅ 𝑢0                  ⎭                               2. (Challenge.) The scheme uses non-interactive zero-knowledge
+    −  store the Merkle tree 𝑇𝜅 and send (𝛱𝑉1 , 𝑠′′ , 𝑘, 𝜃) to user  .
+                                                                                                           proof, where the user  generates challenge 𝑐:
+
+4.2.3. Show and verification certificate                                                                    𝑐 = 𝐻(𝑇1 ∥ 𝑇2 ∥ 𝑋𝑢 ∥ 𝜁 ∥ 𝑖𝑎𝑢𝑥𝑧𝑘 ∥ 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ).
+                (                    {         }𝑛 ) ( {      } )
+                                                       ̃ 𝑎𝑢𝑥𝑖 𝑛
+   𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑𝑈 𝑝𝑝, 𝑉 , 𝑇𝑟𝑜𝑜𝑡 , cred, 𝜃, 𝑤𝑖 , 𝑎𝑢𝑥𝑖 𝑖=1 → 𝛱,           ∶
+                                                              𝑖=1                                       3. (Proof.)  generates proof 𝛱𝑈1 that satisfies issuer policy
+   − User  sends an access request message 𝑚𝑠𝑔, and the verifier                                          𝜄𝑧𝑘 , 𝜄𝑧𝑘 (𝐴𝑡𝑡𝑟𝑠, 𝑖𝑎𝑢𝑥𝑧𝑘 ) = 1, and computes 𝑆1 = 𝑠1 − 𝑐 ⋅ 𝑥𝑢 , 𝑆2 =
+returns a random number 𝑅 = 𝐻2 (𝑛𝑜𝑛𝑐𝑒 ∥ 𝑚𝑠𝑔);                                                              𝑠2 − 𝑐 ⋅ 𝑠′ , 𝑆3 = 𝑠3 − 𝑐 ⋅ 𝑡. The proof 𝛱𝑈1 = (𝑐, 𝑆1 , 𝑆2 , 𝑆3 ), and sends
+   −  locally retrieves the verifier’s access criteria 𝑉 and the root                                     ((𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 ), 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) to the issuer .
+node 𝑇𝑟𝑜𝑜𝑡 of the tree containing 𝑐𝑟𝑒𝑑;                                                                                                               𝑆   𝑆              𝑆   𝑆
+                                                                                                        4. (Verify.)  computes 𝑇1′ = 𝑋𝑢𝑐 𝑔1 1 𝑔2 2 , 𝑇2′ = 𝜁 𝑐 𝑌1 1 𝑌2 2 ⋅ 𝐶𝑚𝑆3 , and
+                                                          ?                                                             ?
+    − Upon receiving (𝑛𝑜𝑛𝑐𝑒, 𝑅), verify 𝑅 = 𝐻2 (𝑛𝑜𝑛𝑐𝑒 ∥ 𝑚𝑠𝑔), then                                          verify: 𝑐 = 𝐻(𝑇1′ ∥ 𝑇2′ ∥ 𝑋𝑢 ∥ 𝜁 ∥ 𝑖𝑎𝑢𝑥𝑧𝑘 ∥ 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ). If verification
+randomly select 𝛼0 ← Z∗𝑞 . For 𝑛 access criteria 𝛷′ = {𝜙1 , 𝜙2 , … , 𝜙𝑛 },                                  passes, then 𝛱𝑈1 is correct, otherwise abort.
+partition the attribute set into public attributes 𝐴𝑇 𝑇 𝑅 and secret
+attributes {𝑎𝑡𝑡𝑟𝑗 ∉ 𝐴𝑇 𝑇 𝑅 }. Compute the commitment using blind
+                                                                                                    5.1.2. Details of 𝑆𝑃 𝐾2
+factor 𝑟:
+                                                                                                       SPK2 can also be implemented using standard discrete logarithm
+𝐶𝑚 = 𝐶𝑜𝑚𝑚𝑖𝑡(𝑛𝑘, 𝑟𝑘, {𝑎𝑡𝑡𝑟𝑗 ∉ 𝐴𝑇 𝑇 𝑅 }; 𝑟)                                                          proof techniques.
+      ⎛                   ∏                     ⎞          ∏
+                                    𝐻 (𝑎𝑡𝑡𝑟 )                                                           1. (Commitment.) The issuer/trust authority randomly selects
+    = ⎜𝑔1𝑛𝑘 𝑔2𝑟𝑘 ⋅                 𝑢𝑖 1 𝑗 ⋅ ℎ𝑟0 ⎟ ⋅
+                                                                      𝐻 (𝑎𝑡𝑡𝑟 )
+                                                                     𝑢𝑖 1 𝑖 ;
+      ⎜                                         ⎟                                                          𝑡1 , 𝑡2 , 𝑡3 , 𝑡4 ∈𝑅 Z∗𝑞 and computes:
+      ⎝            𝑎𝑡𝑡𝑟 𝑗 ∉𝐴𝑇 𝑇 𝑅              ⎠   𝑎𝑡𝑡𝑟 𝑖 ∉𝐴𝑇 𝑇 𝑅 
+   − Next, the times of certificate displays is initialized to 𝑛𝑗 = 1, and                                          𝑡   𝑡                ′′         𝐻 (𝑒𝑝𝑜𝑐ℎ∥𝑘)⋅𝑡4
+                                                                                                            𝐶1 = ℎ11 ℎ22 , 𝐶2 = (𝜁 ⋅ 𝑌2𝑠 )𝑡3 ⋅ 𝑢0 2                  .
+𝑛𝑗 = 𝑛𝑗 + 1 (0 ≤ 𝑛𝑗 < 𝑘) is set for each generation of zero-knowledge
+
+                                                                                                6
+H. Di et al.                                                                                                                              Computer Standards & Interfaces 97 (2026) 104097
+
+
+
+
+                                                                                  Fig. 2. System Flowchart.
+
+
+    2. (Challenge.) The scheme uses non-interactive zero-knowledge                                       2. (Challenge.) Using non-interactive zero-knowledge proof, the
+       proof, where  generates challenge 𝑐:                                                                user generates challenge 𝑐:
+         𝑐 = 𝐻(𝐶1 ∥ 𝐶2 ∥ 𝑌𝑢 ∥  ∥ 𝑠′′ ∥ 𝑘).                                                                 𝑐 = 𝐻(𝐴1 ∥ 𝐴2 ∥ 𝐴3 ∥ 𝐴4 ∥ 𝐴5 ∥ 𝑋0 ∥ 𝜁 ′ ∥ 𝜂 ∥ 𝛤 ∥ 𝑇𝑟𝑜𝑜𝑡 ∥ 𝑎𝑢𝑥𝑖 ).
+    3. (Proof.) The issuer generates proof 𝛱𝑉1 by computing 𝐶1′ =                                        3. (Proof.)  generates proof 𝛱̃ by computing:
+       𝑡1 − 𝑐 ⋅ 𝑦1 , 𝐶2′ = 𝑡2 − 𝑐 ⋅ 𝑦2 , 𝐶3′ = 𝑡3 − 𝑐 ⋅ 𝑧1 , 𝐶4′ = 𝑡4 − 𝑐 ⋅ 𝑧0 . The
+       proof 𝛱𝑉1 = (𝑐, 𝐶1′ , 𝐶2′ , 𝐶3′ , 𝐶4′ ),  sends (𝛱𝑉1 , 𝑠′′ , 𝑘) to user.                            𝐴′1 = t3 − 𝑐 ⋅ 𝛼0 , 𝐴′2 = t4 − 𝑐 ⋅ 𝑥𝑤 , 𝐴′3 = t5 − 𝑐 ⋅ 𝑠,
+                                                       𝐶′ 𝐶′                          ′′ ′                  𝐴′4 = t6 − 𝑐 ⋅ 𝑡, 𝐴′5 = n7 − 𝑐 ⋅ 𝑛𝑗 , 𝐴′6 = n8 − 𝑐 ⋅ 𝜌1 ,
+    4. (Verify.) computes, C1 =                  𝑌𝑢𝑐 ℎ1 1 ℎ2 2 , C2   = 𝑐 (𝜁   ⋅ 𝑌 𝑠 )𝐶3
+                                                                                    2
+                                                                                             ⋅
+          𝐻2 (𝑒𝑝𝑜𝑐ℎ∥𝑘)⋅𝐶4′                    ?
+         𝑢0             , and verify: 𝑐 = 𝐻(C1 ∥ C2 ∥ 𝑌𝑢 ∥ 𝑍           ∥ 𝑘).     ∥ 𝑠′′                      𝐴′7 = 𝜚2 − 𝑐 ⋅ 𝑟𝑘, 𝐴′8 = 𝜚1 − 𝑐 ⋅ 𝑛𝑘.
+         If verification passes, then 𝛱𝑉1 is correct, otherwise abort.
+                                                                                                            The proof 𝛱̃ = (𝑐, 𝐴′1 , 𝐴′2 , 𝐴′3 , 𝐴′4 , 𝐴′5 , 𝐴′6 , 𝐴′7 , 𝐴′8 ), and sends
+                                                                                                              ̃ 𝑎𝑢𝑥𝑖 , 𝑋0 , 𝜁 ′ , 𝜂, 𝛤 , 𝑇𝑟𝑜𝑜𝑡 ) to verifier .
+                                                                                                            (𝛱,
+5.1.3. Details of 𝑆𝑃 𝐾3
+                                                                                                         4. (Verify.)  computes:
+   The construction of 𝑆𝑃 𝐾3 includes zero-knowledge proof and range
+proof. We divide 𝑆𝑃 𝐾3 into two parts 𝑆𝑃 𝐾3𝐴 and 𝑆𝑃 𝐾3𝐵 . The specific                                                  𝐴′                     𝐴′   𝐴′       ′
+                                                                                                            A1 = 𝑋0𝑐 𝑔0 1 𝛾 𝐻1 (𝜃) , A2 = 𝜁 ′𝑐 𝑌1 2 𝑌2 3 𝐶𝑚𝐴4 ,
+details are as follows:                                                                                                                 ( )𝑐
+                (                                )                                                                     𝐴′ 𝐴′              ̃
+                                                                                                                                          𝑢          ′   ′
+        ⎧         𝑛𝑘, 𝑟𝑘, 𝛼0 , 𝑥𝑢 , 𝑠, 𝑡, 𝑛𝑗 , 𝜌1 ∶ ⎫                                                       A3 =  𝑐 𝑔1 5 𝑔2 6 , A4 =            𝜂 𝐴7 𝜂 𝐴5 ,
+                                                                                                                                          𝜂
+        ⎪       𝑋0 = 𝑔0 𝛾 1
+                           𝛼0 𝐻 (𝜃)
+                                                      ⎪
+        ⎪          ′ = 𝑌 𝑥𝑢 𝑌 𝑠 ⋅ 𝐶𝑚𝑡                 ⎪                                                            [ 𝑅     ]𝑐
+        ⎪  ∧    𝜁         1    2                      ⎪(                            )                               𝑢 ⋅ 𝑢0
+                                                                                                                    ̃          −𝐴 ′ −𝐴 ′ −𝐴 ′          ′
+𝑆𝑃 𝐾3𝐴 ⎨ ∧  = 𝑔 𝑛𝑗 𝑔 𝜌1                                             ′
+                                                      ⎬ 𝑎𝑢𝑥𝑖 , 𝑋0 , 𝜁 , 𝜂, 𝛤 , 𝑇𝑟𝑜𝑜𝑡 ,                      A5 =              𝑢0 8 𝑢0 5 𝑢0 2 𝛤 𝐴8′ 𝛤 𝐴5 ,
+                                                                                                                      𝛤
+        ⎪       𝑢̃
+                           1 2
+                         𝑟𝑘 𝑛                         ⎪
+        ⎪ ∧ 𝜂 =𝜂 𝜂 𝑗                                  ⎪                                                                      ?
+        ⎪                                                                                                   and verify: 𝑐 = 𝐻(A1 ∥ A2 ∥ A3 ∥ A4 ∥ A5 ∥ 𝑋0 ∥ 𝜁 ′ ∥ 𝜂 ∥ 𝛤 ∥
+                𝑢̃ 𝑅 ⋅𝑢0      −𝑛𝑘 𝑢−𝑛𝑗 𝑢−𝑥𝑢 𝛤 𝑛𝑘 𝛤 𝑛𝑗 ⎪
+        ⎩ ∧         𝛤
+                         =  𝑢 0    0     0
+                                                      ⎭                                                     𝑇𝑟𝑜𝑜𝑡 ∥ 𝑎𝑢𝑥𝑖 ).
+                              𝑛   𝜌
+𝑆𝑃 𝐾3𝐵 {(𝑛𝑗 , 𝜌1 ) ∶  = 𝑔1 𝑗 𝑔2 1 ∧ 0 ≤ 𝑛𝑗 < 𝑘}(𝑚).                                                 In groups of unknown order, range proofs currently widely recognized
+   SPK3𝐵 is instantiated as a simple range proof, which will be dis-                                 by academia and industry are based on the square decomposition
+cussed later. Next, we demonstrate how to implement SPK3𝐴 .                                          assumption [43] and 𝑛-ary decomposition [40], which can achieve
+                                                                                                     secure and efficient range proofs. However, we note that the range
+    1. (Commitment.)  randomly selects 𝜚1 , 𝜚2 , t3 , t4 , t5 , t6 , n7 , n8 ∈𝑅                     proofs required in authentication protocols always take the form 0 ≤
+       Z𝑛𝑞 and computes:                                                                             𝑛 < 𝑘. If we set 𝑘 = 2𝜅 , we can easily construct a simple range proof
+                 t                    t   t                  n    n
+                                                                                                     with complexity (𝜅), as shown in Eq. (1):
+         𝐴1 = 𝑔03 𝑦𝐻1 (𝜃) , 𝐴2 = 𝑌1 4 𝑌2 5 𝐶𝑚t6 , 𝐴3 = 𝑔1 7 𝑔2 8 ,
+                                −𝜚 −n −𝑡                                                             𝑃 𝑂𝐾𝑅𝐴𝑁𝐺𝐸 {(𝑛, 𝑟) ∶ 𝐶𝑛 = 𝑔0𝑛 𝑔1𝑟 ∧ 0 ≤ 𝑛 < 2𝜅 }.                                 (1)
+         𝐴4 = 𝜂 𝜚2 𝜂 n7 , 𝐴5 = 𝑢0 1 𝑢0 7 𝑢0 4 𝛤 𝜚1 𝛤 n7 .
+
+                                                                                                 7
+H. Di et al.                                                                                                                      Computer Standards & Interfaces 97 (2026) 104097
+
+
+   In this scheme, we use a Bulletproofs-based instantiation of 𝑆𝑃 𝐾3𝐵 .                 the adversary 1 forges parameters (𝑐𝑡𝑥∗ , 𝑛𝑘∗ , 𝑟𝑘∗ , 𝐴𝑡𝑡𝑟𝑠∗ ), selects the
+Here we will briefly describe and provide a detailed proof process.                      random blind factor 𝑟∗ ∈ Z∗𝑞 , query 1 − 𝑄𝑢𝑒𝑟𝑦, and generates 𝐶𝑚∗ =
+                                                                                                                                                                             ∗
+Please refer to the Ref. [29,43].                                                        𝐶𝑜𝑚𝑚𝑖𝑡 (𝑛𝑘∗ , 𝑟𝑘∗ , 𝐴𝑡𝑡𝑟𝑠∗ ; 𝑟∗ ). Next, choose 𝑥∗𝑢 , 𝑠′∗ , 𝑡∗ ← Z∗𝑞 , calculate 𝛱𝑈1 :
+                                                                    ∑                                     ( ∗ ′∗ ∗ ∗                           )
+    1. (Prove.) First, perform binary decomposition on 𝑛, 𝑛 = 𝑘−1              𝑖
+                                                                      𝑖=0 𝑏𝑖 2 ,
+                                                                                                      ⎧     𝑥𝑢 , 𝑠 , 𝑡 , 𝑟 , 𝑛𝑘∗ , 𝑟𝑘∗ , 𝐴𝑡𝑡𝑟𝑠∗ ∶ ⎫
+       where 𝑏 ∈ {0, 1}. Construct vector 𝐚𝐿 = (𝑏0 , 𝑏1 , … , 𝑏𝑘−1 ), 𝐚𝑅 =                            ⎪            𝑥∗𝑢 𝑠′∗                        ⎪
+                                                                                            ∗         ⎪      ∗
+                                                                                                          𝑋𝑢 = 𝑔1 𝑔2                              ⎪( ∗ ∗                     )
+       𝐚𝐿 −𝟏𝑘 (𝑎𝑅,𝑖 = 𝑏𝑖 −1). Next, choose blind factor 𝛼, 𝜌 ← Z𝑞 , 𝒔𝐿 , 𝒔𝑅 ←            𝛱𝑈1 = 𝑆𝑃 𝐾1∗ ⎨                   (   ) ′∗                ⎬ 𝑋𝑢 , 𝜁 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 .
+                                                                                                            ∗       𝑎 𝑥∗     𝑏 𝑠 ⋅ 𝐶𝑚∗𝑡∗
+       Z𝑘𝑞 , compute the initialization commitment 𝐴 = ℎ𝛼 𝒈𝒂𝐿 𝒉𝒂𝑅 , 𝑆 =                               ⎪ 𝛬 𝜁 (= ( ) 𝑢            )               ⎪
+                                                                                                      ⎪ 𝛬 𝜄𝑧𝑘 𝐴𝑡𝑡𝑟𝑠∗ , 𝑖𝑎𝑢𝑥𝑧𝑘 = 1                 ⎪
+       ℎ𝜌 𝒈𝒔𝐿 𝒉𝒔𝑅 . Then, construct a non-interactive proof challenge 𝑦 =                             ⎩
+                                                                                                      ( ∗                      )                  ⎭                (         )
+           (          )                                                                      Sending 𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 to the issuer,  checks 𝜄𝑝𝑢𝑏 𝑖𝑎𝑢𝑥𝑝𝑢𝑏
+       𝐻 𝐴, 𝑆, 𝐶𝑛 , 𝑧 = 𝐻(𝑦, 𝐴, 𝑆) based on Fiat–Shamir and polyno-
+                                                  (          )                                             1 ∗
+                                                                                         and validates 𝛱𝑈 , aborts if it fails, otherwise it selects a random
+       mials 𝒍(𝑥) = 𝒂𝐿 − 𝑧𝟏𝑘 + 𝒔𝐿 𝑥, 𝒓(𝑥) = 𝑦𝑘 ◦ 𝒂𝑅 + 𝑧𝟏𝑘 + 𝒔𝑅 𝑥, calculate
+       the inner product 𝑡 = ⟨𝒍(𝑥), 𝒓(𝑥)⟩, 𝜏𝑥 ← Z𝑝 , 𝑇 = 𝑔 𝑡 ℎ𝜏𝑥 . The final             number 𝑠′′∗ ∈ Z∗𝑞 and performs 2 − 𝑄𝑢𝑒𝑟𝑦. Embed tuple  = (, 𝑎 , 𝑏 ),
+                                                                                                                             ′′∗      ∗
+       challenge is 𝑥 = 𝐻(𝑧, 𝑦, 𝑇 ), generate response 𝒍 = 𝒍(𝑥), 𝒓 =                     register 𝑐𝑟𝑒𝑑 ∗ ∶= (𝜁 ∗ ⋅ (𝑏 )𝑠 ) ⋅ 𝑢𝑤    0
+                                                                                                                                        , generate the forged Merkle
+                                                                                         tree 𝑇 ∗ , update the root node to 𝑇𝑟𝑜𝑜𝑡  ∗ , select 𝑧∗ , 𝑧∗ ← Z∗ , Calculate
+       𝒓(𝑥), 𝑡̂ = ⟨𝒍, 𝒓⟩, 𝜏 = 𝜏𝑥 + 𝑥2 𝜌, 𝜇 = 𝛼 + 𝑥𝜌. Finally output the proof                           {                                       0 1            𝑞             }
+       𝜋 = (𝐴, 𝑆, 𝑇 , 𝑡̂, 𝜏, 𝜇, 𝒍, 𝒓).                                                      ∗             ( ∗ ∗          )                                    ∗ ∗     𝑤∗ ⋅𝑧∗
+                                                                                         𝛱𝑉1 = 𝑆𝑃 𝐾2∗      𝑧0 , 𝑧1 , 𝑎, 𝑏 ∶ 𝑌𝑢∗ = 𝑎 𝑏 ∧ ∗ = (𝜁 ∗ ⋅ (𝑏 )𝑠′′ )𝑧1 ⋅ 𝑢0 0
+    2. (Verify.) Upon receiving the commitment 𝐶𝑛 , proof 𝜋, recal-
+                                            (        )                                                                    ∗
+                                                                                         (𝑌𝑢∗ , 𝑠′′∗ , 𝑘∗ , ∗ ), send (𝛱𝑉1 , 𝑠′′∗ , 𝑘∗ , 𝜃 ∗ ) to adversary 1 , 1 calculate
+       culate the challenge 𝑦 = 𝐻 𝐴, 𝑆, 𝐶𝑛 , 𝑧 = 𝐻(𝑦, 𝐴, 𝑆), 𝑥 =
+                                                        ⟨               ⟩                𝑠∗ = 𝑠′∗ + 𝑠′′∗ and save to local.
+       𝐻(𝑧, 𝑦, 𝑇 ). Next, compute offset value 𝛿𝑦 = 𝑦𝑘 , 𝑧𝟏𝑘 + 𝑧2 2𝑘 , and
+                                                               𝑘 (  )𝑧𝟏 𝑘 +𝑧2 2𝑘              𝑄𝑢𝑒𝑟𝑦3 : In this phase 1 to show the proof, using zero knowledge
+       reconstruct the commitment 𝑃 = 𝐴 ⋅ 𝑆 𝑥 ⋅ ℎ−𝜇 ⋅ 𝒈𝑧𝟏 ⋅ 𝒉′                   ,
+                                                                 ?        2
+                                                                                         simulator , run algorithm 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑 forged 𝑡𝑜𝑘𝑒𝑛∗ and 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤
+         where 𝒉′ = 𝒉◦𝑦𝑘 . Then, verify inner product 𝑔 𝑡̂ℎ𝜏 = 𝑇 ⋅ 𝐶𝑛𝑍 ⋅ 𝑔 𝛿𝑦 .          interact. Adversary 1 forges the message 𝑚𝑠𝑔 ∗ requesting access to
+         If passed, accept, otherwise, reject.                                           .  selects 𝑛𝑜𝑛𝑐𝑒∗ , conducts 3 − 𝑄𝑢𝑒𝑟𝑦 query, calculates 𝑟∗ , and
+                                                                                         returns it to adversary 1 . Adversary 3 − 𝑄𝑢𝑒𝑟𝑦 hash verification,
+5.2. Theoretical security analysis                                                       if by selecting public attribute 𝑎𝑡𝑡𝑟∗𝑖 ∈ 𝐴𝑇                  ∗
+                                                                                                                                                  ( 𝑇 𝑅 , the secret attribute )is
+                                                                                         𝑎𝑡𝑡𝑟∗𝑗 ∉ 𝐴𝑇 𝑇 𝑅∗ , calculate 𝐶𝑚∗ = Commit 𝑛𝑘∗ , 𝑟𝑘∗ , 𝑎𝑡𝑡𝑟∗𝑗 ∉ 𝐴𝑇 𝑇 𝑅∗ ; 𝑟∗ ,
+5.2.1. Proof of Game1                                                                                (                 )
+                                                                                         select 𝑛∗𝑗 0 ≤ 𝑛∗𝑗 < 𝑘∗ , 𝛼0∗                    ←        Z∗𝑞 , generate 𝛱      ̃ ∗ , send
+                                                                                                {    } 𝑖=𝑛  (            )
+Theorem 1. The scheme is unforgeable if the DLP and DDH assumptions                       ̃ ∗ , 𝑎𝑢𝑥𝑖
+                                                                                         (𝛱                          ∗
+                                                                                                           , 𝜃 ∗ , 𝑇𝑟𝑜𝑜𝑡   , 𝛷′ , 𝑎𝑡𝑡𝑟∗𝑖 ∈ 𝐴𝑇 𝑇 𝑅∗ ) to .
+                                                                                                       𝑖=1
+hold.                                                                                        Forgery. Adversary 1 outputs the forged certificate 𝑐𝑟𝑒𝑑 ∗ and the
+                                                                                         corresponding authentication path 𝜃 ∗ , which meets the condition that
+Proof. Suppose that the adversary 1 forges the credential with the                      𝑐𝑟𝑒𝑑 ∗ was not generated through                legal issuance.  running )algorithm
+                                                                                                                            (          (              )       {    }
+non-negligible probability 𝜖, we construct reduction algorithm  to                      VerifyShow, 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤 𝑝𝑝, 𝑉 , 𝑐𝑟𝑒𝑑 ∗ , 𝑇𝑟𝑜𝑜𝑡          ∗      ̃ ∗ , 𝑎𝑢𝑥𝑖 𝑖=𝑖 = 1.
+                                                                                                                                                        ,𝛱           𝑖=1
+solve the DLP or CDH problem with the non-negligible advantage                               Then, requery 3 by rewinding technique to obtain 𝑟∗ , modify the
+𝜖 − 𝑛𝑒𝑔𝑙. The reduction algorithm  embeds the group parameter tuple                     new challenge to 𝑐 ≠( 𝑐 ′ , compute the response and output                          ̃ ′∗
+                                                                                                                                                                       ) 𝛱 to
+ = (, 𝑎 , 𝑏 ) into the problem instance,  can control and program
+                                                                                         extract witness 𝑤∗ = 𝑥∗𝑢 , 𝑠∗ , 𝑡∗ , 𝑟∗ , 𝑛𝑘∗ , 𝑟𝑘∗ , 𝑎𝑡𝑡𝑟∗𝑗 ∉ 𝐴𝑇 𝑇 𝑅∗ , separate
+the random oracle, and simulates the whole system:                                                                            ∗     ∗       ∗           ∗ ∗        ∗
+    Setup. Challenger 1 run system initialization algorithm                             from the witness 𝜁 ′∗ = (𝑎 )𝑥𝑢 (𝑏 )𝑠 ⋅ 𝐶𝑚∗𝑡 = (𝑎𝑏 )𝑥𝑢 ⋅𝑠 ⋅ 𝐶𝑚∗𝑡 . According
+𝑆𝑒𝑡𝑢𝑝(1𝜆 , 1ℎ , 1𝑚 ) generate 𝑝𝑝, send 𝑝𝑝 to simulator . 1 save issuer                 to the above proof, if the forgery credential 𝑐𝑟𝑒𝑑 ∗ and the corresponding
+private key 𝑖𝑠𝑘 = (𝑦1 , 𝑦2 ).                                                            authentication path 𝜃 ∗ make it difficult to compute 𝑎𝑏 on G, the
+    Query. In this phase, 1 query random Oracle − 𝑄𝑢𝑒𝑟𝑦, 𝑄𝑢𝑒𝑟𝑦2 ,                      probability that adversary 1 will successfully forge a credential for the
+and 𝑄𝑢𝑒𝑟𝑦3 , 1 random response and recording.                                           first time is 𝜖, and the probability of a single retry is about 𝜖 2 . By the
+    − 𝑄𝑢𝑒𝑟𝑦: The adversary 1 can query the random oracle 1 , 2 , 3 .                universal bifurcation Lemma, since adversary 1 performs 𝑞𝐻3 queries.
+Before any hash query,  will prepare three empty hash lists 1,2,3 ,                    The probability of success is 𝜖 2 ∕𝑞𝐻3 , then the advantage of simulator
+and define the query number size as 𝑞𝐻1 , 𝑞𝐻2 , 𝑞𝐻3 to record the query                  to break CDH hard problem successfully is 𝜖 2 ∕𝑞𝐻3 − 𝑛𝑒𝑔𝑙.
+response.                                                         [      ]
+    1 − 𝑄𝑢𝑒𝑟𝑦: Before 1 query,  randomly selected 𝑖∗1 ∈ 1, 𝑞𝐻1 , the                  5.2.2. Proof of Game2
+input attribute 𝑎𝑡𝑡𝑟𝑖 ,  record of all the queries in the list 1 , and make
+a response. If 𝑖 = 𝑖∗1 ,  return values in the list, otherwise  generated              Theorem 2. The Scheme is anonymity and unlinkability if the CDH
+1 (𝑎𝑡𝑡𝑟𝑖 ), records (𝑖, 𝑎𝑡𝑡𝑟𝑖 , 1 (𝑎𝑡𝑡𝑟𝑖 )) in 1 .                                    assumption hold.
+                                                                      [        ]
+    2 − 𝑄𝑢𝑒𝑟𝑦: Before the 2 query,  randomly selects 𝑖∗2 ∈ 1, 𝑞𝐻2 ,
+                                                                                         Proof. Suppose that the adversary 2 distinguishes credentials with
+after entering each user time period 𝑒𝑝𝑜𝑐ℎ𝑖 , and the maximum number
+                                                                                         a non-negligible advantage 𝜖, and construct a reduction algorithm 
+of credentials to be initialized 𝑘𝑖 ,  records all queries in the list 2 ,
+                                                                                         to solve the DDH problem with a non-negligible advantage 𝜖 − 𝑛𝑒𝑔𝑙.
+and responds. If 𝑖 = 𝑖∗2 ,  returns the value in the list, otherwise 
+generates 2 (𝑒𝑝𝑜𝑐ℎ ∥ 𝑘) with the following Eq. (2):                                     The reduction algorithm  embedded the group parameter tuple  =
+                       {                                                                 (, 𝑎 , 𝑏 , 𝑐 ) into the DDH problem instance, and the adversary 2
+    (             )             𝑤∗ , 𝑖 = 𝑖∗2                                             determined whether 𝑐 = 𝑎𝑏 or random, and simulated the whole
+2 𝑒𝑝𝑜𝑐ℎ𝑖 ∥ 𝑘𝑖 =                                 .                          (2)
+                           𝑤 , otherwise                                                 process:
+                       ( (𝑖                    )      (   ))
+    Then,  record 𝑖, epoch 𝑖 ∥ 𝑘𝑖 , 2 𝑒𝑝𝑜𝑐ℎ𝑖 ∥ 𝑘𝑖 in the        [ list ]2 .               Setup. Same with the initialization of Game 1.
+                                                              ∗
+    3 −𝑄𝑢𝑒𝑟𝑦: Before 3 queries,  randomly selected 𝑖3 ∈ 1, 𝑞𝐻3 , the                      Query. Adversary 2 can continue to query issuance and show, but
+input random 𝑛𝑜𝑛𝑐𝑒𝑖 and message 𝑚𝑠𝑔𝑖 ,  record of all the queries in                    cannot query revocation or presentation of challenge credentials. At the
+the list 3 , and respond. If 𝑖 = 𝑖∗3 ,  return values in the list, otherwise           same time also can query 1 − 𝑄𝑢𝑒𝑟𝑦.
+ generated 2 (𝑛𝑜𝑛𝑐𝑒 ∥ 𝑚𝑠𝑔) in the following Eq. (3):                                       Challenge. Adversary 2 submits two attribute sets 𝐴𝑡𝑡𝑟𝑠∗0 and
+                          {                                                              𝐴𝑡𝑡𝑟𝑠∗1 , that satisfy the same access policy to challenger 2 . Since the
+    (                )            𝑟∗ , 𝑖 = 𝑖∗3
+2 𝑛𝑜𝑛𝑐𝑒𝑖 ∥ 𝑚𝑠𝑔𝑖 =                                 .                        (3)          parameter related to the attribute set in zero-knowledge is 𝜁 ′ . The
+                              𝑟𝑖 , otherwise
+                                                                                         challenger 2 calls the simulator  to simulate the SPK and prove
+                     ( (                    )       (      ))
+Then,  record 𝑖, 𝑛𝑜𝑛𝑐𝑒𝑖 ∥ 𝑚𝑠𝑔𝑖 , 2 𝑛𝑜𝑛𝑐𝑒𝑖 ∥ 𝑚𝑠𝑔𝑖 in the list 3 ,                      the embedding group parameter tuple  = (, 𝑎 , 𝑏 , 𝑐 ), randomly
+where oracle 2 and 3 share a hash function. 𝑄𝑢𝑒𝑟𝑦2 : In this phase,                    select 𝑎, 𝑏 ← Z∗𝑞 , and calculate 𝜁1′∗ . Select 𝑐 ← Z∗𝑞 calculate 𝜁2′∗ . Next,
+
+                                                                                     8
+H. Di et al.                                                                                                                          Computer Standards & Interfaces 97 (2026) 104097
+
+
+                Table 3
+                Average times of cryptographic and Merkle tree operations.
+                  Symbol          Definition                                      secp256k1 (128-bit security)                  BLS12-381 (128-bit security)
+                                                                                  100 s/Leaves           1000 s/Leaves          100 s/Leaves          1000 s/Leaves
+                  𝑇𝑏𝑝             Bilinear pairing operation time                 –                      –                      0.9162 ms             0.9466 ms
+                  𝑇ℎ              Hash computation time                           0.0003 ms              0.0000 ms              0.0001 ms             0.0000 ms
+                  𝑇𝑒𝑝             Exponentiation time in group G                  0.0211 ms              0.0314 ms              0.2606 ms             0.2677 ms
+                                                                                                                                G1 :0.3958 ms         G1 :0.2686 ms
+                  𝑇𝑚𝑝−𝑒𝑐          Elliptic curve point multiplication time        0.0254 ms              0.0234 ms
+                                                                                                                                G2 :0.8140 ms         G2 :0.8009 ms
+                                                                                                                                G1 :0.0007 ms         G1 :0.0006 ms
+                  𝑇𝑎𝑑𝑑−𝑒𝑐         Elliptic curve point addition time              0.0462 ms              0.0530 ms
+                                                                                                                                G2 :0.0018 ms         G2 :0.0018 ms
+                  𝑇𝜅𝐺             Generation algorithm of tree 𝑇𝜅                 0.0025 ms              0.0024 ms              0.0029 ms             0.0023 ms
+                  𝑇𝜅𝑉             Verification algorithm of tree 𝑇𝜅               0.0004 ms              0.0002 ms              0.0020 ms             0.0002 ms
+                  𝑇𝜅𝑈             Update algorithm of tree 𝑇𝜅                     0.0002 ms              0.0002 ms              0.0003 ms             0.0003 ms
+
+
+                Table 4
+                Computation and communication cost analysis.
+                  Algorithms                     Parameter                   Phase                 Computation cost                             Communication cost
+                  𝑆𝑒𝑡𝑢𝑝                          𝑝𝑝                          –                     2𝑇𝑒𝑝                                         (13 + 𝑚)|G|
+                  𝐼𝑠𝑠𝑢𝑒𝑆𝑒𝑡𝑢𝑝𝐼                    (𝐼, 𝜄𝑝𝑢𝑏 )                  –                     –                                            –
+                  𝑆ℎ𝑜𝑤𝑆𝑒𝑡𝑢𝑝𝑉                     𝑉                           –                     –                                            –
+                                                 𝐶𝑚                          –                     (3 + 𝑚)𝑇𝑒𝑝 + 𝑚𝑇ℎ + 3𝑇𝑚𝑝−𝑒𝑐                   |G|
+                  𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞𝑈
+                                                                             Proof                 (16 + 𝑚)𝑇𝑒𝑝 + 3𝑇𝑚𝑝−𝑒𝑐                        2|G| + 5|Z𝑞 |
+                                                 𝛱𝑈1
+                                                                             Verify                7𝑇𝑒𝑝                                         –
+                                                 𝑐𝑟𝑒𝑑                        –                     1𝑇𝑒𝑝 + 2𝑇𝑚𝑝−𝑒𝑐 + 1𝑇ℎ                         –
+                  𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡𝐼                    𝑇𝜅                          –                     𝑇𝜅𝐺                                          –
+                                                                             Proof                 8𝑇𝑒𝑝 + 1𝑇ℎ + 3𝑇𝑚𝑝−𝑒𝑐                         2|G| + 6|Z𝑞 |
+                                                 𝛱𝑉1
+                                                                             Verify                6𝑇𝑒𝑝                                         –
+                                                 𝛱̃                          Proof                 25𝑇𝑒𝑝                                        5|G| + 7|Z𝑞 |
+                  𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑𝑈
+                                                 {𝑎𝑢𝑥𝑖 }𝑛𝑖=1                 –                     –                                            i|Z𝑞 |
+                  𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤𝑉                  –                           Verify                26𝑇𝑒𝑝 + 𝑇𝜅𝑉                                  –
+                  𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑                     𝑇𝜅′                         –                     𝑇𝜅𝑈                                          –
+
+                Note*: i is the number of access criteria defined per verifier.
+
+
+simulator  selects 𝑏 ← (  {0, 1}, and uses 𝐴𝑡𝑡𝑟𝑠𝑏 ∗ to generate the cre-       )             6.2. Algorithm computation and communication cost analysis
+                                    {    }     (          )
+dential display 𝛱̃ 𝑏 . Send 𝛱  ̃ 𝑏 , 𝑎𝑢𝑥𝑖 𝑖=𝑖 , 𝜃, 𝑇𝑟𝑜𝑜𝑡 , 𝛷′ , 𝑎𝑡𝑡𝑟𝑖 ∈ 𝐴𝑇 𝑇 𝑅
+                                           𝑖=1
+to adversary 2 .                                                                                 Table 4 shows the computational cost and communication cost
+    Guess. 2 guesses 𝑏′ from the output 𝛱         ̃ 𝑏 , and the advantage is                 of the proposed algorithm in the scheme. The algorithm includes
+            | [         ]   |
+defined as: |Pr 𝑏′ = 𝑏 − 12 |.                                                                8 algorithms as follows. 𝑆𝑒𝑡𝑢𝑝, 𝐼𝑠𝑠𝑢𝑒𝑆𝑒𝑡𝑢𝑝𝐼 , 𝑆ℎ𝑜𝑤𝑆𝑒𝑡𝑢𝑝𝑉 , 𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞𝑈 ,
+            |               |
+                                                                                              𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡𝐼 , 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑𝑈 ,
+    According to the above proof, if two attribute sets satisfying the
+                                                                                                  𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤𝑉 and 𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑. The computational cost increases
+same access policy are (submitted 𝐴𝑡𝑡𝑟𝑠∗0 , 𝐴𝑡𝑡𝑟𝑠       ∗                     ̃
+                                                  ) 1 . It( is difficult for 𝛱)𝑏              linearly with the number of attributes 𝑚. We compared the single user
+to distinguish between 𝑎 , 𝑏 , 𝑎⋅𝑛𝑘+𝑏⋅𝑟𝑘+𝑎𝑏⋅𝑟 and 𝑎 , 𝑏 , 𝑎⋅𝑛𝑘+𝑏⋅𝑟𝑘+𝑐⋅𝑟
+                                                                                              in Table 4 cases for each verifier ℶ access criteria general computation
+on G, then adversary 2 succeeds in distinguishing credentials with
+                                                                                              and communication costs. Respectively, (94 + 2 𝑚)𝑇𝑒𝑝 + (𝑚 + 2)𝑇ℎ +
+non-negligible probability 𝜖∕𝑞𝐻1 . Then the advantage of the simulator
+                                                                                              11𝑇𝑚𝑝−𝑒𝑐 + 𝑇𝜅𝐺 + 𝑇𝜅𝑉 and (22 + 𝑚)|G| + (18 + ℶ)|Z𝑞 |. The cost of a single
+ to break the DDH hard problem successfully is 𝜖∕𝑞𝐻1 − 𝑛𝑒𝑔𝑙.
+                                                                                              algorithm is shown in Table 4 below:
+    Note that even if the underlying Merkle path remains the same
+for repeated authentications, the simulator ensures that each creden-
+                                                                                              6.3. Computation and communication cost comparison
+tial presentation is randomized. Therefore, the adversary’s advantage
+does not increase by observing identical path values, which remain
+                                                                                                  In Table 1 of Section 2, we have compared the functions of the ex-
+computationally indistinguishable across sessions.
+                                                                                              isting schemes [19,29–31,33–35]. The scheme [32–34] satisfies the 𝑘-
+                                                                                              times period anonymous authentication function. Since the scheme [32]
+Theorem 3. The Scheme is attribute Privacy if the CDH assumption hold.
+                                                                                              is constructed based on bilinear pairing. Here, we compare the scheme
+Similar anonymity, but in view of the properties rather than identity.
+                                                                                              [33,34] with the proposed scheme in the computation cost processes of
+6. Performance analysis                                                                       issuance, show and verification. Using the lightweight curve secp256k1
+                                                                                              environment, as shown in Table 5 and Fig. 3. In Table 1, the scheme
+6.1. Experimental setup                                                                       [33] does not support the attribute selection disclosure function and
+                                                                                              does not increase with the increase of the number of attributes 𝑚.
+    The scheme is based on AMD Ryzen9 7945HX processor, Rust 1.75                             Therefore, the data results in Fig. 3 show that our scheme is better
+and Ubuntu 22.04 LTS environment, and the error is controlled within                          than the scheme [33] when the number of attributes 𝑚 is small.
+5%. The test program is written in 𝑅𝑢𝑠𝑡 and performs benchmark                                Throughout the entire process, the overall performance was superior
+evaluations on SHA-256 hacks, elliptic curve operations, and Merkle                           to the scheme [34]. Finally, the data results show that our scheme
+tree operations with the 128-bit security secp256k1, BLS12-381, and                           is superior to the existing schemes under the condition of similar
+sha2 libraries. The experiment measured the average time of 100 and                           functions.
+1000 operations (as shown in Table 3). All tests were compiled based                              In addition to the above experimental comparison, we also added
+on –release optimization to ensure accurate and reliable performance                          the proposed scheme to test the computational overhead under two
+results.                                                                                      different curve environments, BLS12-381 supporting bilinear pairing
+
+                                                                                          9
+H. Di et al.                                                                                                                             Computer Standards & Interfaces 97 (2026) 104097
+
+
+                    Table 5
+                    Computation cost comparison.
+                      Scheme                        Computation cost (ms)
+                                                    Credential issuance                                        Certificate showing          Authentication credentials
+                      [33]                          15𝑇𝑒𝑝 + 10𝑇𝑚𝑝−𝑒𝑐 + 2𝑇𝑎𝑑𝑑−𝑒𝑐                                31𝑇𝑒𝑝 + 6𝑇𝑚𝑝−𝑒𝑐 + 𝑇ℎ         20𝑇𝑒𝑝 + 9𝑇𝑚𝑝−𝑒𝑐 + 𝑇ℎ
+                      [34]                          (5 𝑚 + 40)𝑇𝑒𝑝 + (3 𝑚 + 4)𝑇ℎ                                (𝑚 + 22)𝑇𝑒𝑝 + 𝑇ℎ             (𝑚 + 23)𝑇𝑒𝑝
+                      Our Scheme                    (𝑚 + 35)𝑇𝑒𝑝 + (𝑚 + 2)𝑇ℎ + 11𝑇𝑚𝑝−𝑒𝑐 + 𝑇𝜅𝐺                   (16 + 𝑚)𝑇𝑒𝑝 + 𝑚𝑇ℎ            19𝑇𝑒𝑝 + 𝑇ℎ + 𝑇𝜅𝑉
+
+
+
+
+                               (a)                                          (b)                                        (c)                                (d)
+
+
+                                                                             Fig. 3. Computation cost comparison.
+
+
+
+
+                                                                 Fig. 4. Computation cost comparison of different curves.
+
+
+
+
+                                                                           Fig. 5. Communication cost comparison.
+
+
+and lightweight curve secp256k1, as shown in Fig. 4. The exper-                                           7. Conclusion
+imental results show that the scheme has more advantages under
+lightweight curve. It is suggested to apply the proposed scheme under                                         In this paper, we propose a 𝑘-times periodic anonymous authen-
+curve secp256k1.
+                                                                                                          tication that does not require the issuer to hold a key and supports
+    Finally, the communication cost of the existing scheme [33,34] is
+                                                                                                          the access criteria. Compared with other existing 𝑘-Times periodic
+compared and calculated based on the size of the data to be transmitted
+                                                                                                          anonymous authentication schemes, the proposed scheme not only has
+during the anonymous certificate display process. We test the commu-
+                                                                                                          lower computational cost, but also eliminates the need for the issuer to
+nication efficiency on curve secp256k1, where the group element and
+                                                                                                          hold the issuing information or the user key, and only needs to upload
+integer size of curve secp256k1 are |G| = 264𝑏𝑖𝑡𝑠 = 33𝑏𝑦𝑡𝑒𝑠, |Z𝑞 | =
+256𝑏𝑖𝑡𝑠 = 32𝑏𝑦𝑡𝑒𝑠, respectively. In the test, it is assumed that the                                      the root path of the Merkle tree to the blockchain or public panel, which
+access criterion ℶ is 1, and the number of user attributes is 1. The                                      ensures that the subsequent authentication can still be carried out even
+communication costs of the schemes [33,34] are respectively 8|G| +                                        in the case of the failure of the issuing center. In terms of security,
+11|Z𝑞 |, and (𝑚 + 14)|G| + 8|Z𝑞 |. The parameters that our scheme needs                                   it satisfies a series of DAC security properties, including anonymity,
+to transmit for presentation are (𝛱,                 ̃ {𝑎𝑢𝑥𝑖 }𝑛 , 𝑋0 , 𝜁 ′ , 𝜂, 𝛤 , 𝜃), where 𝛱̃ =        unlinkability, unforgeability and attribute privacy. The limitation of
+                                                              𝑖=1
+(𝑐, 𝐴′1 , 𝐴′2 , 𝐴′3 , 𝐴′4 , 𝐴′5 , 𝐴′6 , 𝐴′7 , 𝐴′8 ). Therefore, the total communication                   current schemes is that they rely on classical cryptography, which
+cost during the transmission process is 4|G| + (9 + ℶ)|Z𝑞 |. As shown                                     cannot resist quantum computing attacks. To address this challenge,
+in Fig. 5.                                                                                                we plan to integrate quantum-resistant cryptographic frameworks, such
+
+                                                                                                     10
+H. Di et al.                                                                                                                           Computer Standards & Interfaces 97 (2026) 104097
+
+
+as lattice-based signature, coding cryptography, or multivariate poly-                           [14] C. Garman, M. Green, I. Miers, Decentralized anonymous credentials, in: Proceed-
+nomial encryption in future research to construct periodic 𝑘-times                                    ings of the 21st NDSS, 2014, URL: https://www.ndss-symposium.org/ndss2014/
+authentication schemes with post-quantum security.                                                    decentralized-anonymous-credentials.
+                                                                                                 [15] D. Derler, C. Hanser, D. Slamanig, A new approach to efficient revocable
+                                                                                                      attribute-based anonymous credentials, in: Cryptography and Coding, 2015, pp.
+CRediT authorship contribution statement                                                              57–74.
+                                                                                                 [16] T. Bui, T. Aura, Application of public ledgers to revocation in distributed access
+    Hongyan Di: Writing – original draft, Methodology, Formal analy-                                  control, in: Information and Communications Security, 2018, pp. 781–792.
+                                                                                                 [17] A. Sonnino, M. Al-Bassam, S. Bano, S. Meiklejohn, G. Danezis, Coconut: Thresh-
+sis, Data curation, Conceptualization. Yinghui Zhang: Writing – review
+                                                                                                      old issuance selective disclosure credentials with applications to distributed
+& editing, Supervision, Project administration, Methodology, Funding                                  ledgers, in: 26th Annual Network and Distributed System Security Symposium,
+acquisition. Ziqi Zhang: Writing – original draft, Formal analysis, Data                              NDSS, 2019, URL: https://arxiv.org/pdf/1802.07344.
+curation. Yibo Pang: Project administration, Formal analysis, Data                               [18] H. Halpin, Nym credentials: Privacy-preserving decentralized identity with
+curation. Rui Guo: Writing – original draft, Methodology, Formal anal-                                blockchains, in: 2020 Crypto Valley Conference on Blockchain Technology,
+ysis. Yangguang Tian: Writing – original draft, Project administration,                               CVCBT, 2020, pp. 56–67, http://dx.doi.org/10.1109/CVCBT50464.2020.00010.
+                                                                                                 [19] H. Cui, M. Whitty, A. Miyaji, Z. Li, A blockchain-based digital identity manage-
+Methodology, Funding acquisition.                                                                     ment system via decentralized anonymous credentials, in: Proceedings of the 6th
+                                                                                                      ACM International Symposium on Blockchain and Secure Critical Infrastructure,
+Declaration of competing interest                                                                     2025, pp. 1–11, http://dx.doi.org/10.1145/3659463.3660027.
+                                                                                                 [20] C. Lin, D. He, H. Zhang, L. Shao, X. Huang, Privacy-enhancing decentralized
+                                                                                                      anonymous credential in smart grids, Comput. Stand. Interfaces 75 (2021)
+    The authors declare that they have no known competing finan-
+                                                                                                      103505, http://dx.doi.org/10.1016/j.csi.2020.103505.
+cial interests or personal relationships that could have appeared to                             [21] Z. Ma, J. Zhang, Y. Guo, Y. Liu, X. Liu, W. He, An efficient decentralized key
+influence the work reported in this paper.                                                            management mechanism for VANET with blockchain, IEEE Trans. Veh. Technol.
+                                                                                                      69 (2020) 5836–5849, http://dx.doi.org/10.1109/TVT.2020.2972923.
+Data availability                                                                                [22] J. Zhang, J. Cui, H. Zhong, I. Bolodurina, L. Liu, Intelligent drone-assisted
+                                                                                                      anonymous authentication and key agreement for 5G/B5G vehicular ad-hoc
+                                                                                                      networks, IEEE Trans. Netw. Sci. Eng. 8 (2021) 2982–2994, http://dx.doi.org/
+    Data will be made available on request.                                                           10.1109/TNSE.2020.3029784.
+                                                                                                 [23] D. Liu, H. Wu, C. Huang, J. Ni, X. Shen, Blockchain-based credential management
+                                                                                                      for anonymous authentication in SAGVN, IEEE J. Sel. Areas Commun. 40 (2022)
+References                                                                                            3104–3116, http://dx.doi.org/10.1109/JSAC.2022.3196091.
+                                                                                                 [24] D. Liu, H. Wu, J. Ni, X. Shen, Efficient and anonymous authentication with
+ [1] K.Y. Lam, C.H. Chi, Identity in the internet-of-things (IoT): New challenges and                 succinct multi-subscription credential in SAGVN, IEEE Trans. Intell. Transp. Syst.
+     opportunities, in: Information and Communications Security, 2016, pp. 18–26.                     23 (2022) 2863–2873, http://dx.doi.org/10.1109/TITS.2022.3147354.
+ [2] K. Shafique, B.A. Khawaja, F. Sabir, S. Qazi, M. Mustaqim, Internet of things               [25] L. Wei, Y. Zhang, J. Cui, H. Zhong, I. Bolodurina, D. He, A threshold-based full-
+     (IoT) for next-generation smart systems: A review of current challenges, future                  decentralized authentication and key agreement scheme for VANETs powered
+     trends and prospects for emerging 5G-IoT scenarios, IEEE Access 8 (2020)                         by consortium blockchain, IEEE Trans. Mob. Comput. 23 (2024) 12505–12521,
+     23022–23040, http://dx.doi.org/10.1109/ACCESS.2020.2970118.                                      http://dx.doi.org/10.1109/TMC.2024.3412106.
+ [3] L. Ante, C. Fischer, E. Strehle, A bibliometric review of research on digital               [26] M. Zeng, J. Cui, Q. Zhang, H. Zhong, D. He, Efficient revocable cross-domain
+     identity: Research streams, influential works and future research paths, J. Manuf.               anonymous authentication scheme for IIoT, IEEE Trans. Inf. Forensics Secur. 20
+     Syst. 62 (2022) 523–538, http://dx.doi.org/10.1016/j.jmsy.2022.01.005.                           (2025) 996–1010, http://dx.doi.org/10.1109/TIFS.2024.3523198.
+ [4] M.A. Olivero, A. Bertolino, F.J.D. Mayo, M.J.E. Cuaresma, I. Matteucci, Digital             [27] I. Teranishi, J. Furukawa, K. Sako, K-times anonymous authentication (extended
+     persona portrayal: Identifying pluridentity vulnerabilities in digital life, J. Inf.             abstract), in: Advances in Cryptology - ASIACRYPT 2004, 2004, pp. 308–322.
+     Secur. Appl. 52 (2020) 102492, URL: https://api.semanticscholar.org/CorpusID:               [28] L. Nguyen, R. Safavi-Naini, Dynamic k-times anonymous authentication, in:
+     215881538.                                                                                       Applied Cryptography and Network Security, 2005, pp. 318–333.
+                                                                                                 [29] M.H. Au, W. Susilo, Y. Mu, Constant-size dynamic k-TAA, in: Security and
+ [5] M.S. Ferdous, F. Chowdhury, M.O. Alassafi, In search of self-sovereign identity
+                                                                                                      Cryptography for Networks, 2006, pp. 111–125.
+     leveraging blockchain technology, IEEE Access 7 (2019) 103059–103079, http:
+                                                                                                 [30] U. Chaterjee, D. Mukhopadhyay, R.S. Chakraborty, 3PAA: A private PUF protocol
+     //dx.doi.org/10.1109/ACCESS.2019.2931173.
+                                                                                                      for anonymous authentication, IEEE Trans. Inf. Forensics Secur. 16 (2021)
+ [6] A. Shabtai, Y. Elovici, L. Rokach, List of data breaches and cyber attacks in 2023.
+                                                                                                      756–769, http://dx.doi.org/10.1109/TIFS.2020.3021917.
+     Media report. IT governance, 2023, URL: https://www.itgovernance.co.uk/blog/
+                                                                                                 [31] J. Huang, W. Susilo, F. Guo, G. Wu, Z. Zhao, Q. Huang, An anonymous
+     list-of-data-breaches-andcyber-attacks-in-2023.
+                                                                                                      authentication system for pay-as-you-go cloud computing∗ *, IEEE Trans. Depend-
+ [7] P.C. Bartolomeu, E. Vieira, S.M. Hosseini, J. Ferreira, Self-sovereign identity:
+                                                                                                      able Secur. Comput. 19 (2) (2022) 1280–1291, http://dx.doi.org/10.1109/TDSC.
+     Use-cases, technologies, and challenges for industrial IoT, in: 2019 24th IEEE
+                                                                                                      2020.3007633.
+     International Conference on Emerging Technologies and Factory Automation,
+                                                                                                 [32] J. Camenisch, S. Hohenberger, M. Kohlweiss, A. Lysyanskaya, M. Meyerovich,
+     ETFA, 2019, pp. 1173–1180, http://dx.doi.org/10.1109/ETFA.2019.8869262.
+                                                                                                      How to win the clonewars: efficient periodic n-times anonymous authentication,
+ [8] European Union, Regulation (EU) 2016/679 of the European parliament and of
+                                                                                                      in: Proceedings of the 13th ACM Conference on Computer and Communications
+     the council of 27 april 2016 on the protection of natural persons with regard
+                                                                                                      Security, 2006, pp. 201–210, http://dx.doi.org/10.1145/1180405.1180431.
+     to the processing of personal data and on the free movement of such data,
+                                                                                                 [33] B. Lian, G. Chen, M. Ma, J. Li, Periodic 𝐾 -times anonymous authentication with
+     and repealing directive 95/46/EC (general data protection regulation), 2016,
+                                                                                                      efficient revocation of violator’s credential, IEEE Trans. Inf. Forensics Secur. 10
+     [Online] Available: URL: https://eur-lex.europa.eu/eli/reg/2016/679/oj/eng.
+                                                                                                      (3) (2015) 543–557, http://dx.doi.org/10.1109/TIFS.2014.2386658.
+ [9] A. Mühle, A. Grüner, T. Gayvoronskaya, C. Meinel, A survey on essential                     [34] Y. Yang, W. Xue, J. Sun, G. Yang, Y. Li, H. Hwa Pang, R.H. Deng, PkT-
+     components of a self-sovereign identity, Comput. Sci. Rev. 30 (2018) 80–86,                      SIN: A secure communication protocol for space information networks with
+     http://dx.doi.org/10.1016/j.cosrev.2018.10.002.                                                  periodic k-time anonymous authentication, IEEE Trans. Inf. Forensics Secur.
+[10] European Union, Regulation (EU) 2024/1183 of the European parliament and                         (2024) 6097–6112, http://dx.doi.org/10.1109/TIFS.2024.3409070.
+     of the council of 5 June 2024 on European digital identity wallets, 2024, URL:              [35] C. Wiraatmaja, S. Kasahara, Scalable anonymous authentication scheme based
+     https://eur-lex.europa.eu/eli/reg/2024/1183/oj. (Accessed 13 October 2024).                      on zero-knowledge set-membership proof, Distrib. Ledger Technol. 4 (2025)
+[11] D. Chaum, Security without identification: transaction systems to make big                       http://dx.doi.org/10.1145/3676285.
+     brother obsolete, Commun. ACM 28 (1985) 1030–1044, http://dx.doi.org/10.                    [36] R. Canetti, Y. Chen, J. Holmgren, A. Lombardi, G.N. Rothblum, R.D. Rothblum,
+     1145/4372.4373.                                                                                  D. Wichs, Fiat-Shamir: from practice to theory, 2019, http://dx.doi.org/10.1145/
+[12] D. Chaum, Showing credentials without identification. Signatures transferred                     3313276.3316380.
+     between unconditionally unlinkable pseudonyms, in: Proc. of a Workshop on                   [37] J. Camenisch, M. Stadler, Efficient group signature schemes for large groups, in:
+     the Theory and Application of Cryptographic Techniques on Advances in                            Advances in Cryptology — CRYPTO ’97, 1997, pp. 410–424.
+     Cryptology—EUROCRYPT ’85, 1986, pp. 241–244.                                                [38] M. Rosenberg, J. White, C. Garman, I. Miers, zk-creds: Flexible anonymous
+[13] J. Camenisch, A. Lysyanskaya, An efficient system for non-transferable anony-                    credentials from zkSNARKs and existing identity infrastructure, in: 2023 IEEE
+     mous credentials with optional anonymity revocation, in: Advances in Cryptology                  Symposium on Security and Privacy, SP, 2023, pp. 790–808, http://dx.doi.org/
+     — EUROCRYPT 2001, 2001, pp. 93–118.                                                              10.1109/SP46215.2023.10179430.
+
+
+                                                                                            11
+H. Di et al.                                                                                            Computer Standards & Interfaces 97 (2026) 104097
+
+
+[39] Y. Dodis, A. Yampolskiy, A verifiable random function with short proofs and             Yibo Pang received the B.S. degree in Information Security
+     keys, 2004, URL: https://eprint.iacr.org/2004/310. Cryptology ePrint Archive,           from the School of Cyberspace Security, Xi’an University of
+     Paper 2004/310.                                                                         Posts and Telecommunications, Xi’an, China, in 2020, and
+[40] J. Groth, On the size of pairing-based non-interactive arguments, in: Advances          the M.S. degree in Cyberspace Security from the School of
+     in Cryptology – EUROCRYPT 2016, 2016, pp. 305–326.                                      Cyberspace Security, Xi’an University of Posts and Telecom-
+[41] V. Shoup, Sequences of games: a tool for taming complexity in security proofs,          munications, Xi’an, China, in 2023. He is currently pursuing
+     IACR Cryptol. EPrint Arch. (2004) 332, URL: http://eprint.iacr.org/2004/332.            a PhD at Xi’an University of Posts and Telecommunica-
+[42] M. Bellare, P. Rogaway, Random oracles are practical: a paradigm for designing          tions. His research interests include multimedia security and
+     efficient protocols, in: Proceedings of the 1st ACM Conference on Computer and          privacy.
+     Communications Security, 1993, pp. 62–73, http://dx.doi.org/10.1145/168588.
+     168596.
+[43] B. Bünz, J. Bootle, D. Boneh, A. Poelstra, P. Wuille, G. Maxwell, Bulletproofs:
+     Short proofs for confidential transactions and more, in: 2018 IEEE Symposium            Rui Guo is an associate professor and master’s supervisor at
+     on Security and Privacy, SP, 2018, pp. 315–334, http://dx.doi.org/10.1109/SP.           Xi’an ’an University of Posts and Telecommunications. He
+     2018.00020.                                                                             has presided over a total of 9 scientific research projects,
+                                                                                             including those funded by the National Natural Science
+                                                                                             Foundation of China, the Key Research and Development
+                         Hongyan Di is currently studying for a master’s degree in
+                                                                                             Program of Shaanxi Province, and the Basic Research Pro-
+                         Cyberspace and Information Security from Xi’an University
+                                                                                             gram of Shaanxi Province. As a major participant, he has
+                         of Posts and Telecommunications. Her research interests
+                                                                                             participated in and completed more than 10 projects, such
+                         include cross-domain authentication and digital signature
+                                                                                             as the National Key Research and Development Plan and the
+                         security.
+                                                                                             National Natural Science Foundation of China. As the first
+                                                                                             author, I have published over 20 academic papers, among
+                                                                                             which 12 are indexed by SCI (including 1 TOP 1% ESI
+                                                                                             highly cited paper).
+
+
+                                                                                             Dr. Yangguang Tian received his Ph.D. degree in applied
+                         Yinghui Zhang received his Ph.D. degree in Cryptography             cryptography from the University of Wollongong, Australia.
+                         from Xidian University, China, in 2013. He is a professor           After Ph.D., he did post-docs at School of Information
+                         at School of Cyberspace Security, National Engineering              System, Singapore Management University, and iTrust, Sin-
+                         Research Center for Secured Wireless (NERCSW), Xi’an                gapore University of Technology and Design. Before Surrey,
+                         University of Posts & Telecommunications. He was a re-              he was a research-based assistant professor at Osaka Uni-
+                         search fellow at School of Information System, Singapore            versity, Japan. He is currently a lecturer at the University
+                         Management University. He has published over 100 research           of Surrey, UK. His research interests include applied cryp-
+                         articles in ACM CSUR, IEEE TDSC, IEEE TCC, Computer                 tography, network security, blockchain technologies, and
+                         Networks, etc. He served on the program committee of                privacy-preserving technologies. Dr. Tian’s recent research
+                         several conferences and the editorial member of several             works have been published in the cybersecurity-related
+                         international journals in information security. His research        international conferences and journals, such as USENIX’24,
+                         interests include public key cryptography, cloud security,          AsiaCCS’24, IEEE TIFS’23, IEEE TDSC’24, etc.
+                         and wireless network security.
+
+
+                         Ziqi Zhang is currently studying for a master’s degree in
+                         Cyberspace and Information Security from Xi’an University
+                         of Posts and Telecommunications. Her research interests
+                         include digital signature security and its applications.
+
+
+
+
+                                                                                        12
+
--- a/papers_txt/GTA--Generating-high-performance-tensorized-progra_2025_Journal-of-Systems-A.txt
+++ b/papers_txt/GTA--Generating-high-performance-tensorized-progra_2025_Journal-of-Systems-A.txt
--- a/papers_txt/Graph-based-interpretable-dialogue-sentiment-analysis--A_2026_Computer-Stand.txt
+++ b/papers_txt/Graph-based-interpretable-dialogue-sentiment-analysis--A_2026_Computer-Stand.txt
--- a/papers_txt/How-AI-agents-transform-reflective-practices--A-three-se_2026_Computer-Stand.txt
+++ b/papers_txt/How-AI-agents-transform-reflective-practices--A-three-se_2026_Computer-Stand.txt
@@ -0,0 +1,897 @@
+                                                              Computer Standards & Interfaces 97 (2026) 104094
+
+
+                                                                   Contents lists available at ScienceDirect
+
+
+                                                       Computer Standards & Interfaces
+                                                           journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+How AI agents transform reflective practices: A three-semester comparative
+study in socially shared regulation of learning
+Yumin Zheng a, Fengjiao Tu b , Fengfang Shu a,c , Chaowang Shang a,* , Lulu Chen a , Jiang Meng a
+a
+  Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China
+b
+  Department of Information Science, University of North Texas, 3940 North Elm, Denton, Texas, 76203, USA
+c
+  Institute of Open Education, Wuhan Vocational College of Software and Engineering, Wuhan Open University, Wuhan, China
+
+
+
+
+A R T I C L E I N F O                                    A B S T R A C T
+
+Keywords:                                                High-quality reflection has been a challenging barrier in the socially shared regulation of learning (SSRL).
+Artificial intelligence agent                            Especially with the emergence of generative artificial intelligence (GAI), traditional methods such as reflection
+Socially shared regulation of learning                   reports may increase the students’ risk of superficial reflection. This study uses an artificial intelligence agent (AI
+Reflection quality
+                                                         agent) to design a reflection assistant, which aims to enhance students’ reflection ability through continuous
+Collaborative learning
+Generative artificial intelligence
+                                                         questioning and real-time, content-specific feedback based on their written reflections. Through a comparative
+                                                         experiment conducted over three semesters, this study demonstrates the different impacts of three reflection
+                                                         methods, reflection reports, reflection short-answer questions, and AI agents, on the quality of university stu
+                                                         dents’ reflections. The results indicate that there is a significant difference in the quality of reflection among the
+                                                         three reflection methods. Students using AI agents show the highest levels of reflection, characterized primarily
+                                                         by connective reflection and critical reflection. Epistemic network analysis further reveals that the AI agent
+                                                         reflection method is more effective in improving the reflection quality of low-performance teams than that of
+                                                         high-performance teams. This expands AI agents’ use in SSRL reflection, introduces new methods for the GAI era,
+                                                         and provides practical experience and reflection intervention strategies for teachers and instructional designers
+                                                         in SSRL.
+
+
+
+
+1. Introduction                                                                               Nowadays, these traditional methods fall short of addressing the chal
+                                                                                              lenges posed by GAI [9]. Students may easily rely on tools like ChatGPT
+    With the rapid advancement of generative artificial intelligence                          to complete short-answer questions, journals, and reports. Kiy [10] has
+(GAI), numerous challenges in collaborative learning have been                                shown that 76 % of university students use ChatGPT for their assign
+addressed with innovative solutions [1,2]. GAI applications, represented                      ments, with the percentage being even higher among software engi
+by artificial intelligence agents (AI agents), have introduced revolu                        neering students, reaching 93 % [11]. The widespread use of GAI has
+tionary transformations to education. These transformations are mainly                        profoundly transformed traditional methods of learning and teaching,
+due to the powerful expert-level conversational abilities and                                 and this era calls for new approaches to reflection.
+user-friendly accessibility [3].                                                                  AI agents are computing systems with capabilities for autonomous
+    The socially shared regulation of learning (SSRL) strategy serves as a                    perception, decision making, and action [12]. They use GAI to learn,
+crucial mechanism for enhancing learning outcomes in collaborative                            reason, and perform corresponding tasks or actions from the surround
+learning [4]. Through the SSRL strategy, learners collaboratively set                         ing environment and input information. To enable practical imple
+goals and monitor progress, thereby improving their performance [5].                          mentation, rule-based AI agents have been developed that require no
+Reflection is a critical component of SSRL, aiding learners in recognizing                    programming and can be deployed simply by defining task objectives
+and refining their learning processes [6]. However, achieving                                 and roles via prompts. In educational contexts, these rule-based AI
+high-quality reflection remains a challenge [7].                                              agents are commonly used for personalized instruction and intelligent
+    There are various methods to enhance reflection quality in SSRL,                          tutoring due to their ability to engage in real-time dialogue and provide
+such as providing prompts and templates in reflection reports [8].                            immediate feedback [13].
+
+
+    * Corresponding author.
+      E-mail address: phdzhengyumin@mails.ccnu.edu.cn (C. Shang).
+
+https://doi.org/10.1016/j.csi.2025.104094
+Received 1 February 2025; Received in revised form 28 October 2025; Accepted 10 November 2025
+Available online 11 November 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+Y. Zheng et al.                                                                                                     Computer Standards & Interfaces 97 (2026) 104094
+
+
+    The rule-based AI agent provides an effective approach for sup               widely applied in education [16]. It can support collaborative learning
+porting SSRL reflection. Instructors can set specific SSRL task directions,       through personalized instruction, real-time feedback, and intelligent
+and the agent guides students based on the reflection checklist while             assessment [17]. AI agents, a form of GAI equipped with autonomous
+adaptively generating questions according to students’ responses. Each            learning and decision-making capacities, have emerged as key instruc
+follow-up question is dynamically generated based on the student’s                tional tools in global educational research.
+prior answers and the specific SSRL task, making it difficult for students            Empirical studies have shown that AI agents significantly improve
+to rely on external AI tools like ChatGPT to provide generic responses.           student engagement [18,19], learning motivation [20,21], and aca
+This continuous dialogue mechanism supports deeper, more analytical               demic performance [22]. AI agents exist in various forms, such as
+reflection and reduces the risk of superficial reflection [14]. Despite AI        chatbots [23], intelligent tutoring systems (ITS; [24]), embodied
+agents having broad application prospects, current research on                    conversational agents (ECA; [25,26]), and intelligent virtual assistants
+improving learners’ reflection quality by AI agents remains limited and           (IVA; [13,27]). Among these, GAI-based chatbots have been widely
+requires further in-depth exploration.                                            adopted in education due to their customizable roles and flexible
+    Against this backdrop, this study introduces a rule-based AI agent            deployment. The present study focuses on this type of conversational AI
+reflection assistant within the SSRL framework to help learners enhance           agent.
+their reflection quality. This study aims to examine the impact of the AI             In higher education, AI agents have been shown to support higher-
+agent on SSRL reflection quality by comparing three reflection methods:           order thinking skills, such as critical thinking, metacognition, and
+reflection reports, short-answer reflection questions, and the AI agent-          problem-solving [23,28,29]. In these studies, GAI was embedded within
+based reflection. In addition, different methods may lead to different            structured reflection activities, allowing students to engage in guided
+reflection qualities among learners in high and low-performance teams             reflective processes targeting specific cognitive skills. For example,
+[15]. Therefore, we further explored the differences in reflection quality        Hong et al. [29] employed AI to handle lower-level tasks in essay
+between high and low-performance teams when using these three                     writing, enabling students to focus on evaluation and reflection, thereby
+reflection methods. We proposed the following research questions:                 enhancing critical thinking. Chen et al. [28] implemented metacognitive
+                                                                                  strategy-supported AI agents that prompted process-oriented reflection
+    RQ1: How does the AI agent reflection assistant affect learners’              and multi-perspective discussion, improving metacognitive skills. Zhou
+    reflection quality in SSRL?                                                   et al. [23] situated reflection within a self-regulated learning frame
+    RQ2: What differences do high and low-performance teams show in               work, showing that GAI-supported reflection indirectly benefits critical
+    reflection quality when using the three reflection methods?                   thinking and problem-solving.
+                                                                                      Although these studies demonstrate that AI agents can enhance
+    This study conducted a three-semester comparative teaching exper             higher-order thinking, reflection itself has often been treated merely as a
+iment to evaluate the impact of AI agents and two traditional reflection          learning process rather than a measurable skill. Reflection is a core
+methods (reflection reports and short-answer questions) on university             component of higher-order thinking and an essential learning compe
+students’ reflection quality. Using statistical analysis, content analysis,       tency for 21st-century university students. Empirical evidence directly
+and epistemic network analysis (ENA), this study examines the effec              examining the impact of AI agents on learners’ reflective abilities,
+tiveness of AI agents in enhancing university students’ reflection quality        particularly in collaborative learning environments, remains scarce.
+in SSRL.                                                                          Investigating this relationship is therefore necessary to understand how
+    The main contributions of this study are summarized as follows:               AI agents can effectively support the development of reflection.
+
+  - We introduce a practical SSRL activity, providing educators with a            2.2. Socially shared regulation of learning and reflection
+    valuable instructional framework for facilitating collaborative
+    learning.                                                                         Collaborative learning includes three primary types of regulation:
+  - We integrated an AI agent reflection assistant in SSRL and provided a         self-regulation (SR), co-regulation (CoR), and socially shared regulation
+    comprehensive debugging process, offering instructors examples and            (SSR) [30,31]. Based on SSR theory, socially shared regulation of
+    considerations of AI agent implementation.                                    learning (SSRL) is an emerging collaborative learning strategy empha
+  - We revealed the reflection quality differences between high and low-          sizing mutual support and feedback among team members. The strategy
+    performance teams in various reflection approaches and demon                 consists of four key stages: goal setting, task distribution, progress
+    strated the advantages of the AI agent for low-performance teams.             monitoring, and reflection evaluation [32–35]. Research indicates that
+                                                                                  the SSRL strategy has a positive impact on collaborative learning [36].
+    The research is organized as follows: Section 2 reviews prior research        Learners may enhance their awareness of the collaborative process and
+on AI agents in education, SSRL theory, and reflection. Section 3 de             facilitate the activation of regulatory processes through SSRL [4]. And
+scribes the participants, research design, and methods for data collection        SSRL helps to enhance learners’ cognitive and metacognitive abilities,
+and analysis. Section 4 compares reflection quality across the three              boosting learning motivation and engagement [37,38]. Additionally,
+methods and examines differences between high and low-performance                 SSRL fosters communication among team members, improving collab
+teams using ENA. Section 5 discusses the results and implications. The            orative efficiency [39]. Thus, SSRL has been widely incorporated into
+paper concludes with a summary and potential directions for future                collaborative learning and plays a significant role in enhancing various
+research.                                                                         learner abilities.
+                                                                                      Reflection quality is a key indicator for assessing the success of SSRL
+2. Literature review                                                              [39]. High-quality reflection is an indispensable component of SSRL, as
+                                                                                  it enables learners to examine and evaluate their learning processes and
+   To explore the impact of AI agents on learning processes, it is                outcomes [40]. Unlike conventional collaborative learning, the reflec
+essential to examine their application in education, followed by a dis           tion content in SSRL emphasizes the process of mutual regulation and
+cussion on SSRL and reflection.                                                   monitoring among group members. However, since reflection is the final
+                                                                                  stage of SSRL, educators often overlook its significance [41]. Teachers’
+2.1. AI agents in teaching                                                        lack of emphasis on the reflection stage may lead to low-quality
+                                                                                  reflection among students [42]. Achieving high-quality SSRL reflection
+   Generative Artificial Intelligence (GAI), defined as AI systems                remains a persistent challenge for educators and students [43].
+capable of autonomous learning and content generation, has been                       To enhance students’ reflective abilities, it is essential to focus on the
+
+                                                                              2
+Y. Zheng et al.                                                                                                      Computer Standards & Interfaces 97 (2026) 104094
+
+
+definition of reflection. Dewey [44] defined reflection as a continuous           elaborated on the activities of SSRL and the design process of the AI
+process of exploring and evaluating experiences, which helps in                  agent. Lastly, we discussed the coding scheme for reflection quality and
+dividuals gain a deeper understanding of their behaviors and outcomes.            provided the methodology for data collection and analysis.
+Zimmerman [45] further emphasized that self-reflection is a complex
+learning process involving various aspects of self-monitoring, such as            3.1. Participants
+self-assessment and feedback on contributions. In the theory of SSRL,
+reflection encompasses not only self-assessment but also shared moni                 The participants were from the course “Internet Thinking and Digital
+toring processes with others [39]. These theories provide support for             Self-Learning” over three semesters: Spring 2023, Fall 2023, and Spring
+exploring and promoting the reflective process.                                   2024. A total of 97 undergraduate students, aged 18 to 22, took part in
+    In reflective activities, teachers can support students’ deep learning        this study (Table 1).
+and reflective abilities through various intervention strategies, such as             At the beginning of each semester, students completed a pre-test
+scaffolding, reflective prompts, and feedback [46]. Reflective scaf              using the CThQ [63], which assesses six cognitive dimensions: mem
+folding involves providing structured guidance to help students more              ory, comprehension, application, analysis, evaluation, and creation
+effectively review and analyze their learning experiences [47]. When              (overall reliabilityα= 0.87). According to Dewey [64], critical thinking
+designing reflection tasks for SSRL, teachers often utilize the SSRL              is a deepening and extension of reflective thinking, with high consis
+reflection scaffolds developed by Panadero et al. [48]. Additionally,             tency in cognitive processing, reasoning, and evidence evaluation. The
+reflective prompts and guiding questions steer students toward specific           CThQ pre-test provides a valid proxy for students’ baseline reflection
+directions for reflection, assisting them in identifying potential barriers       levels. One-way ANOVA indicated no significant differences in pre-test
+and challenges in their learning [49]. Feedback provides learners with            total scores among the three groups (Group 1: M = 105.07, SD =
+suggestions or information to improve task performance, helping them              6.13; Group 2: M = 103.72, SD = 4.19; Group 3: M = 105.22, SD =
+optimize both their reflection and learning processes [50]. From a                4.24), F(2, 86) = 1.33, p = 0.27, suggesting comparable reflection
+cognitive perspective, feedback serves as guidance to enhance students’           abilities across groups prior to the intervention.
+task performance [51]. Timely feedback on students’ reflections not                   Participants were divided into 3 groups, each employing a different
+only improves the quality of subsequent reflections but also deepens              reflection method, and within each group, students were further divided
+their understanding of reflective concepts [52].                                  into teams using random assignment to minimize potential biases arising
+    Reflection journals, reflection reports, and reflection short-answer          from prior academic performance, familiarity, or interpersonal prefer
+questions have been explored to improve reflection quality [53,54].               ence. Random assignment was chosen over self-selection or instructor-
+However, the traditional methods may not adapt to the advancements of             based grouping to ensure group equivalence and to enhance the inter
+GAI. These require students to submit longer texts, which inevitably              nal validity of the comparative analysis [65].
+causes a risk of superficial reflections due to the use of GAI. Some                  The first group (G1), consisting of 31 students from the Spring 2023
+scholars have also modified reflection methods from a technological               semester, conducted reflection reports and were further divided into 7
+perspective by using various reflection platforms, such as Google Docs            teams. The second group (G2), consisting of 30 students from the Fall
+[55], Flipgrid [56], the VEO app [57], and Wiki [58]. However, these              2023 semester, conducted short-answer reflections and were divided
+platforms primarily offer static or limited interaction, which constrains         into 7 teams. The third group (G3), consisting of 36 students from the
+students’ ability to adaptively engage in reflective processes. The               Spring 2024 semester, conducted reflections through continuous ques
+low-quality reflection issues in SSRL urgently require new solutions.             tioning by an AI agent and were divided into 9 teams. Additional in
+    Although GAI poses challenges to traditional reflection methods, it           formation about the participants is provided in Table 1.
+also offers new solutions. AI agents are increasingly regarded as effective
+tools for supporting reflection practices. Research indicates that the use        3.2. Design of socially shared regulation of learning activities
+of AI agents in reflection activities may enhance students’ learning
+motivation and engagement [59]. Teachers can use AI agents to design                  During the 4-week activity, students collaborated in teams to pro
+reflection scaffolding, assisting learners in conducting more in-depth            duce micro-lesson videos lasting 5 to 8 min. The activity was divided
+and systematic reflections [60]. In addition, AI agents may enhance               into 4 stages, each lasting one week (Table 2).
+reflection quality through data analysis and intelligent feedback [61].               In the first week (goal setting), students were required to establish a
+Therefore, AI agents demonstrate potential in addressing the issue of             common goal, select the video’s theme, and outline the content frame
+improving SSRL reflection quality.                                                work. Then, they submitted a project proposal detailing the topic, ob
+    Thus, this study designed a reflection assistant by AI agents to              jectives, task distribution, and timeline. In the second week (task
+enhance university students’ reflection quality in SSRL. Statistical              distribution), the teams followed their project plan to allocate tasks and
+analysis, content analysis, and ENA were employed to collect and                  begin executing the project. The instructor provided guidance and
+analyze textual data related to reflection quality. By comparing the AI           suggestions throughout this process. In the third week (progress moni
+agent reflection assistant with traditional SSRL strategy reflection scaf        toring), each team submitted a video sample that was between 1 and 2
+folding, this study analyzed the differences in reflection content and            min long. The instructor conducted an initial evaluation based on the
+reflection levels among university students across three methods.                 sample and suggested improvement. Students refined and adjusted their
+Additionally, previous research suggests that high and low-performance            video production based on the feedback. In the fourth week (reflection
+teams may experience different effects from various reflection methods            evaluation), students submitted their completed micro-lesson videos
+[62]. Therefore, this study further explores the differences between high
+and low-performance teams when using three reflection methods. This               Table 1
+study provides new theoretical evidence for using AI agents in SSRL               Participant and group information.
+reflection practices.
+                                                                                   Group    Course      Reflection         Team      Participant   Female     Male
+                                                                                                        method
+3. Methodology
+                                                                                   G1       Spring      Report             7         31            17         14
+                                                                                            2023
+   This study employed a quasi-experiment to explore the differences               G2       Fall 2023   Short-answer       7         30            19         11
+among three reflection methods in SSRL. And examine whether AI                                          questions
+agents improve the reflection quality of university students. Firstly, we          G3       Spring      AI reflection      9         36            20         16
+                                                                                            2024        assistant
+provided information about the participants and the course. Then, we
+
+                                                                              3
+Y. Zheng et al.                                                                                                             Computer Standards & Interfaces 97 (2026) 104094
+
+
+Table 2                                                                                    AI agent reflection assistant, Crystal, was developed using the Coze
+The stages of SSRL.                                                                        platform (https://www.coze.cn/). The AI agent consists of 4 core com
+  Week      SSRL stages         Description                                                ponents, with Part A being the AI agent’s name, Part B defining the role
+                                                                                           setting and response logic, Part C specifying the conversational experi
+  1         Goal setting        Students discuss the goal, theme, and framework.
+  2         Task distribution   Students allocate tasks and make the micro lesson          ence, such as the opening dialogue, and Part D serving as the preview
+                                videos.                                                    interface. Developing the AI agent requires following these operational
+  3         Progress            Students monitor the task and submit a video sample.       steps.
+            monitoring
+  4         Reflection          Students submit completed micro-lesson videos and
+            evaluation          individual reflection assignments.
+                                                                                              Step 1: Create the AI agent and assign it to the name Crystal (as
+                                                                                              shown in Fig. 1, Part A). Define it as the reflection assistant for the
+                                                                                              course “Internet Thinking and Digital Self-Learning. Set its duty to
+and individual reflection assignments (employing different reflection                         guide students in completing tasks (as shown in Fig. 1 Part B) and
+methods for each of the three semesters). Finally, a reflection-sharing                       design the opening statement (as shown in Fig. 1 Part C).
+session was held in class, where students exchanged learning experi                          Step 2: Set up the reflection task (as shown in Fig. 1, Part B). Input all
+ences and insights.                                                                           the questions from the SSRL reflection scaffolding developed by
+                                                                                              Panadero et al. [48] into the AI agents as the question base. This
+                                                                                              ensures a logical flow of questions from the AI agent to the students,
+3.3. Design of the three reflection methods                                                   preventing task misdirection. In addition, the AI agent was not
+                                                                                              restricted to this fixed list but generated follow-up questions,
+    Prior to the reflection phase, all students completed a four-week                         particularly “Why” questions, based on the students’ specific an
+SSRL activity in which the instructor introduced and practiced the                            swers, which reflected its adaptiveness.
+four SSRL stages. Consequently, all reflections were anchored in the                          Step 3: Set up the response rule (as shown in Fig. 1, Part B). Establish
+teams’ performance across these four stages. In G1, the reflection                            the response rules for the AI agent:
+remained open-ended within this framework and only specified a min                           a. Ask only one reflection question per interaction.
+imum length of at least 200 words (no SSRL question list was provided).                       b. Provide encouraging feedback that adapts dynamically after each
+    In G2, students conducted individual reflections through short-                           response (e.g., “You did a great job”, “Your reflection is very
+answer questions. The guiding questions were derived from the SSRL                            insightful”).
+reflection scaffolding [48]. For example, questions included “What is the                     c. Avoid using academic terms.
+group’s current assignment?” and “What obstacles might the group                              d. Use only special interrogative questions (e.g., “What, ” “Why”),
+encounter?”                                                                                   with follow-up questions adjusted according to students’ responses.
+    G3 students used the AI agent reflection assistant for their re                          e. After answering all questions, conclude the conversation and ex
+flections. After the SSRL task, the instructor provided students with a                       press gratitude.
+quick response code (QR code) linking to the AI agent’s website. Stu                         Step 4: Testing and deployment (as shown in Fig. 1, Part D). Check
+dents scanned the QR code with their phones to initiate a conversation                        the conversation flow and ensure the AI agent’s smooth and effective
+with the AI agent. Each student completed the reflection task through                         interactions. Select 5 students for a second round of testing to ensure
+the dialogue.
+    The development process of the AI agent is illustrated in Fig. 1. The
+
+
+
+
+                                                       Fig. 1. AI agent development interface on the Coze platform.
+
+                                                                                       4
+Y. Zheng et al.                                                                                                         Computer Standards & Interfaces 97 (2026) 104094
+
+
+    the conversation flows smoothly. Once confirmed, the AI agent can                Table 3
+    be deployed and available to all students.                                       Learner reflection quality coding scheme.
+                                                                                      Categories       Coding    Description
+
+3.4. Experimental procedure                                                           Reflection       NOR       Lacking a reflection mindset.
+                                                                                        level          LOWR      Having a reflective mindset involves reviewing
+                                                                                                                 experiences, describing facts and feelings, and
+    The experimental procedure is illustrated in Fig. 2. As described in
+                                                                                                                 reflecting on what has been learned. It also
+the Participants section, all students completed the CThQ [63] as a                                              encompasses the ability to connect new knowledge
+pre-test before the course. They then attended a 16-week course                                                  with existing knowledge and to improve learning
+covering basic concepts. All students were taught by the same instructor,                                        strategies.
+with the course content, teaching methods, and learning resources                                      HIGHR     Critically analyzing the current situation, attempting to
+                                                                                                                 view problems from different perspectives, forming
+remaining entirely consistent across the three semesters. Students                                               new viewpoints from available resources, and seeking
+participated in a 4-week group collaboration activity, “creating micro                                           to test hypotheses.
+lesson videos”, conducted using the SSRL strategy. After the group ac                Reflection       DESR      A description of “what” the object of reflection is.
+tivity finished, each student was assigned an individual reflection task.               content        EXPR      An explanation of the causes behind the object of
+                                                                                                                 reflection, addressing the “why” often indicated by
+G1 and G2 used traditional reflection methods, with G1 completing
+                                                                                                                 keywords such as “in order to”, "due to", or "so as to".
+reflection reports and G2 answering short-answer questions. G3                                         CONR      Understanding whether the object of reflection has
+employed a new reflection method, utilizing the AI agent reflection                                              changed across different times and contexts, coupled
+assistant.                                                                                                       with an analysis of the reasons for these changes and
+                                                                                                                 their impact on behavior, represents a higher level of
+                                                                                                                 analysis concerning the “what” and “why”.
+3.5. Data collection and analysis                                                                      CRIR      It identifies personal or team issues and analyzes them
+                                                                                                                 with theory and practice to solve problems, focusing on
+                                                                                                                 “how” to achieve self-reconstruction. This may include
+    After the three semesters, the reflection texts of all students were                                         keywords like “needs improvement” or “next stage”.
+collected and anonymized. G1 produced 31 reflection reports totaling
+8032 words. G2 submitted 30 reflection short-answer texts, totaling
+15,468 words. G3′s AI agent reflection assistant dialogues comprised 36                  To ensure reliability, a coding discussion group comprised two ex
+submissions, totaling 16,801 words (excluding the AI agent’s questions).             perts and two professional coders. First, the two coders preliminarily
+    Content analysis was used to process the reflection texts. Through               coded the first 10 % of the reflection texts. In cases of disagreement, they
+systematic coding rules, this method reduced the influence of subjective             consulted with the experts to reach a consensus. After training and
+judgment and personal bias, thereby providing more objective results.                repeated practice, the coders achieved a high level of consistency. The
+The coding scheme consists of two parts: reflection level and reflection             coders strictly adhered to the revised coding scheme during the formal
+content, as shown in Table 3. The reflection level coding scheme is based            coding process. After coding, inter-coder reliability was calculated,
+on Plack et al. [66], and it is used to assess the overall reflection level of       yielding a Cohen’s kappa coefficient of 0.87, indicating that the coding
+learners, categorized into no reflection (NOR), low reflection (LOWR),               process had a high level of reliability. The coders consulted with experts
+and high reflection (HIGHR). The reflection content coding scheme is                 for different coding results and ultimately reached an agreement.
+based on Wang et al. [67] and is used to explore the differences in the                  After coding the reflection texts using the content analysis, ENA was
+types of learners’ reflection content. The reflection content is catego             employed to conduct a fine-grained analysis of the reflection data.
+rized into 4 types: descriptive reflection (DESR), explanatory reflection            Content analysis excels at systematically and objectively analyzing large
+(EXPR), connected reflection (CONR), and critical reflection (CRIR),                 volumes of textual content. ENA focuses on uncovering the complex
+with reflection quality progressively increasing across these categories.            relational networks between elements, such as reflection levels. The
+    The reflection texts in the reflection reports and short-answer re              combination of the two methods allows for attention to both the char
+flections were relatively longer, while those in the AI agent dialogues              acteristics of the text itself and the internal relationships between the
+were shorter. To mitigate the differences caused by these length dis                content elements. Additionally, the ENA Webkit (http://www.epist
+crepancies, this study used a single complete sentence as the minimum                emicnetwork.org/) provides a stable environment for data analysis.
+coding unit. For example, the statement “As the group leader, I am quite                 To investigate the differences in reflection quality between the high
+decisive. I directly assigned tasks to everyone, and the group was sup              and low-performance teams, we assessed the micro lesson videos
+portive.” should be coded as two separate sentences.                                 completed by students in SSRL. The videos were assessed by two experts
+
+
+
+
+                                                                Fig. 2. Experimental procedure.
+
+                                                                                 5
+Y. Zheng et al.                                                                                                      Computer Standards & Interfaces 97 (2026) 104094
+
+
+in education, each with over 10 years of teaching experience. The                 Table 5
+evaluation criteria included the following categories, with topic selec          The result of the Kruskal-Wallis H test.
+tion worth 10 points, instructional design 40 points, content complete                                   Codes     Mean score                    χ²         p
+ness 20 points, audio-visual quality 20 points, and artistry 10 points.
+                                                                                                                    G1           G2      G3
+Each group received a score ranging from 0 to 100 points. The two ex
+perts thoroughly discussed the evaluation criteria to ensure consistency           Reflection quality     NOR       0.018        0.005   0.088    6.557      0.038
+                                                                                                          LOWR      0.267        0.163   0.232
+in scoring and then individually assessed all instructional designs and
+                                                                                                                                                            
+                                                                                                          HIGHR     0.018        0.044   0.218              
+materials. The scoring consistency between the two experts (Spearman                                      DESR      0.229        0.197   0.262              
+correlation coefficient) was 0.86 (p < 0.01).                                                             EXPR      0.100        0.103   0.264              
+    The average score from both experts was used as the final score for                                   CONR      0.038        0.028   0.221              
+                                                                                                          CRIR      0.006        0.037   0.200
+each group (Table 4). The grouping criteria for high and low performing                                                                                     
+
+teams proposed by Hou [68] have been widely adopted by scholars [69].
+In this study, based on those criteria, the top 15 % of teams were clas          significance of 0.038. The mean ranks for the 3 groups were G1 = 9.14,
+sified as the high-performance teams, including G1-team7, G2-team2,               G2 = 8.00, and G3 = 15.86. The results indicate a statistically significant
+and G3-team1. The bottom 15 % of teams were classified as the                     difference in reflection scores between the groups (p = 0.038). Specif
+low-performance teams, including G1-team5, G2-team6, and G3-team4.                ically, G3′s mean rank was significantly higher than G1 and G2, indi
+Using ENA, we further explored the differences between the high and               cating that using the AI agent is associated with higher performance.
+low-performance teams of students.                                                    To further investigate the observed differences, we applied ENA for a
+                                                                                  fine-grained analysis of the students’ reflections across the 3 reflection
+3.6. IRB approval and AI agent data privacy                                       methods. This analysis aims to uncover the epistemic structures and
+                                                                                  patterns, providing deeper insights into how different reflection
+    This study has received approval from the Institutional Review Board          methods influence the quality and complexity of students’ reflection
+(IRB) of the university, ensuring that all ethical standards are met. All         processes. By analyzing epistemic networks, we may better understand
+students participated voluntarily, fully aware of the study’s purpose and         the specific epistemic factors and relationships underlying the differ
+procedures, and signed informed consent forms prior to the                        ences observed in the statistical results.
+commencement of the experiment. In addition, to protect participants’                 Fig. 3 presents a comparative ENA network model of reflection
+privacy, all data collected during the study were anonymized.                     content for the three groups using different reflection methods. In this
+    All conversations on the Coze platform were fully anonymized, and             model, nodes represent individual reflection codes, and edges indicate
+students were reminded before using the platform not to enter any                 the co-occurrence of codes within each unit of analysis. Blue, red, and
+personal or sensitive information (such as name, student ID, gender, or           purple dots denote the centroids of students in G1, G2, and G3,
+school). Data was labeled only with class sequence numbers (e.g., Stu            respectively, while the four black dots represent the four categories of
+dent 1, Student 2), and access was strictly limited to the research team.         reflection content (DESR, EXPR, CRIR, CONR). ENA applies singular
+In addition, all students signed the Coze platform’s privacy protection           value decomposition (SVD) to reduce the network model to two di
+agreement, and the platform further ensures data security through                 mensions, which together account for 70.1 % of the variance (SVD1 =
+anonymization and encryption techniques.                                          51.5 %, SVD2 = 18.6 %). The x-axis in the ENA space (SVD1) defines the
+                                                                                  dimension of reflection content, with the right side (higher x-values)
+4. Results                                                                        representing DESR codes and the left side (lower x-values) representing
+                                                                                  CONR codes. The y-axis (SVD2) in the ENA space defines the dimension
+    The results are organized to address the key research questions               of reflection content, where the CRIR and EXPR codes are positioned
+regarding the effectiveness of the AI agent and the differences in                higher (with higher y-values). The DESR code is located lower in the
+reflection quality across various reflection methods.                             ENA space (with lower y-values). This model allows comparison across
+                                                                                  students and groups, showing which types of reflection are more
+4.1. How does the AI agent reflection assistant affect learners’ reflection       dominant and how reflection content patterns differ between groups.
+quality in SSRL?                                                                      The right side of Fig. 3 displays the mean networks of the 3 groups.
+                                                                                  Overall, the reflection content of all 3 groups predominantly features
+    A Kruskal-Wallis H test was conducted to assess the differences in            EXPR and DESR, with a strong association observed between these two
+SSRL reflection scores among the 3 groups of students using different             points. The reflection content network of G1 is the sparsest, with only a
+reflection methods, as shown in Table 5. The test compares independent            few occurrences of CRIR, aside from the relatively frequent appearances
+samples without assuming a normal data distribution. This makes it                of EXPR and DESR. The network of G2 is more concentrated, with dis
+highly suitable for analyzing the multiple groups of non-normally                 tribution across all 4 reflection types and a stronger CRIR-DESR
+distributed reflection data in this study.                                        connection (value of 0.10). The reflection content of G3 is the most
+    For this analysis, an overall reflection quality score was calculated         densely connected, with all 4 types having a relatively high proportion
+for each student by taking the meaning of all seven reflection codes              of representation. The CRIR-CONR (0.23) and CONR-EXPR (0.13) con
+(NOR, LOWR, HIGHR, DESR, EXPR, CONR, CRIR). This composite score                  nections are relatively strong. In contrast, the other pairs based on
+was used for the Kruskal-Wallis H test, while the mean scores for indi           traditional SSRL reflection did not exhibit strong correlations.
+vidual codes presented in Table 5 are provided only for descriptive                   Table 6 demonstrates how the AI agent, through guided dialogue,
+purposes.                                                                         facilitated the transition of G3 students from connective reflection
+    The results showed a chi-square value of 6.557, and an asymptotic             (CONR) to critical reflection (CRIR), thereby deepening the SSRL
+
+
+Table 4
+Scores of the SSRL performance for the 3 groups.
+  Group           team1            team2           team3            team4            team5              team6            team7            team8             team9
+
+  G1              86.0             90.0            88.5             76.5             68.5               87.0             92.0             NA                NA
+  G2              83.5             93.5            87.0             90.5             81.0               71.5             76.5             NA                NA
+  G3              94.0             84.5            75.0             71.5             88.0               76.0             89.5             84.5              90.5
+
+
+                                                                              6
+Y. Zheng et al.                                                                                                                         Computer Standards & Interfaces 97 (2026) 104094
+
+
+
+
+                                                                        Fig. 3. Comparison of reflection content.
+
+
+                                                                                                   then proposed a concrete plan for deeper learning of AI technology
+Table 6
+                                                                                                   (CONR). The AI agent’s continuous questioning and feedback prompted
+Transcribed example of the G3 reflection content.
+                                                                                                   students to progress from simple descriptive reflection to more critical
+  Student     Transcript                                                                           and in-depth reflection throughout the reflection process.
+  1           AI agent: What problems did you encounter, and how did you solve them?                  Table 7 presents reflection examples from some G1 and G2 students,
+              Student 9: We initially struggled, but by discussing with our team and               highlighting the impact of different reflection forms and guidance
+              leader, we solved the problems and saw how teamwork improved our
+              process over time. (CONR)
+              AI agent: What are your plans moving forward?
+                                                                                                   Table 7
+              Student 9: I hope for more team projects to use everyone’s strengths and
+              improve how I contribute next time. (CRIR)
+                                                                                                   Transcript example of the G1 and G2 reflection content.
+  2           AI agent: What were the shortcomings of this micro-lecture task?                      Group    Transcript
+              Student 18: Our PPT didn’t cover everything, so next time I’ll make it
+              more complete. (CRIR)                                                                 G1       G1-Student 4: Our group chose a radio show format for this Himalaya
+              AI agent: I see, that’s a very insightful analysis. What plans do you have for                 assignment. (DESR) I’ve always been a fan of radio shows, so I was very
+              your future studies?                                                                           happy to have the opportunity to create one this time. (DESR) Of course, I
+              Student 18: I plan to learn more about AI video production to improve my                       also faced some challenges during the production process (DESR), such as
+              skills for future jobs. (CONR)                                                                 the tone not fitting the storyline and the quality of the program needing to
+  3           AI agent: Did you encounter any difficulties and challenges while                              be better. (EXPR)
+              completing the task? If so, what were they, and how did you solve them?               G1       G1-Student 30: Regarding this task, firstly, we didn’t do well in the
+              Student 31: Our biggest challenge was managing time, but adjusting our                         presentation aspect. The presentation was only in the form of a document,
+              schedule helped us complete the tasks. (CONR)                                                  which needed to ensure a smooth connection between the presentation and
+              AI agent: Great. How did you overcome these difficulties and challenges?                       the work, making it difficult to access the content. (CONR) Secondly, the
+              Student 31: We held weekly meetings to discuss. (DESR)                                         content presentation was poorly executed and lacked a logical structure.
+  4           AI agent: Please summarize the successful aspects of this task.                                (EXPR) Finally, the speech was not coherent during the presentation, and
+              Student 36: The tasks were well-organized, and because our team                                the preparation was insufficient. (EXPR)
+              cooperated closely, we were able to complete the work more efficiently                G2       G2-Student 6:
+              than at the beginning. (CONR)                                                                  Task: We approached the task mainly in two aspects. (DESR) The first part
+                                                                                                             determined the theme and type of work, and the second part recorded the
+                                                                                                             work. (DESR)
+                                                                                                             Division of labor: Our division of labor and cooperation were very
+reflection process. Under the guidance of the AI agent, student 9 and
+                                                                                                             reasonable, and each member completed their assigned tasks. (EXPR)
+student 31 shifted from describing the current state of teamwork and                                         Self-evaluation: Very successful. (DESR)
+time management, such as “We solved problems through communica                                              Outlook: We plan to work more collaboratively on each task and strive to do
+tion with team members” (CONR), to deeper reflections on self-                                               our best. (CRIR)
+improvement and future learning plans, exemplified by “I hope for                                   G2       G2-Student 27:
+                                                                                                             Task: This task enhanced our understanding of content production and
+more team projects to utilize everyone’s potential” (CRIR). Prompted by                                      strengthened the collaboration among team members. (EXPR)
+the AI agent’s questioning, student 10 and student 36 reflected on the                                       Division of labor: Our team had a clear division of responsibilities, and
+shortcomings of the SSRL tasks, noting that “The resources were not                                          everyone had their tasks (EXPR). I was responsible for the recording, which
+comprehensive, and most content lacked innovation” (CONR), and                                               was quite challenging. (EXPR)
+                                                                                                             Self-evaluation: Although our team may not have been the best among all
+further analyzed the root causes of these issues, along with potential
+                                                                                                             the teams, we had unique messages to convey. (CONR) If there is a next
+improvement measures (CRIR). Inspired by the AI agent, student 18 first                                      time, we will strive to improve it. (CRIR)
+identified the issue of inadequate presentation in the task (CRIR) and                                       Outlook: We should promote our work more effectively. (CRIR)
+
+
+                                                                                               7
+Y. Zheng et al.                                                                                                    Computer Standards & Interfaces 97 (2026) 104094
+
+
+methods on students’ reflection quality. Two G1 students (student 4 and            study, the U values are relatively high; however, they remain within the
+student 30) conducted their reflections in the form of reports. Due to the         acceptable range for statistical analysis. Some of these differences
+lack of specific guidance from the instructor, who only provided general           showed relatively small effect sizes, which will be further addressed in
+requirements, their reflections remained superficial, primarily involving          the discussion section.
+DESR and EXPR. For example, student 4 wrote, “I have always enjoyed
+radio shows, so I was very pleased to have the opportunity to create one           4.2. What differences do high and low-performance teams show in
+this time. “Student 30 mentioned, "The tone did not match the storyline,           reflection quality when using the three reflection methods?
+and the sound quality of the program was poor. These reflections remain
+limited to mere descriptions of the phenomena, needing more in-depth                   Fig. 4 illustrates the distribution of students from the 3 reflection
+analysis of the underlying causes and offering no insights for future              methods (G1, G2, G3) along the two principal component axes (SVD1
+improvement. This tendency may be related to the relatively broad                  and SVD2). The points of different colors and shapes in the figure
+scope of the reports. These examples demonstrate that structured                   represent high and low-performance teams within each group, indi
+guidance exerts a positive effect on the quality of reflection. In addition,       cating their performance across various reflection categories, such as
+they highlight the importance of timely feedback and question                      DESR, EXPR, CONR, and CRIR. The SVD1 axis accounts for 77.3 % of the
+prompting. Providing students with immediate feedback based on their               total variance, while the SVD2 axis explains 16.8 %. The position of each
+responses and guiding them toward more elaborated answers contrib                 point represents the students’ tendencies in reflection content, with
+utes to fostering deeper levels of reflection.                                     points closer to a specific reflection category indicating that the group’s
+    In contrast, two students from Group G2 (student 6 and student 27),            performance is more concentrated in that category.
+guided by the 4 aspects provided by the instructor and reflecting                      In Fig. 4, the centroids of the low-performance teams in G1 and G2
+through short-answer questions, demonstrated a higher reflection                   are positioned relatively close to each other, with the low-performance
+quality. The instructor guided students to reflect on four dimensions,             teams located higher near DESR. Conversely, the high-performance
+including task, division of labor, self-evaluation, and outlook. This              teams are situated lower, closer to CRIR. This indicates a certain de
+approach, particularly in the latter two areas, effectively fostered CRIR          gree of similarity in the reflection content between the low-performance
+and CONR. For example, student 6 mentioned, “We plan to collaborate                teams in G1 and G2. G3 is distributed on the right side of the figure, with
+more effectively in completing each future learning task, striving to              a greater distance between the high and low-performance teams, indi
+achieve the best outcome” (CRIR). At the same time, student 27 stated,             cating a more pronounced difference in reflection content than the other
+“Although our team may not be the best among all teams, we conveyed                teams. Unlike G1 and G2, the G3 high-performance teams are positioned
+our unique message. If there is a next time, we will work harder to                at the top, closer to CONR, while the low-performance teams are located
+improve” (CONR and CRIR). This structured guidance enhanced the                    at the bottom, near CRIR and EXPR. This suggests that the high-
+depth of reflection. However, since short-answer questions are a one-              performance teams in G3 tend to engage more in connective reflec
+way form of reflection for students, the instructor may not intervene              tion, whereas the low-performance teams focus more on critical and
+in their responses. As a result, there may be instances where students             explanatory reflection.
+provide irrelevant answers or overly brief responses, which can affect                 The study employed the Mann-Whitney U test to elucidate further
+the overall reflection quality. For instance, student 6 responded with             the scaling characteristics of the differences in reflection content be
+“Very successful” in the self-evaluation section (DESR), which lacked              tween the high and low-performance teams across the 3 cohorts
+depth in reflection. The AI agent could address this shortcoming by                (Table 8). According to the results of the Mann-Whitney U test, there are
+facilitating continuous interaction and feedback, encouraging students             differences in the reflection content performance between the high and
+to engage in deeper reflection.
+    When comparing the effectiveness of the reflection methods in G1,
+G2, and G3, G1′s reflection reports were of lower quality, primarily
+focusing on DESR and EXPR. Due to the absence of specific guidance, the
+reflections needed more depth. The short-answer questions format in G2
+improved reflection quality to some extent. Students’ reflections became
+more focused with the instructor’s guidance, particularly improving
+CRIR and CONR. However, this approach is still constrained by the
+limitations of outcome-based assessment. The AI agent guidance in G3
+further enhanced reflection quality. Through real-time feedback and
+targeted questioning, students could engage in deeper levels of CRIR and
+CONR.
+    To scale these differences, the Mann-Whitney U test was employed to
+evaluate the distribution of the projection points of the 3 groups of
+students within the ENA space. The results indicated that at the α = 0.05
+significance level, G1 and G2 showed significant differences in both the
+first dimension (U = 147,537, p = 0.01, r = 0.09) and the second
+dimension (U = 147,204, p = 0.01, r = 0.08). This suggests that the
+structured guidance provided by short-answer questions enhances
+reflection quality. G1 and G3 also showed a significant difference in the
+first dimension (U = 99,595.5, p = 0.00, r = 0.34), highlighting the
+impact of integrating the AI agent in G3 to enhance reflection quality.
+However, no difference was observed in the second dimension (U =
+147,049.5, p = 0.42, r = 0.03). Additionally, G2 and G3 exhibited dif
+ferences in both the first dimension (U = 127,246.5, p = 0.00, r = 0.36)
+and the second dimension (U = 215,386.5, p = 0.01, r = − 0.08), further
+demonstrating the effectiveness of the AI agent in fostering deeper
+reflection. This effect surpasses that of the structured short-answer              Fig. 4. The centroid distribution of high and low group students across the
+questions approach alone. Notably, due to the large sample size in this            three reflection methods.
+
+                                                                               8
+Y. Zheng et al.                                                                                                     Computer Standards & Interfaces 97 (2026) 104094
+
+
+Table 8
+The reflection content distribution of high and low-performance teams across the three methods.
+
+
+
+
+low-performance teams across different reflection approaches. In G1,               highlighted, AI may assist learners in constructing their learning pro
+the high and low-performance teams did not exhibit significant differ             cesses, thereby enhancing critical thinking. In higher education, Xia and
+ences in either dimension (MR1: U = 4932.00, p = 0.41, r = 0.05; MR2:              Li [73] also suggested that AI assistants have a positive impact on stu
+U = 5463.00, p = 0.44, r = 0.05). In G2, the high and low-performance              dents’ imagination, creativity, critical thinking, and autonomous
+teams showed a significant difference in the MR1 dimension (U =                    learning. Zang et al. [69] experimentally confirmed the role of AI agents
+3303.00, p = 0.03, r = 0.19) but no difference in the MR2 dimension (U             in enhancing students’ critical thinking in English learning. However,
+= 3051.00, p = 0.26, r = 0.10). For G3 (students using AI agent-driven             the systematic review by Mohamud et al. [74] indicated that the
+continuous questioning), the high and low-performance teams showed a               introduction of AI in higher education may diminish students’ critical
+significant difference in the MR1 dimension (U = 1136.50, p < 0.001, r             thinking. This conclusion contradicts the findings of this study. The
+= 0.45). In contrast, the difference in the MR2 dimension was insignif            differences may be due to a lack of proper instructional design by
+icant (U = 2187.50, p = 0.54, r = 0.06).                                           teachers when using AI [74]. Cronje [75] argued that AI may serve as a
+    In G3, the differences between the high and low-performance teams              teaching assistant to facilitate learning, but it should be integrated with
+were the most pronounced, particularly on the MR1 dimension. Further               instructional design and necessary prompts. In this study, the SSRL
+analysis of the ENA diagram revealed that low-performance teams                    reflection checklist was operationalized as structured prompts to cali
+exhibited stronger connections in EXPR-CRIR (0.46) and EXPR-CONR                   brate the AI agent, enabling it to scaffold students’ reflections across the
+(0.61). This suggests that the AI agent-driven reflection method may               four phases of SSRL. By embedding SSRL principles into its dialogic
+help low-performance teams focus more on specific reflection content.              design, the agent acted as both a facilitator of reflection and a medium
+                                                                                   for delivering theoretical scaffolds. This underscores the importance for
+5. Discussion                                                                      educators and researchers to apply instructional theory and design
+                                                                                   thoughtfully when integrating AI into the classroom.
+    This section analyzes the findings based on the research questions. It             In addition to SSRL theoretical guidance, the AI agent leveraged its
+covers the positive impact of AI agents on students’ SSRL reflection,              technological capabilities, including continuous questioning and real-
+differences in reflection quality between high and low-performance                 time feedback, to actively scaffold deeper student reflections. Wolf
+teams, and key considerations for using AI agents effectively in SSRL.             bauer et al. [76] noted that continuous dialogue with intelligent assis
+                                                                                   tants enhances students’ levels of reflection. In the G3 group, the AI
+                                                                                   agents not only guided students to explore the root causes of issues but
+5.1. The positive role of AI agents in students’ SSRL reflection
+                                                                                   also helped them develop specific improvement plans. This guiding
+                                                                                   process is similar to the “Socratic method” in educational psychology.
+    In SSRL, the AI agent reflection assistant enhanced the quality of
+                                                                                   Through a series of targeted questions, students are encouraged to
+students’ reflections. This outcome aligns with previous research [70,
+                                                                                   engage in deep thinking and gain a more profound understanding of the
+71]. For instance, Maedche et al. [70] demonstrated the positive role of
+                                                                                   knowledge [77]. In addition, the timely feedback function of AI agents
+AI agents in fostering deeper reflection among students. Sigman et al.
+                                                                                   plays a crucial role in enhancing the quality of students’ SSRL re
+[71] also found that AI assistants emulate and augment human cogni
+                                                                                   flections. Self-determination theory suggests that providing positive
+tion, thereby promoting reflection. These studies provide more evidence
+                                                                                   emotional support through feedback helps students gain a sense of
+of the positive impact AI agents have on facilitating reflective practices
+                                                                                   belonging, thereby enhancing their motivation to learn and willingness
+in education.
+                                                                                   to reflect [78]. Uygur et al. [79] suggested that timely feedback
+    This study further clarifies how AI agents enhance the quality of
+                                                                                   enhanced students’ reflection and learning. However, traditional SSRL
+student reflection in the SSRL process through ENA. In these activities,
+                                                                                   reflection reports and short-answer questions are one-way reflective
+student reflections guided by AI agents exhibited higher levels of critical
+                                                                                   activities, lacking immediate feedback and guidance. The AI agent
+thinking and coherence. In contrast, the other two traditional reflective
+                                                                                   reflection assistant compensates for the shortcomings of teachers in
+texts displayed lower levels of reflection, focusing primarily on
+                                                                                   providing timely feedback, enhancing the effectiveness of collaborative
+descriptive and exploratory reflection. As Rusandi et al. [72]
+
+                                                                               9
+Y. Zheng et al.                                                                                                     Computer Standards & Interfaces 97 (2026) 104094
+
+
+learning.                                                                           examine how to fine-tune AI guidance so that it benefits high performers
+    This study indicates that the level of reflection guidance directly             without disrupting their existing strategies.
+affects learners’ reflection quality, which is consistent with previous                 Additionally, there was no significant difference in performance
+research [80–82]. G1, with minimal guidance, showed the lowest                      between high and low-performance student teams in reflective reports,
+quality, while G2, guided by the SSRL reflection checklist, exhibited               with both showing low quality reflections. This may be due to learners
+higher-quality reflections, demonstrating the importance of SSRL scaf              lacking clear guidance in the reflection process. Maedche et al. [70]
+folds. G3 combined SSRL scaffolding with real-time feedback and                     found that in reflective environments lacking external feedback or
+encouragement for deeper reflection. Comparisons suggest that while                 structured guidance, the quality of students’ reflections is constrained.
+structured short-answer questions had a limited impact, the AI agent                This suggests that instructors should provide the necessary scaffolding
+provided a practically meaningful enhancement of students’ reflective               when designing reflective tasks. The SSRL scaffolding demonstrated
+practices. However, these findings are based primarily on qualitative               significant value in this study and is well-suited for broader application
+data, and further quantitative research is needed to validate them.                 in collaborative settings.
+    In summary, AI agents play a substantial role in promoting student
+reflection. Although the comparison between structured short-answer                 5.3. Considerations for the effective use of AI agents in SSRL
+questions and traditional reflective reports showed statistically signifi
+cant but very small effects, this suggests that short-answer questions                   Although experiments have demonstrated that AI agents enhance
+alone had a limited impact on enhancing students’ reflection quality. In            SSRL reflection quality, there are several limitations in their usage. To
+contrast, the AI agent had a substantially greater impact on students’              better promote the outcomes of this study, we offer considerations for
+reflective practices. It is essential for educators and instructional de           teachers and instructional designers regarding the use of AI agents.
+signers to integrate AI agents into classrooms and develop more                          Firstly, the quality and reliability of feedback provided by AI agents
+instructional design case studies. Moreover, teachers should prioritize             still present limitations. This finding aligns with the studies of Maloney
+the importance of instructional theories and provide essential design               et al. [91] and Fedus et al. [92], which suggest that the accuracy and
+guidance when applying AI agents.                                                   effectiveness of AI agents depend on algorithm design and data quality.
+                                                                                    In this study, the AI agent exhibited two primary issues: repeated
+5.2. Differences between high and low-performance teams under various               questioning and unexpected interruptions during conversations. To
+SSRL reflection methods                                                             address the issue of repeated questioning, adjustments to the prompt
+                                                                                    design can be implemented. For example, the prompts specify that each
+    The results indicate a significant difference in the high and low-              question should be asked only once and repeated only if the student
+performance teams that utilized reflective short-answer questions and               responds off-topic or does not answer. For unexpected interruptions,
+the AI agent reflection assistant. In short-answer questions, high-                 teachers need to guide students in testing their network environment
+performance teams performed better. This aligns with the conclusions                and re-engaging with the task. These observations show that AI agents
+of Knight et al. [83], who found that high-performance students out                need improvement in handling complex contexts and dynamic learning
+performed low-performance students in reflective questions. The                     needs.
+disparity in reflection between high and low-performance learners is                     In addition, data privacy and ethical concerns pose another chal
+primarily attributed to their metacognitive levels and learning strategies          lenge in the application of AI agents. AI agents require extensive data
+[84–86]. For instance, Safari and Fitriati [85] found that                          collection, including students’ reflection content, behavioral patterns,
+high-performance learners were able to use all strategies equally, but              and learning habits [93]. To mitigate this issue, this study incorporated
+low-performance learners more frequently relied on metacognitive and                an opening message in the AI agent’s script. The message advised stu
+social strategies. These differences may impact learners’ outcomes,                 dents: “Please do not disclose personal sensitive information, such as
+including their learning effectiveness and reflection [84].                         your name or school, during the interaction.” Furthermore, before
+    In contrast, the reflection quality of low-performance teams using the          implementing the AI agent, teachers need to raise students’ awareness of
+AI agent reflective assistant was better than that of the high-                     data security and privacy protection [94].
+performance teams. This is a novel finding of the study, suggesting                      The risks associated with over-reliance on AI technology should also
+that the AI reflective assistant played a positive role in guiding low-             be carefully evaluated. Although AI agents can provide personalized
+performance learners through the reflection process. This finding                   support, they cannot fully replace the role of human teachers, particu
+aligns with previous evidence showing that AI technologies tend to                  larly in offering emotional support and fostering social interaction [95].
+provide greater benefits for lower performers [87–90]. Prior studies                In this study, AI agents were utilized exclusively in the post-class
+have suggested that such differential effects often occur because an AI             reflection phase. The remaining instructional time relied on
+chatbot can use adaptive strategies and personalized feedback to address            face-to-face interactions between teachers and students. As GAI tech
+the strategic gaps of low performers [88]. AI tutoring can also offer both          nology becomes increasingly accessible, preventing students from
+cognitive and emotional support [89]. Xu et al. [90] further found that             developing dependency behaviors may become more challenging.
+low-performing learners become more engaged when they receive im                   Future research could explore strategies to prevent learners from
+mediate feedback and external help. This engagement encourages them                 becoming overly reliant on GAI technologies.
+to apply higher-order thinking strategies more actively.                                 While AI agents have demonstrated advantages in enhancing stu
+    These mechanisms may also explain the current results in our SSRL               dents’ SSRL reflection quality, their widespread applicability is con
+reflection task. The AI reflection assistant provided structured guidance           strained by feedback quality, data privacy, and ethical considerations.
+in real time and reduced the cognitive load of producing reflections. This          Future research should emphasize these limitations, refining the appli
+allowed low-performing learners to focus more on critical and creative              cation framework of AI to ensure its effectiveness and sustainability in
+thinking. In contrast, high-performing learners may already have                    the educational domain.
+established reflection routines. Extra guidance could interfere with these
+processes, leading to smaller gains in reflection quality [87].                     6. Conclusion, limitations, and future research
+    This study, therefore, not only confirms that differential effects exist
+in reflection tasks but also highlights the potential of AI support to                  This study explores methods to enhance student reflection quality by
+promote higher-order thinking in low-performing learners. In educa                 designing an AI agent that supports reflection through continuous
+tional practice, this suggests that AI reflection assistants could be stra         questioning and real-time feedback. Using content analysis and ENA,
+tegically deployed to close performance gaps. Future research could                 this study conducted a three-semester experiment comparing reflection
+
+                                                                               10
+Y. Zheng et al.                                                                                                           Computer Standards & Interfaces 97 (2026) 104094
+
+
+reports, short-answer questions, and an AI agent reflection assistant. The          National Natural Science Foundation of China (Grant Number:
+results indicate that AI agents improve reflection quality, particularly for        62577035). The other authors declare that they have no known
+low-performance teams. The study offers practical guidance for inte                competing financial interests or personal relationships that could have
+grating AI into SSRL-based instruction.                                             appeared to influence the work reported in this paper.
+    Although this study contributes to understanding students’ reflection
+behaviors in SSRL, several limitations remain. The first limitation arises          Appendix A. The Critical Thinking Questionnaire (CThQ)
+from the study participants. Conducted within a higher education
+setting, this research primarily examines the effectiveness of using AI                Instructions: For each statement below, please indicate how much
+agents to facilitate reflection among university students. Only 97 stu             you agree using a 5-point Likert scale (1 = Strongly disagree, 2 =
+dents from the “Internet Thinking and Digital Self-Learning” course                 Disagree, 3 = Neutral, 4 = Agree, 5 = Strongly agree).
+participated, so the findings may not be generalizable to other courses or
+age groups. Further research is needed to explore the potential impact                 1. After reading a text, I check important information, even if it
+and adaptability of AI agents in secondary and primary education set                     seems to be true.
+tings [96]. Secondly, the AI agent still has limitations in the quality and            2. I like combining information from different texts.
+reliability of feedback, which may affect the depth and quality of stu                3. I am willing to share newly acquired information.
+dents’ reflections. Addressing this issue relies on rapidly updating and               4. In-depth analyses of reality is a waste of time.
+optimizing large AI model algorithms to provide higher-quality and                     5. After reading a text, I can recall important points.
+more targeted feedback. The third limitation is that the three reflection              6. The same content can be expressed in many different ways.
+methods used in this experiment all fall under outcome-based reflection,               7. I can understand texts from various fields.
+overlooking the dynamic process of students’ reflections at different                  8. I form my impressions based on various pieces of information that
+stages of collaborative learning. Additionally, the proposed mechanisms                   I combine.
+underlying the AI agent’s impact on reflection quality, particularly for               9. Everything already exists, so nothing completely new can be
+low-performance teams, remain hypothetical and require further                            created.
+empirical validation through quantitative studies. Lastly, this study did             10. When I talk, I give many examples.
+not differentiate the specific contributions of individual design elements            11. In discussions, I care about justifying my stance while under
+in the AI agent’s interaction strategy (e.g., sequential questioning,                     standing the other party.
+encouraging feedback, simplified language). More research could adopt                 12. I like finding connections between seemingly different
+ablation analysis to examine how these elements independently influ                      phenomena.
+ence students’ reflective practices.                                                  13. I can see the structure of a text, and I could reorganize it.
+    Based on the limitations identified in this study, future research                14. When discussing, I try to use practical examples to justify my
+could expand the study to more diverse educational contexts, including                    stance.
+secondary and primary education, to examine the generalizability and                  15. If necessary, I can recall information I have read before.
+adaptability of AI agents. Incorporating multi-modal data, such as stu               16. I do not remember much of what I learned at school.
+dents’ facial expressions, gestures, and dialogue, may offer a more                   17. When I am interested in some information, I try to verify whether
+comprehensive understanding of reflective behaviors in SSRL. Im                          it is true.
+provements in AI models are needed to enhance the quality and reli                   18. I can extract the most relevant parts of a text.
+ability of feedback, supporting deeper and higher-quality student                     19. To evaluate information, I check multiple sources.
+reflections. In addition, investigating the individual contributions of               20. I like discussing new interpretations of texts I already know.
+specific design elements in AI agents’ interaction strategies, for example,           21. I like to collate different opinions and compare them.
+through ablation-style comparisons, could clarify which features most                 22. I have difficulties with paraphrasing.
+effectively promote high-order reflection, particularly among low-                    23. I try to apply the information I have learned in everyday life.
+performance teams. We therefore urge more researchers to focus on                     24. When I read, I look for relationships between its information and
+this area of study, exploring the impact of GAI on educational outcomes                   other texts I have read.
+to better understand and harness its potential for improving educational              25. I pay attention to the contexts, nuances, and overtones of
+practices.                                                                                statements.
+
+Declaration of generative AI in the writing process
+                                                                                    Data availability
+   During the preparation of this work, the authors used Kimi (https:
+//kimi.moonshot.cn/) to improve language and readability. After                        The datasets generated and analyzed during the current study are
+using this tool, the authors reviewed and edited the content as needed              available from the corresponding author on reasonable request.
+and take full responsibility for the content of the publication.
+                                                                                    References
+CRediT authorship contribution statement
+                                                                                    [1] S. Ahmad, M. Rahmat, M. Mubarik, M. Alam, S. Hyder, Artificial intelligence and
+                                                                                        its role in education, Sustainability 13 (22) (2021) 12902.
+   Yumin Zheng: Writing – original draft, Conceptualization. Fengjiao               [2] X. Gong, Z. Li, A. Qiao, Impact of generative AI dialogic feedback on different
+Tu: Investigation, Data curation. Fengfang Shu: Investigation, Data                     stages of programming problem solving, Educ. Inf. Technol. 30 (7) (2025)
+curation. Chaowang Shang: Formal analysis, Data curation. Lulu                          9689–9709.
+                                                                                    [3] O. Tapalova, N. Zhiyenbayeva, D. Gura, Artificial Intelligence in Education: aIEd
+Chen: Writing – review & editing, Formal analysis. Jiang Meng:                          for personalised learning pathways, Electron. J. e-Learn. 20 (5) (2022) 639–653.
+Investigation.                                                                      [4] S. Järvelä, P. Kirschner, E. Panadero, J. Malmberg, C. Phielix, J. Jaspers,
+                                                                                        M. Koivuniemi, H. Järvenoja, Enhancing socially shared regulation in collaborative
+                                                                                        learning groups: designing for CSCL regulation tools, in: Educ. Technol. Res. Dev.,
+Declaration of competing interest                                                       63, 2014, pp. 125–142.
+                                                                                    [5] D. Bransen, M.J.B. Govaerts, E. Panadero, et al., Putting self-regulated learning in
+    The authors declare the following financial interests/personal re                  context: integrating self-, co-, and socially shared regulation of learning, Med.
+                                                                                        Educ. 56 (1) (2022) 29–36.
+lationships which may be considered as potential competing interests:
+    Chaowang Shang acknowledges the financial support from the
+
+                                                                               11
+Y. Zheng et al.                                                                                                                               Computer Standards & Interfaces 97 (2026) 104094
+
+ [6] E. Eshuis, J. Vrugte, A. Anjewierden, L. Bollen, J. Sikken, T. Jong, Improving the              [36] E. Panadero, S. Järvelä, Socially shared Regulation of Learning: a review, Eur.
+     quality of vocational students’ collaboration and knowledge acquisition through                      Psychol. 20 (2015) 190–203.
+     instruction and joint reflection, Int. J. Comput.-Support. Collab. Learn. 14 (2019)             [37] J. Isohätälä, H. Järvenoja, S. Järvelä, Socially shared regulation of learning and
+     53–76.                                                                                               participation in social interaction in collaborative learning, Int. J. Educ. Res. 81
+ [7] C. Chan, K. Lee, Reflection literacy: a multilevel perspective on the challenges of                  (2017) 11–24.
+     using reflections in higher education through a comprehensive literature review,                [38] J. Li, Y. Lin, M. Sun, R. Shadiev, Socially shared regulation of learning in game-
+     Educ. Res. Rev. 32 (2020) 100376.                                                                    based collaborative learning environments promotes algorithmic thinking,
+ [8] L. Guo, How should reflection be supported in higher education? — A meta-                            learning participation, and positive learning attitudes, Interact. Learn. Environ. 31
+     analysis of reflection interventions, Reflective Pract. 23 (2021) 118–146.                           (2020) 1715–1726.
+ [9] S. Popenici, S. Kerr, Exploring the impact of artificial intelligence on teaching and           [39] J. Malmberg, S. Järvelä, H. Järvenoja, E. Panadero, Promoting socially shared
+     learning in higher education, Res. Pract. Technol. Enhanc. Learn. 12 (1) (2017) 22.                  regulation of learning in CSCL: progress of socially shared regulation among high-
+[10] H. Kiy, A Study on Writing Experience With ChatGPT of College Students, J. Korea                     and low-performing groups, Comput. Hum. Behav. 52 (2015) 562–572.
+     Converg. Soc. 14 (9) (2024) 976.                                                                [40] J. Yukawa, Co-reflection in online learning: collaborative critical thinking as
+[11] K. Hanifi, O. Cetin, C. Yilmaz, On ChatGPT: perspectives from software engineering                   narrative, Int. J. Comput.-Support. Collab. Learn. 1 (2006) 203–228.
+     students, in: Proc. 2023 IEEE 23rd Int. Conf. Softw. Qual. Reliab. Secur. (QRS),                [41] A. Głowala, M. Kołodziejski, T. Butvilas, Reflection as a basic category of a
+     2023, pp. 196–205.                                                                                   teacher’s thinking and action, Multidiscip. J. Sch. Educ. 12.1(2023):229–250.
+[12] Zhiheng Xi, et al., The rise and potential of large language model based agents: A              [42] J. Buck, Reflecting on reflections: a case study of disappointment in student writing
+     survey, Sci. China Inf. Sci. 68 (2) (2025) 121101.                                                   assignments, J. Acoust. Soc. Am. (2023). A273-A273.
+[13] E. Katsarou, F. Wild, A. Sougari, P. Chatzipanagiotou, A systematic review of voice-            [43] N Rahmi, C M Zubainur, Students’ mathematical reflective thinking ability through
+     based intelligent virtual agents in EFL education, Int. J. Emerg. Technol. Learn.                    scaffolding strategies[C]//Journal of Physics: Conference Series, IOP Publishing
+     (iJET) 18 (10) (2023) 65–85.                                                                         1460 (1) (2020) 012022.
+[14] P.R. Lewis, Ş. Sarkadi, Reflective artificial intelligence, Minds Mach. 34 (2) (2024)          [44] J. Dewey, Education Democracy, The elementary school teacher 4 (4) (1903)
+     14.                                                                                                  193–204.
+[15] Z. Xu, P. Zhang, M. Tu, M. Zhang, Y. Lai, Brain optimization with additional study              [45] B.J. Zimmerman, Self-regulated learning and academic achievement: an overview,
+     time: potential brain differences between high- and low-performance college                          Educ. Psychol. 25 (1) (1990) 3–17.
+     students, Front. Psychol. 14 (2023) 1209881.                                                    [46] D. Coulson, M. Harvey, Scaffolding student reflection for experience-based
+[16] UK Government, Generative Artificial Intelligence (AI) in Education, GOV.UK,                         learning: a framework, Teach. High. Educ. 18 (2013) 401–413.
+     2023. https://www.gov.uk/government/publications/generative-artificial-i                        [47] S. Lajoie, Extending the scaffolding metaphor, Instr. Sci. 33 (2005) 541–557.
+     ntelligence-in-education/generative-artificial-intelligence-ai-in-education.                    [48] E. Panadero, P.A. Kirschner, S. Järvelä, J. Malmberg, H. Järvenoja, How individual
+[17] M. Dogan, T. Dogan, A. Bozkurt, The use of artificial intelligence (AI) in online                    self-regulation affects group regulation and performance: a shared regulation
+     learning and distance education processes: a systematic review of Empirical                          intervention, Small Group Res. 46 (4) (2015) 431–454.
+     Studies, Appl. Sci. 13 (5) (2023) 3056.                                                         [49] E. Davis, Prompting middle school science students for productive reflection:
+[18] L. Shi, The integration of advanced AI-enabled emotion detection and adaptive                        generic and directed prompts, J. Learn. Sci. 12 (2003) 142–191.
+     learning systems for improved emotional regulation, J. Educ. Comput. Res. 63                    [50] J. Hattie, H. Timperley, The Power of Feedback, Rev. Educ. Res. 77 (2007)
+     (2024) 173–201.                                                                                      112–181.
+[19] B. Tang, J. Liang, W. Hu, H. Luo, Enhancing programming performance, learning                   [51] R. Ajjawi, F. Kent, J. Broadbent, J. Tai, M. Bearman, D. Boud, Feedback that works:
+     interest, and self-efficacy: the role of large language models in middle school                      a realist review of feedback interventions for written tasks, Stud. High. Educ. 47
+     education, Systems 30 (6) (2025) 8109–8138.                                                          (2021) 1343–1356.
+[20] L. Feng, Investigating the effects of artificial intelligence-assisted language learning        [52] U. Krause, R. Stark, Reflection in example- and problem-based learning: effects of
+     strategies on cognitive load and learning outcomes: a comparative study, J. Educ.                    reflection prompts, feedback, and cooperative learning, Eval. Res. Educ. 23 (2010)
+     Comput. Res. 62 (8) (2025) 1741–1774.                                                                255–272.
+[21] Q. Huang, W. Li, Y. Zhao, Enhancing deep learning and motivation in university                  [53] J. Contreras, S. Edwards-Maddox, A. Hall, M. Lee, Effects of reflective practice on
+     English education through AI technology: a quasi-experimental study, Asian J.                        baccalaureate nursing students’ Stress, Anxiety, and competency: an integrative
+     Educ. Soc. Stud. 51 (4) (2025) 452–463.                                                              review, Worldviews Evid.-Based Nurs. 17 (3) (2020) 239–245.
+[22] Ó. Cuéllar, M. Contero, M. Hincapié, Personalized and Timely Feedback in Online              [54] H. Gadsby, Fostering reflective practice in Post Graduate Certificate in Education
+     education: Enhancing learning With Deep Learning and Large Language Models,                          students through reflective journals. Developing a typology for reflection,
+     MTI. 9 (5) (2025) 45.                                                                                Reflective Pract. 23 (2022) 357–368.
+[23] X. Zhou, D. Teng, H. Al-Samarraie, The mediating role of generative AI self-                    [55] S. Rabu, N. Badlishah, Levels of students’ Reflective thinking skills in a
+     regulation on students’ critical thinking and problem-solving, Educ. Sci. 14 (12)                    collaborative learning environment using Google Docs, TechTrends 64 (2020)
+     (2024) 1302.                                                                                         533–541.
+[24] S. Steenbergen-Hu, H. Cooper, A meta-analysis of the effectiveness of intelligent               [56] J. Stoszkowski, A. Hodgkinson, D. Collins, Using Flipgrid to improve reflection: a
+     tutoring systems on college students’ academic learning, J. Educ. Psychol. 106                       collaborative online approach to coach development, Phys. Educ. Sport Pedagogy
+     (2014) 331–347.                                                                                      26 (2020) 167–178.
+[25] C. Moridis, A. Economides, Affective learning: empathetic agents with emotional                 [57] E. Liesa, P. Mayoral, M. Giralt-Romeu, S. Angulo, Video-Based Feedback for
+     facial and tone of voice expressions, IEEE Trans. Affect. Comput. 3 (2012)                           Collaborative Reflection Among Mentors, University Tutors, and Students, Edu.
+     260–272.                                                                                             Sci. 13 (9) (2023) 879.
+[26] S. Nelekar, A. Abdulrahman, M. Gupta, D. Richards, Effectiveness of embodied                    [58] M. Alghasab, J. Hardman, Z. Handley, Teacher-student Interaction On wikis:
+     conversational agents for managing academic stress at an Indian university (ARU)                     Fostering collaborative Learning and Writing, Learn Cult. Soc. Inter. 21 (2019)
+     during COVID-19, Br. J. Educ. Technol. 53 (2021) 491–511.                                            10–20.
+[27] W. Sun, Q. Chen, The design, implementation, and evaluation of Gamified                         [59] R. Gubareva, R. Lopes, Virtual Assistants for learning: a systematic literature
+     Immersive Virtual Reality (IVR) for learning: a review of Empirical Studies, Proc.                   review, CSEDU (1) (2020) 97–103.
+     Eur. Conf. Games-Based Learn. 17 (1) (2023) 789–797.                                            [60] L. González, H. Neyem, I. Contreras-McKay, D. Molina, Improving learning
+[28] M. Chen, L. Wu, Z. Liu, X. Ma, The impact of metacognitive strategy-supported                        experiences in software engineering capstone courses using artificial intelligence
+     intelligent agents on the quality of collaborative learning from the perspective of                  virtual assistants, Comput. Appl. Eng. Educ. 30 (2022) 1370–1389.
+     the community of inquiry, in: Proc. 2024 4th Int. Conf. Educ. Technol. (ICET),                  [61] B. Renner, G. Wesiak, V. Pammer-Schindler, M. Prilla, L. Müller, D. Morosini,
+     2024, pp. 11–17.                                                                                     S. Mora, N. Faltin, U. Cress, Computer-supported reflective learning: how apps can
+[29] H. Hong, C. Viriyavejakul, P. Vate-U-Lan, Enhancing critical thinking skills:                        foster reflection at work, Behav. Inf. Technol. 39 (2019) 167–187.
+     exploring generative AI-enabled cognitive offload instruction in English essay                  [62] A. Freiberg-Hoffmann, A. Romero-Medina, B. López-Fernández, M. Fernández-
+     writing, 4, ECOHUMANISM Учредители: Transnational Press, London, p. 2024.                            Liporace, Learning approaches: cross-cultural differences (Spain–Argentina) and
+[30] D.H. Schunk, B.J. Zimmerman, Motivation and Self-Regulated learning: Theory,                         academic achievement in college students, Span. J. Psychol. 26 (2023) e16.
+     research, and Applications, Routledge, 2012.                                                    [63] A. Kobylarek, K. Błaszczyński, L. Ślósarz, M. Madej, Critical thinking Questionnaire
+[31] P.H. Winne, A.F. Hadwin, N.E. Perry. Metacognition and computer-supported                            (CThQ)–construction and application of a critical thinking test tool, Andragogy
+     collaborative learning, The International Handbook of Collaborative Learning,                        Adult Educ. Soc. Mark. 2 (2) (2022), 1-1.
+     Routledge, 2013, pp. 462–479.                                                                   [64] J. Dewey, An analysis of reflective thought, J. Philos. (1922) 29–38.
+[32] Y. Su, Y. Li, H. Hu, et al., Exploring college English language learners’ self and              [65] D.T. Campbell, J.C. Stanley, Experimental and Quasi-Experimental Designs For
+     social regulation of learning during wiki-supported collaborative reading activities,                Research, Ravenio Books, 2015.
+     Int. J. Comput.-Support. Collab. Learn. 13 (2018) 35–60.                                        [66] M.M. Plack, M. Driscoll, S. Blissett, R. McKenna, T.P. Plack, A method for assessing
+[33] F. Tu, L. Wu, Kinshuk, et al., Exploring the influence of regulated learning                         reflective journal writing, J. Allied Health 34 (4) (2005) 199–208.
+     processes on learners’ prestige in project-based learning, Educ. Inf. Technol. 30 (2)           [67] L. Wang, G. Wu, J. Wu, A Study on the Reflective Level of Teachers’
+     (2025) 2299–2329.                                                                                    Autobiography, Global Education Outlook (01), (2018) 93–105.
+[34] S. Zhang, J. Chen, Y. Wen, H. Chen, Q. Gao, Q. Wang, Capturing regulatory                       [68] H.T. Hou, Integrating cluster and sequential analysis to explore learners’ flow and
+     patterns in online collaborative learning: a network analytic approach, Int. J.                      behavioral patterns in a simulation game with a situated-learning context for
+     Comput.-Support. Collab. Learn. 16 (2021) 37–66.                                                     science courses: a video-based process exploration, Comput. Human Behav. 48
+[35] J. Zheng, W. Xing, G. Zhu, Examining sequential patterns of self-and socially                        (2015) 424–435.
+     shared regulation of STEM learning in a CSCL environment, Comput. Educ. 136
+     (2019) 34–48.
+
+
+                                                                                                12
+Y. Zheng et al.                                                                                                                             Computer Standards & Interfaces 97 (2026) 104094
+
+[69] G. Zang, M. Liu, B. Yu, The application of 5G and artificial intelligence technology          [83] J. Knight, D. Weaver, M. Peffer, Z. Hazlett, Relationships between prediction
+     in the innovation and reform of college English education, Comput. Intell.                         accuracy, metacognitive reflection, and performance in introductory genetics
+     Neurosci. 2022 (1) (2022) 9008270.                                                                 students, CBE Life Sci. Educ. 21 (3) (2022) ar45.
+[70] A. Maedche, C. Legner, A. Benlian, B. Berger, H. Gimpel, T. Hess, O. Hinz,                    [84] D. Difrancesca, J. Nietfeld, L. Cao, A comparison of high and low achieving
+     S. Morana, M. Söllner, AI-based digital assistants, Bus. Inf. Syst. Eng. 61 (2019)                students on self-regulated learning variables, Learn. Individ. Differ. 45 (2016)
+     535–544.                                                                                           228–236.
+[71] M. Sigman, D. Slezak, L. Drucaroff, S. Ribeiro, F. Carrillo, Artificial and Human             [85] S A Gani, D Fajrina, R Hanifa, Students’ learning strategies for developing speaking
+     intelligence in mental health, AI Mag. 42 (2021) 39–46.                                            ability[J], Stud. Eng. lang. educ. 2 (1) (2015) 16–28.
+[72] M.A. Rusandi, I. Saripah, D.M. Khairun, No worries with ChatGPT: building bridges             [86] M. Yip, Differences between high and low academic achieving university students
+     between artificial intelligence and education with critical thinking soft skills,                  in learning and study strategies: a further investigation, Educ. Res. Eval. 15 (2009)
+     J. Public Health. 45 (3) (2023) e602–e603.                                                         561–570.
+[73] X. Xia, X. Li, Artificial intelligence for higher education development and teaching          [87] H.K. Etkin, K.J. Etkin, R.J. Carter, C.E. Rolle, Differential effects of GPT-based tools
+     skills, Wirel, Commun. Mob. Comput. 2022 (1) (2022) 7614337.                                       on comprehension of standardized passages, Front. Educ. 10 (2025) 1506752.
+[74] Y. Mohamud, A. Ma’rof, A. Mohamed, M. Uzir, A narrative review on the impact of               [88] S. Ruan, A. Nie, W. Steenbergen, J. He, J.Q. Zhang, M. Guo, et al., A reinforcement
+     applied artificial intelligence tools on higher secondary students, Int. J. Acad. Res.             learning tutor better supported lower performers in a math task, Mach. Learn. 113
+     Bus. Soc. Sci. 13 (14) (2023) 34–42.                                                               (2024) 3023–3048.
+[75] J. Cronje, Exploring the role of ChatGPT as a peer coach for developing research              [89] D.R. Thomas, J. Lin, E. Gatz, A. Gurung, S. Gupta, K. Norberg, et al., Improving
+     proposals: feedback quality, prompts, and student reflection, Electron. J. (2024)                  student learning with hybrid human-AI tutoring: a three-study quasi-experimental
+     22.2, e-Learn.                                                                                     investigation, in: Proc. 14th Learn. Anal. Knowl. Conf., 2024, pp. 404–415. New
+[76] I. Wolfbauer, V. Pammer-Schindler, K. Maitz, C. Rosé, A script for conversational                 York, NY, USA: Association for Computing Machinery (LAK ‘24).
+     reflection guidance: a field study on developing reflection competence with                   [90] Y Xu, J Zhu, M Wang, et al., The impact of a digital game-based AI chatbot on
+     apprentices, IEEE Trans. Learn. Technol. 15 (2022) 554–566.                                        students’ academic performance, higher-order thinking, and behavioral patterns in
+[77] F. Leigh, Platonic dialogue, maieutic method, and critical thinking, J. Philos. Educ.              an information technology curriculum[J], App. Sci. 14 (15) (2024) 6418.
+     41 (2008) 309–323.                                                                            [91] Maloney, A., Roberts, D. A., & Sully, J. (2022). A solvable model of neural scaling
+[78] E. Deci, R. Ryan. Intrinsic motivation and self-determination in Human behavior,                   laws. arXiv preprint arXiv:2210.16859.
+     1975, pp. 1–371.                                                                              [92] W. Fedus, B. Zoph, N. Shazeer, Switch transformers: scaling to trillion parameter
+[79] J. Uygur, E. Stuart, M. Paor, E. Wallace, S. Duffy, M. O’Shea, S. Smith,                           models with simple and efficient sparsity, J. Mach. Learn. Res. 23 (120) (2022)
+     T. Pawlikowska, The Best evidence in Medical Education systematic review to                        1–39.
+     determine the most effective teaching methods that develop reflection in medical              [93] K. Seo, J. Tang, I. Roll, S. Fels, D. Yoon, The impact of artificial intelligence on
+     students: BEME Guide No. 51, Med. Teach. 41 (2019) 3–16.                                           learner–instructor interaction in online learning, Int. J. Educ. Technol. High. Educ.
+[80] K. Arendt, L. Stark, A. Friedrich, R. Brünken, R. Stark, Quality of reflections on                 18 (1) (2021) 54.
+     teaching: approaches to its measurement and low-threshold promotion, Educ. Sci.               [94] B. Klimova, M. Pikhart, J. Kacetl, Ethical issues of the use of AI-driven mobile apps
+     15 (7) (2025) 884.                                                                                 for education, Front. Public Health 10 (2023) 1118116.
+[81] J. Jung, Y. Lu, A. Ding, How do prompts shape preservice teachers’ reflections? A             [95] T. Adiguzel, M. Kaya, F. Cansu, Revolutionizing education with AI: exploring the
+     case study in an online technology integration class, J. Teach. Educ. 73 (3) (2021)                transformative potential of ChatGPT, Contemp. Educ. Technol. 15 (3) (2023).
+     301–313.                                                                                      [96] M. Thottoli, B. Alruqaishi, A. Soosaimanickam, Robo academic advisor: can
+[82] A. Sturgill, P. Motley, Methods of reflection about service learning: guided vs. free,             chatbots and artificial intelligence replace human interaction? Contemp. Educ.
+     dialogic vs. expressive, and public vs. private. Teaching and learning inquiry,                    Technol. 16 (1) (2024) ep485.
+     ISSOTL J. 2 (1) (2014) 81–93.
+
+
+
+
+                                                                                              13
+
--- a/papers_txt/Integrating-IoT-security-practices-into-a-risk-based-f_2026_Computer-Standar.txt
+++ b/papers_txt/Integrating-IoT-security-practices-into-a-risk-based-f_2026_Computer-Standar.txt
--- a/papers_txt/LE-GEMM--A-lightweight-emulation-based-GEMM-with-p_2025_Journal-of-Systems-A.txt
+++ b/papers_txt/LE-GEMM--A-lightweight-emulation-based-GEMM-with-p_2025_Journal-of-Systems-A.txt
--- a/papers_txt/Lightweight-batch-authentication-and-key-agreement_2025_Journal-of-Systems-A.txt
+++ b/papers_txt/Lightweight-batch-authentication-and-key-agreement_2025_Journal-of-Systems-A.txt
--- a/papers_txt/MExpm--Fair-computation-offloading-for-batch-modular-ex_2026_Computer-Standa.txt
+++ b/papers_txt/MExpm--Fair-computation-offloading-for-batch-modular-ex_2026_Computer-Standa.txt
@@ -0,0 +1,883 @@
+                                                            Computer Standards & Interfaces 97 (2026) 104107
+
+
+                                                                 Contents lists available at ScienceDirect
+
+
+                                                        Computer Standards & Interfaces
+                                                          journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+MExpm: Fair computation offloading for batch modular exponentiation with
+improved privacy and checkability in IoV
+                                                   ∗,1
+Sipeng Shen 1 , Qiang Wang                             , Fucai Zhou, Jian Xu         , Mingxing Jin
+Software College, Northeastern University, China
+
+
+
+ARTICLE                INFO                              ABSTRACT
+
+Keywords:                                                Modular exponentiation is a fundamental cryptographic operation extensively applied in the Internet of
+Internet of Vehicles                                     Vehicles (IoV). However, its computational intensity imposes significant resource and time demands on
+Modular exponentiation                                   intelligent vehicles. Offloading such computations to Mobile Edge Computing (MEC) servers has emerged as
+Computation offloading
+                                                         a promising approach. Nonetheless, existing schemes are generally impractical, as they either fail to ensure
+Smart contract
+                                                         fairness between intelligent vehicles and MEC servers, lack privacy protection for the bases and exponents,
+                                                         or cannot guarantee the correctness of results with overwhelming probability due to potential misbehavior by
+                                                         MEC servers. To address these limitations, we propose MExpm, a fair and efficient computation offloading
+                                                         scheme for batch modular exponentiation under a single untrusted server model. Our scheme leverages
+                                                         blockchain technology to ensure fairness through publicly verifiable results. Furthermore, MExpm achieves
+                                                         high checkability, offering a near-perfect probability of checkability. To enhance privacy, we introduce secure
+                                                         obfuscation and logical split techniques, effectively protecting both the bases and the exponents. Extensive
+                                                         theoretical analysis and experimental results demonstrate that our scheme is not only efficient in terms of
+                                                         computation, communication, and storage overheads but also significantly improves privacy protection and
+                                                         checkability.
+
+
+
+1. Introduction                                                                          requirements of intelligent vehicles [10,11]. Despite these benefits, it
+                                                                                         still suffers from some security challenges. Once the computation tasks
+1.1. Motivation                                                                          are offloaded, it will lose control over them. As a result, the MEC
+                                                                                         server may forge the outcome of the computation. To address this issue,
+    Batch modular exponentiation, a fundamental mathematical opera-                      verifiable CO was first proposed by [12] to ensure the integrity of the
+                  ∏     𝑎
+tion, denoted as 𝑛𝑖=1 𝑢𝑖 𝑖 mod 𝑁, which is widely used in the Internet                   results. A fundamental requirement for verifiable CO is that the total
+of Vehicles (IoV) (i.e., key exchange, digital signatures, and identity                  time invested in the verification process should be less than the time
+authentication) and is assumed as one of the most resource-intensive                     spent performing the computation by himself. Otherwise, the intelligent
+operations. Considering limited computation resources in intelligent                     vehicle would not prefer to offload its computation.
+vehicles, locally executing the above task is unviable, which cannot
+meet both computation resources and time latency requirements [1].
+                                                                                         1.2. Limitations of prior art
+To tackle this challenge, computation offloading (CO) is proposed to
+undertake resource-intensive computation tasks for intelligent vehi-
+                                                                                             In this paper, we mainly focus on verifiable computation offloading
+cles [2]. However, current cloud computation paradigm for modular
+                                                                                         for batch modular exponentiation with MEC servers. However, to the
+exponentiation offloading in [3–8] fails to meet the requirements of low
+                                                                                         best of our knowledge, none of the existing prior schemes are practical
+latency, location awareness, and mobility support [9], since the cloud
+                                                                                         enough, as demonstrated in Fig. 1. They suffer from the following
+servers are far from the vehicles, it is a challenge for network transfer
+latency. To overcome the limitations of cloud computation, offloading                    challenges.
+computational tasks from intelligent vehicles to MEC servers, being                          Fairness. Most verifiable CO schemes for batch modular exponen-
+closer to intelligent vehicles than cloud servers, can provide adequate                  tiation make sure the results are correct for the client before paying
+computation resources for offloaded tasks while meeting the latency                      but often disregard the cloud’s interests. As a result, the client might
+
+
+  ∗ Corresponding author.
+      E-mail address: wangqiang1@mail.neu.edu.cn (Q. Wang).
+  1
+      Equal contribution.
+
+https://doi.org/10.1016/j.csi.2025.104107
+Received 3 June 2025; Received in revised form 12 August 2025; Accepted 28 November 2025
+Available online 3 December 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+S. Shen et al.                                                                                                            Computer Standards & Interfaces 97 (2026) 104107
+
+
+
+
+Fig. 1. Limitations of Prior Art and Our Defenses: Scenario 1: Previous works adopt private verification algorithm. Under this assumption, the greedy intelligent
+vehicle may reject the correct computation results and refuse to pay for the MEC server’s work. Scenario 2: Previous works with low checkability fail to detect
+MEC server’s misbehavior. Scenario 3: Previous works with plaintext offloading strategy fail to protect the confidentiality of inputs and outputs.
+
+
+refuse to pay by deliberately claiming that the MEC server returns                  Table 1
+an incorrect result even when executed faithfully. Furthermore, the                 Comparison of properties.
+cloud may intentionally manipulate the computation outcome for some                  Scheme           Batch size    Privacy   Checkability rate      Verification   Fairness
+                                                                                                                               119
+economic incentives. When a dispute occurs between them, a fully                     MExp [3]         1             ×          120
+                                                                                                                                                     Private        ×
+                                                                                                                               119
+trusted third party (TTP), such as a judge, has to be involved to deduce             SMCExp [4]       1             ×          120
+                                                                                                                                                     Private        ×
+                                                                                                                               119
+which party is wrong. As an ex-post measure, the dispute can be finally              SoRSA [6]        1             ×          120
+                                                                                                                                                     Private        ×
+handled, but it is unfriendly for time-sensitive IoV applications [13].              EPExp [7]        1             ×         0                      Private        ×
+                                                                                     MExpm(ours)      1             ✓         ≈1                     Public         ✓
+Therefore, it is essential to find an immediate resolution without TTP to                                                                2
+
+guarantee fairness between the MEC server and the intelligent vehicle.               MExp [3]         n             ×         1 − 10(4𝑛2𝑛+6𝑛+2)      Private        ×
+                                                                                                                                         2
+
+Due to transparency, accountability, and immutability, blockchain can                SMCExp [4]       n             ×         1 − 10(4𝑛2𝑛+6𝑛+2)      Private        ×
+                                                                                                                               1
+be used to establish trust among untrusted parties. A naive solution is to           GExp [5]         n             ×         𝑛+1
+                                                                                                                                                     Private        ×
+                                                                                                                                           𝑛2
+delegate the entire computation to the blockchain. It is inefficient and             MExpm(ours)      n             ✓         1 − (4𝑛2 +6𝑛+2)(𝑁−2)   Public         ✓
+
+imposes financial burdens on intelligent vehicles, such as significant gas          Batch Size: The number of bases in one offloading;
+fees for modular exponentiation in Ethereum. Besides, this approach                 Privacy: Whether it can protect privacy of bases and exponents;
+seriously deviates from the original intent of computation offloading.              Checkability Rate: The Checkability of Offloading scheme;
+                                                                                    Verification: Verification method for offloading;
+    Checkability Rate. The existing schemes employ verification mech-               Fairness: The fairness for both service provider and intelligent vehicles;
+anisms to ensure computation correctness against malicious MEC                      ✓: means the scheme achieves this property; ×: means it does not.
+servers [14]. However, the achieved checkability rate often falls short
+of expectations, failing to reach 100%. For example, in [3,4], the
+checkability rate is only 97.5% when the batch size 𝑛 = 1000. In
+                                                                                    a fair computation offloading task, the verification algorithm should
+other words, intelligent vehicles may fail to detect misbehavior by
+                                                                                    be public. To tackle this challenge, a straightforward approach is to
+malicious MEC servers with a 2.5% probability. Besides, the intelligent
+                                                                                    process the computation using fully homomorphic encryption (FHE).
+vehicle may make misjudgments even if the MEC servers return correct                Specifically, the bases 𝑢𝑖 are encrypted using the data owner’s public
+results. When disputes arise between the intelligent vehicle and the                key and outsourced to the MEC server. The intelligent vehicle encodes
+MEC servers, as previously mentioned, a complex procedure involving                 the queries 𝑎𝑖 under the same public key. To recover the final result
+TTP is imperative. This ex-post measure is valid, but it is unsuitable for          returned by the MEC server, the private key of the data owner should
+IoV time-sensitive applications. Furthermore, there is no such a fully              be shared with the intelligent vehicle. If the private key is leaked, it
+trusted entity in the real world.                                                   will cause serious privacy issues [15,16]. Furthermore, the intelligent
+    Privacy. Most of the existing schemes offload modular exponenti-
+        ∏    𝑎                                                                      vehicle cannot afford this heavy computation owing to the limitation
+ation 𝑛𝑖=1 𝑢𝑖 𝑖 mod 𝑁 in a plaintext way [6,7]. If we directly apply                of resources.
+them into IoV, this inevitably comes with the privacy concern. Modular                  Compared with existing works in Table 1, MExpm supports privacy
+exponentiation plays a critical role in secure cryptography algorithms              and fairness both for service providers and intelligent vehicles. Our
+(i.e., key exchange, digital signatures, identity authentication). In this          contributions can be summarized as follows.
+case, the MEC server knows the base 𝑢𝑖 , exponent 𝑎𝑖 , and output which
+is the result of 𝑢𝑎 ( mod 𝑁), so it will increase the risks of privacy                  1. To the best of our knowledge, we are the first ones to attempt fair
+leakage and attacks. From this point, it is essential to protect the                       computation offloading of batch modular exponentiation under
+privacy of the bases, exponents, and results. To tackle this challenge,                    a single untrusted server model, which is more appropriate for
+some researchers [3–5] utilize the logical split technique to protect                      practical applications.
+privacy. The security relies on a strong assumption that the auxiliary                  2. We integrate smart contracts into the verification process to en-
+information cannot be known by the malicious adversary. Therefore,                         sure fairness and correctness. Compared with existing schemes,
+the verification algorithm can only be executed by the data owner. For                     our approach incurs lower gas consumption.
+
+                                                                                2
+S. Shen et al.                                                                                                   Computer Standards & Interfaces 97 (2026) 104107
+
+
+    3. We employ a logical split method and secure obfuscation tech-
+       niques to conceal the bases, exponents, and modulus before
+       offloading computation. Consequently, MExpm achieves near-
+       perfect checkability rate.
+
+
+2. Related work
+
+
+2.1. Computation offloading
+
+
+    Intelligent vehicles, with limited computation resources and an
+increasing number of in-car applications, struggle to efficiently execute
+computation-intensive tasks. To address the challenges faced by intel-
+ligent vehicles, computation offloading has been proposed. It transfers
+communication, computation, and storage tasks to MEC servers situ-
+ated around intelligent vehicles [17]. Existing computation offloading
+schemes mainly focus on computation efficiency [17–19], resource allo-
+cation [20,21], or decision-making optimization [11] task. While these
+schemes lay the foundation for offloading computationally intensive
+tasks to MEC servers, they often lack adequate security considerations,                          Fig. 2. The architecture of system model.
+leading to a gap in verifiable computation offloading for batch modular
+exponentiation.
+                                                                                3. Fair computation offloading for batch modular exponentiation
+2.2. Secure outsourcing algorithm for modular exponentiation                    scheme in IoV
+
+                                                                                3.1. System model
+   The secure outsourcing algorithm for modular exponentiation can
+be categorized into single and dual-server models. Dual-server model
+                                                                                     As illustrated in Fig. 2, the fair computation offloading for batch
+assumes that there is no complicity risk between cloud servers [22–25].
+                                                                                modular exponentiation in IoV mainly comprises four entities: Service
+This assumption, complex to implement in real-world applications, is
+                                                                                Agency (SA), Intelligent Vehicle (IV), Roadside Unit (RSU), and MEC
+vulnerable to collusion attacks between servers. Therefore, we mainly           Server.
+consider the single-server model, which is first proposed in 2006 by                 SA: It is an honest entity. It provides the intelligent vehicle with the
+Dijk et al. [26]. Recently, numerous algorithms have been proposed to           initialized bases 𝑢𝑖 and modulus 𝑁 of the batch modular exponentiation
+improve checkability rate [5,27,28]. In 2016, Ding et al. [3] proposed                 ∏       𝑎
+                                                                                task 𝑛𝑖=1 𝑢𝑖 𝑖 mod 𝑁, and its communication with intelligent vehicles
+a modular exponentiation outsourcing scheme with checkability rate              is based on secure channels.
+close to 119
+          120
+              , especially when batch size 𝑛 = 1, which is rather higher             IV: It is a resource-limited entity. It does not trust the MEC server,
+than before. Thereafter, Su et al. [4], in 2020, expanded Ding’s method,        but it wants to offload some requests 𝑎𝑖 to the MEC server, where
+optimized the logical split, and changed the modulus of the algorithm           𝑖 ∈ {1, … , 𝑛}. Furthermore, it may try to get the result without paying
+to a composite number. Recent schemes including SoRSA [6] and EP-               by intentionally saying that the cloud’s computation result is wrong.
+Exp [7] assume that bases in computation tasks are ciphertext and lack               RSU: It is an untrusted entity, which serves as a full node of the
+the consideration of security of bases. The checkability rate of these          blockchain. It provides verifiable services to guarantee the integrity of
+methods is still far from 1, and it can result in certain security risks.       the result.
+Meanwhile, many of these schemes concurrently present outsourcing                    MEC Server: It is a powerful entity deployed at the network’s
+algorithms for 𝑢𝑎 . Nevertheless, a single modular exponentiation out-          edge with adequate computation resources, which is responsible for
+sourcing algorithm represents a specific instance of batch modular              performing the computation offloading tasks for the intelligent vehicle.
+exponentiation outsourcing with batch size 𝑛 = 1.                               Similar to the intelligent vehicle, it is also a profit-driven entity. It
+                                                                                would like to get the reward from the intelligent vehicle without
+                                                                                performing the computation.
+2.3. Fair computation                                                                A fair computation offloading for batch modular exponentiation
+                                                                                (MExpm) in IoV consists of the following algorithms.
+    Recently, blockchain and smart contracts have been proposed to                   (𝑃 𝑎𝑟𝑎𝑚𝑠, 𝑅𝐾) ← 𝑆𝑒𝑡𝑢𝑝(1𝜆 , 𝑢1 , … , 𝑢𝑛 , 𝑁). Given a security parameter
+                                                                                𝜆, the bases 𝑢1 , … , 𝑢𝑛 and 𝑁, SA invokes this algorithm to generate the
+address these fairness issues [29]. Smart contracts can provide a secure
+                                                                                public parameters 𝑃 𝑎𝑟𝑎𝑚𝑠 and the recovery key 𝑅𝐾, where 𝑢𝑖 and 𝑁
+solution for participants to execute contracts on Ethereum, essentially
+                                                                                are the base and modulus for modular exponentiation tasks.
+being executable code with correctness, transparency, and immutabil-
+                                                                                     (𝑇 𝐾, 𝑉 𝐾, 𝐴𝑢𝑥) ← 𝐾𝑒𝑦𝐺𝑒𝑛(𝑎1 , … , 𝑎𝑛 , 𝑃 𝑎𝑟𝑎𝑚𝑠). On inputting the ex-
+ity [30]. Although there are some studies utilizing smart contract to
+                                                                                ponents 𝑎1 , … , 𝑎𝑛 and the public parameters 𝑃 𝑎𝑟𝑎𝑚𝑠, IV runs this
+fulfill fair computation, they either rely on the assumption that the
+                                                                                algorithm to generate the evaluation key 𝑇 𝐾 for performing the compu-
+client and the cloud are honest [31], or utilize smart contract con-            tation task, witness generation key 𝑉 𝐾 and auxiliary information 𝐴𝑢𝑥.
+ducting complex computation tasks [29,32,33]. However, in standard              It is worth noting that this algorithm can be carried out entirely offline
+blockchain systems such as Ethereum, users are typically charged gas            before the online phase, so it does not introduce additional latency
+fees based on the complexity of the computational task running in the           during computation outsourcing. The input 𝑎𝑖 is the exponent of 𝑢𝑖 ,
+smart contract. The gas fees for smart contracts are recorded in the            where 𝑖 ∈ {1, … , 𝑛}.
+fee table EIP150 [34]. Generally, the cost of Ethereum is high, and                  (𝜎E , 𝜋𝐸 ) ← 𝐶𝑜𝑚𝑝𝑢𝑡𝑒(𝑇 𝐾, 𝑉 𝐾). On inputting the evaluation key
+considering that modular exponentiation is an expensive computation             𝑇 𝐾 and witness generation key 𝑉 𝐾, the MEC server performs this
+task, Existing schemes may increase the financial burden on users.              algorithm to produce the encoding result 𝜎E and witness 𝜋𝐸 .
+
+                                                                            3
+S. Shen et al.                                                                                                                     Computer Standards & Interfaces 97 (2026) 104107
+
+
+Table 2                                                                                         𝟐. 𝑲𝒆𝒚𝑮𝒆𝒏(𝒂1 , … , 𝒂𝒏 , 𝑷 𝒂𝒓𝒂𝒎𝒔) ∶ This algorithm is executed by the
+Notations.                                                                                  IV to construct the evaluation key 𝑇 𝐾 for performing the computation
+ Symbols                                  Descriptions                                      task, the witness generation key 𝑉 𝐾, and auxiliary information 𝐴𝑢𝑥.
+ {𝑢1 , 𝑢2 , … , 𝑢𝑛 }                      Computation bases                                 Notably, the IV can execute this procedure in an offline manner, thereby
+ 𝜆                                        Security Parameter
+                                                                                            avoiding additional delays in the online authentication or verification
+ 𝑝                                        512-bit prime integer
+ 𝑁                                        512-bit prime integer                             phase. This algorithm works as follows:
+ 𝐿                                        A composite integer 𝐿 = 𝑝𝑁                            (a) IV parses 𝑃 𝑎𝑟𝑎𝑚𝑠 as {𝐿, 𝑦𝑖 } and the input exponents 𝑎𝑖 ∈ Z∗𝜙(𝐿) .
+ 𝑘                                        A random integer
+                                                                                                (b) IV runs RandN program [35] four times to generate four blinding
+ 𝜏                                        A composite integer 𝜏 = 𝑘𝑁
+ {𝑦1 , 𝑦2 , … , 𝑦𝑛 }                      Bases after secure obfuscation                    pairs (𝑘1 , 𝑔 𝑘1 ), (𝑘2 , 𝑔 𝑘2 ), (𝑘3 , 𝑔 𝑘3 ), (𝑘4 , 𝑔 𝑘4 ) and sets:
+ (𝑘𝑖 , 𝑔 𝑘𝑖 ), 𝑖 ∈ {1, 2, 3, 4}           Random pairs generated by RandN algorithm
+ {𝑎1 , 𝑎2 , … , 𝑎𝑛 }                      Computation exponents
+                                                                                            𝑣1 = 𝑔 𝑘1 mod 𝐿, 𝑣2 = 𝑔 𝑘2 mod 𝐿,
+                                                                                                                                                                                 (2)
+ 𝜙(⋅)                                     Euler’s function                                  𝑣3 = 𝑔 𝑘3 mod 𝐿, 𝑣4 = 𝑔 𝑘4 mod 𝐿.
+ {𝑤𝑖 , 𝑧1 , 𝛿1 , 𝑚𝑖 }, 𝑖 ∈ {1, … , 𝑛}     Computation tasks after logical division
+ 𝑟 ∈ {2, … , 𝑁}                           Random integer                                    where 𝑔 ∈ Z∗𝐿 and its order is 𝜙(𝐿).
+ 𝜉 ∈ {1, … , 𝑛}                           Random index                                        (c) IV performs logical split to compute 𝑤𝑖 , 𝑧1 , 𝛿1 , and 𝑚𝑖 such that
+ 𝑑                                        Modular Multiplicative Inverse of 𝑎𝜉
+ {𝑤′𝑖 , 𝑧2 , 𝛿2 , 𝑚′𝑖 }, 𝑖 ∈ {1, … , 𝑛}   Verification tasks after logical division
+ {𝜎𝐸 , 𝜋𝐸 }                               Computation results returned by MEC server                    𝑤𝑖 = 𝑦𝑖 ∕𝑣1 (mod 𝐿),
+                                                                                              (                 )
+                                                                                            𝑘1 𝑎1 + 𝑎2 + ⋯ + 𝑎𝑛 = 𝑘3 + 𝛿1 𝑧1 (mod 𝜙(𝐿)),                                         (3)
+                                                                                                     𝑎𝑖 = 𝛿1 𝑧1 + 𝑚𝑖 (mod 𝜙(𝐿)).
+    {0∕1, 𝜎E } ← 𝑉 𝑒𝑟𝑖𝑓 𝑦(𝜎E , 𝜋𝐸 , 𝐴𝑢𝑥). On inputting the encoding result                     (d) IV chooses two random integers 𝑟 ∈ {2, … , 𝑁} and 𝜉 ∈ {1, … , 𝑛}
+𝜎E , the witness 𝜋𝐸 and auxiliary information 𝐴𝑢𝑥, the RSU runs this                        and computes 𝑑, where 𝑎𝜉 𝑑 ≡ 1 (mod 𝜙(𝐿)).
+algorithm to check whether the MEC server returns a correct result
+                                                                                               (e) IV computes 𝑤′𝑖 , 𝑧2 , 𝛿2 , and 𝑚′𝑖 such that
+utilizing smart contract. If not, it outputs 0, 1 and 𝜎E otherwise.
+    𝑅𝑒𝑠𝑢𝑙𝑡 ← 𝑅𝑒𝑐𝑜𝑣𝑒𝑟𝑦(𝜎E , 𝑅𝐾). On inputting 𝜎E and recovery key 𝑅𝐾,                                    𝑤′𝑖 = 𝑦𝑖 ∕𝑣2 (mod 𝐿),
+the intelligent vehicle runs this algorithm to decode the true result                         (                  )
+                                                                                            𝑘2 𝑎1 + 𝑎2 + ⋯ + 𝑎𝑛 = 𝑘4 + 𝛿2 𝑧2 (mod 𝜙(𝐿)),                                         (4)
+𝑅𝑒𝑠𝑢𝑙𝑡.
+                                                                                                        𝑎𝑖 = 𝛿2 𝑧2 + 𝑚′𝑖 (mod 𝜙(𝐿)).
+
+3.2. Overview of construction and notations                                                 Especially when 𝑖 = 𝜉, we have 𝑦′𝜉 = 𝑦𝜉 𝑟𝑑 ( mod 𝐿) and 𝑤′𝜉 = 𝑦′𝜉 ∕𝑣2 ( mod
+                                                                                            𝐿), where 𝜉 ∈ [1, 𝑛] is a random integer.
+                                                                                                                                                  ∏
+    Similar to [3], the bases and exponents are protected using logical                          (f)    IV      sets   𝑇𝐾             =        {(𝑔 𝑛𝑖=1 𝑤𝑖 , 𝑧1 ), (𝑤𝑖 , 𝑚𝑖 )𝑖∈[𝑛] },
+                                                                                                          ∏𝑛
+split. A recovery algorithm is also involved to protect the confidentiality                 𝑉 𝐾 = {(𝑔 𝑖=1 𝑤′𝑖 , 𝑧2 ), (𝑤′𝑖 , 𝑚′𝑖 )𝑖∈[𝑛] } and 𝐴𝑢𝑥 = {𝑟𝑣3 , 𝑣4 , 𝛿1 , 𝛿2 }, where
+of the final result. At the setup phase, we utilize the secure obfuscation                  𝛿1 , 𝛿2 ∈ Z∗𝜙(𝐿) . The pseudo code of the key generation procedure can be
+technique to hide the modulus 𝑁 and bases 𝑢𝑖 . In 𝑆𝑒𝑡𝑢𝑝 step (a), only                      found in Algorithm 1.
+the masked modulus 𝐿 = 𝑝 ⋅ 𝑁 is sent to MEC server, so the MEC
+server cannot get any information about 𝑁 without the mask factor
+                                                                                            Algorithm 1: KeyGen Algorithm
+𝑝 chosen and kept privately by the User. To prevent the MEC server
+from learning the original bases 𝑢𝑖 , we apply a modular obfuscation                           Input: Exponents 𝑎1 , ⋯ , 𝑎𝑛 ∈ Z∗𝜙(𝐿) , public parameters
+technique by embedding each base into a larger modular space (i.e., Eq.                                  𝑃 𝑎𝑟𝑎𝑚𝑠 = {𝐿, 𝑦𝑖 }
+(1)) in 𝑆𝑒𝑡𝑢𝑝 step (b). Since 𝑘 and 𝑝 are sampled uniformly, the                               Output: Evaluation key 𝑇 𝐾, verification key 𝑉 𝐾, auxiliary info
+adversarial MEC server cannot recover them. The original computation                                        𝐴𝑢𝑥
+                  ∏𝑛        𝑎𝑖 mod 𝑁 is converted into ∏𝑛 𝑦 𝑎𝑖 mod 𝐿.                        1 Parse 𝑃 𝑎𝑟𝑎𝑚𝑠 as {𝐿, 𝑦𝑖 } ;                                         // Step (a)
+offloading task      𝑖=1 𝑢𝑖                                  𝑖=1 𝑖
+                                                                                             2 Run RandN algorithm four times to get
+The privacy of the exponent 𝑎𝑖 is ensured by the logical split, where
+𝑎𝑖 = 𝛿1 ⋅ 𝑧𝑖 + 𝑚𝑖 mod 𝜙(𝐿). Since the standard integer factorization                            (𝑘1 , 𝑔 𝑘1 ), (𝑘2 , 𝑔 𝑘2 ), (𝑘3 , 𝑔 𝑘3 ), (𝑘4 , 𝑔 𝑘4 );            // Step (b)
+                                                                                                                       𝑘                         𝑘
+                                                                                             3 Compute 𝑣1 = 𝑔 1 mod 𝐿, 𝑣2 = 𝑔 2 mod 𝐿, 𝑣3 = 𝑔 3 mod 𝐿,
+                                                                                                                                                                 𝑘
+assumption holds, the adversary cannot derive their factors 𝑝 and 𝑁
+from 𝐿. Without factors 𝑝 and 𝑁, it is infeasible to compute 𝜙(𝐿) =                             𝑣4 = 𝑔 𝑘4 mod 𝐿;                                        // Compute Equation 2
+                                                                                             4 Compute 𝑘1 (𝑎1 + ⋯ + 𝑎𝑛 ) = 𝑘3 + 𝛿1 𝑧1 mod 𝜙(𝐿) ; // Equation
+(𝑝 − 1)(𝑁 − 1).𝜙(𝐿) = (𝑝 − 1) ⋅ (𝑁 − 1). As a result, the reduction modulo
+𝜙(𝐿) effectively hides the underlying value, which makes it infeasible                            3
+                                                                                             5 for 𝑖 ← 1 to 𝑛 do
+to recover 𝑎𝑖 from 𝛿1 ⋅ 𝑧𝑖 + 𝑚𝑖 mod 𝜙(𝐿). Furthermore, the malicious
+adversary learns nothing about the final computation result without the
+                                                                                             6     Compute 𝑤𝑖 = 𝑦𝑖 ∕𝑣1 mod 𝐿 ;                  // Equation 3
+recovery key. A detailed description of the notations used in MExpm
+                                                                                             7     Compute 𝑎𝑖 = 𝛿1 𝑧1 + 𝑚𝑖 mod 𝜙(𝐿);            // Equation 3
+can be found in Table 2.                                                                     8 Sample 𝑟 ∈ {2, ⋯ , 𝑁} and 𝜉 ∈ {1, ⋯ , 𝑛} randomly ;     // Step
+                                                                                                (d)
+                                                                                             9 Compute 𝑑 = 𝑎
+                                                                                                              −1 mod 𝜙(𝐿);                         // Step (d)
+3.3. Detailed construction                                                                                    𝜉
+                                                                                            10 Compute 𝑘2 (𝑎1 + ⋯ + 𝑎𝑛 ) = 𝑘4 + 𝛿2 𝑧2 mod 𝜙(𝐿); // Equation 3
+
+    𝟏. 𝑺𝒆𝒕𝒖𝒑(𝟏𝝀 , 𝒖1 , … , 𝒖𝒏 , 𝑵) ∶ This algorithm is run by SA. Given                     11 for 𝑖 ← 1 to 𝑛 do
+a security parameter 𝜆 and a 512-bit prime integer 𝑁, SA works as                           12     Compute 𝑤′𝑖 = 𝑦𝑖 ∕𝑣2 mod 𝐿 ;                           // Equation 3
+follows:                                                                                    13     Compute 𝑎𝑖 = 𝛿2 𝑧2 + 𝑚′𝑖 mod 𝜙(𝐿);                     // Equation 3
+    (a) SA generates a 512-bit prime integer 𝑝, and computes 𝐿 = 𝑝𝑁.                        14   Randomly Select 𝜉 ∈ [1, 𝑛], and Update 𝑦′𝜉 = 𝑦𝜉 ⋅ 𝑟𝑑 mod 𝐿,
+    (b) SA uniformly chooses 𝑘 from 𝑍𝑁 and computes 𝜏 = 𝑘𝑁. For any                             𝑤′𝜉 = 𝑦′𝜉 ∕𝑣2 mod 𝐿 ;                                        // Step (e)
+𝑖 ∈ {1, 2, … , 𝑛}, SA sets 𝑦𝑖 as follows:                                                                      ∏
+                                                                                            15 Set 𝑇 𝐾 = {(𝑔 ⋅ 𝑛𝑖=1 𝑤𝑖 , 𝑧1 ), (𝑤𝑖 , 𝑚𝑖 )𝑖∈[𝑛] };            // Step (f)
+                                                                                                               ∏𝑛         ′        ′    ′
+𝑦𝑖 = 𝑢𝑖 + 𝜏 mod 𝐿                                                                 (1)       16 Set 𝑉 𝐾 = {(𝑔 ⋅
+                                                                                                                  𝑖=1 𝑤𝑖 , 𝑧2 ), (𝑤𝑖 , 𝑚𝑖 )𝑖∈[𝑛] };          // Step (f)
+                                                                                            17 Set 𝐴𝑢𝑥 = {𝑟𝑣3 , 𝑣4 , 𝛿1 , 𝛿2 };                              // Step (f)
+   (c) SA sets 𝑃 𝑎𝑟𝑎𝑚𝑠 = {𝐿, 𝑦𝑖 } and 𝑅𝐾 = {𝑁}, where 𝑅𝐾 is                                 18 return 𝑇 𝐾, 𝑉 𝐾, 𝐴𝑢𝑥;
+transmitted via a secure channel between SA and IV.
+
+                                                                                        4
+S. Shen et al.                                                                                                                      Computer Standards & Interfaces 97 (2026) 104107
+
+
+                                                                                                1. 𝑆𝑒𝑡𝑢𝑝: Generate two distinct secure primes 𝑝 = 13, 𝑁 = 11.
+                                                                                                   Compute 𝐿 = 𝑝 ⋅ 𝑁 = 143. Then generate 𝑢 = 128, 𝑎 = 79.
+    𝟑. 𝑪𝒐𝒎𝒑𝒖𝒕𝒆(𝑻 𝑲, 𝑽 𝑲) ∶ This algorithm is run by MEC server to                               2. 𝐶𝑜𝑚𝑝𝑢𝑡𝑒: Locally compute result 𝑅𝑒𝑠𝑢𝑙𝑡 = 12879 mod 43 = 8.
+generate encoding result 𝜎𝐸 and witness result 𝜋𝐸 . The MEC server
+works as follows:                                                                               The proposed MExpm consists of the following procedures.
+                                              ∏
+    (a) MEC server parses 𝑇 𝐾 as {(𝑔 𝑛𝑖=1 𝑤𝑖 , 𝑧1 ), (𝑤𝑖 , 𝑚𝑖 )𝑖∈[𝑛] } and 𝑉 𝐾
+       ∏𝑛                                                                            ′
+as {(𝑔 𝑖=1 𝑤𝑖 , 𝑧2 ), (𝑤𝑖 , 𝑚𝑖 )𝑖∈[𝑛] }, and then sets 𝛾𝑖 = (𝑤𝑖 )𝑚𝑖 and 𝛾𝑖′ = (𝑤′𝑖 )𝑚𝑖
+              ′          ′   ′                                                                  1. 𝑆𝑒𝑡𝑢𝑝: SA generates 𝑝 = 13, 𝑁 = 11 and computes 𝐿 = 𝑝⋅𝑁 = 143.
+for any 𝑖 ∈ {1, … , 𝑛}, respectively.                                                              Then SA generates base 𝑢 = 128 and utilizes random integer
+                                     ( ∏        )𝑧              ( ∏         )𝑧
+    (b) MEC server sets 𝑄0 = 𝑔 𝑛𝑖=1 𝑤𝑖 1 and 𝑄1 = 𝑔 𝑛𝑖=1 𝑤′𝑖 2 .                                   𝑘 = 5 to compute 𝑦 = 𝑢 + 𝑘 ⋅ 𝑁 mod 𝐿 = 40.
+    (c) MEC server sets 𝜎𝐸 = {𝑄0 , (𝛾𝑖 )𝑖∈[𝑛] } and 𝜋𝐸 = {𝑄1 , (𝛾𝑖′ )𝑖∈[𝑛] }.                   2. 𝐾𝑒𝑦𝐺𝑒𝑛: Intelligent Vehicle runs 𝑅𝑎𝑛𝑑𝑁 algorithm to obtain
+𝟒. 𝑽 𝒆𝒓𝒊𝒇 𝒚(𝝈 𝑬 , 𝝅 𝑬 , 𝑨𝒖𝒙) ∶ The algorithm is run by RSU to check the                            (63, 125), (42, 25), (52, 113), (82, 69), 𝑔 = 71 and compute 𝑣1 −1 =
+correctness of the result returned by the MEC. The RSU works as                                    125−1 mod 𝐿 = 135, 𝑣2 −1 = 25−1 mod 𝐿 = 103. Then it gener-
+follows:                                                                                           ates a computation task 𝑎 = 79. Thereafter, intelligent vehicle
+    (a) Upon receiving the encoding result 𝜎𝐸 and witness 𝜋𝐸 , it first                            generate random integers 𝛿1 = 11, 𝛿2 = 109, 𝑟 = 7 and compute
+parses them as {𝑄0 , (𝛾𝑖 )𝑖∈[𝑛] } and {𝑄1 , (𝛾𝑖′ )𝑖∈[𝑛] }, respectively. (b) it                    𝑑 = 𝑎−1 mod 𝜙(𝐿) = 79, 𝑟𝑑 mod 𝐿 = 19. Finally, intelligent
+parses auxiliary information 𝐴𝑢𝑥 as {𝑟𝑣3 , 𝑣4 , 𝛿1 , 𝛿2 }.                                         vehicle utilizes Eqs. (3) and (4) to conduct logical split, then
+                                                                 ( )𝛿
+    (c) RSU utilizes smart contract to compute 𝜂 = 𝑄0 1 and then                                   obtain 𝑤 = 109, 𝑔𝑤 = 17, 𝑤′ = 59, 𝑔𝑤′ = 42, 𝛿1 𝑧1 = 57, 𝑧1 =
+check whether the following equation holds:                                                        55, 𝛿2 𝑧2 = 78, 𝑧2 = 44, 𝑚 = 74, 𝑚′ = 83, 𝑟𝑣3 = 76.
+            ∏
+            𝑛
+                           ( )𝛿 ∏  𝑛                                                            3. 𝐶𝑜𝑚𝑝𝑢𝑡𝑒: MEC server receives the offloading tasks and com-
+𝑟𝑣3 ⋅ 𝜂 ⋅         𝛾𝑖 = 𝑣4 ⋅ 𝑄1 2 ⋅   𝛾𝑖′ (mod 𝐿)                                   (5)             pute (𝑔𝑤)𝑧1 = 5955 mod 143 = 43, (𝑔𝑤′ )𝑧2 = 4244 mod 143 =
+                                                                                                                       ′
+            𝑖=1                                𝑖=1                                                 126, 𝑤𝑚 = 12, 𝑤′ 𝑚 = 119.
+   If not, the smart contract outputs 0 and aborts. Otherwise, outputs 1
+                    ∏                                                                           4. 𝑉 𝑒𝑟𝑖𝑓 𝑦: Smart contract is called to verify 𝐿𝑒𝑓 𝑡 = 76 ⋅ 12 ⋅ 4311
+and sets 𝜎𝐸 = {𝑟𝑣3 𝜂 𝑛𝑖=1 𝛾𝑖 }. The verification logic of the smart contract
+                                                                                                   mod 143 = 111 and 𝑅𝑖𝑔ℎ𝑡 = 69 ⋅ 126109 ⋅ 119 mod 143 = 111.
+can be found in Algorithm 2.
+                                                                                                5. 𝑅𝑒𝑐𝑜𝑣𝑒𝑟𝑦: Intelligent Vehicle computes 𝑟−1 mod 𝐿 = 41, then
+                                                                                                   obtain 𝑅𝑒𝑠𝑢𝑙𝑡 = 111 ⋅ 41 mod 11 = 8.
+Algorithm 2: Verification Logic of Smart Contract
+  Input:
+    𝑄0 , 𝑄1 ∈ Z𝐿 ; (𝛾𝑖 )𝑖∈[𝑛] , (𝛾𝑖′ )𝑖∈[𝑛] ∈ Z𝐿                                             4. Theoretical analysis
+    Scalars: 𝑟𝑣3 , 𝑣4 , 𝛿1 , 𝛿2 ∈ Z𝐿
+  Output: Boolean flag indicating verification result                                        4.1. Correctness
+         𝛿1
+1 𝜂 ←𝑄
+         0
+            mod 𝐿 ;                                // Compute 𝜂
+2 prodGamma ← 1;
+                                                                                                 To prove the correctness, we need to argue that the returned results
+3 for 𝑖 ← 1 to 𝑛 do
+                                                                                             by the MEC server can pass the verification algorithm and the intel-
+4     prodGamma ← prodGamma ⋅ 𝛾𝑖 mod 𝐿; // Accumulate
+                                                                                             ligent vehicle can recover the final result if all entities involved are
+        product of 𝛾𝑖
+                                                                                             honest.
+5    prodGammaPrime ← 1;
+                                                                                                For the first part, we mainly argue it based on Eq. (5). That is, we
+6    for 𝑖 ← 1 to 𝑛 do
+                                                                                             prove that the 𝜎𝐸 and 𝜋𝐸 can pass the 𝑣𝑒𝑟𝑖𝑓 𝑦 algorithm when MEC
+ 7    prodGammaPrime ← prodGammaPrime                               ⋅ 𝛾𝑖′ mod 𝐿;
+                                                                                             server is honest and follows all algorithms mentioned above.
+        // Accumulate product of proofs                             𝛾𝑖′
+                                                                                                Based on Eq. (4), the right-hand side (𝑅𝐻𝑆) of Eq. (5) can be
+ 8 lhs ← 𝑟𝑣3 ⋅ 𝜂 ⋅ prodGamma mod 𝐿; // Left-hand side of
+                                                                                             expressed as:
+    the equality
+                𝛿2
+ 9 rhs ← 𝑣4 ⋅ 𝑄 ⋅ prodGammaPrime mod 𝐿;    // Right-hand                                               ( )𝛿 ∏  𝑛
+                1                                                                            𝑅𝐻𝑆 = 𝑣4 ⋅ 𝑄1 2 ⋅   𝛾𝑖′ (mod 𝐿)
+    side of the equality                                                                                                  𝑖=1
+10 return (lhs == rhs); // Return true if verification                                                    (                )𝑧2 𝛿2 𝑛
+                                                                                                               ∏
+                                                                                                               𝑛                   ∏ 𝑚′
+    passes                                                                                         = 𝑔 𝑘4 𝑔            𝑤′𝑖               𝑤′𝑖 𝑖 (mod 𝐿)
+                                                                                                                  𝑖=1              𝑖=1
+                                                                                                                   ( 𝑛         )𝑧2 𝛿 2 𝑛
+                                                                                                                      ∏                 ∏ 𝑚′
+   𝟓. 𝑹𝒆𝒄𝒐𝒗𝒆𝒓𝒚(𝝈 𝑬 , 𝑹𝑲) ∶ The algorithm is run by IV to recover the                               = 𝑔 𝑘4 +𝑧2 𝛿2           𝑤′𝑖              𝑤′𝑖 𝑖 (mod 𝐿)
+encoding result 𝜎𝐸 to the true result 𝑅𝑒𝑠𝑢𝑙𝑡.                                                                         𝑖=1               𝑖=1
+                                                 ∏                                                                          )∏ 𝑛
+   (a) IV parses 𝑅𝐾 as {𝑁} and 𝜎𝐸 as {𝑟𝑣3 𝜂 𝑛𝑖=1 𝛾𝑖 }, where 𝜂 =                                          (                              ′
+( ∏𝑛      )𝑧1 𝛿1                                                                                   = 𝑔 𝑘2 𝑎1 +𝑎2 +···+𝑎𝑛           𝑤′𝑖 𝑚𝑖 +𝛿2 𝑧2 (mod 𝐿)
+ 𝑔 𝑖=1 𝑤𝑖        .                                                                                                            𝑖=1
+   (b) IV recovers the final computation result 𝑅𝑒𝑠𝑢𝑙𝑡 as follows:                                        (                 )∏ 𝑛
+                                                                                                       𝑘2 𝑎1 +𝑎2 +···+𝑎𝑛
+                                                                                                   =𝑔                              𝑤′𝑖 𝑎𝑖 (mod 𝐿)                               (7)
+                    ∏
+                    𝑛
+𝑅𝑒𝑠𝑢𝑙𝑡 = 𝑟𝑣3 𝜂             𝛾𝑖 ⋅ 𝑟−1 (mod 𝐿)                                                                                   𝑖=1
+                    𝑖=1                                                                              ∏𝑛
+                   (                  )𝑧1 𝛿1                                       (6)             =       𝑔 𝑘2 𝑎𝑖 𝑤′𝑖 𝑎𝑖 (mod 𝐿)
+                           ∏
+                           𝑛                   ∏
+                                               𝑛
+                                                      𝑚                                              𝑖=1
+         = 𝑔 𝑘3        𝑔         𝑤𝑖                  𝑤𝑖 𝑖 (mod 𝑁)
+                           𝑖=1                 𝑖=1
+                                                                                                     ∏𝑛
+                                                                                                             𝑎
+                                                                                                   =       𝑣2𝑖 𝑤′𝑖 𝑎𝑖 (mod 𝐿)
+                                                                                                     𝑖=1
+3.4. An illustrative                                                                                 ∏𝑛
+                                                                                                   =       𝑦′𝑖 𝑎𝑖 (mod 𝐿)
+                                                                                                     𝑖=1
+    We now provide a toy example to further illustrate MExpm. MExpm
+performs the following procedures. The original modular exponentia-                                  ∏
+                                                                                                     𝜉−1
+                                                                                                             𝑎                 ∏ 𝑛
+                                                                                                                                        𝑎
+                                                                                                   =       𝑦𝑖 𝑖 ⋅ 𝑦′𝜉 𝑟𝑑𝑎𝜉 ⋅         𝑦𝑖 𝑖 (mod 𝐿)
+tion performs the following procedures.                                                              𝑖=1                      𝑖=𝜉+1
+
+
+                                                                                         5
+S. Shen et al.                                                                                                                      Computer Standards & Interfaces 97 (2026) 104107
+
+
+Since 𝑎𝜉 𝑑 ≡ 1( mod 𝜙(𝐿)), we always have 𝑦′𝜉 𝑟𝑑𝑎𝜉 = 𝑟𝑦𝜉 mod 𝐿. Based on
+Eq. (3), we can get:
+             ∏
+             𝑛
+                𝑎
+𝑅𝐻𝑆 = 𝑟        𝑦𝑖 𝑖 (mod 𝐿)
+             𝑖=1
+             ∏
+             𝑛
+                                     𝑎
+        =𝑟             𝑔 𝑘1 𝑎𝑖 𝑤𝑖 𝑖 (mod 𝐿)
+             𝑖=1
+                   (                        )∏
+                                             𝑛
+                                                      𝑚 +𝛿1 𝑧1
+        = 𝑟𝑔 𝑘1 𝑎1 +𝑎2 +···+𝑎𝑛                      𝑤𝑖 𝑖          (mod 𝐿)
+                                              𝑖=1
+                                ( 𝑛            )𝑧1 𝛿1
+                                 ∏                        ∏
+                                                          𝑛
+                                                                 𝑚
+               𝑘3 +𝑧1 𝛿1
+        = 𝑟𝑔                               𝑤𝑖                   𝑤𝑖 𝑖 (mod 𝐿)   (8)
+                                     𝑖=1                  𝑖=1
+                   (                       )𝑧1 𝛿1
+                               ∏
+                               𝑛                    ∏
+                                                    𝑛
+                                                            𝑚
+        = 𝑟𝑔 𝑘3            𝑔         𝑤𝑖                    𝑤𝑖 𝑖 (mod 𝐿)                                           Fig. 3. Comparison of checkability rate.
+                               𝑖=1                  𝑖=1
+             ( )𝛿                    ∏
+                                     𝑛
+        = 𝑟𝑣3 𝑄0 1                         𝛾𝑖 (mod 𝐿)
+                                     𝑖=1                                                 Proof. If the malicious MEC server deceives the intelligent vehicle IV
+                               ∏
+                               𝑛
+                                                                                         successfully, the following equation will hold.
+        = 𝑟𝑣3 ⋅ 𝜂 ⋅                  𝛾𝑖 (mod 𝐿)
+                               𝑖=1                                                                   ∏
+                                                                                                     𝑛
+                                                                                                                      ( )𝛿 ∏𝑛
+                                                                                         𝑡 ⋅ 𝑟𝑣3 𝜂         𝛾𝑖 = 𝑡 ⋅ 𝑣4 𝑄1 2   𝛾𝑖′                                             (10)
+Obviously, according to Eq. (8), if the MEC server and intelligent                                   𝑖=1                      𝑖=1
+vehicle IV are honest and follow all procedures described above, the
+                                                                                         The corresponding encoding result will be decoded by IV as follows:
+encoding result 𝜎𝐸 and witness result 𝜋𝐸 can always pass the 𝑉 𝑒𝑟𝑖𝑓 𝑦
+algorithm.                                                                                                     ∏
+                                                                                                               𝑛
+                                                                                         𝑅𝑒𝑠𝑢𝑙𝑡 = 𝑡 ⋅ 𝑣3 𝜂            𝛾𝑖 (mod 𝑁)                                              (11)
+    Second, we will argue that the encoding result 𝜎𝐸 can be decoded
+                                                                                                                𝑖=1
+to the actual result 𝑅𝑒𝑠𝑢𝑙𝑡. In this section, we mainly rely on Eq. (1),                     Since the MEC server could not gain access to the values of 𝛿1 and 𝛿2 ,
+Eq. (2), Eq. (3) and (6). The 𝜎𝐸 can be parsed and computed as follows:                                                                      ( )𝛿
+                                                                                         it cannot obtain the correct values of 𝜂 and 𝑄1 2 . Therefore, the MEC
+                                                                                         server can only turn to other 𝑛 pairs to cheat the intelligent vehicle,
+                       ∏
+                       𝑛
+                                                                                         then it needs to determine the correct meanings of 2𝑛 + 2 sub-tasks to
+𝑅𝑒𝑠𝑢𝑙𝑡 = 𝑟𝑣3 𝜂                  𝛾𝑖 ⋅ 𝑟−1 (mod 𝑁)
+                                                                                         obtain pairs: (𝑤ℎ , 𝑚ℎ ) and (𝑤′𝑗 , 𝑚′𝑗 ). Since the sending order of these 2𝑛+
+                       𝑖=1
+                   (                       )𝑧1 𝛿1                                        2 pairs is random, the MEC server is unaware of the specific meanings
+                               ∏
+                               𝑛                    ∏
+                                                    𝑛
+         = 𝑔 𝑘3         𝑔            𝑤𝑖
+                                                            𝑚
+                                                          𝑤𝑖 𝑖 (mod 𝑁)                   of each pair. Therefore, it needs to find (𝑤ℎ , 𝑚ℎ ) and (𝑤′𝑗 , 𝑚′𝑗 ) among the
+                                                                                                                                                   𝑛    𝑛
+                               𝑖=1                  𝑖=1                                  2𝑛 + 2 pairs. The probability of this operation is 2𝑛+2      2𝑛+1
+                                                                                                                                                           . Additionally,
+                                ( 𝑛             )𝑧1 𝛿 1                                  for successful deception, the MEC server needs to determine the value
+                                 ∏                        ∏
+                                                          𝑛
+                                                                 𝑚
+         = 𝑔 𝑘3 +𝑧1 𝛿1                     𝑤𝑖                   𝑤𝑖 𝑖 (mod 𝑁)             of 𝑟, where 𝑟 ∈ {2, … , 𝑁}. Thus, the probability of finding the correct 𝑟
+                                     𝑖=1                  𝑖=1                                 1
+                                                                                         is 𝑁−2 . Subsequently, the MEC server generates a random number 𝑡 and
+                   (                        )∏
+                                             𝑛
+                                                                                                           𝑚             𝑚′
+         = 𝑔 𝑘1 𝑎1 +𝑎2 +···+𝑎𝑛
+                                                     𝑚 +𝛿 𝑧
+                                                    𝑤𝑖 𝑖 1 1 (mod 𝑁)                     returns 𝑡𝑤ℎ ℎ and 𝑡𝑤′𝑗 𝑗 . We denote the malicious server successfully
+                                             𝑖=1                               (9)       determining the correct meaning of (𝑤ℎ , 𝑚ℎ ) and (𝑤′𝑗 , 𝑚′𝑗 ) as event 𝐸1 ,
+             ∏
+             𝑛
+                                 𝑎                                                       and denote the determining the value of 𝑟 as event 𝐸2 . We have:
+                       𝑘1 𝑎 𝑖
+         =         𝑔            𝑤𝑖 𝑖 (mod 𝑁)                                                         𝑛    𝑛                    1
+                                                                                         Pr(𝐸1 ) = 2𝑛+2 2𝑛+1
+                                                                                                             and Pr(𝐸2 ) = 𝑁−2    .
+             𝑖=1
+             ∏𝑛                                                                              Therefore, the probability of the intelligent vehicle being deceived
+                       𝑎                                                                                                              𝑛2
+         =         𝑦𝑖 𝑖 (mod𝑁)                                                           is: Pr(𝐸1 ∩ 𝐸2 ) = Pr(𝐸1 ) Pr(𝐸2 ) = (4𝑛2 +6𝑛+2)(𝑁−2) . Therefore, the checka-
+             𝑖=1                                                                                                                                  𝑛                2
+                                                                                         bility rate of our proposed scheme MExpm is: 1 − (4𝑛2 +6𝑛+2)(𝑁−2) .
+             ∏𝑛
+                   (       )𝑎
+         =          𝑢𝑖 + 𝑘𝑁 𝑖 (mod 𝑁)
+             𝑖=1
+           ∏
+           𝑛                                                                             5. Simulation
+              𝑎
+         =   𝑢𝑖 𝑖 (mod 𝑁)
+             𝑖=1
+                                                                                             In this section, we evaluate the performance of our proposed scheme
+Obviously, when Eq. (9) holds, the correctness of the algorithm
+                                                                                         MExpm by comparing it with the most advanced and representative
+𝑅𝑒𝑐𝑜𝑣𝑒𝑟𝑦 is guaranteed and the proof is completed.
+                                                                                         modular exponentiation offloading schemes reported in recent litera-
+                                                                                         ture. Specifically, we consider MExp [3] and SMCExp [4] for secure
+4.2. Security analysis                                                                   batch modular exponentiation, as well as SoRSA [6] and EPExp [7]
+                                                                                         for single modular exponentiation. These schemes reflect the latest
+     In this section, we demonstrate the privacy for computation of-                     advancements in both batch-oriented and single-operation settings and
+                                                     ∏
+floading results. In MExpm, we firstly convert 𝑛𝑖=1 𝑢𝑖 𝑎𝑖 (mod 𝑁) into                   are widely recognized as benchmarks in the field. Notably, all of these
+∏𝑛       𝑎
+   𝑖=1 𝑦𝑖 (mod 𝐿), then the exponents 𝑎𝑖 are transformed into 𝛿1 𝑧1 +
+           𝑖                                                                             algorithms are incorporated as baselines in our experimental evalua-
+𝑚𝑖,𝑖∈[𝑛] and 𝛿2 𝑧2 + 𝑚′𝑖,𝑖∈[𝑛] . The public information in our scheme are                tion, covering key performance indicators such as local computation
+{𝑃 𝑎𝑟𝑎𝑚𝑠, 𝑇 𝐾, 𝑉 𝐾, 𝐴𝑢𝑥, 𝜎𝐸 , 𝜋𝐸 }, and the adversaries cannot obtain any                time, end-to-end execution latency, communication overhead, and gas
+information about secret information {𝑢𝑖,𝑖∈[𝑛] , 𝑎𝑖,𝑖∈[𝑛] , 𝑅𝐾, 𝑅𝑒𝑠𝑢𝑙𝑡}.                 consumption. Since MExpm is designed for batch modular exponenti-
+                                                                                         ation, while SoRSA and EPExp are designed solely for single modular
+Theorem 1. When the MEC server cheats the client, the misbehavior can                    exponentiation, for fairness, we also conduct the comparison for the
+                                               𝑛2
+be detected with checkability rate 1 − (4𝑛2 +6𝑛+2)(𝑁−2) .                                case where the batch size 𝑛 = 1.
+
+                                                                                     6
+S. Shen et al.                                                                                                     Computer Standards & Interfaces 97 (2026) 104107
+
+
+
+
+                                                      Fig. 4. Comparsion of time cost in 𝑢𝑎 mod 𝑁.
+
+
+
+
+                                                                                        ∏𝑛
+                                                   Fig. 5. Comparison of time cost in    𝑖=1
+                                                                                               𝑢𝑖 𝑎𝑖 mod 𝑁.
+
+
+
+
+                 Fig. 6. Comparison of communication cost.                                             Fig. 7. Comparison of storage cost.
+
+
+                                                                            7
+S. Shen et al.                                                                                                  Computer Standards & Interfaces 97 (2026) 104107
+
+
+                                                                                 usage, communication overhead, and storage overhead. We quantify
+                                                                                 privacy using the checkability rate. Computation cost is measured by
+                                                                                 the execution time (in milliseconds), where longer runtimes correspond
+                                                                                 to higher resource consumption. We assess blockchain resource usage
+                                                                                 on the Ethereum simulation platform, Remix, using gas consumption
+                                                                                 as the metric. Communication overhead is defined as the total data
+                                                                                 transmitted during offloading. Storage overhead is quantified by the
+                                                                                 additional storage required on both the client and server sides.
+
+
+                                                                                 5.3. Checkability
+
+
+                                                                                     The details of checkability rate comparison of these four schemes
+                                                                                 are shown in Fig. 3. As 𝑛 increases, our proposed scheme MExpm
+                                                                                 always maintains a high checkability rate close to 1, while the other
+                                                                                 three schemes gradually decrease. When 𝑛 = 1000, the checkability
+                                                                                                          1
+                                                                                 rate of GExp is only 1001   . It means that a forged result can pass
+                                                                                 the verification algorithm with the probability 1000
+                                                                                                                                  1001
+                                                                                                                                       . Both MExp and
+                                                                                 SMCExp have the same checkability rate. However, when 𝑛 = 5000, the
+Fig. 8. Comparison of Gas Consumption in Verify Algorithm when the size of       checkability rate for these two schemes is only 0.975. Since MExpm
+𝑟 is larger than 32 bits.                                                        uses a prime number 𝑁 with size 512, its checkability rate is higher
+                                                                                 than 0.999.
+
+
+                                                                                 5.4. Computation cost
+
+
+                                                                                 5.4.1. Single modular exponentiation offloading
+                                                                                     The comparsion results of single modular exponentiation offloading
+                                                                                 could be found in Fig. 4. Compared with MExp and SMCExp, MExpm
+                                                                                 demonstrates better performance either in 𝐾𝑒𝑦𝐺𝑒𝑛 algorithm or in
+                                                                                 𝑆𝑒𝑡𝑢𝑝 algorithm. Particularly in 𝑉 𝑒𝑟𝑖𝑓 𝑦 algorithm, MExpm outperforms
+                                                                                 these competitors. When it comes to SoRSA and EPExp, whose security
+                                                                                 assumption is rather simple and cannot applied in real-world scenarios,
+                                                                                 it seems unfair to compare them with schemes for batch modular
+                                                                                 exponentiation with higher security standard.
+
+
+                                                                                 5.4.2. Batch modular exponentiation offloading
+                                                                                     Fig. 5 compares the computational cost of batch modular exponen-
+                                                                                 tiation offloading. MExpm consistently requires fewer resources than
+                                                                                 MExp across all phases. Although MExpm adds a recovery phase, it
+             Fig. 9. The relative saving ratio of MExp and MExpm.                consists of a single modular inversion—incurring a fixed and negligible
+                                                                                 overhead.
+
+
+                                                                                 5.5. Communication and storage cost
+5.1. Experimental setting
+
+    We implemented MExpm and MExp for batch modular exponentia-                     To simulate a low-bandwidth network environment, we set the
+tion offloading, MExp, SMCExp, EPExp, SoRSA and MExpm for single                 transmission rate to 1 Kbps. Fig. 6 shows the communication cost of
+modular exponentiation offloading using Python 3.8, along with the Py-           the all competitors and MExpm in terms of the time cost of trans-
+Cryptodome and GNU Multiple Precision (gmpy2 version 2.1.5) library.             mission. For fair comparison, all schemes employ 1024-bit modulus.
+All simulation experiments were conducted on the same Windows ma-                For EPExp and SoRSA, whose authors assume that they only offload
+chine equipped with an Intel Core TM i9-13900HX processor (running               ciphertext to servers and do not need to take security of bases into
+at 2.20 GHz) and 16 GB of memory. We perform each algorithm 100                  consideration, and thus, they have lower communication cost com-
+times and then computed the mean of its time cost. The size of prime             pared with other schemes. Compared with MExp and SMCExp, MExpm
+numbers selected in MExpm are all 512 bits, meaning the number 𝐿                 shares the same communication cost in 𝐶𝑜𝑚𝑝𝑢𝑡𝑒 and 𝑉 𝑒𝑟𝑖𝑓 𝑦 algorithm.
+is 1024 bits. For MExp and methods without offloading, we randomly               SMCExp shows the least communication cost in 𝐾𝑒𝑦𝐺𝑒𝑛 and 𝑆𝑒𝑡𝑢𝑝
+generate a pair of 1024-bit prime numbers. In our simulation, ‘MExpm             algorithm. The results in Fig. 6 demonstrate that MExpm can deploy
+w/o obfuscation’ denotes MExpm without the secure obfuscation oper-
+                                                                                 a more secure offloading strategy while with similar communication
+ation and ‘w/o offloading’ indicates the local execution of the modular
+                                                                                 cost compared with other competitors. Fig. 7 shows the storage cost
+exponentiation operation.
+                                                                                 among all schemes, SoRSA needs to store 𝑛, 𝑞, 𝑝, 𝐶, 𝑘, 𝑡1 , 𝑡2 to conduct
+5.2. Evaluation metrics                                                          verification and recovery, leading to the most demanding storage cost.
+                                                                                 EPExp demonstrates the best storage performance, while it lacks a
+   To comprehensively evaluate MExpm, we assess its performance                  consideration of a malicious MEC server. Among MExp and SMCExp,
+across five dimensions: privacy, computation cost, blockchain resource           MExpm demonstrates the best performance in storage performance.
+
+                                                                             8
+S. Shen et al.                                                                                                  Computer Standards & Interfaces 97 (2026) 104107
+
+
+5.6. Gas consumption                                                                • Witness Tampering: The MEC server alters or forges witnesses
+                                                                                      with the objective of deceiving the verifier and illegitimately
+    The results of the gas consumption comparison are demonstrated in                 passing the verification process.
+Fig. 8. It can be observed that as 𝑟 increases, the gas consumption for
+                                                                                     In our simulation, the Intelligent Vehicle (IV) executes the KeyGen
+both MExpm and MExp grows steadily. However, the gap between them
+                                                                                 algorithm entirely offline to generate the evaluation key 𝑇 𝐾, witness
+widens significantly. For instance, when 𝑛 = 5, the gas fee difference
+                                                                                 generation key 𝑉 𝐾, and auxiliary information 𝐴𝑢𝑥. The 𝑇 𝐾 and 𝑉 𝐾
+rises from 7,504 gas at 𝑟 = 32 bits to 58,981 gas at 𝑟 = 256 bits.
+                                                                                 are then transmitted to the MEC server and the Roadside Unit (RSU),
+Furthermore, the gas cost of MExpm’s 𝑉 𝑒𝑟𝑖𝑓 𝑦 algorithm scales linearly
+                                                                                 respectively. The RSU, acting as a lightweight verifier, executes the
+with 𝑛 and is largely unaffected by 𝑟, whereas MExp’s verification cost
+                                                                                 verification algorithm upon receiving computation results from the
+increases with both 𝑟 and 𝑛. This highlights MExpm’s superior efficiency
+                                                                                 MEC server.
+in reducing computational and financial burdens for intelligent vehi-                Experimental results demonstrate that our verification mechanism
+cles, especially at larger scales. To provide a normalized view of these         achieves a 100% detection rate for all injected malicious behaviors,
+savings, we evaluate the relative saving ratio (SR) defined as                   with zero false positives under benign conditions. This confirms that
+         𝑛,𝑟     𝑛,𝑟
+        𝐺MExp − 𝐺MExpm                                                           the proposed scheme maintains strong security guarantees even in the
+𝑆𝑅 =              𝑛,𝑟                                                            presence of malicious MEC servers, thereby reinforcing its practicality
+                 𝐺MExp
+                                                                                 for real-world V2X deployments.
+         𝑛,𝑟     𝑛,𝑟
+where 𝐺MExp   , 𝐺MExpm is the gas consumption of MExp and MExpm with
+same 𝑛 and 𝑟 respectively. As illustrated in Fig. 9, MExpm consistently          5.9. Deployment feasibility in real-world V2X environments
+achieves 𝑆𝑅 > 0 across all tested parameter, with observed savings
+                                                                                     The proposed scheme is designed for secure computation outsourc-
+ranging from approximately 30% to 70%. These results confirm that
+                                                                                 ing in resource-constrained vehicular networks. In such scenarios, the
+MExpm delivers substantial resource savings over MExp, reinforcing its
+                                                                                 primary metric for evaluation is whether the total computational cost
+scalability and economic advantages.
+                                                                                 incurred locally after outsourcing is significantly lower than that of
+                                                                                 fully local execution. Therefore, as is common in the literature on com-
+5.7. Economic analysis of gas savings
+                                                                                 putation outsourcing, we perform all experiments — including the com-
+                                                                                 putation algorithm, verification algorithm, and a non-outsourced base-
+    To further assess the practical impact of our scheme in real-world           line — on the same hardware platform. This ensures a fair and repro-
+deployments, we provide an economic estimation of gas savings                    ducible comparison under identical computational conditions, thereby
+achieved by the proposed MExpm scheme compared with the represen-                directly demonstrating the benefits of outsourcing.
+tative baseline MExp, particularly in the context of blockchain-based                In the proposed system, the Service Authority (SA) is responsible for
+smart contract verification. As shown in Fig. 8, the gas cost of each            handling the bulk of initialization tasks, such as modulus generation,
+offloaded batch modular exponentiation result increases with both the            base obfuscation, and parameter distribution. This design choice aligns
+batch size 𝑛 and the bit length of the randomness parameter 𝑟. When 𝑛 =          with the practical division of labor in vehicular networks, reducing
+1 and bit length of 𝑟 is 32 bits, MExp incurs 24,013 gas while MExpm             the computational burden on field devices. The Intelligent Vehicle (IV)
+requires only 16,509 gas, leading to a difference of 7504 gas per                executes the KeyGen algorithm to generate the evaluation key 𝑇 𝐾,
+verification. The average gas price is approximately 30 Gwei (1 Gwei =           witness generation key 𝑉 𝐾, and auxiliary information 𝐴𝑢𝑥. Crucially,
+10−9 ETH), the ETH/USD exchange rate is approximately $4, 000, and 1             the KeyGen algorithm can be performed entirely offline before the
+million gas cost approximately $120 on Ethereum networks. Therefore,             online phase, ensuring that no additional delay is introduced when
+the savings can be translated as                                                 initiating the outsourced computation. Once 𝑇 𝐾 is generated, the IV
+                                                                                 transmits 𝑇 𝐾 to the MEC server for performing the computation tasks.
+Gas Savings = 7,504 gas × 0.03 USD∕1,000,000 ≈ 0.90 USD                              In real deployments, RSUs are lightweight verification devices typi-
+Consider a practical usage scenario where each intelligent vehicle               cally deployed at traffic intersections or along highways, where power
+offloads batch modular exponentiation tasks 10 times per day (e.g., for          supply and network connectivity can be unstable. To emulate these
+authentication, key negotiation, digital signatures, etc.), the annual           constraints, we configure the RSU role on a lightweight laptop and
+number of invocations is 10 (tasks/day)×365 (days/year) = 3,650 tasks/           limit the network transmission rate to 1 Kbps, thereby simulating a
+year. Thus, the total annual gas cost saving per vehicle is: 0.90                realistic low-bandwidth vehicular environment. Meanwhile, the Service
+                                                                                 Authority (SA) undertakes the bulk of initialization tasks, ensuring that
+ USD/task × 3,650 tasks/year ≈ 𝟑𝟐𝟖𝟓𝐔𝐒𝐃∕𝐲𝐞𝐚𝐫. This estimation high-
+                                                                                 RSUs and IVs remain computationally efficient during the online phase.
+lights the substantial economic benefits of MExpm when deployed at
+                                                                                     This deployment-oriented design, together with our simulation set-
+scale in large IoV systems. For a fleet of 10,000 vehicles, the projected
+                                                                                 tings, ensures that the evaluation faithfully reflects real-world lim-
+gas savings could exceed 32.8 million USD annually.
+                                                                                 itations while remaining reproducible. Consequently, the proposed
+                                                                                 scheme is shown to be both practically feasible and robust for secure
+5.8. Robustness evaluation against malicious MEC servers
+                                                                                 computation outsourcing in V2X environments.
+
+    To address potential security risks in practical deployments, we             6. Conclusion and future work
+conducted robustness experiments simulating malicious Mobile Edge
+Computing (MEC) servers that deviate from the prescribed computation                 In this paper, we propose MExpm, a secure and efficient computa-
+protocol. Such adversarial behaviors may include, but are not limited            tion offloading scheme for batch modular exponentiation in Vehicle-
+to:                                                                              to-Everything (V2X) communications. Our proposed scheme addresses
+                                                                                 critical challenges in V2X systems, such as computational burden,
+     • Forged Results: The MEC server deliberately returns computa-              latency, and privacy concerns, by leveraging Mobile Edge Computing
+       tion results that deviate from the prescribed algorithm, thereby          (MEC) servers and blockchain technology. Our scheme achieves several
+       attempting to mislead the verifier regarding the correctness of the       significant improvements over existing methods. It ensures fairness
+       computation.                                                              in computation offloading by using smart contracts, provides high
+     • Partial Omission or Manipulation: The MEC server selectively              checkability to detect any misbehavior by MEC servers, and enhances
+       omits partial computation results or manipulates intermediate             privacy protection through secure obfuscation and logical split tech-
+       values with the intent of reducing its own computational work-            niques. These features make MExpm particularly well-suited for various
+       load.                                                                     real-time applications in V2X systems. These include
+
+                                                                             9
+S. Shen et al.                                                                                                                    Computer Standards & Interfaces 97 (2026) 104107
+
+
+     • Real-time Cryptographic Operations: MExpm can offload                                  [6] H. Zhang, J. Yu, C. Tian, L. Tong, J. Lin, L. Ge, H. Wang, Efficient and secure
+       resource-intensive cryptographic tasks, such as digital signatures                         outsourcing scheme for rsa decryption in internet of things, IEEE Internet Things
+                                                                                                  J. 7 (8) (2020) 6868–6881.
+       and key exchanges, ensuring secure and efficient communication
+                                                                                              [7] Q. Hu, M. Duan, Z. Yang, S. Yu, B. Xiao, Efficient parallel secure outsourcing of
+       with reduced latency.                                                                      modular exponentiation to cloud for iot applications, IEEE Internet Things J. 8
+     • Safety-Critical Message Signature: Offloading the computation                              (16) (2020) 12782–12791.
+       of digital signatures (which rely on exponentiation) for emer-                         [8] K. Zhou, M. Afifi, J. Ren, Expsos: Secure and verifiable outsourcing of exponen-
+       gency braking alerts and collision avoidance warnings, with on-                            tiation operations for mobile cloud computing, IEEE Trans. Inf. Forensics Secur.
+                                                                                                  12 (11) (2017) 2518–2531.
+       chain smart contracts validating each signature to prevent tam-
+                                                                                              [9] Y. Saleem, F. Salim, M.H. Rehmani, Integration of cognitive radio sensor
+       pering.                                                                                    networks and cloud computing: A recent trend, in: Cognitive Radio Sensor
+     • Privacy-Preserving Authentication: The scheme guarantees the                               Networks: Applications, Architectures, and Challenges, IGI Global, 2014, pp.
+       privacy of sensitive data, such as cryptographic bases and expo-                           288–312.
+       nents, while allowing verification of computation results. This is                    [10] J. Wang, Y. Wang, H. Ke, Joint optimization for mec computation offloading
+                                                                                                  and resource allocation in iov based on deep reinforcement learning, Mob. Inf.
+       essential for secure authentication in V2X communications, pro-
+                                                                                                  Syst. 2022 (2022).
+       tecting both vehicles and infrastructure from malicious attacks.                      [11] L. Yao, X. Xu, M. Bilal, H. Wang, Dynamic edge computation offloading for
+     • Traffic Management Systems: MExpm can be integrated into                                   internet of vehicles with deep reinforcement learning, IEEE Trans. Intell. Transp.
+       smart city infrastructures, supporting secure communication for                            Syst. (2022).
+       traffic management, tolling systems, and other applications where                     [12] R. Gennaro, C. Gentry, B. Parno, Non-interactive verifiable computing: Out-
+                                                                                                  sourcing computation to untrusted workers, in: Advances in Cryptology–CRYPTO
+       privacy and computation efficiency are crucial.
+                                                                                                  2010: 30th Annual Cryptology Conference, Santa Barbara, CA, USA, August 2010
+                                                                                                  15-19. Proceedings 30, Springer, 2010, pp. 465–482.
+   Although MExpm significantly reduces computation resources com-
+                                                                                             [13] Y. Guan, H. Zheng, J. Shao, R. Lu, G. Wei, Fair outsourcing polynomial
+pared to local execution, it introduces additional complexity due to en-                          computation based on the blockchain, IEEE Trans. Serv. Comput. 15 (5) (2021)
+hanced security features. Future work should aim to design a more gen-                            2795–2808.
+eralized verifiable computation offloading framework that optimizes                          [14] H. Pagnia, F.C. Gärtner, et al., On the Impossibility of Fair Exchange Without a
+the balance between security and computational efficiency.                                        Trusted Third Party, Tech. Rep., Citeseer, 1999.
+                                                                                             [15] A. Aloufi, P. Hu, Y. Song, K. Lauter, Computing blindfolded on data homomor-
+                                                                                                  phically encrypted under multiple keys: A survey, ACM Comput. Surv. 54 (9)
+CRediT authorship contribution statement                                                          (2021) http://dx.doi.org/10.1145/3477139.
+                                                                                             [16] J. Zhang, Z.L. Jiang, P. Li, S.M. Yiu, Privacy-preserving multikey com-
+   Sipeng Shen: Writing – review & editing, Writing – original draft,                             puting framework for encrypted data in the cloud, Inform. Sci. 575
+Methodology, Conceptualization. Qiang Wang: Writing – review &                                    (2021) 217–230, http://dx.doi.org/10.1016/j.ins.2021.06.017, https://www.
+                                                                                                  sciencedirect.com/science/article/pii/S0020025521006083.
+editing, Writing – original draft, Methodology. Fucai Zhou: Writing –
+                                                                                             [17] M. Cui, S. Zhong, B. Li, X. Chen, K. Huang, Offloading autonomous driving
+review & editing, Supervision. Jian Xu: Writing – review & editing,                               services via edge computing, IEEE Internet Things J. 7 (10) (2020) 10535–10547.
+Supervision. Mingxing Jin: Writing – review & editing.                                       [18] J. Zhou, F. Wu, K. Zhang, Y. Mao, S. Leng, Joint optimization of offloading
+                                                                                                  and resource allocation in vehicular networks with mobile edge computing, in:
+Declaration of competing interest                                                                 2018 10th International Conference on Wireless Communications and Signal
+                                                                                                  Processing, WCSP, IEEE, 2018, pp. 1–6.
+                                                                                             [19] S. Yang, A task offloading solution for internet of vehicles using combination
+    The authors declare no competing interests.
+                                                                                                  auction matching model based on mobile edge computing, IEEE Access 8 (2020)
+                                                                                                  53261–53273.
+Acknowledgments                                                                              [20] H. Xu, W. Huang, Y. Zhou, D. Yang, M. Li, Z. Han, Edge computing resource
+                                                                                                  allocation for unmanned aerial vehicle assisted mobile network with blockchain
+   This work was supported in part by the National Natural Sci-                                   applications, IEEE Trans. Wirel. Commun. 20 (5) (2021) 3107–3121.
+                                                                                             [21] Y. Liu, H. Yu, S. Xie, Y. Zhang, Deep reinforcement learning for offloading and
+ence Foundation of China under Grants 62202090, 62173101, and
+                                                                                                  resource allocation in vehicle edge computing and networks, IEEE Trans. Veh.
+62372069, by Natural Science Foundation of Liaoning Province un-                                  Technol. 68 (11) (2019) 11158–11168.
+der Grant 2025-MS-046, by the Fundamental Research Funds for the                             [22] J. Kilian, Theory of Cryptography: Second Theory of Cryptography Conference,
+Central Universities, China under Grant N2417006, and by Liaoning                                 TCC 2005, Cambridge, MA, USA, February 10-12. 2005, Proceedings, vol. 3378,
+Collaboration Innovation Center for CSLE under Grant XTCX2024-015.                                Springer, 2005.
+                                                                                             [23] X. Ma, J. Li, F. Zhang, Efficient and secure batch exponentiations outsourcing
+                                                                                                  in cloud computing, in: 2012 Fourth International Conference on Intelligent
+Data availability                                                                                 Networking and Collaborative Systems, IEEE, 2012, pp. 600–605.
+                                                                                             [24] X. Chen, J. Li, J. Ma, Q. Tang, W. Lou, New algorithms for secure outsourcing
+    Data will be made available on request.                                                       of modular exponentiations, IEEE Trans. Parallel Distrib. Syst. 25 (9) (2013)
+                                                                                                  2386–2396.
+                                                                                             [25] Y. Ren, N. Ding, X. Zhang, H. Lu, D. Gu, Verifiable outsourcing algorithms for
+References                                                                                        modular exponentiations with improved checkability, in: Proceedings of the 11th
+                                                                                                  ACM on Asia Conference on Computer and Communications Security, 2016, pp.
+ [1] R. Sun, Y. Wen, N. Cheng, W. Wang, R. Chai, Y. Hui, Structural knowledge-                    293–303.
+     driven meta-learning for task offloading in vehicular networks with integrated          [26] M. Van Dijk, D. Clarke, B. Gassend, G.E. Suh, S. Devadas, Speeding up
+     communications, sensing and computing, J. Inf. Intell. (2024).                               exponentiation using an untrusted computational resource, Des. Codes Cryptogr.
+ [2] S. Yuan, Y. Fan, Y. Cai, A survey on computation offloading for vehicular                    39 (2006) 253–273.
+     edge computing, in: Proceedings of the 2019 7th International Conference on             [27] J. Ye, J. Wang, Secure outsourcing of modular exponentiation with single
+     Information Technology: IoT and Smart City, 2019, pp. 107–112.                               untrusted server, in: 2015 18th International Conference on Network-Based
+ [3] Y. Ding, Z. Xu, J. Ye, K.-K.R. Choo, Secure outsourcing of modular exponen-                  Information Systems, IEEE, 2015, pp. 643–645.
+     tiations under single untrusted programme model, J. Comput. System Sci. 90              [28] S. Li, L. Huang, A. Fu, J. Yearwood, Cexp: secure and verifiable outsourcing of
+     (2017) 1–13.                                                                                 composite modular exponentiation with single untrusted server, Digit. Commun.
+ [4] Q. Su, R. Zhang, R. Xue, Secure outsourcing algorithms for composite modular                 Netw. 3 (4) (2017) 236–241.
+     exponentiation based on single untrusted cloud, Comput. J. 63 (8) (2020)                [29] C. Dong, Y. Wang, A. Aldweesh, P. McCorry, A. van Moorsel, Betrayal, distrust,
+     1271–1271.                                                                                   and rationality: Smart counter-collusion contracts for verifiable cloud comput-
+ [5] Y. Wang, Q. Wu, D.S. Wong, B. Qin, S.S. Chow, Z. Liu, X. Tan, Securely                       ing, in: Proceedings of the 2017 ACM SIGSAC Conference on Computer and
+     outsourcing exponentiations with single untrusted program for cloud storage, in:             Communications Security, 2017, pp. 211–227.
+     Computer Security-ESORICS 2014: 19th European Symposium on Research in                  [30] J. Ellul, G.J. Pace, Runtime verification of ethereum smart contracts, in: 2018
+     Computer Security, Wroclaw, Poland, September 2014 7-11. Proceedings, Part I                 14th European Dependable Computing Conference, EDCC, IEEE, 2018, pp.
+     19, Springer, 2014, pp. 326–343.                                                             158–163.
+
+
+                                                                                        10
+S. Shen et al.                                                                                                                    Computer Standards & Interfaces 97 (2026) 104107
+
+
+[31] M.R. Dorsala, V. Sastry, S. Chapram, Fair payments for verifiable cloud services        [33] D. Xu, Y. Ren, X. Li, G. Feng, Efficient and secure outsourcing of modular
+     using smart contracts, Comput. Secur. 90 (2020) 101712.                                      exponentiation based on smart contract, Int. J. Netw. Secur. 22 (6) (2020)
+[32] S. Avizheh, M. Nabi, R. Safavi-Naini, K. M. Venkateswarlu, Verifiable computa-               934–944.
+     tion using smart contracts, in: Proceedings of the 2019 ACM SIGSAC Conference           [34] G. Wood, et al., Ethereum: a secure decentralised generalised transaction ledger,
+     on Cloud Computing Security Workshop, 2019, pp. 17–28.                                       (2014) 2017.
+                                                                                             [35] S. Mahdavi-Hezavehi, Y. Alimardani, R. Rahmani, D. Rosaci, An efficient frame-
+                                                                                                  work for a third party auditor in cloud computing environments, Comput. J. 63
+                                                                                                  (1) (2020) 1285–1297.
+
+
+
+
+                                                                                        11
+
--- a/papers_txt/Post-quantum-PAKE-over-lattices-revised--Smaug-T-_2026_Computer-Standards---.txt
+++ b/papers_txt/Post-quantum-PAKE-over-lattices-revised--Smaug-T-_2026_Computer-Standards---.txt
--- a/papers_txt/ProckStore--An-NDP-empowered-key-value-store-with-asynchr_2025_Journal-of-Sy.txt
+++ b/papers_txt/ProckStore--An-NDP-empowered-key-value-store-with-asynchr_2025_Journal-of-Sy.txt
--- a/papers_txt/Quality-assessment-for-software-data-validation-in-aut_2026_Computer-Standar.txt
+++ b/papers_txt/Quality-assessment-for-software-data-validation-in-aut_2026_Computer-Standar.txt
--- a/papers_txt/Quantum-safe-identity-based-designated-verifier-_2025_Journal-of-Systems-Arc.txt
+++ b/papers_txt/Quantum-safe-identity-based-designated-verifier-_2025_Journal-of-Systems-Arc.txt
@@ -0,0 +1,733 @@
+                                                              Journal of Systems Architecture 160 (2025) 103362
+
+
+                                                                   Contents lists available at ScienceDirect
+
+
+                                                          Journal of Systems Architecture
+                                                          journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+Quantum-safe identity-based designated verifier signature for BIoMT
+Chaoyang Li a,b              ,∗, Yuling Chen a , Mianxiong Dong c , Jian Li d , Min Huang b , Xiangjun Xin b ,
+
+Kaoru Ota c
+a State Key Laboratory of Public Big Data, Guizhou University, Guizhou Guiyang, 550025, China
+b
+  College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China
+c
+  Department of Sciences and Informatics, Muroran Institution of Technology, Muroran 050-8585, Japan
+d
+  School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
+
+
+
+ARTICLE                INFO                               ABSTRACT
+
+MSC:                                                      Blockchain technology changes the centralized management form in traditional healthcare systems and
+00-01                                                     constructs the distributed and secure medical data-sharing mechanism to achieve data value maximization.
+99-00                                                     However, the advanced capabilities of quantum algorithms bring a serious threat to current blockchain
+Keywords:                                                 cryptographic algorithms which are based on classical mathematical difficulties. This paper proposes the first
+Blockchain                                                quantum-safe identity-based designated verifier signature (ID-DVS) scheme for blockchain-based Internet of
+Internet of medical things                                medical things (BIoMT) systems. This scheme is constructed based on the lattice assumption of the short
+Identity
+                                                          integer solution (SIS) problem, which is believed to resist the quantum attack. The identity mechanism helps
+DVS
+                                                          to establish a transaction traceability mechanism when this data is shared among different medical institutions.
+Privacy-preserving
+                                                          The designated verifier mechanism also prevents unauthorized users from accessing data to improve the
+                                                          security of medical data-sharing processes. Next, this ID-DVS scheme is proved in random oracle model, which
+                                                          can achieve the security properties of anonymity and unforgeability. It also can capture the post-quantum
+                                                          security. Then, the performance analysis of the key size and time consumption are presented, and the results
+                                                          show that this ID-DVS is more efficient than other similar schemes. Therefore, this work supports secure
+                                                          medical data-sharing and protects the privacy of users and medical data.
+
+
+
+1. Introduction                                                                                 tructure, Merkle tree, digital signature, and zero-knowledge proof,
+                                                                                                which are utilized to better adapt to the transaction privacy protection
+    Blockchain-enabled Internet of Medical Things (BIoMT) profoundly                            in the blockchain network. These blockchain cryptographic technolo-
+affects people’s lives and health with the gradual increase of wearable                         gies jointly protect transaction security and user privacy. For example,
+health devices [1]. Firstly, blockchain technology helps to establish a                         the digital signature is responsible for transaction verification in the
+distributed medical data-sharing framework among different medical                              consensus process and for establishing links to different blocks [3].
+institutions, which replaces the traditional centralized management                             The signature also provides the transaction traceability mechanism
+form and achieves cross-institutional medical data utilization. Then, the                       when some disputes occur. Especially the DVS is more suitable for
+BIoMT solves the problems of collecting, storing, sharing, and using                            one-to-one data-sharing among different BIoMT systems that it can
+massive medical data. However, the security issues with medical data                            guarantee the non-delegatability of signature. These technologies con-
+and user privacy in the cross-institutional data-sharing process have                           struct the trust foundation for the blockchain-based network as these
+gained much attention as more sensitive information is inserted into                            NP-hard problem-based cryptographic algorithms cannot be broken
+these medical data. Especially for the sensitive information protection,                        through with the current most advanced classic computer. Most of
+the users do not want to give non-specified users access to the data.                           these algorithms are based on RSA and ECC cryptographic theories, but
+Hence, one-to-one data sharing can effectively prevent the leakage of                           the fundamental problems of large integer factorization and discrete
+sensitive information.                                                                          logarithms are weak against the quantum attack [4].
+    Blockchain cryptography has received more attention as it is in-                                Quantum threat is the main concern in current information systems
+creasingly essential in most blockchain-based applications [2]. It is                           with the rapid developments of quantum computers and quantum
+relation to the cryptographic algorithms of the symmetric crypto-                               computing. The Grover quantum algorithm can speed up the efficiency
+graphic, asymmetric cryptographic, hash function, public key infras-
+
+
+    ∗ Corresponding author at: College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China.
+      E-mail address: lichaoyang@zzuli.edu.cn (C. Li).
+
+https://doi.org/10.1016/j.sysarc.2025.103362
+Received 9 December 2024; Received in revised form 13 January 2025; Accepted 6 February 2025
+Available online 15 February 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+C. Li et al.                                                                                                     Journal of Systems Architecture 160 (2025) 103362
+
+
+of target search, which brings threats to the symmetric cryptographic           data-sharing processes. For identity authentication, Jia et al. [13]
+algorithm, for example: Elliptic Curve Cryptography
+                                              √        (ECC), by decreas-       constructed a privacy-aware authentication model with blockchain and
+ing the search complexity from 𝑂(𝑁) to 𝑂( 𝑁) [5]. The Shor quantum              proposed two authentication protocols based on ECC and physically un-
+algorithm can achieve exponential acceleration for large integer factor-        clonable function algorithm respectively to enhance privacy security in
+ization [6], which brings threats to the asymmetric cryptographic, for          the IoMT ecosystem. Lin et al. [14] proposed a mutual user authentica-
+example: RSA. In recent years, post-quantum cryptographic algorithms            tion protocol with the ECC algorithm, which could achieve a legal user
+have gained much attention in the areas of scientific research, finance,        authentication in blockchain-based IoMT networking. Chen et al. [15]
+and industry [7]. Currently, code-based cryptography, Hash cryp-                designed a certificateless aggregate signcryption scheme based on ECC
+tography, lattice cryptography, and multivariate-quadratic-equations            to protect the data privacy in IoT applications, but it could not provide
+cryptography are some famous post-quantum cryptographic (PQC) al-               anti-quantum attack security. Han et al. [16] introduced a blockchain
+gorithms. Code-based cryptography was first proposed by McEliece [8],           based privacy-preserving framework and a public key searchable en-
+which was constructed by the error correction codes. Although this              cryption scheme to strengthen the data traceability. Zou et al. [17]
+cryptosystem has a significant anti-quantum attack advantage, its key           introduced a credential-embedded authentication protocol to protect
+size disadvantage makes it unsuitable for IoT systems. Hash cryptog-            users’ privacy and designed an authenticated key agreement protocol to
+raphy was initially introduced by Lamport [9], which was known as               support bilateral authentication for medical data-sharing through IoMT
+the one-way function to provide quantum-proof security. The Merkle              systems. For data encryption/decryption, Guo et al. [18] presented
+tree is another well-known hash-based cryptosystem [10]. These hash-            an attributed-based encryption protocol with a ciphertext policy and
+based algorithms are not based on solving hard mathematical problems,           set an outsourced online/offline revocable mechanism to guarantee
+but they can obtain the properties of one-wayness, collusion resistance,        fine-grained access control. Li and Dong et al. [19] gave a keyword-
+and preimage resistance. Lattice cryptography is one of the suggested           searchable encryption scheme to achieve cross-institution medical data
+PQC scheme in the NIST call, which was first proposed by Ajtai [11].            utilization and established an on-chain ledger and off-chain storage
+Multivariate-quadratic-equations cryptography is another kind of PQC            model to reduce ledger redundancy. Liu et al. [20] designed a cer-
+that is based on the complexity of solving multivariate equations [12].         tificateless public key encryption protocol based on high-consumption
+This kind of PQC algorithm suffers from efficiency hardship with the            bilinear pairing, combining the keyword search function to protect
+large key size and ciphertext overhead.                                         medical data in IoMT. Qu et al. [21] introduced an interesting work
+    This paper focuses on the needs of security and integrity, and pro-         of quantum blockchain to improve privacy security in IoMT, which
+poses a lattice-based ID-DVS scheme to cover the privacy-preserving is-         utilized the quantum signature and quantum identity authentication
+sues, such as designated verifier, signer’s anonymity, and signature non-       to achieve secure medical data-sharing with the quantum cloud. For
+delegatability in the BIoMT system. The contributions are summarized            transaction verification, Mao et al. [22] presented an identity-based
+as follows.                                                                     aggregated signature scheme for IoMT, which could enable efficient
+                                                                                local verification of medical data with a locally verifiable mechanism.
+     • A lattice-based ID-DVS scheme has been proposed. This is the
+                                                                                Zhang et al. [23] proposed a certificateless signcryption protocol to
+       first ID-DVS scheme which is constructed with the reject sampling
+                                                                                guarantee privacy security in IoMT, which utilized bilinear pairings
+       in Gaussian distribution and SIS lattice problem. The identity
+                                                                                and zero-knowledge proof to resist super-level internal adversaries.
+       mechanism in this ID-DVS provides transaction traceability for
+                                                                                Li et al. [24] proposed a designated verifier signature scheme and
+       medical data-sharing, and the designed verifier setting protects
+                                                                                established a cross-chain medical data-sharing framework to support
+       user privacy as unauthorized users cannot access the transaction.
+                                                                                secure and efficient data-sharing among different BIoMT systems.
+     • The security proof of the proposed ID-DVS scheme is given. In
+                                                                                    With the deepening application of blockchain in BIoMT, the re-
+       the random oracle model, this ID-DVS scheme can be proved to
+                                                                                search on blockchain cryptographic algorithms applicable to medical
+       satisfy the security properties of anonymity and unforgeability.
+                                                                                data-sharing transactions is also more urgent. Most of these BIoMT
+       Meanwhile, this ID-DVS scheme can resist the quantum attack
+                                                                                systems are also based on RSA and ECC cryptographic algorithms,
+       with the lattice assumption, which can prevent the quantum
+                                                                                which are vulnerable to quantum attacks. So it is urgent to seek more
+       adversary in the future quantum computer age.
+                                                                                secure anti-quantum cryptographic algorithms to equip current BIoMT
+     • The efficiency comparison and performance analysis are pre-
+                                                                                systems.
+       sented. The key size, time consumption, and energy consumption
+       are calculated and compared with other similar schemes. The
+                                                                                2.2. Post-quantum cryptography
+       results show that this ID-DVS scheme is more efficient, which can
+       well support secure medical data-sharing among different BIoMT
+                                                                                    PQC utilizes classical computationally hard problems to construct
+       systems.
+                                                                                quantum-safe cryptosystems for current information systems. Especially
+   Next, the related work is given in Section 2, some preliminaries are         for the sensitive information protection of medical data in BIoMT
+shown in Section 3, the ID-DVS scheme is proposed in Section 4, the             systems, the practical application of PQC is important and necessary.
+security of the ID-DVS scheme is analyzed and proved in Section 5, the          For code-based cryptography, Thiers et al. [25] presented a decoding
+performance analysis is in Section 6, and the conclusion is in Section 7.       algorithm based on the 𝑞-ary codes, which could achieve low com-
+                                                                                plexity and anti-quantum security. Alahmadi et al. [26] introduced
+2. Related work                                                                 a signature scheme with error-correcting codes for blockchain-based
+                                                                                networks and utilized bounded distance decoding for signature veri-
+   This paper mainly focuses on the research and applications of                fication. For hash cryptography, Punithavathi et al. [27] established a
+blockchain cryptography in BIoMT. Some reviews of blockchain cryp-              double-layer encryption framework and proposed a crypto hash algo-
+tography for BIoMT, PQC, and lattice-based signature theory about this          rithm to resist the malware attack in medical data-sharing processes in
+theme are given in the following subsections.                                   the IoMT system. Kuznetsov et al. [28] gave the performance analysis
+                                                                                of the hashing algorithm in blockchain-based systems and compared
+2.1. Blockchain cryptography for BIoMT                                          it with other related hashing algorithms to show its efficiency and
+                                                                                practice. For lattice cryptography, Ye et al. [29] designed a traceable
+   In    the  BIoMT      system,    identity    authentication,    data         ring signature scheme based on lattice assumption for IoMT, which
+encryption/decryption, and transaction verification all need blockchain         could obtain tag-linkability and exculpability in a random oracle model.
+cryptography algorithms to protect privacy security in the medical              Bagchi et al. [30] utilized the ring LWE problem to construct an
+
+                                                                            2
+C. Li et al.                                                                                                                Journal of Systems Architecture 160 (2025) 103362
+
+               Table 1
+               Lattice-based schemes comparison.
+                Ref.                               Lattice problem          Advantage                                 Limitation
+                Kim et al. [33]                    NTRU                     Key encapsulation;                        Centralized KGC; Key escrow;
+                                                                            Randomness-recovery; Encoding             Chosen ciphertext attack weak
+                Yu et al. [35]                     NTRU and SIS             Certificateless, Ring signature           Private key management
+                Li and Jiang et al. [34]           ring-LWE and SIS         Non-delegatability; Bimodal               Centralized KGC; Key escrow
+                                                                            Gaussians
+                Yao et al. [36]                    ring-LWE and ring-ISIS   Ring analog; Authenticate                 Centralized KGC; Key escrow
+                                                                            ciphertext
+                Zhang et al. [37]                  ring-LWE and SIS         Non-delegatability; Chameleon             Centralized KGC; Key escrow
+                                                                            hash
+                Zhang and Sun et al. [38]          ring-LWE                 Re-signature; Semi-trusted proxy;         Centralized KGC; Key escrow;
+                                                                            Signature evolution                       Double time consumption
+
+
+
+aggregate signature scheme and applied this scheme to the Internet of                3. Preliminaries
+drones for privacy preservation. For multivariate-quadratic-equations
+cryptography, Shim et al. [31] proposed a post-quantum signature                        The lattice theories, ID-DVS scheme model, and security model have
+with multivariate-quadratic-equations, which supported the dramatic                  been presented in this section.
+online signing for cryptographic systems. These four PQC proposals are
+not only generally used for creating encryption/decryption and digital               3.1. Lattice theories
+signature algorithms, but also for key exchange and authentication
+cryptosystems in the not-too-distant future.                                         Definition 1 (Lattice [39]). Let 𝑣1 , … , 𝑣𝑛 ∈ R𝑚 be a set of linearly
+    This paper plans to utilize lattice theory to construct a PQC signa-             independent vectors. The lattice 𝛬𝐿 generated by 𝑣1 , … , 𝑣𝑛 refers to the
+ture algorithm, as the digital signature plays an essential roles in trans-          set formed by linear combinations of vectors 𝑣1 , … , 𝑣𝑛 .
+action signature, blockchain system consistency, and data ownership
+confirmation in BIoMT systems.                                                       𝛬𝐿 = {𝑎1 𝑣1 + 𝑎2 𝑣2 + · · · + 𝑎𝑛 𝑣𝑛 ∶ 𝑎1 , 𝑎2 , · · ·, 𝑎𝑛 ∈ Z}                      (1)
+
+
+2.3. Lattice-based signature theory                                                      Here, the matrices 𝐴 = (𝑎1 , … , 𝑎𝑚 ) ⊂ R𝑛×𝑚 is the coefficient matrix
+                                                                                     of lattice 𝛬, where the dimension 𝑛 and rank 𝑚 of this lattice satisfy
+    Lattice cryptography serves as one promising PQC theory that has                 𝑚 = 𝑂(𝑛 log 𝑞).
+gained much attention in recent years. Its security is also based on some
+NP-hard problems, such as shortest vector problem (SVP), shortest in-
+                                                                                     Definition 2 (q-ary Lattice [39]). Eq. (1) is the ‘‘q-ary’’ lattice, which
+dependent vectors problem (SIVP), closest vector problem (CVP), short
+                                                                                     is constructed by a matrix  ∈ Z𝑛×𝑚
+                                                                                                                       𝑞 , a prime number 𝑞, and a vector
+integer solution (SIS), learning with errors (LWE), bounded distance
+                                                                                     𝜇 ∈ Z𝑛𝑞 .
+decoding problem (BDD), and so on [32]. The Number Theory Research
+Unit (NTRU) algorithm is based on SVP or SIVP, which is designed with                𝛬⟂ (𝐴) = {𝑥 ∈ Z𝑚 |𝑥 = 0 mod 𝑞 𝑓 𝑜𝑟 𝑥 ∈ Z𝑚 }
+                                                                                                                                                            (2)
+the polynomial ring. The scheme in the Refs. [19] is based on this mech-             𝛬⟂𝜇 (𝐴) = {𝑥 ∈ Z |𝑥 = 𝜇 𝑚𝑜𝑑 𝑞 𝑓 𝑜𝑟 𝑥 ∈ Z }
+                                                                                                     𝑚                         𝑚
+
+anism. Kim et al. [33] introduced a key encapsulation mechanism with
+the NTRU lattice, which could resist significant cryptanalytic attacks in
+current information systems. The LWE is a CVP in which the hardness
+                                                                                     Definition 3 (Gaussian Distribution [40]). The Gaussian distribution is
+is solving linear equations with noise. The scheme in the Refs. [29] is              𝜌𝑐 ,𝜎 (𝑥) = 𝑒𝑥𝑝( −(𝑥−𝑐)
+                                                                                                             2
+                                                                                                               ), where 𝜎 ∈ R is the standard deviation, 𝑐 ∈ R is
+based on this mechanism. Li and Jiang et al. [34] proposed a group                                      2𝜎 2
+                                                                                     the center, and 𝑥 ∈ R is vector. More generally, it can be defined as
+signature scheme with the SIS lattice problem, which had been applied                                        2
+                                                                                     𝜌𝑐 ,𝜎 (𝑥) = 𝑒𝑥𝑝( −‖𝑥−𝑐‖
+                                                                                                        2𝜎 2
+                                                                                                               ) with 𝑥, 𝑐 ∈ R𝑛 . When the center 𝑐 = 0, it becomes
+to the IoMT system with blockchain technology for secure medical
+                                                                                     𝜌𝜎 (𝑥). Meanwhile, 𝐷𝜎 (𝑥) = 𝜌𝜎 (𝑥)∕𝜌𝜎 (Z) is discrete Gaussian distribution
+data-sharing. Yu et al. [35] designed an NTRU-based certificateless
+                                                                                     over Z and 𝐷𝜎 (𝑥) = 𝜌𝜎 (𝑥)∕𝜌𝜎 (Z𝑚 ) is the general situation over Z𝑚 .
+ring signature for electronic voting, which could obtain the properties
+of quantum immunity, unconditional anonymity, and unforgeability.
+The ring-LWE is a variant of LWE that has more strengthened security                 Definition 4 (ℜ − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 Problem [40]). ℜ − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 is defined to
+properties. The schemes in the Refs. [30] are based on this mechanism.               find a non-zero 𝑣 ∈ ℜ𝑚 𝑞 which satisfy 𝐴𝑣 = 0, where ℜ a ring, 𝜅 is a
+                                                                                                        𝑞 , 𝐴 ∈ ℜ𝑞 , and ‖𝑣‖2 ≤ 𝛽.
+                                                                                     distribution over ℜ𝑛×𝑚
+Yao et al. [36] designed a public-key authenticated encryption protocol                                               𝑛×𝑚
+
+with ring-LWE in the ideal lattice, which also could achieve keyword
+search ability in cloud computing. Zhang et al. [37] proposed a DVS
+scheme with the chameleon hash and without trapdoors, which could                    Definition 5 (𝑆 𝑎𝑚𝑝𝑙𝑒𝑃 𝑟𝑒(𝐴, 𝑇 , 𝜎 , 𝑦) [40]). Given a matrix 𝐴 ∈ 𝑍𝑞𝑛×𝑚 ,
+                                                                                                                                        √
+achieve non-delegatability. Zhang and Sun et al. [38] presented an ID-               a trapdoor basis 𝑇 of lattice 𝛬⟂ (𝐴), 𝜎 ≥ 𝐿 ⋅ 𝜔( 𝑙𝑜𝑔 𝑛), and a random
+DVS scheme with a function of signature evolution, which also added                  vector 𝑦, 𝑆 𝑎𝑚𝑝𝑙𝑒𝑃 𝑟𝑒(𝐴, 𝑇 , 𝜎 , 𝑦) can derive a non-zero vector 𝑒 ∈ 𝑍𝑞𝑚 ,
+                                                                                                                                   √
+the proxy and re-signature functions. The simple comparisons of these                which satisfy 𝐴𝑒 = 𝑦 𝑚𝑜𝑑 𝑞. Here, ‖𝑒‖ ≤ 𝜎 𝑚.
+lattice-based schemes are shown in Table 1.
+    As in BIoMT, the protection of sensitive information in medical
+data is essential in the medical utilization processes among different               3.2. Model descriptions
+medical institutions. Meanwhile, the threats to classical cryptographic
+algorithms from quantum computers should be taken more seriously.                       The scheme model and security model are given in this subsection,
+Therefore, This paper addresses security and privacy issues related to               and they provide the formal definition of an ID-DVS scheme.
+system users and medical data by proposing a quantum-safe ID-DVS                        (1) Scheme model
+scheme to strengthen the security of medical data-sharing in BIoMT                      For an ID-DVS scheme, it is mainly composed of five polynomial
+systems.                                                                             time algorithms.
+
+
+                                                                                 3
+C. Li et al.                                                                                                                  Journal of Systems Architecture 160 (2025) 103362
+
+
+     • Setup(1𝑛 ): Input the security parameter 𝑛, key generation center                 Table 2
+       (KGC) outputs the system parameters 𝑝𝑝 and system master secret                   System parameters.
+
+       key 𝑚𝑠𝑘.                                                                           Notation                             Meaning
+
+     • KeyGen.(𝐼 𝐷𝑎 , 𝐼 𝐷𝑏 , 𝑝𝑝, 𝑚𝑠𝑘): Input the identities 𝐼 𝐷𝑎 and 𝐼 𝐷𝑏 of              q                                    One large prime with 𝑞 = 𝑞(𝑛) ≥ 3
+       the signer and designated verifier, 𝑝𝑝, and 𝑚𝑠𝑘, KGC generates the                 n, m                                 The dimension of key matrix, and 𝑚 ≥ 5𝑛𝑙𝑜𝑔 𝑞
+                                                                                          𝜅                                    The system security parameter
+       key pairs (𝑝𝑘𝑎 , 𝑠𝑘𝑎 ) and (𝑝𝑘𝑏 , 𝑠𝑘𝑏 ) respectively.
+                                                                                          Z                                    The integer matrix/vector set for system keys
+     • Sign(𝑝𝑝, 𝑠𝑘𝑎 , 𝑝𝑘𝑎 , 𝑝𝑘𝑏 , 𝜇): Input the message 𝜇, 𝑝𝑝, (𝑝𝑘𝑎 , 𝑠𝑘𝑎 ), the                                                                                  √
+                                                                                          𝜎                                    A system parameter with 𝜎 = 𝐿 ⋅ 𝜔( 𝑙𝑜𝑔 𝑛)
+       designated verifier’s public key 𝑝𝑘𝑏 , the signer generates an ID-                 𝑚𝑝𝑘                                  The group public key
+       DVS signature (𝑒, 𝜇).                                                              𝑚𝑠𝑘                                  The group muster secret key
+     • Verify(𝑠𝑘𝑏 , 𝑝𝑘𝑏 , 𝑝𝑘𝑎 , 𝜇, 𝑒): Input (𝑒, 𝜇), 𝑝𝑝, (𝑝𝑘𝑏 , 𝑠𝑘𝑏 ), and the            𝐼 𝐷𝑖                                 The user identity
+                                                                                          𝐻1 , 𝐻2                              The cryptographic Hash function
+       signer’s public key 𝑝𝑘𝑎 , the designated verifier checks the legality
+                                                                                          𝐷𝜎𝑚                                  The bimodal Gaussian distribution
+       of the ID-DVS signature.                                                           𝜎                                    The standard deviation for 𝐷𝜎𝑚
+     • Simulation(𝑝𝑝, 𝑠𝑘𝑏 , 𝑝𝑘𝑏 , 𝑝𝑘𝑎 , 𝜇): Input the message 𝜇, 𝑝𝑝, (𝑝𝑘𝑏 , 𝑠𝑘𝑏 ),        𝜇                                    The message to be signed
+       the singer’s public key 𝑝𝑘𝑎 , the designed verifier generates an-                  𝑝𝑘, 𝑠𝑘                               The public and private keys for system users
+       other ID-DVS signature (𝑒′ , 𝜇).
+    (2) Security model
+    An ID-DVS scheme must satisfy the correctness, anonymity, and
+unforgeability. The correctness can be verified according to the verifi-
+cation process. The anonymity and unforgeability should be proved in                         • Initialize: 𝐶 performs the Setup(1𝑛 ) algorithm to obtain the system
+the random oracle model as shown in the following Definitions 6 and 7,                         parameters 𝑝𝑝 and the master secret key 𝑚𝑠𝑘. Then, he exposes 𝑝𝑝
+respectively. Note that only by passing this certification can it be shown                     and keeps 𝑚𝑠𝑘 in secret.
+that the designed ID-DVS scheme is safe. Next, the security proof model                      • Query: 𝐸 can perform enough polynomial times of queries on the
+is constructed with a query-respond game, where an adversary Eve 𝐸                             random oracle. Here, the hash function, secret key, and signature
+performs the query and a challenger Charlie 𝐶 performs the response.                           are all the query targets. 𝐸 can perform queries on the non-target
+                                                                                               user’s identity 𝐼 𝐷∗ or the non-target message 𝜇 ∗ . 𝐶 responds to
+Definition 6 (Anonymity). If an adversary can make the right guess                             the answers to the queries if the answers already exist. Other-
+whether the signature is signed by the signer or the designated verifier                       wise, 𝐶 executes the signature algorithms of KeyGen. or Sign to
+with the adaptive selective identity attack in the random oracle model,                        generate new answers to 𝐸’s queries.
+he wins this round of the query-respond game. Detailed query-respond                         • Forge: 𝐸 utilizes these enough queried answers to generate a valid
+processes between 𝐴 and 𝐶 are shown as follows.
+                                                                                               signature (𝑒, 𝜇 ∗ ) for the target user’s identity 𝐼 𝐷∗ and message 𝜇 ∗ ,
+     • Initialize: 𝐶 performs the Setup(1𝑛 ) algorithm to obtain the system                    and exposes this signature.
+       parameters 𝑝𝑝 and the master secret key 𝑚𝑠𝑘. Then, he exposes 𝑝𝑝                      • Challenge: 𝐶 also can execute the signature processes legally and
+       and keeps 𝑚𝑠𝑘 in secret.                                                                derive another valid signature (𝑒∗ , 𝜇 ∗ ) for the target user’s identity
+     • Query: 𝐸 can perform enough polynomial times of queries on the                          𝐼 𝐷∗ and message 𝜇 ∗ . Then, 𝐶 utilizes these two valid signatures
+       random oracle. Here, the hash function, secret key, and signature                       about the same message 𝜇 ∗ to solve the Z − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 instance.
+       are all the query targets. 𝐸 can perform queries on the non-target                    • Analyze: This step analyses two points. One is the probability that
+       user’s identity 𝐼 𝐷∗ or the non-target message 𝜇 ∗ . 𝐶 responds to
+                                                                                               𝐶 can find a solution for the Z − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 instance, and the other
+       the answers to the queries if the answers already exist. Other-
+                                                                                               one is the probability that 𝐸 successfully generates a valid ID-DVS
+       wise, 𝐶 executes the signature algorithms of KeyGen. or Sign to
+                                                                                               signature. Here the successful rate of 𝐸 can be defined as shown
+       generate new answers to 𝐸’s queries.
+                                                                                               in Eq. (4).
+     • Challenge: 𝐸 selects two target system users’ identities 𝐼 𝐷𝑖0 and
+       𝐼 𝐷𝑖1 and queries on the signature about these two identities. Next,                    𝐴𝑑 𝑣𝐹𝐴 𝑜𝑟𝑔 𝑒 = 𝑃 𝑟[𝐸 𝑠𝑢𝑐 𝑐 𝑒𝑠𝑠𝑒𝑑 .]                                         (4)
+       𝐶 randomly chooses the identity 𝐼 𝐷𝑖𝑏 , 𝑏 ∈ 0, 1 as the signer and
+       the other one as the designated verifier, derives the ID-DVS (𝑒, 𝜇 ∗ )               This unforgeability ensures that no one other than the signer can
+       according to the processes of KeyGen. and Sign algorithms, and
+                                                                                         generate a legitimate signature, thus improving the security of the
+       sends it back to 𝐸.
+                                                                                         medical data-sharing process among different BIoMT systems.
+     • Guess: 𝐸 performs the guess of 𝑏∗ . If 𝑏∗ = 𝑏, 𝐸 wins this game.
+       Here the guess successful rate of 𝐸 can be defined as shown in
+       Eq. (3).                                                                          4. The ID-DVS scheme
+
+       𝐴𝑑 𝑣𝐴𝑛𝑜𝑛
+           𝐴    = 𝑃 𝑟[𝐸 𝑠𝑢𝑐 𝑐 𝑒𝑠𝑠𝑒𝑑 .]                                        (3)
+                                                                                             This ID-DVS scheme is constructed with the lattice assumption of
+                                                                                         ℜ − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 . To improve the computational efficiency, the lattice
+    This anonymity increases the probability that the adversary will                     assumption is reduced from R to Z, and the new lattice assumption
+fail to attack the signature because he cannot determine whether the                     Z−𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 does not decrease the hardness. The parameter definitions
+signer or the designated verifier is the real signer. Meanwhile, the                     are shown in Table 2. This scheme mainly contains five algorithms of
+designated verifier cannot prove to third parties that this signature is                 𝑆 𝑒𝑡𝑢𝑝, 𝐾 𝑒𝑦𝐺𝑒𝑛., 𝑆 𝑖𝑔 𝑛, 𝑉 𝑒𝑟𝑖𝑓 𝑦, and 𝑆 𝑖𝑚𝑢𝑙𝑎𝑡𝑖𝑜𝑛. The simple framework of
+valid. This mechanism can protect user privacy in medical data-sharing                   this ID-DVS scheme is shown in Fig. 1, and details of these algorithms
+transactions and prevent the designated verifier from authorizing other                  are described as follows.
+users to access the signature.
+                                                                                         4.1. Setup
+Definition 7 (Unforgeability). If an adversary can forge a valid signature
+with the adaptive selective message attack in the random oracle model,
+                                                                                             Some system parameters are preset according to the setting princi-
+a challenger can derive another valid signature and solve the lattice
+assumption with these two signatures. Here, the successful probability                   ple in Ref. [41], where 𝑛 is the security parameter, 𝑞 is a prime number
+of this challenger is non-negligible. Detailed query-respond processes                                        𝑞 = 𝑞(𝑛) ≥ 3, 𝑚 is a positive
+                                                                                         which satisfies with √                          √ integer which satisfies
+between 𝐸 and 𝐶 are shown below.                                                         𝑚 ≥ 5𝑛 𝑙𝑜𝑔 𝑞, 𝐿 = 𝑂( 𝑛 𝑙𝑜𝑔 𝑞), and 𝜎 ≥ 𝐿 ⋅ 𝜔( 𝑙𝑜𝑔 𝑛).
+
+
+                                                                                     4
+C. Li et al.                                                                                                               Journal of Systems Architecture 160 (2025) 103362
+
+
+
+
+                                                             Fig. 1. The simple framework of ID-DVS scheme.
+
+
+
+  (1) KGC generates a matrix 𝑚𝑝𝑘 = 𝐴 ∈ 𝑍𝑞𝑛×𝑚 with the former system                        (3) Utilizes his secret key 𝑠𝑘 to compute 𝑒 = 𝑥 + 𝑠𝐼 𝐷1 ;
+      parameters by the Trapdoor generation (TrapGen.(1𝑛 )) algorithm,                                                                                        𝐷𝑚 (𝑒)
+                                                                                           (4) Output the signature < 𝑒, 𝑐 > with probability 𝑚𝑖𝑛( 𝑀 𝐷𝑚 𝜎                     , 1);
+                                                                                                                                                              𝑠𝐼 𝐷 𝑐 ,𝜎 (𝑒)
+      which is an approximate random distribution matrix. Then, a                                                                                                 1
+                                                                                                otherwise, restart.
+      basis 𝑇 ∈ 𝑍𝑞𝑚×𝑚 is derived from 𝛬⟂ (𝐴) by TrapGen.(1𝑛 ) as ‖𝑇̃ ‖ ≤
+      𝐿;                                                                                     This is a probabilistic algorithm, and 𝑀 is some fixed positive real
+  (2) Chooses 𝐻1 , 𝐻2 ∶ {0, 1}∗ → 𝑍𝑞𝑛 ;                                                   that is set large enough to ensure that the preceding probability is
+  (3) Outputs 𝑝𝑝 = {𝐴, 𝐻1 , 𝐻2 } as public system parameters;                             always at most 1. If there is no data output, the signer will repeat these
+  (4) Serves 𝑚𝑝𝑘 = 𝐴 as the master public key and 𝑚𝑠𝑘 = 𝑇 as the                          sign processes until a legal ID-DVS is generated.
+      master secret key.
+                                                                                          4.4. Verify
+
+4.2. KeyGen                                                                                   When receives the ID-DVS from the signer, the designated verifier
+                                                                                          utilizes 𝑝𝑝, the signer’s private key 𝑎𝐼 𝐷1 , and his private key 𝑠𝑘2 = 𝑠𝐼 𝐷2
+    Given the system parameter 𝑝𝑝 and user’s identity 𝐼 𝐷𝑖 .                              to verify the legality of (𝑒, 𝑐) with message 𝜇.
+  (1) KGC computes 𝑎𝐼 𝐷𝑖 = 𝐻1 (𝐼 𝐷𝑖 ) ∈ 𝑍𝑞𝑛 ;                                              (1) The designated verifier checks ‖𝑒‖ > 𝐿, and rejects it;
+  (2) Computes 𝑠𝐼 𝐷𝑖 ← 𝑆 𝑎𝑚𝑝𝑙𝑒𝑃 𝑟𝑒(𝐴, 𝑇 , 𝑎𝐼 𝐷𝑖 , 𝜎) ∈ 𝑍𝑞𝑚 , where 𝜎 ≥                     (2) Checks ‖𝑒‖∞ > 𝑞∕4, and rejects it;
+             √                                                  √
+      ‖𝑇̃ ‖𝜔( 𝑙𝑜𝑔 𝑚), 𝑎𝐼 𝐷𝑖 𝑚𝑜𝑑 𝑞 = 𝐴 ⋅ 𝑠𝐼 𝐷𝑖 , and ‖𝑠𝐼 𝐷𝑖 ‖ ≤ 𝜎 𝑚;                        (3) When the former conditions hold, he verifies whether
+  (3) Outputs 𝑝𝑘 = 𝑎𝐼 𝐷𝑖 as the public key and 𝑠𝑘 = 𝑠𝐼 𝐷𝑖 as the secret                        𝑐 = 𝐻2 (𝐴(𝑒 + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1 𝑚𝑜𝑑 𝑞 , 𝜇) holds or not. Iff this condition
+      key for system user with 𝐼 𝐷𝑖 .                                                          holds, he accepts this signature; Otherwise, he rejects it.
+
+   For the signer and designated verifier in this ID-DVS scheme, the
+signer’s key pair is set as (𝑝𝑘1 , 𝑠𝑘1 ) = (𝑎𝐼 𝐷1 , 𝑠𝐼 𝐷1 ) and the designated            4.5. Simulation
+verifier’s key pair is set as (𝑝𝑘2 , 𝑠𝑘2 ) = (𝑎𝐼 𝐷2 , 𝑠𝐼 𝐷2 ). Then, they will work
+together to generate a legitimate ID-DVS with the following steps.                           This subsection presents the generation simulation of a new ID-
+                                                                                          DVS performed by the designated verifier. According to the former
+4.3. Sign                                                                                 generation processes, he can derive a legal ID-DVS with the same
+                                                                                          message 𝜇.
+    Given the system parameter 𝑝𝑝 and message 𝜇.
+                                                                                           (1) Selects a random vector 𝑥′ ← 𝐷𝜎𝑚
+  (1) The signer 𝐼 𝐷1 randomly chooses 𝑥 ∈ 𝐷𝜎𝑚 ;                                           (2) Computes 𝑐 ′ = 𝐻(𝐴𝑥′ + 𝑎𝐼 𝐷1 𝑚𝑜𝑑 𝑞 , 𝜇) with the system public key
+  (2) Computes 𝑐 = 𝐻2 (𝐴𝑥 + 𝑎𝐼 𝐷2 𝑚𝑜𝑑 𝑞 , 𝜇);                                                  𝐴 and the same message 𝜇;
+
+                                                                                      5
+C. Li et al.                                                                                                                           Journal of Systems Architecture 160 (2025) 103362
+
+
+  (3) Computes 𝑒′ = 𝑥′ + 𝑠𝐼 𝐷2 ;                                                                                exists, the result (𝐼 𝐷𝑖 , 𝑎𝐼 𝐷𝑖 ) is returned back to 𝐸. If not,
+                                                                    𝐷𝑚 (𝑒′ )                                    𝐶 computes the corresponding 𝑎𝐼 𝐷𝑖 = 𝐻1 (𝐼 𝐷𝑖 ), returns the
+  (4) Outputs the ID-DVS (𝑒, 𝑐 ′ ) with probability min( 𝑀 𝐷 𝜎                    (𝑒′ )
+                                                                                        , 1),
+                                                                    𝑠𝐼 𝐷 𝑐 ′ ,𝜎                                 result (𝐼 𝐷𝑖 , 𝑎𝐼 𝐷𝑖 ) back to 𝐸, and records this result into the
+                                                                        2
+       otherwise he restarts this algorithm.                                                                    list 𝐿𝑖𝑠𝑡𝐻1 .
+   Here, the simulated signature (𝑒′ , 𝑐 ′ ) is indistinguishable from the                                   – 𝐻2 query: 𝐸 adaptively chooses a message 𝜇𝑖 to query on
+former generated signature (𝑒, 𝑐) with the same message 𝜇. This is the                                         𝐻2 function. 𝐶 owns a list 𝐿𝑖𝑠𝑡𝐻2 to store (𝜇𝑖 , 𝑐𝑖 ). When he
+inherent quality of the DVS scheme which can prevent attacks from                                              obtains the query, he first searches the list 𝐿𝑖𝑠𝑡𝐻2 whether
+unauthorized verifiers. It can improve the security of cross-institution                                       the identity 𝜇𝑖 is queried or not. If exists, the result (𝜇𝑖 , 𝑐𝑖 )
+medical data-sharing through the BIoMT system.                                                                 is returned back to 𝐸. If not, 𝐶 randomly selects 𝑥 ∈ 𝐷𝜎𝑚 ,
+                                                                                                               computes the corresponding 𝑐𝑖 = 𝐻2 (𝐴𝑥 𝑚𝑜𝑑 𝑞 , 𝜇𝑖 ), returns
+5. Security analysis                                                                                           the result (𝜇𝑖 , 𝑐𝑖 ) back to 𝐸, and records this result into the
+                                                                                                               list 𝐿𝑖𝑠𝑡𝐻2 .
+   The security analyses of the correctness, anonymity, and unforge-                                         – Secret key query: 𝐸 adaptively chooses the non-target iden-
+ability of the proposed ID-DVS scheme have been given in this section.                                         tity 𝐼 𝐷𝑖 to query on secret key. 𝐶 owns a list 𝐿𝐾 to store
+                                                                                                               (𝑠𝐼 𝐷𝑖 , 𝐼 𝐷𝑖 ). When he obtains the query, he first searches
+5.1. Correctness                                                                                               the list 𝐿𝐾 whether the identity 𝐼 𝐷𝑖 is queried or not.
+                                                                                                               If exists, the result (𝑠𝐼 𝐷𝑖 , 𝐼 𝐷𝑖 ) is returned back to 𝐸. If
+    According to the verification steps in Verify algorithm, a valid                                           not, 𝐶 obtains (𝐼 𝐷𝑖 , 𝑎𝐼 𝐷𝑖 ) from the list 𝐿𝑖𝑠𝑡𝐻1 or regener-
+ID-DVS shall satisfy three conditions. From the signature generation                                           ates it firstly. Next, 𝐶 computes the corresponding 𝑠𝐼 𝐷𝑖 ←
+process, (𝑒, 𝑐) satisfy ‖𝑒‖ ≤ 𝐿 and ‖𝑒‖∞ ≤ 𝑞∕4 which are easily                                                𝑆 𝑎𝑚𝑝𝑙𝑒𝑝𝑟𝑒(𝐴, 𝑇 , 𝑎𝐼 𝐷𝑖 , 𝜎), returns the result (𝑠𝐼 𝐷𝑖 , 𝐼 𝐷𝑖 ) back to
+verified. The third condition 𝑐 ← 𝐻2 (𝐴(𝑒 + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1 𝑚𝑜𝑑 𝑞 , 𝜇) =                                       𝐸, and records this result into the list 𝐿𝐾 .
+𝐻2 (𝐴𝑥 + 𝑎𝐼 𝐷2 𝑚𝑜𝑑 𝑞 , 𝜇) holds which can be verified by the equation                                        – Signature query: 𝐸 adaptively chooses a message 𝜇𝑖 to query
+𝐴(𝑒 + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1 = 𝐴𝑥 + 𝑎𝐼 𝐷2 𝑚𝑜𝑑 𝑞. Eq. (5) shows the detailed                                           on signature. 𝐶 owns a list 𝐿𝑆 to store (𝑒, 𝑐𝑖 ). When he
+verification processes.                                                                                        obtains the query, he first searches the list 𝐿𝑆 whether the
+𝐴(𝑒 + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1 = 𝐴(𝑥 + 𝑠𝐼 𝐷1 + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1                                                          message 𝜇𝑖 is queried or not. If exists, the result (𝑒, 𝑐𝑖 , 𝜇)
+                         = 𝐴𝑥 + 𝐴𝑠𝐼 𝐷1 + 𝐴𝑠𝐼 𝐷2 − 𝑎𝐼 𝐷1                                                        is returned back to 𝐸. If not, 𝐶 obtains (𝜇𝑖 , 𝑐𝑖 ) from the
+                                                                                       (5)                     list 𝐿𝑖𝑠𝑡𝐻2 or regenerates it firstly. Next, 𝐶 computes the
+                         = 𝐴𝑥 + 𝑎𝐼 𝐷1 + 𝑎𝐼 𝐷2 − 𝑎𝐼 𝐷1
+                                                                                                               corresponding 𝑒1 = 𝑥 + 𝑠𝐼 𝐷1 , where 𝐼 𝐷1 is set as the signer
+                         = 𝐴𝑥 + 𝑎𝐼 𝐷2                                                                          and 𝐼 𝐷2 is set as the designated verifier. Then, he returns
+                                                                                                               the result (𝑒, 𝑐𝑖 ) back to 𝐸, and records this result into the
+    Meanwhile, the signature (𝑒′ , 𝑐 ′ ) simulated by the designated verifier                                  list 𝐿𝑆 .
+also can be verified by the signer as the conditions of ‖𝑒′ ‖ ≤ 𝐿,
+‖𝑒′ ‖∞ ≤ 𝑞∕4, and the equation 𝑐 ′ ← 𝐻2 (𝐴(𝑒′ + 𝑠𝐼 𝐷1 ) − 𝑎𝐼 𝐷2 𝑚𝑜𝑑 𝑞 , 𝜇) =                            • Challenge: 𝐸 randomly selects two system users’ identities 𝐼 𝐷𝑖0
+𝐻2 (𝐴𝑥 + 𝑎𝐼 𝐷1 𝑚𝑜𝑑 𝑞 , 𝜇) holds, which is shown in Eq. (6) holds.                                         and 𝐼 𝐷𝑖1 which are not queried before. Next, he sends these two
+𝐴(𝑒′ + 𝑠𝐼 𝐷1 ) − 𝑎𝐼 𝐷2 = 𝐴(𝑥 + 𝑠𝐼 𝐷2 + 𝑠𝐼 𝐷1 ) − 𝑎𝐼 𝐷2                                                    target identities to 𝐶. 𝐶 randomly selects the identity 𝐼 𝐷𝑖𝑏 , 𝑏 ∈
+                                                                                                          0, 1 as the signer and the other one as the designated verifier, and
+                          = 𝐴𝑥 + 𝐴𝑠𝐼 𝐷2 + 𝐴𝑠𝐼 𝐷1 − 𝑎𝐼 𝐷2
+                                                                                       (6)                derives the ID-DVS (𝑒, 𝑐𝑖0 ) and (𝑒′ , 𝑐𝑖1 ) according to the ID-DVS
+                          = 𝐴𝑥 + 𝑎𝐼 𝐷2 + 𝑎𝐼 𝐷1 − 𝑎𝐼 𝐷2                                                    processes, and sends it back to 𝐸.
+                          = 𝐴𝑥 + 𝑎𝐼 𝐷1                                                                  • Guess: 𝐸 utilizes the formerly obtained messages and performs the
+                                                                                                          guess of signer 𝑏∗ . 𝐶 confirms whether 𝐼 𝐷𝑖𝑏∗ is the real signer or
+                                                                                                          not. If correct, 𝐸 wins this game.
+5.2. Anonymity                                                                                          • Analyze: Because the parameter 𝑥 is randomly selected with the
+                                                                                                          same Gaussian distribution 𝐷𝜎𝑚 , the statistical distance of 𝑐𝑖0 and
+Theorem 1. The proposed ID-DVS can capture anonymity with lattice                                         𝑐𝑖1 is indistinguishable. Therefore, the statistical distance of these
+assumption Z − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 if no adversary can correctly distinguish the real                           two signatures (𝑒, 𝑐𝑖0 ) and (𝑒′ , 𝑐𝑖1 ) generated by 𝑒 = 𝑥 + 𝑠𝐼 𝐷𝑖 and
+                                                                                                                                                                             0
+signer with the non-negligible probability.                                                               𝑒′ = 𝑥 + 𝑠𝐼 𝐷𝑖 is also indistinguishable. This is to say that 𝐸
+                                                                                                                          1
+                                                                                                          cannot distinguish the correct signer of these two signatures and
+                                                                                                          the proposed ID-DVS can guarantee the signer’s anonymity.
+Proof. According to Definition 6, 𝐸 attempts to distinguish the real
+signer by performing the queries on Hash, secret key, and sign algo-
+rithms under the adaptively chosen identity attack. Here, 𝐸 can execute                             5.3. Unforgeability
+enough times queries on three algorithms to obtain information about
+the non-target identity in polynomial time. Meanwhile, the probability                              Theorem 2. The proposed ID-DVS can capture unforgeability with lattice
+that 𝐸 wins one round query-respond game is defined as at least 𝜁.                                  assumption Z − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 if no adversary can generate a valid signature
+Then, 𝐶 generates a signature with the target identity 𝐼 𝐷∗ and lets 𝐸                              with the non-negligible probability.
+guess the real signer. Detailed query-respond processes are shown as
+follows.
+                                                                                                    Proof. According to Definition 7, 𝐸 attempts to derive a valid signature
+     • Initialize: 𝐶 executes the Setup algorithm to generate the system
+                                                                                                    by performing the queries on Hash, secret key, and sign algorithms
+       parameters (𝑛, 𝑚, 𝑞 , 𝑘, 𝜎) and sends them to 𝐸.
+                                                                                                    under the adaptively chosen message attack. Here, 𝐸 can execute
+     • Query: 𝐸 adaptively chooses the non-target identity to query with
+                                                                                                    enough time queries on three algorithms to obtain information about
+       𝐶.
+                                                                                                    the non-target message in polynomial time. Meanwhile, the probability
+               – 𝐻1 query: 𝐸 adaptively chooses the non-target identity 𝐼 𝐷𝑖                        that 𝐸 wins one round query-respond game is defined as at least 𝜉.
+                 to query on 𝐻1 function. 𝐶 owns a list 𝐿𝑖𝑠𝑡𝐻1 to store                             Then, 𝐶 attempts to utilize this forged signature to solve the lattice
+                 (𝐼 𝐷𝑖 , 𝑎𝐼 𝐷𝑖 ). When he obtains the query, he first searches the                  instance Z − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽 . Detailed query-respond processes are shown as
+                 list 𝐿𝑖𝑠𝑡𝐻1 whether the identity 𝐼 𝐷𝑖 is queried or not. If                        follows.
+
+                                                                                                6
+C. Li et al.                                                                                                                 Journal of Systems Architecture 160 (2025) 103362
+
+
+     • Initialize: 𝐶 executes the Setup algorithm to generate the system                          It also has:
+       parameters (𝑛, 𝑚, 𝑞 , 𝑘, 𝜎) and sends them to 𝐸.
+                                                                                                  𝐴(𝑒∗ − 𝑒∗∗ ) = 𝐴(𝑥∗ − 𝑥∗∗ ) 𝑚𝑜𝑑 𝑞                                      (10)
+     • Query: 𝐸 adaptively chooses the non-target messages to query
+       with 𝐶.                                                                                    𝐴(𝑒∗1 to
+                                                                                                  Due   − 𝑒𝑥∗∗ ) = 0∗∗𝑚𝑜𝑑 𝑞
+                                                                                                            1 − 𝑥 ≠ 0, it can derive
+                                                                                                                                                                         (11)
+               – 𝐻1 query: 𝐸 adaptively chooses the identity 𝐼 𝐷𝑖 to query                         Here, 𝐶 quits this game if 𝑒∗1 − 𝑒∗∗ = 0. Otherwise, 𝑒∗1 − 𝑒∗∗ is a
+                                                                                                                                     1                         1
+                 on 𝐻1 function. 𝐶 owns a list 𝐿𝑖𝑠𝑡𝐻1 to store (𝐼 𝐷𝑖 , 𝑎𝐼 𝐷𝑖 ).                    solution of SIS instance 𝐴𝑒 = 0 𝑚𝑜𝑑 𝑞.
+                 When he obtains the query, he first searches the list 𝐿𝑖𝑠𝑡𝐻1                    • Analyze: There are two situations in which 𝐶 quits the query-
+                 whether the identity 𝐼 𝐷𝑖 is queried or not. If exists, the re-                   respond game. Therefore, the success rate is 𝑞 +𝑞 𝜉 +𝑞 +𝑞 . This
+                 sult (𝐼 𝐷𝑖 , 𝑎𝐼 𝐷𝑖 ) is returned back to 𝐸. If not, 𝐶 computes the                                                                   𝐻1   𝐻2    𝐾   𝑆
+                                                                                                  probability is negligible with the increase in query times. In
+                 corresponding 𝑎𝐼 𝐷𝑖 = 𝐻1 (𝐼 𝐷𝑖 ), returns the result (𝐼 𝐷𝑖 , 𝑎𝐼 𝐷𝑖 )
+                                                                                                  addition, the lattice assumption is a non-deterministic polynomial
+                 back to 𝐸, and records this result into the list 𝐿𝑖𝑠𝑡𝐻1 .
+                                                                                                  problem that cannot be broken with current classical or quantum
+               – 𝐻2 query: 𝐸 adaptively chooses the non-target message 𝜇𝑖 to
+                                                                                                  computational conditions.
+                 query on 𝐻2 function. 𝐶 owns a list 𝐿𝑖𝑠𝑡𝐻2 to store (𝜇𝑖 , 𝑐𝑖 ).
+                 When he obtains the query, he first searches the list 𝐿𝑖𝑠𝑡𝐻2
+                                                                                                 From former theoretical security proof, the proposed ID-DVS scheme
+                 whether the identity 𝜇𝑖 is queried or not. If exists, the result
+                                                                                             can obtain correctness, anonymity, and unforgeability. Meanwhile,
+                 (𝜇𝑖 , 𝑐𝑖 ) is returned back to 𝐸. If not, 𝐶 randomly selects
+                 𝑥 ∈ 𝐷𝜎𝑚 , computes the corresponding 𝑐𝑖 = 𝐻2 (𝐴𝑥 𝑚𝑜𝑑 𝑞 , 𝜇𝑖 ),              this ID-DVS scheme can also satisfy the post-quantum security as it
+                 returns the result (𝜇𝑖 , 𝑐𝑖 ) back to 𝐸, and records this result            is constructed with lattice assumption. Compared with other classi-
+                 into the list 𝐿𝑖𝑠𝑡𝐻2 .                                                      cal cryptography algorithm-based BIoMT systems, this scheme can
+                                                                                             well guarantee anti-quantum security for medical data-sharing among
+               – Secret key query: 𝐸 adaptively chooses the identity 𝐼 𝐷𝑖 to
+                 query on secret key. 𝐶 owns a list 𝐿𝐾 to store (𝑠𝐼 𝐷𝑖 , 𝐼 𝐷𝑖 ).             different medical institutions.
+                 When he obtains the query, he first searches the list 𝐿𝐾
+                 whether the identity 𝐼 𝐷𝑖 is queried or not. If exists, the                 6. Performance analysis
+                 result (𝑠𝐼 𝐷𝑖 , 𝐼 𝐷𝑖 ) is returned back to 𝐸. If not, 𝐶 obtains
+                 (𝐼 𝐷𝑖 , 𝑎𝐼 𝐷𝑖 ) from the list 𝐿𝑖𝑠𝑡𝐻1 or regenerates it firstly. Next,
+                 𝐶 computes the corresponding 𝑠𝐼 𝐷𝑖 ← 𝑆 𝑎𝑚𝑝𝑙𝑒𝑝𝑟𝑒(𝐴, 𝑇 , 𝑎𝐼 𝐷𝑖 ,                 The performance analyses of this ID-DVS scheme from the theory
+                 𝜎), returns the result (𝑠𝐼 𝐷𝑖 , 𝐼 𝐷𝑖 ) back to 𝐸, and records this          and simulation aspects have been given in this section.
+                 result into the list 𝐿𝐾 .
+               – Signature query: 𝐸 adaptively chooses the non-target mes-
+                                                                                             6.1. Theoretical analysis
+                 sage 𝜇𝑖 to query on signature. 𝐶 owns a list 𝐿𝑆 to store (𝑒, 𝑐𝑖 ).
+                 When he obtains the query, he first searches the list 𝐿𝑆
+                 whether the message 𝜇𝑖 is queried or not. If exists, the result                In this phase, six items are selected for comparison, where the
+                 (𝑒, 𝑐𝑖 , 𝜇) is returned back to 𝐸. If not, 𝐶 obtains (𝜇𝑖 , 𝑐𝑖 ) from        assumption is the lattice assumption, 𝑚𝑝𝑘 is the system master key,
+                 the list 𝐿𝑖𝑠𝑡𝐻2 or regenerates it firstly. Next, 𝐶 computes the             𝑚𝑠𝑘 is the system private key, 𝑝𝑘 is the system user’s public key, 𝑠𝑘 is
+                 corresponding 𝑒 = 𝑥 + 𝑠𝐼 𝐷1 , where 𝐼 𝐷1 is set as the signer               the system user’s private key, and signature is the size of the proposed
+                 and 𝐼 𝐷2 is set as the designated verifier. Then, he returns                signature. The comparison results are shown in Table 3. Firstly, the
+                 the result (𝑒, 𝑐𝑖 ) back to 𝐸, and records this result into the             schemes in Ref. [24,34] and this proposed scheme are based on the
+                 list 𝐿𝑆 .                                                                   problem of Z − 𝑆 𝐼 𝑆, the schemes in Ref. [29,30] are based on Ring-
+                                                                                             LWE, and the scheme in Ref. [35] is based on NTRU lattice. Secondly,
+     • Forge: 𝐸 can respectively perform 𝑞𝐻1 , 𝑞𝐻2 , 𝑞𝐾 , and 𝑞𝑆 queries on
+                                                                                             the size of 𝑚𝑝𝑘, 𝑚𝑠𝑘, 𝑝𝑘, and 𝑠𝑘 is in relation to the parameters of
+       the algorithms of 𝐻1 Hash, 𝐻2 Hash, secret key, and sign until
+                                                                                             𝑚, 𝑛, and 𝑞. Then, the size of the signatures in these schemes is also
+       obtaining enough information. With these query results, 𝐸 can
+                                                                                             with the effort scalar factor 𝜎 and ring number 𝑁. In Ref. [29] and
+       forge a valid signature (𝑒∗ , 𝑐𝑖∗ ) about the target message 𝜇∗ . Then,
+                                                                                             Ref. [30], the signature size increases with the ring number increasing
+       𝐸 returns it to 𝐶.
+     • Challenge: 𝐶 first confirms that the signature secret key about                       which will affect the efficiency of the signature algorithm. Here, there
+       identity 𝐼 𝐷𝑖∗ is not queried, the signature about message 𝜇 ∗ is not                 are no results about 𝑚𝑝𝑘 and 𝑚𝑠𝑘 in Ref. [24] and Ref. [24,34] as the
+       queried, and the public keys of (𝑎𝐼 𝐷1 , 𝑎𝐼 𝐷2 ) is derived by 𝐶. Then,               algorithms of Setup and KeyGen. in these two references are not divided.
+       𝐶 utilizes this forged signature (𝑒∗ , 𝑐𝑖∗ ) to solve the Z − 𝑆 𝐼 𝑆𝑞𝜅,𝑛,𝑚,𝛽           These theoretical comparisons and analyses show that the proposed
+       instance 𝐴𝑒∗ = 0 𝑚𝑜𝑑 𝑞. He checks the list 𝐿𝑖𝑠𝑡𝐻2 and quits this                      ID-DVS has certain advantages over those in the other five related
+       game if that (𝜇𝑖∗ , 𝑐𝑖∗ ) does not exist. Otherwise, he utilizes the same             schemes.
+       random vector 𝑥 ∈ 𝐷𝜎𝑚 and derives a new valid signature (𝑒∗∗ , 𝑐𝑖∗∗ )                     Meanwhile, the theoretical analyses of the times costs of Setup,
+       according to the sign algorithm with the following two equations.                     KeyGen, Sign, and Verify algorithms are presented in Table 4, where
+       ⎧ 𝑐𝑖∗ ←𝐻2 (𝐴(𝑒∗ + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1 𝑚𝑜𝑑 𝑞 , 𝜇)                                          𝑇𝑇 𝑟𝑎𝑝 represents the time costs of trapdoor algorithm, 𝑇𝑆 𝑎𝑚 represents
+       ⎪                                                                                     the Gaussian Samplepre algorithm, 𝑇𝑀 𝑢𝑙 represents the scalar mul-
+       ⎪       = 𝐻2 (𝐴𝑥∗ + 𝑎𝐼 𝐷2 𝑚𝑜𝑑 𝑞 , 𝜇 ∗ )
+       ⎨ ∗∗            ∗∗
+                                                                              (7)            tiplication algorithm, and 𝑇𝐻 represents the hash algorithm. Here,
+       ⎪𝑐𝑖 ←𝐻2 (𝐴(𝑒 + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1 𝑚𝑜𝑑 𝑞 , 𝜇)                                             some high-time-consuming algorithms and steps have been selected for
+       ⎪
+       ⎩       = 𝐻2 (𝐴𝑥∗∗ + 𝑎𝐼 𝐷2 𝑚𝑜𝑑 𝑞 , 𝜇∗ )                                               comparison, and some other addition or modular operations that are
+       According to the verification algorithm, it has:                                      low-time-consuming are not considered. The Setup and KeyGen algo-
+       { ∗                                                                                   rithms can be prepared in advance, which can save time and costs. So
+         𝐴(𝑒 + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1 = 𝐴𝑥∗ + 𝑎𝐼 𝐷2 𝑚𝑜𝑑 𝑞
+                                                                                  (8)        the time-consuming in other algorithms will affect the efficiency more.
+           𝐴(𝑒∗∗ + 𝑠𝐼 𝐷2 ) − 𝑎𝐼 𝐷1 = 𝐴𝑥∗∗ + 𝑎𝐼 𝐷2 𝑚𝑜𝑑 𝑞
+                                                                                             In the proposed ID-DVS scheme, the time costs of KeyGen and Sign
+       Then, it has:                                                                         algorithms are lower than the other schemes. From these comparison
+       { ∗
+         𝐴𝑒 − 𝑎𝐼 𝐷1 = 𝐴𝑥∗ 𝑚𝑜𝑑 𝑞                                                              results, it can derived that the proposed ID-DVS has certain advantages
+                                                                                  (9)
+           𝐴𝑒∗∗ − 𝑎𝐼 𝐷1 = 𝐴𝑥∗∗ 𝑚𝑜𝑑 𝑞                                                         over those in the other five related schemes.
+
+                                                                                         7
+C. Li et al.                                                                                                                            Journal of Systems Architecture 160 (2025) 103362
+
+Table 3
+Keys size comparison.
+ Ref.                                     Assumption                 mpk                    msk                       pk                       sk                        signature
+ Li et al. [24]                           Z − 𝑆𝐼𝑆                    –                      –                         mnlog2q                  mnlog2q                   2mlog(12𝜎)
+ Ye et al. [29]                           Ring-LWE                   mnlogq                 n(m-n)logq                nlogq                    mlogq                     2mlog(12𝜎)+Nlog3
+ Bagchi et al. [30]                       Z − 𝑆𝐼𝑆                    2mlogq                 mlogq                     2mlogq                   mlogq                     2Nmlog(12𝜎)
+ Li and Jiang et al. [34]                 Ring-LWE                   –                      –                         mnlog2q                  mnlog2q                   2mlog(12𝜎)
+ Yu et al. [35]                           NTRU                       mlogq                  4𝑛2 𝑙𝑜𝑔 𝑞                 mlogq                    2nlogq                    2mlog(2𝜎)
+ This scheme                              Z − 𝑆𝐼𝑆                    mnlogq                 mmlogq                    nlogq                    mlogq                     2mlog(12𝜎)
+
+
+
+
+                  Table 4
+                  Time costs comparison.
+                    Items                                  Setup               KeyGen.                                  Sign                              Verify
+                    Li et al. [24]                         –                   2𝑇𝑇 𝑟𝑎𝑝                                  2𝑇𝑀 𝑢𝑙 + 𝑇𝐻                       3𝑇𝑀 𝑢𝑙 + 𝑇𝐻
+                    Ye et al. [29]                         𝑇𝑇 𝑟𝑎𝑝              𝑇𝑆 𝑎𝑚 + 𝑇𝑀 𝑢𝑙                            𝑇𝑆 𝑎𝑚 + 7𝑇𝑀 𝑢𝑙 + 3𝑇𝐻              5𝑇𝑀 𝑢𝑙 + 2𝑇𝐻
+                    Bagchi et al. [30]                     2𝑇𝑇 𝑟𝑎𝑝             3𝑁 𝑇𝑀 𝑢𝑙 + 𝑁 𝑇𝐻                          3𝑁 𝑇𝑀 𝑢𝑙 + 𝑁 𝑇𝐻                   2𝑇𝑀 𝑢𝑙 + 𝑇𝐻
+                    Li and Jiang et al. [34]               –                   2𝑁 𝑇𝑇 𝑟𝑎𝑝                                5𝑇𝑀 𝑢𝑙 + 2𝑇𝐻                      3𝑇𝑀 𝑢𝑙 + 𝑇𝐻
+                    Yu et al. [35]                         𝑇𝑇 𝑟𝑎𝑝              𝑁 𝑇𝑆 𝑎𝑚 + 2𝑁 𝑇𝑀 𝑢𝑙 + 2𝑁 𝑇𝐻               3𝑇𝑀 𝑢𝑙 + 𝑇𝐻                       6𝑇𝑀 𝑢𝑙 + 4𝑇𝐻
+                    This scheme                            𝑇𝑇 𝑟𝑎𝑝              𝑇𝑆 𝑎𝑚 + 𝑇𝐻                               2𝑇𝑀 𝑢𝑙 + 𝑇𝐻                       4𝑇𝑀 𝑢𝑙 + 𝑇𝐻
+
+
+
+
+Fig. 2. Keys size comparison (80-bit security level with parameter setting of 𝑛 = 512 𝑚 = 3549, 𝑞 = 223 , and 𝜎 = 230 ; 192-bit security level with parameter setting of 𝑛 = 1024 𝑚 = 8323,
+𝑞 = 227 , and 𝜎 = 230 ).
+
+
+
+
+6.2. Simulation evaluation                                                                        Ref. [40]. Then, the time-consuming results in Table 4 are calculated,
+                                                                                                  and the results show that this ID-DVS scheme has obvious advantages
+    To more clearly compare the advantages and disadvantages of dif-                              that other similar schemes. Meanwhile, the simulated devices are with
+ferent schemes, the ID-DVS scheme has been executed with the Matlab                               3.2 V and 7.6 mA. With the former calculated time-consuming data,
+2016b on a Windows 11 desktop with Intel(R) Core(TM) i5-1240P                                     the energy-consuming results are calculated and shown in Fig. 4.
+1.90 GHz and 16G RAM. Here, the system parameters are selected
+according to those in Ref. [39], which are presented in the tile of                               7. Conclusion
+Fig. 2. Meanwhile, the signature size in Ref. [29] and Ref. [30] is in
+relation to the ring number 𝑁 which is preset as 𝑁 = 3. With the                                      This paper contributes to privacy protection in the cross-chain
+ring number increasing, the signature size in these two references will                           health data-sharing process in the BIoMT systems and introduces an
+increase. From the comparison results, the key size of 𝑝𝑘 and 𝑠𝑘 in this                          MCF model with a DVS scheme. The MCF model is constructed with
+ID-DVS has a certain advantage over other schemes. Although 𝑚𝑝𝑘 and                               blockchain and relay chain technologies, which can support cross-chain
+𝑚𝑠𝑘 are equal to or bigger than that in other schemes, this ID-DVS is                             health data-sharing and guarantee that data is not tampered with.
+constructed with the lattice assumption Z − 𝑆 𝐼 𝑆 which can provide a                             The DVS is designed with lattice cryptography which can resist anti-
+strong security guarantee. As the signing process is the main part of a                           quantum attack. Meanwhile, the combination of the MCF model and
+signature scheme, the signature size is the smallest compared with these                          DVS scheme can effectively improve the privacy security of system
+similar schemes, which can improve the algorithm execution efficiency.                            transactions and users. Then, it has proved that the DVS scheme can
+    Then, the simulation of the time-consuming and energy-consuming                               satisfy the security requirements of unforgeability, anonymity, and
+are shown in Fig. 3 and Fig. 4, respectively. Here, the time-consuming                            non-traceability. The key size comparison shows that the proposed
+of 𝑇𝑇 𝑟𝑎𝑝 , 𝑇𝑆 𝑎𝑚 , 𝑇𝑀 𝑢𝑙 , 𝑇𝐻 algorithms are set according to the principal in                   DVS scheme is efficient and ledger space-saving, the consumption
+
+                                                                                            8
+C. Li et al.                                                                                                  Journal of Systems Architecture 160 (2025) 103362
+
+
+
+
+                                                           Fig. 3. Time-consuming comparison.
+
+
+
+
+                                                           Fig. 4. Energy-consuming comparison.
+
+
+comparison of time and energy shows that this DVS is more practical             Declaration of competing interest
+for cross-chain transactions and the performance evaluations of cross-
+chain transactions show that the proposed MCF model is efficient and                The authors declare that they have no known competing finan-
+practical for BIoMT systems. These works provide a new solution for             cial interests or personal relationships that could have appeared to
+the ‘‘data island’’ and privacy protection issues in current IoMT systems       influence the work reported in this paper.
+and promote the cross-chain technology application in BIoMT systems.
+                                                                                Acknowledgments
+    Moreover, there are still some worth exploring research directions,
+such as cross-chain identity authentication, secure secret sharing, data
+                                                                                    This work was supported by the National Natural Science Founda-
+access control, and efficient data retrieval in cross-chain health data-        tion of China under Grant Numbers 62272090, 72293583, 72293580,
+sharing processes which will become the possible research orientations          the Foundation of State Key Laboratory of Public Big Data under Grant
+in future work.                                                                 PBD2023-25, the Foundation and Cutting-Edge Technologies Research
+                                                                                Program of Henan Province (CN) under Grant Numbers 242102211073,
+CRediT authorship contribution statement                                        the Japan Society for the Promotion of Science (JSPS) KAKENHI Grant
+                                                                                Numbers JP22K11989, JP24K14910, Leading Initiative for Excellent
+   Chaoyang Li: Writing – review & editing, Writing – original draft,           Young Researchers (LEADER), MEXT, Japan, and Japan Science and
+Formal analysis, Conceptualization. Yuling Chen: Writing – review               Technology Agency (JST), PRESTO Grant Number JPMJPR21P3, JST
+& editing, Supervision. Mianxiong Dong: Project administration, In-             ASPIRE Grant Number JPMJAP2344, and the Soroptimist Japan Foun-
+vestigation. Jian Li: Validation, Supervision. Min Huang: Validation,           dation. Mianxiong Dong is the corresponding author, and the Doctor
+Supervision. Xiangjun Xin: Supervision, Funding acquisition. Kaoru              Scientific Research Fund of Zhengzhou University of Light Industry
+Ota: Supervision, Formal analysis.                                              under Grant 2021BSJJ033.
+
+                                                                            9
+C. Li et al.                                                                                                                          Journal of Systems Architecture 160 (2025) 103362
+
+
+Data availability                                                                               [21] Z. Qu, Y. Meng, B. Liu, G. Muhammad, P. Tiwari, QB-IMD: A secure medical
+                                                                                                     data processing system with privacy protection based on quantum blockchain
+                                                                                                     for IoMT, IEEE Internet Things J. 11 (1) (2023) 40–49.
+    No data was used for the research described in the article.
+                                                                                                [22] W. Mao, P. Jiang, L. Zhu, Locally verifiable batch authentication in IoMT, IEEE
+                                                                                                     Trans. Inf. Forensics Secur. 19 (2023) 1001–1014.
+                                                                                                [23] J. Zhang, C. Dong, Y. Liu, Efficient pairing-free certificateless signcryption
+References                                                                                           scheme for secure data transmission in IoMT, IEEE Internet Things J. (2023).
+                                                                                                [24] C. Li, B. Jiang, M. Dong, Y. Chen, Z. Zhang, X. Xin, K. Ota, Efficient designated
+ [1] X. Xiang, J. Cao, W. Fan, S. Xiang, G. Wang, Blockchain enabled dynamic trust                   verifier signature for secure cross-chain health data sharing in BIoMT, IEEE
+     management method for the internet of medical things, Decis. Support Syst. 180                  Internet Things J. 11 (11) (2024) 19838–19851.
+     (2024) 114184.                                                                             [25] J.-P. Thiers, J. Freudenberger, Code-based cryptography with generalized con-
+ [2] A. Kosba, A. Miller, E. Shi, Z. Wen, C. Papamanthou, Hawk: The blockchain                       catenated codes for restricted error values, IEEE Open J. Commun. Soc. 3 (2022)
+     model of cryptography and privacy-preserving smart contracts, in: 2016 IEEE                     1528–1539.
+     Symposium on Security and Privacy, SP, IEEE, 2016, pp. 839–858.                            [26] A. Alahmadi, S. Çalkavur, P. Solé, A.N. Khan, M.A. Raza, V. Aggarwal, A new
+ [3] W. Wang, H. Xu, M. Alazab, T.R. Gadekallu, Z. Han, C. Su, Blockchain-based                      code based signature scheme for blockchain technology, Mathematics 11 (5)
+     reliable and efficient certificateless signature for iIoT devices, IEEE Trans. Ind.             (2023) 1177.
+     Inform. 18 (10) (2021) 7059–7067.                                                          [27] R. Punithavathi, K. Venkatachalam, M. Masud, M.A. AlZain, M. Abouhawwash,
+ [4] Z. Wang, S. Wei, G.-L. Long, L. Hanzo, Variational quantum attacks threaten                     Crypto hash based malware detection in IoMT framework, Intell. Autom. Soft
+     advanced encryption standard based symmetric cryptography, Sci. China Inf. Sci.                 Comput. 34 (1) (2022).
+     65 (10) (2022) 200503.                                                                     [28] A. Kuznetsov, I. Oleshko, V. Tymchenko, K. Lisitsky, M. Rodinko, A. Kol-
+ [5] L.K. Grover, Quantum mechanics helps in searching for a needle in a haystack,                   hatin, Performance analysis of cryptographic hash functions suitable for use in
+     Phys. Rev. Lett. 79 (2) (1997) 325.                                                             blockchain, Int. J. Comput. Netw. Inf. Secur. 13 (2) (2021) 1–15.
+ [6] P.W. Shor, Polynomial-time algorithms for prime factorization and discrete                 [29] Q. Ye, Y. Lang, H. Guo, Y. Tang, Efficient lattice-based traceable ring signature
+     logarithms on a quantum computer, SIAM Rev. 41 (2) (1999) 303–332.                              scheme with its application in blockchain, Inform. Sci. 648 (2023) 119536.
+ [7] D.J. Bernstein, T. Lange, Post-quantum cryptography, Nature 549 (7671) (2017)              [30] P. Bagchi, R. Maheshwari, B. Bera, A.K. Das, Y. Park, P. Lorenz, D.K. Yau,
+     188–194.                                                                                        Public blockchain-envisioned security scheme using post quantum lattice-based
+ [8] R.J. McEliece, A public-key cryptosystem based on algebraic, Coding Thv 4244                    aggregate signature for internet of drones applications, IEEE Trans. Veh. Technol.
+     (1978) 114–116.                                                                                 72 (8) (2023) 10393–10408.
+ [9] L. Lamport, Constructing digital signatures from a one way function, 1979.                 [31] K.-A. Shim, J. Kim, Y. An, Mq-sign: A new post-quantum signature scheme based
+[10] R.C. Merkle, A certified digital signature, in: Conference on the Theory and                    on multivariate quadratic equations: Shorter and faster, KpqC Round 1 (2022).
+     Application of Cryptology, Springer, 1989, pp. 218–238.                                    [32] H. Nejatollahi, N. Dutt, S. Ray, F. Regazzoni, I. Banerjee, R. Cammarota, Post-
+[11] M. Ajtai, Generating hard instances of lattice problems, in: Proceedings of the                 quantum lattice-based cryptography implementations: A survey, ACM Comput.
+     Twenty-Eighth Annual ACM Symposium on Theory of Computing, 1996, pp.                            Surv. 51 (6) (2019) 1–41.
+     99–108.                                                                                    [33] J. Kim, J.H. Park, Ntru+: Compact construction of NTRU using simple encoding
+[12] J. Dey, R. Dutta, Progress in multivariate cryptography: Systematic review,                     method, IEEE Trans. Inf. Forensics Secur. 18 (2023) 4760–4774.
+     challenges, and research directions, ACM Comput. Surv. 55 (12) (2023) 1–34.                [34] C. Li, B. Jiang, M. Dong, X. Xin, K. Ota, Privacy preserving for electronic medical
+[13] X. Jia, M. Luo, H. Wang, J. Shen, D. He, A blockchain-assisted privacy-aware                    record sharing in healthchain with group signature, IEEE Syst. J. 17 (4) (2023)
+     authentication scheme for internet of medical things, IEEE Internet Things J. 9                 6114–6125.
+     (21) (2022) 21838–21850.                                                                   [35] H. Yu, W. Hui, Certificateless ring signature from NTRU lattice for electronic
+[14] Q. Lin, X. Li, K. Cai, M. Prakash, D. Paulraj, Secure Internet of medical Things                voting, J. Inf. Secur. Appl. 75 (2023) 103496.
+     (IoMT) based on ECMQV-MAC authentication protocol and EKMC-SCP blockchain                  [36] L. Yao, J. Weng, A. Yang, X. Liang, Z. Wu, Z. Jiang, L. Hou, Scalable CCA-secure
+     networking, Inform. Sci. 654 (2024) 119783.                                                     public-key authenticated encryption with keyword search from ideal lattices in
+[15] D. Chen, F. Zhou, Y. Liu, L. Li, Y. Liang, Secure pairing-free certificateless                  cloud computing, Inform. Sci. 624 (2023) 777–795.
+     aggregate signcryption scheme for IoT, J. Syst. Archit. 156 (2024) 103268.                 [37] Y. Zhang, W. Susilo, F. Guo, Lattice-based strong designated verifier signature
+[16] Y. Han, J. Han, W. Meng, J. Lai, G. Wu, Blockchain-based privacy-preserving                     with non-delegatability, Comput. Stand. Interfaces 92 (2025) 103904.
+     public key searchable encryption with strong traceability, J. Syst. Archit. 155            [38] Q. Zhang, Y. Sun, Y. Lu, W. Huang, Revocable identity-based designated verifier
+     (2024) 103264.                                                                                  proxy re-signature with signature evolution, Comput. Stand. Interfaces 92 (2025)
+[17] S. Zou, Q. Cao, C. Huangqi, A. Huang, Y. Li, C. Wang, G. Xu, A physician’s                      103894.
+     privacy-preserving authentication and key agreement protocol based on decen-               [39] D. Micciancio, O. Regev, Lattice-based cryptography, in: Post-Quantum
+     tralized identity for medical data sharing in IoMT, IEEE Internet Things J. 11                  Cryptography, Springer, 2009, pp. 147–191.
+     (17) (2024) 29174–29189.                                                                   [40] L. Ducas, A. Durmus, T. Lepoint, V. Lyubashevsky, Lattice signatures and bimodal
+[18] R. Guo, G. Yang, H. Shi, Y. Zhang, D. Zheng, O 3-R-CP-ABE: An efficient and                     Gaussians, in: Annual Cryptology Conference, Springer, 2013, pp. 40–56.
+     revocable attribute-based encryption scheme in the cloud-assisted IoMT system,             [41] M. Ajtai, Generating hard instances of the short basis problem, in: Automata,
+     IEEE Internet Things J. 8 (11) (2021) 8949–8963.                                                Languages and Programming: 26th International Colloquium, ICALP’99 Prague,
+[19] C. Li, M. Dong, J. Li, G. Xu, X.-B. Chen, W. Liu, K. Ota, Efficient medical big                 Czech Republic, July 11–15, 1999 Proceedings 26, Springer, 1999, pp. 1–9.
+     data management with keyword-searchable encryption in healthchain, IEEE Syst.
+     J. 16 (4) (2022) 5521–5532.
+[20] X. Liu, Y. Sun, H. Dong, A pairing-free certificateless searchable public key
+     encryption scheme for IoMT, J. Syst. Archit. 139 (2023) 102885.
+
+
+
+
+                                                                                           10
+
--- a/papers_txt/REC--Enhancing-fine-grained-cache-coherence-protoc_2025_Journal-of-Systems-A.txt
+++ b/papers_txt/REC--Enhancing-fine-grained-cache-coherence-protoc_2025_Journal-of-Systems-A.txt
@@ -0,0 +1,998 @@
+                                                             Journal of Systems Architecture 160 (2025) 103339
+
+
+                                                                 Contents lists available at ScienceDirect
+
+
+                                                        Journal of Systems Architecture
+                                                        journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+REC: Enhancing fine-grained cache coherence protocol in multi-GPU systems
+Gun Ko, Jiwon Lee, Hongju Kal, Hyunwuk Lee, Won Woo Ro ∗
+Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea
+
+
+
+ARTICLE               INFO                              ABSTRACT
+
+Keywords:                                               With the increasing demands of modern workloads, multi-GPU systems have emerged as a scalable solution, ex-
+Multi-GPU                                               tending performance beyond the capabilities of single GPUs. However, these systems face significant challenges
+Data sharing                                            in managing memory across multiple GPUs, particularly due to the Non-Uniform Memory Access (NUMA)
+Cache coherence
+                                                        effect, which introduces latency penalties when accessing remote memory. To mitigate NUMA overheads, GPUs
+Cache architecture
+                                                        typically cache remote memory accesses across multiple levels of the cache hierarchy, which are kept coherent
+                                                        using cache coherence protocols. The traditional GPU bulk-synchronous programming (BSP) model relies on
+                                                        coarse-grained invalidations and cache flushes at kernel boundaries, which are insufficient for the fine-grained
+                                                        communication patterns required by emerging applications. In multi-GPU systems, where NUMA is a major
+                                                        bottleneck, substantial data movement resulting from the bulk cache invalidations exacerbates performance
+                                                        overheads. Recent cache coherence protocol for multi-GPUs enables flexible data sharing through coherence
+                                                        directories that track shared data at a fine-grained level across GPUs. However, these directories limited in
+                                                        capacity, leading to frequent evictions and unnecessary invalidations, which increase cache misses and degrade
+                                                        performance. To address these challenges, we propose REC, a low-cost architectural solution that enhances
+                                                        the effective tracking capacity of coherence directories by leveraging memory access locality. REC coalesces
+                                                        multiple tag addresses from remote read requests within common address ranges, reducing directory storage
+                                                        overhead while maintaining fine-grained coherence for writes. Our evaluation on a 4-GPU system shows that
+                                                        REC reduces L2 cache misses by 53.5% and improves overall system performance by 32.7% across a variety
+                                                        of GPU workloads.
+
+
+
+1. Introduction                                                                          each kernel. However, as recent GPU applications increasingly require
+                                                                                         more frequent and fine-grained communication both within and across
+    Multi-GPU systems have emerged to meet the growing demands                           kernels [11,13–15], these frequent synchronizations can lead to sub-
+of modern workloads, offering scalable performance beyond what a                         stantial cache operation and data movement overheads. Additionally,
+single GPU can deliver. However, as multi-GPU architectures scale in                     precisely managing the synchronizations places additional burdens on
+size and complexity [1,2], managing memory across multiple GPUs                          programmers, complicating the optimization of multi-GPU systems.
+becomes increasingly challenging [3–7]. One of the primary challenges                        Ren et al. [11] proposed HMG, a hierarchical cache coherence
+arises from the bandwidth discrepancy between local and remote mem-                      protocol designed for L2 caches in large-scale multi-GPU systems. HMG
+ory, commonly known as the Non-Uniform Memory Access (NUMA)                              employs coherence directories to record cache line addresses and their
+effect [3,4]. To mitigate the NUMA penalty, GPUs generally rely on                       associated sharers upon receiving remote read requests. Any writes to
+caching remote memory accesses, allowing them to be served with
+                                                                                         these addresses trigger invalidations. Once capacity is reached, existing
+local bandwidth [5,8–10]. This caching strategy is often extended
+                                                                                         entries are evicted from the directory, triggering invalidation requests
+across multiple levels of the cache hierarchy, including both private
+                                                                                         to the sharer GPUs. These invalidations are unnecessary, as the cor-
+on-chip caches and shared caches [3,4,11,12], to better accommodate
+                                                                                         responding cache lines do not immediately require coherence to be
+the diverse access patterns of emerging workloads.
+                                                                                         maintained. When GPUs access data across a wide range of addresses,
+    While remote data caching offers significant performance benefits
+in multi-GPU systems, it also requires extending coherence throughout                    significant directory insertions lead to a number of unnecessary invali-
+the cache hierarchy. Conventional GPUs rely on a simple software-                        dations for cache lines that have not yet been fully utilized. Subsequent
+inserted bulk-synchronous programming (BSP) model [11], which per-                       accesses to these cache lines result in cache misses, requiring data to
+forms cache invalidation and flush operations at the start and end of                    be fetched again over bandwidth-limited inter-GPU links.
+
+
+  ∗ Corresponding author.
+     E-mail address: wro@yonsei.ac.kr (W.W. Ro).
+
+https://doi.org/10.1016/j.sysarc.2025.103339
+Received 10 September 2024; Received in revised form 27 December 2024; Accepted 5 January 2025
+Available online 9 January 2025
+1383-7621/© 2025 Published by Elsevier B.V.
+G. Ko et al.                                                                                                                   Journal of Systems Architecture 160 (2025) 103339
+
+
+
+
+Fig. 1. Performance of each caching scheme normalized to a system that enables
+remote data caching in both L1 and L2 caches using software and hardware coherence
+protocols, respectively. ‘‘No caching’’ refers to a system that disables remote data       Fig. 2. Baseline multi-GPU system. Each GPU has a coherence directory that records
+caching, simplifying coherence.                                                            and tracks the status of shared data at given addresses along with the corresponding
+                                                                                           sharer IDs.
+
+
+    To evaluate the implications of the coherence protocol, we mea-
+sure the performance impact of unnecessary invalidations on a 4-GPU                        2. Background
+system that caches remote data in both L1 and L2 caches. L1 caches
+are assumed to be software-managed, while L2 caches are managed                            2.1. Multi-GPU architecture
+under fine-grained invalidation through coherence directories. As Fig. 1
+shows, there exists a significant performance opportunity in eliminat-                         The slowdown of transistor scaling has made it increasingly difficult
+ing unnecessary invalidations caused by frequent directory evictions.                      for single GPUs to meet the growing demands of modern workloads. Al-
+Increasing the size of the coherence directory can delay evictions and                     ternatively, multi-GPU systems have emerged as a viable path forward,
+the corresponding invalidation requests, but at the cost of increased                      offering enhanced performance and memory capacity by leveraging
+hardware. Our observations indicate that to eliminate unnecessary                          multiple GPUs connected using high-bandwidth interconnects such as
+invalidations, the size of the coherence directory would need be sub-                      PCIe and NVLink [18]. However, these inter-GPU links are likely to
+stantially increased, accounting for 30.4% of the L2 cache size. As                        have bandwidth that falls far behind the local memory bandwidth [3,
+the size of GPU L2 caches continues to grow [16,17], the aggregate                         4,8]. The NUMA effect that arises from this large bandwidth gap
+storage overhead of coherence directories becomes substantial, caus-                       can significantly impact multi-GPU performance, making it crucial to
+ing inefficiency in scaling for multi-GPU environment (discussed in                        optimize remote access bottlenecks to maximize efficiency.
+Section 3.3).                                                                                  Fig. 2 illustrates the architectural details of our target multi-GPU
+    In this paper, we propose Range-based Directory Entry Coalescing                       system. Each GPU is divided into several SAs, with each comprising a
+(REC), an architectural solution that mitigates unnecessary invalidation
+                                                                                           number of CUs. Every CU has its own private L1 vector cache (L1V$),
+overhead by increasing the effective tracking capacity of the coher-
+                                                                                           while the L1 scalar cache (L1S$) and L1 instruction cache (L1I$) are
+ence directory without incurring significant hardware costs. Our key
+                                                                                           shared across all CUs within an SA. Additionally, each GPU contains
+insight is that since directory updates are performed upon receiving
+                                                                                           a larger L2 cache that is shared across all SAs. When a data access
+remote read requests, leveraging memory access locality provides an
+                                                                                           misses in the local cache hierarchy, it is forwarded to either local or
+opportunity to coalesce multiple tag addresses of shared data based on
+                                                                                           remote GPU memory, depending on the data location. For local mem-
+their common address range. To achieve this, we employ a coherence
+                                                                                           ory accesses, the cache lines are stored in both the shared L2 cache and
+directory design, which aggregates data from incoming remote reads
+that share a common base address within the same address range,                            the L1 cache private to the requesting CU. In the case of remote-GPU
+storing only the offset and the sharer IDs. We reduce the storage                          memory accesses, the data can be cached either only in the L1 cache
+requirements of directory entries by designing them in a base-and-offset                   of the requesting CU [4,5,8] or in both the L2 and L1 caches [3,11,12].
+format, recording the common high-order bits of addresses and using a                      Caching data in remote memory nodes helps mitigate the performance
+bit-vector to indicate the index of each coalesced entry within the target                 degradation caused by accessing remote memory nodes.
+range. For incoming writes, if they are found in the coherence direc-
+tory, invalidations are propagated only to the corresponding address,                      2.2. Remote data caching in multi-GPU
+maintaining fine-grained coherence in multi-GPU systems.
+    To summarize, this paper makes the following contributions:                                While caching remote data only in the L1 cache can save L2 cache
+                                                                                           capacity, it limits the sharing of remote data among CUs. As a result,
+    • We identify a performance bottleneck of fine-grained shared data                     such an approach provides lower performance gain when unnecessary
+      tracking mechanisms in multi-GPU systems. Our analysis demon-                        invalidation overhead is eliminated in its counterpart, as shown in
+      strates that such methods generate unnecessary invalidations at                      Fig. 1. For this reason, in this study, we assume the baseline multi-GPU
+      coherence directory evictions, which incurs a significant perfor-                    architecture allows caching of remote data in both L1 and L2 caches.
+      mance bottleneck due to increased cache miss rates.
+                                                                                               A step-by-step process of remote data caching is shown in Fig. 2.
+    • We show that simply employing larger coherence directories
+                                                                                           Upon generating a memory request, an L1 cache lookup is performed
+      incurs significant storage overhead. Our analysis shows that the
+                                                                                           by the requesting CU ( 1 ). When data is not present in the L1, an
+      baseline multi-GPU system requires a 12× increase in the direc-
+                                                                                           L2 cache lookup is generated to check if the remote data is cached
+      tories to eliminate redundant invalidations.
+                                                                                           in the L2 ( 2 ). If the data is found in the L2 cache, it is returned to
+    • We propose REC which increases effective coverage of the co-
+                                                                                           the requesting CU and cached in its local L1 cache. If the data is not
+      herence directory by enabling each entry to coalesce and track
+      multiple memory addresses along with the associated sharers. By                      found in the L2 cache, the request is forwarded to the remote GPU
+      reducing the L2 cache misses by 53.5%, REC improves overall                          memory at the given physical address. Subsequently, the requested
+      performance by 32.7% on average across our evaluated GPU                             data is returned at a cache line granularity and cached in both the L1
+      workloads.                                                                           and L2 caches ( 3 ). At the same time, the coherence directory, which
+                                                                                           maintains information about data locations across multiple GPUs, is
+
+                                                                                       2
+G. Ko et al.                                                                                                                     Journal of Systems Architecture 160 (2025) 103339
+
+
+
+
+Fig. 3. Coherence protocol flows in detail. The baseline hardware protocol has two           Fig. 4. L2 cache miss rates in baseline and idealized system where no invalidations
+stable states: valid and invalid, with no transient states or acknowledgments required       are propagated by coherence directory evictions. Cold misses are excluded from the
+for write permissions.                                                                       results.
+
+
+
+
+updated with the corresponding entry and the sharer GPU ( 4 ). Writes                        Local reads: Local read requests arriving at the L2 cache are directed
+to remote data in the home GPU are also performed in the local L2                            to either locally- or remotely-mapped data. On cache hits, the data is
+cache, following the write-through policy, as the corresponding GPU                          returned and guaranteed to be consistent because it is either the most
+may access the written data in the future. Remote writes arriving at                         up-to-date data (if mapped to local DRAM) or correctly managed by
+the home GPU trigger invalidation messages to be sent out to the sharer                      the protocol (if mapped to remote GPU). On cache misses, the requests
+GPU(s), and the requesting GPU is recorded as a sharer ( 4 ).                                are forwarded to either local DRAM or a remote GPU. In all cases, the
+                                                                                             directory of the requesting GPU remains unchanged.
+2.3. Cache coherence in multi-GPU                                                            Remote reads: For remote reads that arrive at the home GPU, the
+                                                                                             coherence directory records the ID of the requesting GPU at the given
+                                                                                             cache line address. If the line is already being tracked (i.e., the entry is
+    Existing hardware protocols, such as GPU-VI [19], employ coher-
+                                                                                             found and valid), the directory simply adds the requester to the sharer
+ence directories to track sharers (i.e., L1s) and propagate write-initiated
+                                                                                             field and keeps the entry in the valid state. If the line is not being
+cache invalidations within a single GPU. Bringing the notion into multi-
+                                                                                             tracked, the directory finds an empty spot to allocate a new entry and
+GPU environments, Ren et al. proposed HMG [11], a hierarchical design
+                                                                                             marks it as valid. When the directory is full and every entry is valid, it
+that efficiently manages both intra- and inter-GPU coherence. HMG
+                                                                                             evicts an existing entry and replaces it with the new entry (discussed
+includes two layers for selecting home nodes to track sharers: (1) the
+                                                                                             below).
+inter-GPU module (GPM) level that selects a home GPM within a GPU
+                                                                                             Local writes: Local writes to data mapped to the home GPU memory
+and (2) the inter-GPU level that selects a home GPU across the entire
+                                                                                             look up the directory to find whether a matching entry at the line
+system. A GPM is a chiplet in multi-chip module GPUs. With this,
+                                                                                             address exists. If found, invalidations are propagated to the recorded
+HMG reduces the complexity of tracking and maintaining coherence
+                                                                                             sharers in the background, and the directory entry becomes invalid.
+across a large number of sharers. HMG also optimizes performance by
+                                                                                             Remote writes: By default, L2 caches use a write-back policy for local
+eliminating all transient states and most invalidation acknowledgments,
+                                                                                             writes. As described in Section 2.2, remote writes update both the L2
+leveraging weak memory models in modern GPUs [11].
+                                                                                             cache of the requester and local memory, similar to a write-through
+    Each GPU has a coherence directory attached to its L2 cache,                             policy. Consequently, the directory maintains the entry as valid by
+managed by the cache controllers. The directory is organized in a set-                       adding the requester to the sharer list and sends out invalidations to
+associative structure, and each entry contains the following fields: tag,                    other sharers recorded in the original entry.
+sharer IDs, and coherence state. The tag field stores the cache line                         Directory entry eviction/replacement: Coherence directories are im-
+address for the data copied and fetched by the sharer. The sharer                            plemented in a set-associative structure. Thus, capacity and conflict
+ID field is a bit-vector representing the list of sharers, excluding the                     misses occur as directory lookups are initiated by the read requests con-
+home GPU. Each entry is in one of two stable states: valid or invalid.                       tinuously received from remote GPUs. To notify that the information
+Unlike HMG [11], the baseline coherence directory tracks one cache                           in the evicted entry is no longer traceable, invalidations are sent out as
+line per each entry. In contrast, a directory entry in HMG is designed                       with writes.
+to track four cache lines using a single tag address and sharer ID                           Acquire and release: At the start of a kernel, invalidations are per-
+field, which limits its ability to manage each cache line at a fine                          formed in L1 caches as coherence is maintained using software bulk
+granularity. Consequently, a write to any address tracked by a directory                     synchronizations. However, the invalidations are not propagated be-
+entry may unnecessarily invalidate other cache lines within the same                         yond L1 caches, as L2 caches are kept coherent with the fine-grained
+range, potentially causing inefficiencies in remote data caching. We                         directory protocol. Release operations flush dirty data in both L1 and
+discuss the importance of reducing unnecessary cache line invalidations                      L2 caches.
+in detail in Section 3.1. Like typical memory allocation in multi-GPU
+systems, the physical address space is partitioned among the GPUs in                         3. Motivation
+the system. Therefore, data at any given physical address is designated
+to one GPU (i.e., the home GPU), and every access by a remote GPU                                In multi-GPU systems, coherence is managed explicitly through
+references the coherence directory of the home GPU. For example, in                          cache invalidations to ensure data consistency across multiple GPUs.
+Fig. 2, GPU0 requests data at address 0xA from GPU1, which is the                            When invalidation requests are received, sharer GPUs must look up and
+home GPU; the corresponding entry is then inserted into the directory                        invalidate the corresponding cache lines. Subsequent accesses to these
+of GPU1 with the relevant information.                                                       invalidated cache lines result in cache misses, which are then forwarded
+    Fig. 3 shows the detailed state transitions and actions initiated by                     to the home GPU. This, in turn, can negate the performance benefits of
+the coherence directory. Note that local and remote refer to the sources                     local caching as it undermines the effectiveness of caching mechanisms
+of memory requests received: local refers to accesses from the local CUs,                    intended to reduce remote access bottlenecks. In this section, we ana-
+and remote refers to accesses from the remote GPUs.                                          lyze the behavior of cache invalidation and its impact on the overall
+
+                                                                                         3
+G. Ko et al.                                                                                                                          Journal of Systems Architecture 160 (2025) 103339
+
+
+
+
+                                                                                                   Fig. 6. Performance impact of increasing coherence directory sizes. To eliminate
+                                                                                                   unnecessary invalidations, GPUs require a directory size up to 12× larger than the
+Fig. 5. Fraction of evict-initiated and write-initiated invalidations in the baseline multi-       baseline.
+GPU system. The results are based on invalidation requests that hit in the sharer-side
+L2 caches.
+
+                                                                                                   3.2. Source of premature invalidation
+performance of multi-GPU systems. We identify the sources of invalida-
+                                                                                                       As described in Section 2.3, when a coherence directory becomes
+tion and explore a straightforward solution to mitigate the associated
+                                                                                                   full, the GPU needs to evict an old entry and replace it with a new
+bottlenecks. Our experiments are conducted using MGPUSim [20], a
+                                                                                                   one upon receiving a remote read request; an invalidation request must
+multi-GPU simulation framework that we have extended to support
+                                                                                                   be sent out to the sharer(s) in the evicted entry. Fig. 5 shows the
+the hardware cache coherence protocol. The detailed configuration is
+                                                                                                   distribution of invalidations triggered by directory eviction and write
+provided in Table 2.
+                                                                                                   requests, referred to as evict-initiated and write-initiated invalidations,
+                                                                                                   respectively. The measurements are taken based on the invalidations
+3.1. Impact of cache invalidation                                                                  that are hit in the sharer-side L2 caches after receiving the requests. We
+                                                                                                   observe that significant amount of invalidations (average 79.5%) are
+    To ensure data consistency across multiple GPUs, invalidation re-                              performed by the requests from directory evictions in the home GPUs.
+quests are propagated by the home GPU in two cases: (1) when write                                 These invalidations, considered unnecessary as they do not require
+requests are received and (2) when an entry is evicted from the coher-                             immediate action, should be delayed until remote GPUs have full use
+ence directory due to capacity and conflict misses. Invalidation requests                          of the data.
+triggered by writes are crucial for maintaining data consistency, as they                              We also show the percentage of write-initiated invalidations in
+ensure that no stale data is accessed in the sharer GPU caches. On the                             Fig. 5. One can observe that applications such as FIR, LU, and MM2
+other hand, invalidations generated by directory eviction aim to notify                            experience a significant number of invalidations due to write re-
+the sharers that the coherence information is no longer traceable, even                            quests. These workloads exhibit fine-grained communication within
+if the data is still valid. A detailed background on the protocol flows                            and across dependent kernels, necessitating the invalidation of corre-
+with invalidations is given in Section 2.3.                                                        sponding cache lines in the remote L2 cache upon any modification
+    Broadcasting invalidations does not significantly impact cache ef-                             to the shared data. Although the applications exhibit a high percent-
+ficiency if the cache lines are already evicted or no longer in use.                               age of write-initiated invalidations, their impact on cache miss rates
+However, when applications exhibit frequent remote memory accesses,                                may be negligible if the GPUs do not subsequently require access
+the generation of new directory entries increases invalidation requests                            to the invalidated cache lines. Nonetheless, the results from Fig. 4
+from eviction, invalidating the associated cache lines prematurely.                                clearly demonstrate the importance of minimizing unnecessary cache
+These premature invalidations lead to higher cache miss rates, as                                  invalidations.
+                                                                                                       So far, we have discussed how prematurely invalidating remote data
+subsequent accesses to the invalidated cache lines result in misses.
+                                                                                                   leads to increased cache miss rates, which negatively impacts multi-
+As remote data misses exacerbates NUMA overheads, they need to be
+                                                                                                   GPU performance. We also show that a large fraction of invalidation
+reduced to improve multi-GPU performance.
+                                                                                                   requests stems from directory evictions, which frequently occur due to
+    Fig. 4 shows the impact of cache miss rate when eliminating unnec-
+                                                                                                   the high volume of remote accesses. These accesses trigger numerous
+essary invalidations across the benchmarks listed in Table 3 running on
+                                                                                                   directory updates, overwhelming the baseline coherence directory’s
+a 4-GPU system. The figure demonstrates that the baseline system expe-
+                                                                                                   capacity to effectively manage coherence. A straightforward solution to
+riences a cache miss rate more than double (average 2.4×) that of the
+                                                                                                   mitigate premature invalidations is to increase the size of the coherence
+idealized system without the unnecessary invalidation. This increase
+                                                                                                   directory, providing more coverage to track sharers and reducing evic-
+is mainly due to frequent invalidation requests, which prematurely
+                                                                                                   tion rates. In the following section, we analyze the performance impact
+invalidate cache lines before they can be fully utilized, leading to an                            of larger coherence directory sizes. It is important to note that this
+increase in the number of remote memory accesses. The result strongly                              paper primarily focuses on delaying invalidations caused by directory
+motivates us to further study the source of these frequent invalidations                           evictions, as write-initiated invalidations are necessary and must be
+to improve efficiency of remote data caching in multi-GPU systems.                                 performed immediately for correctness.
+    To demonstrate the performance opportunity, Fig. 1 presents a study
+showing the performance of idealized caching without the invalidation                              3.3. Increasing directory sizes
+overhead. With no invalidations to unmodified cache lines, remote data
+can be fully utilized as needed until they are naturally replaced by the                              A simple approach to delay directory evictions, thereby minimizing
+typical cache replacement policy. The performance of the baseline and                              premature invalidations, is to increase the size of coherence directories.
+ideal system is represented in the first and fourth bars, respectively,                            Limited directory sizes lead to significant evict-initiated invalidations,
+in Fig. 1. The result shows that an ideal system with no unnecessary                               which can undermine the performance benefits of local caching. To
+cache invalidation overheads outperforms the baseline by up to 2.79×                               quantify the benefits of larger directories, we conduct a quantitative
+(average 36.9%). As demonstrated by Figs. 1 and 4, reducing premature                              analysis of performance improvements with increasing directory sizes.
+cache invalidations is crucial in improving efficiency of remote data                                 In our simulated 4-GPU system, each GPU has an L2 cache size of
+caching in multi-GPU systems.                                                                      2 MB, with each cache line being 64B. Each coherence directory tracks
+
+                                                                                               4
+G. Ko et al.                                                                                                                  Journal of Systems Architecture 160 (2025) 103339
+
+
+
+
+Fig. 7. Average performance improvement per increased directory storage in the
+baseline coherence directory design. The results are normalized to the system with
+8K-entry coherence directory.
+
+
+
+
+the identity of all sharers excluding the home GPU (i.e., three GPUs).
+To cover the entire L2 cache space for three GPUs, an ideal coherence
+directory would require approximately 96K entries, or about 12× the
+baseline 8K entries.
+    Fig. 6 illustrates the normalized performance for increasing the
+                                                                                         Fig. 8. A high-level overview of (a) baseline and (b) proposed REC architecture with
+directory sizes by 2×-12× the baseline. With an ideal directory size,
+                                                                                         simplified 2-entry coherence directories. The figure illustrates a scenario where GPU1
+unnecessary invalidations from directory evictions can be eliminated,                    accesses memory of GPU0 in order of 0 × 1000, 0 × 1040, 0 × 1080, and 0 × 1000
+leaving only write-initiated invalidations. The results show that ap-                    by each CU. In the baseline directory, entry that tracks status of data at 0 × 1000
+plications exhibit significant performance gains as the directory size                   is evicted for recording the address 0 × 1080. The proposed directory coalesces three
+increases, with some benchmarks (e.g., ATAX, PR, and ST) requiring                       addresses with same base address into one entry.
+8×-12× the baseline size to achieve the highest speed-up. Specifically,
+benchmarks such as PR and ST show irregular memory access patterns
+that span a wide address range, leading to higher chances of conflict                    4.1. Hardware overview
+misses when updating coherence directories. Most other tested bench-
+marks require up to six times the baseline directory size to achieve                         As shown in Section 3.2, a significant fraction of cache invalidations
+maximum attainable performance; the average speedup with six times                       are generated by the frequent directory evictions. These invalidations
+the size is 1.35×.                                                                       lead to increased cache misses, as data is prematurely invalidated from
+    Each entry in the coherence directory comprises a tag, sharer list,                  the cache, requiring subsequent accesses to fetch the data from remote
+and coherence state. We assume 48 bits for tag addresses, a 3-bit                        memory. While simply increasing the directory size can address this
+vector for tracking sharers, and one bit for the directory entry state;                  bottleneck, the associated cost of hardware can become substantial. To
+thus, each entry requires a total of 52 bits of storage. Our baseline                    address this, we propose REC, an architectural solution that compresses
+directory implementation has 8K entries and occupies approximately                       remote GPU access information, retaining as much data as possible
+2.5% of the L2 cache [11]. Therefore, the storage cost of the baseline                   before eviction occurs. It aggregates data from incoming remote read
+directory in each GPU is 52 × 8192/8/1024 = 52 kB, assuming 8                            requests so that (1) multiple reads to the same address range share
+bits per byte and 1024 bytes per kilobyte. From our observation in                       a common base address, storing only the offset and source GPU in-
+Fig. 6, applications require directory sizes from 6× up to 12× the                       formation, and (2) the coalescing process does not result in any loss
+baseline to achieve maximum performance. This corresponds to a total
+                                                                                         of information, maintaining the accuracy of the coherence protocol.
+storage cost of 312-624 kB, which is an additional 15.2–30.4% of
+                                                                                         We now discuss the design overview of REC and the details of the
+the L2 cache size. While increasing directory size can significantly
+                                                                                         associated hardware components.
+improve performance, the associated hardware costs are substantial.
+                                                                                             Fig. 8(a) shows how the baseline GPU handles a sequence of in-
+To show the inefficiency of simply scaling directory sizes, we calculate
+                                                                                         coming read requests. The cache controller records the tag addresses
+the performance per storage using the results in Fig. 6 and the number
+                                                                                         and the corresponding sharer IDs in the order that the requests arrive.
+of directory entries. Fig. 7 illustrates the results relative to the baseline
+                                                                                         When the coherence directory reaches its capacity, the cache controller
+with 8K entries, showing that performance improvements per increased
+                                                                                         follows a typical FIFO policy to replace the oldest entry with a new
+storage do not scale proportionally with larger coherence directories.
+                                                                                         one within the set. Once an entry is evicted, the information it held
+Additionally, since GPU applications require different directory sizes
+                                                                                         can no longer be tracked, triggering an invalidation request to be sent
+to achieve maximum performance, simply increasing the directory size
+is not an efficient solution. Moreover, as GPU L2 caches continue to                     to the GPU listed in the entry. Upon receiving this request, the sharer
+grow [16,17], the cost of maintaining proportionally larger coherence                    GPU checks its L2 cache and invalidates the corresponding cache line,
+directories will only amplify these overheads. Therefore, improving                      leading to a cache miss on any subsequent access to the cache line.
+coherence directory coverage without significant storage overhead mo-                        To delay invalidations caused by directory evictions without signif-
+tivates the need for more efficient fine-grained hardware protocols in                   icant hardware overhead, we introduce the REC architecture, which
+multi-GPU systems.                                                                       enhances the baseline coherence directory by leveraging spatial locality
+                                                                                         for merging multiple addresses into a single entry. As illustrated in
+4. REC architecture                                                                      Fig. 8(b), REC stores tag addresses with common high-order bits as a
+                                                                                         single entry using a base-plus-offset format. When a new read request
+    This work aims to enhance coherence directory coverage while                         matches the base address in an existing entry, the offset and sharer in-
+avoiding significant hardware overhead, overall reducing unnecessary                     formation are appended to that entry, reducing the need for additional
+cache invalidations in multi-GPU systems. We introduce REC, an archi-                    entries and delaying evictions. The base address represents the shared
+tecture that coalesces directory entries by leveraging the spatial locality              high-order bits, covering a range of addresses and reducing the storage
+in memory accesses observed in GPU workloads. In this section, we                        required compared to storing full tag addresses individually. Addition-
+provide an overview of REC design and discuss its integration with                       ally, REC uses position bits to efficiently track multiple addresses within
+existing multi-GPU coherence protocols.                                                  the specified range, further minimizing storage overhead.
+
+                                                                                     5
+G. Ko et al.                                                                                                                    Journal of Systems Architecture 160 (2025) 103339
+
+Table 1
+Trade-offs between addressable range and storage for each entry. Note that one valid
+bit, not shown in the table, is included in the overall calculation.
+                              Addressable range
+                              64B       128B        256B          1 kB       4 kB
+ Base address bits            48        41          40            38         36
+ Position/Sharer bits         −/3       2/6         4/12          16/48      64/192
+ Total bits per entry         52        50          57            103        293
+
+
+Table 2
+Baseline GPU configuration.
+ Parameter                                        Configuration
+ Number of SAs                                    16
+ Number of CUs                                    4 per SA
+ L1 vector cache                                  1 per CU, 16 kB 4-way
+ L1 inst cache                                    1 per SA, 32 kB 4-way                    Fig. 10. Overview of the REC protocol flows. In the example coherence directory,
+ L1 scalar cache                                  1 per SA, 16 kB 4-way                    entry insertion and offset addition operations are highlighted in blue, while eviction
+ L2 cache                                         2 MB 16-way, 16 banks, write-back        and offset deletion operations are shown in red.
+ Cache line size                                  64B
+ Coherence directory                              8K entries, 8-way
+ DRAM capacity                                    4 GB HBM, 16 banks
+ DRAM bandwidth                                   1 TB/s [11]
+                                                                                           for comparing the storage costs. REC designs with larger addressable
+ Inter-GPU bandwidth                              300 GB/s, bi-directional                 ranges can benefit from increased directory coverage but at the cost of
+                                                                                           storage. In the evaluation of this paper, we tested various addressable
+                                                                                           ranges for REC. Each design is configured to coalesce the maximum
+Table 3
+                                                                                           number of offsets within its specified range. Later in the results, we
+Tested workloads.
+                                                                                           confirm that a 1 kB coalesceable range offers the best trade-off, bal-
+ Benchmark                                             Abbr.        Memory footprint
+                                                                                           ancing reasonable size overhead per entry with the ability to coalesce
+ Matrix transpose and vector multiplication [21]       ATAX         128 MB
+ 2-D convolution [21]                                  C2D          512 MB
+                                                                                           a significant number of entries before evictions occur (discussed in
+ Finite impulse response [22]                          FIR          128 MB                 Section 5.2).
+ Matrix-multiply [21]                                  GEMM         128 MB                     Based on these findings, the format of a directory entry is as
+ Vector multiplication and matrix addition [21]        GEMV         256 MB                 illustrated in Fig. 9. Each entry comprises a base address, coalesced
+ 2-D jacobi solver [21]                                J2D          128 MB
+                                                                                           entries, and a valid bit. When the first remote read request arrives at the
+ LU decomposition [21]                                 LU           128 MB
+ 2 matrix multiplications [21]                         MM2          128 MB                 home GPU, the cache controller sets the base address by right-shifting
+ 3 matrix multiplications [21]                         MM3          64 MB                  the tag address by the number of bits needed to represent the offset
+ PageRank [22]                                         PR           256 MB                 within the specified range. For a 48-bit tag, the address is right-shifted
+ Simple convolution [23]                               SC           512 MB
+                                                                                           by 10 bits (considering a 64B-aligned 1 kB range), and the resulting
+ Stencil 2D [24]                                       ST           128 MB
+                                                                                           bits from positions 64 to 101 are used to store the base address. The
+                                                                                           coalesced entry is identified using the offset within the 1 kB range,
+                                                                                           represented by a position bit, followed by three bits for recording the
+                                                                                           sharers. The position bit is calculated as:
+                                                                                                (           )
+                                                                                                  Tag mod 𝑚
+                                                                                           𝑝=                  × (𝑛 + 1)
+                                                                                                      64
+                                                                                           where 𝑚 denotes the coalescing range, and 𝑛 is the number of shar-
+                                                                                           ers, which are set to 1 kB and 3, respectively. Once the position is
+                                                                                           determined, the corresponding position and the sharer bit are set to
+                                                                                           1 using bitwise OR operation. Given that the 1 kB range allows each
+                                                                                           entry to record up to 16 individual tag addresses, we use the lower 64
+Fig. 9. Coherence directory entry structure for 64B cache lines. In our design, each       bits to store the coalesced entries. Furthermore, the position bit can
+entry stores up to 16 coalesced entries based on 1 kB range.                               also function as the valid bit for each coalesced entry, meaning only
+                                                                                           one valid bit is necessary to indicate whether the entire entry is valid
+                                                                                           or not.
+    Determining the address range within which REC coalesces entries is
+one of the key design considerations, as it directly impacts the number                    4.2. REC protocol flows
+of bits required for each entry. Table 1 shows a list of design choices for
+implementing REC with varying addressable ranges and their potential                           The baseline coherence protocol operates with two stable states-
+trade-offs. The number of required base address bits is calculated using                   valid and invalid-allowing it to remain lightweight and efficient. In
+2n = addressable_range, where n is the number of bits right-shifted                        our proposed coherence directory design, each entry represents the
+from the original tag address. Also, the number of required position                       validity of an entire address range instead of tracking individual tag
+bits is determined by the maximum number of coalesceable cache line                        addresses and associated sharers. This enables the state transitions
+addresses within the target range, assuming 64B line size. Then, the                       to be managed at a coarser granularity during directory evictions.
+number of sharer bits required is (n-1)×num_position_bits, where n is                      Additionally, REC supports fine-grained control over write requests by
+the number of GPUs. For example, if REC is designed to coalesce with                       tracking specific offsets within these address ranges, avoiding the need
+addressable range of 256B, each entry would require 40, 4, and 12 bits                     to evict entire entries. Fig. 10 highlights the architecture of REC and
+for base address, position, and sharer fields, respectively. Lastly, one                   how it differently handles the received requests with the baseline. REC
+valid bit is added to each entry. In Table 1, we show the total bits                       does not require additional coherence states but instead modifies the
+required per entry under the addressable ranges from 128B to 4 kB                          transitions triggered under specific conditions.
+
+                                                                                       6
+G. Ko et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103339
+
+
+Remote reads: When the GPU receives the read request from the                     4.3. Discussion
+remote GPU, the cache controller extracts the base and offset from the
+tag address ( A ). The controller then looks up the coherence directory           Overheads: In our design, the coherence directory consists of 8K
+for an entry with the matching base address ( B ). If a valid entry is            entries, with each entry covering a 1 kB range of addresses. Each entry
+found, the position bit corresponding to the offset calculated using the          comprises a 38-bit base address field, a 64-bit vector for offsets and
+formula in Section 4.1 and the associated sharer bit are set ( C ). For           sharers, and a valid bit (detailed in Table 1). Thus, the total directory
+example, the position bit is 34016 /64 × 4 = 52 representing the 14th             size is 8192 × 103/8/1024 = 103 kB. We also estimate the area
+cache line within the specified 1 kB range. The sharer bit is determined          and power overhead of the coherence directory in REC, using CACTI
+by the source GPU index (e.g., GPU1). Therefore, bit 52 and 53 are set            7.0 [25]. The results show that the directory is 3.94% area and has
+to 1. It can happen that the position bit is already set; nevertheless, the       3.28% power consumption compared to GPU L2 cache. REC requires no
+controller still performs a bitwise OR on the bits at the corresponding           additional hardware extensions for managing the coherence directory.
+positions. Since the entry already exists in the directory, it remains            The existing cache controller handles operations such as base address
+valid. Otherwise, if no valid entry is found, a new entry is created              calculation and bitwise manipulation efficiently.
+with the base address, and the position and sharer bits are set. With             Comparison to prior work: As discussed in Section 2.3, HMG [11]
+                                                                                  designs each coherence directory entry to track four cache lines at
+the insertion of a new entry, the state transitions from invalid to valid.
+                                                                                  a coarse granularity. We empirically show, in Section 3.3, that GPUs
+Local writes: When the write request is performed locally ( D ), the
+                                                                                  require a directory size up to 12× the baseline to eliminate unnecessary
+cache controller must determine whether it needs to send out inval-
+                                                                                  cache line invalidations. Since REC coalesces up to 16 consecutive
+idation requests to the sharers that hold the copy of data. For this,
+                                                                                  cache line addresses per entry, REC can track a significantly larger num-
+the controller again looks up the directory with the calculated base
+                                                                                  ber of cache lines compared to the prior work. Moreover, REC precisely
+address and offset ( E ). If an entry is found and the offset is valid
+                                                                                  tracks each address by storing the offset and sharer information. Thus,
+(i.e., the position bit is set), the invalidation request is generated
+                                                                                  REC fully support fine-grained management of cache lines under write
+and propagated to the recorded sharers immediately ( F ). The state
+                                                                                  operations.
+transition is handled differently based on two conditions. First, when            Scalability: REC requires modifications to its design in large-scale
+another offset is tracked under the common address range, the directory           systems, specifically to the sharer bit field. For an 8-GPU system, REC
+entry should remain valid. Thus, the controller clears only the position          requires (8-1) × 16 = 112 bits to record sharers in each entry. Then,
+and sharer bits for the specific offset of the target address. For example,       the size of each entry becomes 112 + 38 + 16 + 1 = 167 bits, which
+in Fig. 10, the directory entry has another offset (atp = 56) recorded            is approximately three times the baseline size, where each entry costs
+under the same base address. Once the invalidation request is sent out            56 bits, including a 4-bit increase for sharers. Similarly, for a 16-GPU
+to GPU1, the controller only clears bits 0 and 1. If the cleared bits are         system, REC requires 295 bits per entry, roughly five times the baseline
+the last ones, the entire directory entry transitions to an invalid state         size. However, as observed in Section 3.3, an ideal GPU requires up to
+to make room for new entries.                                                     12 times the baseline directory size even in a 4-GPU system, implying
+Remote writes: For the remote write request, the cache controller                 that simply increasing the baseline directory size is insufficient to meet
+begins the same directory lookup process by calculating the base and              scalability demands.
+offset from the tag ( G ). In our target multi-GPU system, the source
+GPU also performs writes to the copy of data in its local L2 cache                5. Evaluation
+(discussed in Section 2.2). Therefore, the controller handles remote
+write requests differently from local writes. When an entry already               5.1. Methodology
+exists in the directory (i.e., hits), there may be two circumstances: (1)
+the target offset is invalid but the entry has other valid offsets and (2)            We use MGPUSim [20], a cycle-accurate multi-GPU simulator, to
+the target offset is already valid and one or more sharers are being              model baseline and REC architecture with four AMD GPUs connected
+tracked. If the target offset is invalid, the controller simply adds the          using inter-GPU links of 300 GB/s bandwidth [26]. The configuration of
+offset and the sharer to the entry in the same way it handles remote              the modeled GPU architecture is detailed in Table 2. Each GPU includes
+reads. If the offset is valid, the controller adds the source GPU to the          L1 scalar and instruction caches shared within each SA, while the L1
+sharer list by setting its corresponding bit and clearing other sharer            vector cache is private to each CU, and the L2 cache is shared across the
+                                                                                  GPU. We extend remote data caching to the L2 caches, allowing data
+bits ( H ), then sends invalidation requests to all other sharers ( I ). In
+                                                                                  from any GPU in the system to be cached in the L2 cache of any other
+Fig. 10, the entry and the target offset (atp = 56) both are already
+                                                                                  GPU. Since MGPUSim does not include a support of hardware cache
+recorded. The controller, thus, additionally sets bit 58 to add GPU2 as
+                                                                                  coherence, we extend the simulator by implementing a coherence di-
+a sharer while clearing the bit 59 and sends the invalidation request
+                                                                                  rectory managed by the L2 cache controller. The coherence directory is
+to GPU3. In either cases, the directory entry remains valid. When the
+                                                                                  implemented with a set-associative structure to reduce lookup latency.
+directory misses, the cache controller allocates a new entry to record
+                                                                                  Since the baseline coherence directory is decoupled from the caches,
+the base, offset, and sharer from the write request. Then, the entry state
+                                                                                  its way associativity as well as the size can be scaled independently.
+transitions to valid.                                                             In our evaluation, the coherence directory is designed with an 8-way
+Directory entry eviction/replacement: When the coherence directory                set-associative structure to reduce conflict misses, containing 8K entries
+becomes full, it needs to replace an entry with the newly inserted                in both the baseline and REC architectures. Upon receiving remote read
+one. The baseline coherence directory uses a FIFO replacement policy.             requests, the cache controller updates the coherence directory with
+However, for workloads that exhibit irregular memory access pat-                  recording the addresses and the associated sharers. Once capacity of
+terns, capturing locality becomes a challenge. To address this, REC               the directory is reached, the cache controller evicts an entry and sends
+adopts the replacement policy, similar to LRU, to better retain entries           out invalidation requests to the recorded sharers. For receiving write
+that are more likely to be accessed again. As the cache controller                requests, the controller looks up the directory to find whether data
+receives the remote read request and does find an entry with the                  with matching addresses are shared by remote GPUs. If the matching
+matching base address ( J ), it determines an entry for replacement               entries are found, invalidation requests are propagated to the sharers
+( K ). The evicting entry is then replaced with the new entry from the            except the source GPU. Additionally, since L2 caches are managed
+incoming request ( L ). Meanwhile, the controller retrieves the base              by coherence directories, acquire operations do not perform invalida-
+address, every merged offset from the evicting entry and reconstructs             tions on L2 caches, but release operations flush the L2 caches. We
+the original tag addresses. Invalidation requests are propagated to every         use workloads from a diverse set of benchmark suites, including AM-
+recorded sharer associated with each tag address ( M ). Lastly, the entry         DAPPSDK [23], Heteromark [22], Polybench [21], SHOC [24]. Table 3
+transitions to an invalid state.                                                  lists the workloads with their memory footprints.
+
+                                                                              7
+G. Ko et al.                                                                                                                  Journal of Systems Architecture 160 (2025) 103339
+
+
+
+
+Fig. 11. Performance comparison of the baseline with double-sized coherence direc-       Fig. 12. Number of coalesced cache line addresses at directory entry eviction under
+tory, HMG [11], REC, and an idealized system with zero unnecessary invalidations.        REC with varying addressable ranges. REC in this work coalesces with 1 kB addressable
+Performance is normalized to the baseline with 8K-entry coherence directory.             range.
+
+
+
+
+5.2. Performance analysis
+
+    Fig. 11 shows the performance of the baseline with coherence di-
+rectory double in size, HMG [11], REC, and an ideal multi-GPU system
+with zero unnecessary invalidations relative to the baseline. First, we
+include the performance of baseline with double in coherence directory
+size to compare REC with the same storage cost. The result shows that
+the baseline with double the size of directory achieves average speedup
+of 7.3%. The baseline coherence directory tracks each remote access
+                                                                                         Fig. 13. Total number of L2 cache misses in the baseline with double-sized coherence
+individually, on a per-entry basis. As discussed in Section 3.3, doubling                directory, HMG [11], and REC relative to the baseline.
+the size of coherence directory does not mitigate the unnecessary cache
+line invalidations for applications with significant directory evictions.
+Also, results show that HMG and REC achieve average speedup of
+                                                                                         As a result, this delays the replacement of useful cache lines, thereby
+16.7% and 32.7% across the evaluated workloads. We observe that
+                                                                                         improving cache efficiency.
+REC outperforms the prior scheme for two reasons. First, REC delays
+                                                                                         L2 cache misses: The performance improvement of REC is largely
+directory evictions by allowing each entry to record more cache line
+                                                                                         attributed to the reduction in cache misses caused by unnecessary
+addresses for a wider range. Since HMG uses each directory entry to
+                                                                                         invalidations from frequent evictions in the coherence directory of
+track four cache lines, an entire coherence directory can track cache
+                                                                                         home GPUs. Fig. 13 shows the total number of L2 cache misses in the
+lines up to 4× the baseline. On the other hand, the directory in REC
+                                                                                         baseline with double-sized directory, HMG, and REC relative to the
+can record up to 16× the number of entries. Second, REC manages write
+                                                                                         baseline. Cold misses are excluded from the results. We observe that
+operations to shared cache lines at a fine granularity by searching the
+                                                                                         REC reduces L2 cache misses by 53.5%. In contrast, the baseline with
+directory with exact addresses and sharers, propagating invalidations
+                                                                                         double-sized directory and HMG experience 1.79× and 1.40× higher
+only when necessary. Since each directory entry of HMG stores only
+a single address and sharer ID field that cover for four cache lines,                    number of cache misses than REC since neither approach is insufficient
+writes to any of these cache lines trigger invalidation requests to every                to delay evict-initiated cache line invalidations. The result is closely
+cache line and recorded sharer which leads them to be false positives.                   related to the reduction in remote access latency, as the corresponding
+In contrast, REC does not allow any false positives and performs inval-                  misses are forwarded to the remote GPUs. Addressing the remote GPU
+idations only to the modified cache lines and the associated sharers. As                 access bottleneck is performance-critical in multi-GPU systems.
+a result, REC reduces unnecessary invalidations on cache lines that are                  Unnecessary invalidations: In the baseline, invalidation requests
+actively being accessed by the requesting GPUs, minimizing redundant                     propagated from frequent directory evictions in the home GPU lead to a
+remote memory accesses. To investigate the effectiveness of REC under                    higher chances of finding the corresponding cache lines still valid in the
+different addressable ranges listed in Table 1, we also measure the                      sharer-side L2 caches. This results in premature invalidations of cache
+number of coalesced cache line addresses when an entry is evicted                        lines that are actively in use, exacerbating the cache miss rate. In REC,
+and plot in Fig. 12. We observe that the directory entries capture an                    the invalidation requests generated by directory eviction reduce the
+average of 1.8, 3.4, 12.9, and 54.7 addresses until eviction under REC                   chances of invalidating valid cache lines. Fig. 14 shows that the number
+with 128B, 256B, 1 kB, and 4 kB coalesceable ranges. Specifically,                       of unnecessary invalidations performed in remote L2 caches (i.e., where
+REC captures more than 14 addresses before directory eviction for                        they are hits) is reduced by 84.4%. Since REC significantly delays evict-
+applications with strong spatial locality.                                               initiated invalidation requests, many cache lines have already been
+    Fig. 12 also illustrates the characteristics of limited locality for                 evicted from the caches by the time these requests are issued.
+certain workloads where REC benefits less. In ATAX, PR, and ST, REC                      Inter-GPU transactions: The reduction in unnecessary invalidations
+coalesces 3.9, 6.1, and 5.8 addresses, respectively. This is because the                 enhances the utilization of data within the sharer GPUs and min-
+applications exhibit locality challenging to be captured due to their                    imizes redundant accesses over inter-GPU links. Fig. 14 shows the
+irregular memory access patterns that span across a wide range of                        total number of inter-GPU transactions compared to the baseline. As
+addresses. To delay the eviction of entries in irregular workloads, we                   illustrated, REC reduces inter-GPU transactions by an average of 34.9%.
+design our proposed coherence directory with an LRU-like replace-                        The reduced inter-GPU transactions directly contributes to the overall
+ment policy (discussed in Section 4.2). Another interesting observation                  performance improvement in multi-GPU systems.
+is that the performance improvement of GEMV with REC is higher                           Bandwidth impact: Fig. 15 shows the total inter-GPU bandwidth
+than the improvement seen when eliminating unnecessary invalida-                         costs of invalidation requests. As presented in Section 3.2, a large
+tions. Our approach delays invalidations, but still performs them when                   fraction of invalidation requests are propagated due to frequent direc-
+the directories become full. During cache line replacement, the con-                     tory evictions. Since REC delays invalidation requests from directory
+troller prioritizes invalid cache lines before applying the LRU policy.                  evictions by allowing each entry to coalesce multiple tag addresses, the
+
+                                                                                     8
+G. Ko et al.                                                                                                                   Journal of Systems Architecture 160 (2025) 103339
+
+
+
+
+Fig. 14. Total number of unnecessary invalidations (bars) and inter-GPU transactions
+(plots) relative to the baseline.                                                          Fig. 17. Performance of REC under varying (a) coalescing address ranges and (b)
+                                                                                           number of directory entries. Results are shown relative to the baseline with an 8K-
+                                                                                           entry coherence directory.
+
+
+
+
+           Fig. 15. Total bandwidth consumption of invalidation requests.
+
+
+                                                                                           Fig. 18. Performance comparison of REC using FIFO and LRU replacement policies.
+                                                                                           Performance is normalized to the baseline coherence directory with FIFO policy.
+
+
+
+
+                         Fig. 16. L2 cache lookup latency.
+
+
+
+
+bandwidth in most of the workloads becomes only a few gigabytes per                        Fig. 19. Performance impact of different L2 cache sizes in the baseline and REC.
+                                                                                           Performance is normalized to the baseline with 2 MB L2 cache.
+second.
+Cache lookup latency: Fig. 16 illustrates the average L2 cache lookup
+latency of REC normalized to the baseline. The results show that the
+lookup latency reduces by 14.8% compared to the baseline. REC affects                      average, REC outperforms the baseline, even with reduced entry sizes
+the average lookup latency as evict-initiated invalidation requests are                    compared to the baseline system with 8K-entry coherence directory.
+propagated in burst. However, since REC significantly delays direc-                        This is because the coverage of each coherence directory in REC
+tory eviction by coalescing multiple tag addresses, the overall latency                    can increase by up to 16× when locality is fully utilized. Although
+decreases for most of the evaluated workloads.                                             applications with limited locality show performance improvements as
+                                                                                           the directory size increases, these gains are relatively modest when
+5.3. Sensitivity analysis                                                                  considered against the additional hardware costs.
+                                                                                           FIFO replacement: Fig. 18 represents the performance of REC with
+Coalescing range: One important design decision in optimizing REC                          a FIFO replacement policy. Our evaluation shows that the choice of
+is determining the range over which to coalesce when remote read                           replacement policy has a relatively small impact on the overall perfor-
+requests are received. As discussed in Section 4.1, the trade-off exists                   mance. For the workloads with regular and more predictable memory
+between the range an entry coalesces and the number of bits required:                      access patterns, using the FIFO replacement policy is already effective
+the larger the range, the more bits are needed to store the remote                         in coalescing sufficient number of addresses under the target ranges
+GPU access information. Fig. 17(a) shows that the performance of REC                       (shown in Fig. 12). However, for some applications, such as ATAX,
+improves as the coalescing range increases, with performance gains                         PR, and ST, performance is lower with FIFO compared to REC due
+beginning to saturate at 1 kB. For our applications, a 1 kB range is                       to their limited locality patterns. These applications, therefore, benefit
+sufficient to capture the majority of memory access locality within the                    from using an LRU-like replacement policy.
+workloads. Since coalescing beyond 4 kB incurs excessive overhead in                       L2 cache size: The performance impact of different L2 cache sizes is
+terms of bits required per entry (with 4 kB already requiring nearly 6×                    shown in Fig. 19. The results are normalized to the baseline with a
+the baseline size), the potential performance improvement may not be                       2 MB L2 cache. The benefits from increasing L2 cache capacity are
+substantial to offset the additional cost. Therefore, we choose a 1 kB                     limited by the baseline coherence directory. In contrast, the perfor-
+range for our implementation.                                                              mance of REC improves as L2 cache size increases, demonstrating its
+Entry size: In our evaluation, we use a directory size of 8K entries                       ability to leverage larger caches effectively. Another observation is that
+to match the baseline coherence directory. Fig. 17(b) shows the per-                       performance improvement with smaller L2 capacity is less significant
+formance REC with varying entry sizes, ranging from 2K to 32K. On                          compared to larger L2 caches. This is because the coverage of the
+
+                                                                                       9
+G. Ko et al.                                                                                                                   Journal of Systems Architecture 160 (2025) 103339
+
+
+
+
+Fig. 20. Performance impact of different inter-GPU bandwidth in the baseline and REC.                   Fig. 23. Performance of REC in different GPU architecture.
+Performance is normalized to the baseline with 300 GB/s inter-GPU bandwidth.
+
+
+
+
+Fig. 21. Performance of REC with different number of SAs normalized to the baseline
+                                                                                                           Fig. 24. Performance of REC with DNN applications.
+with 16 SAs.
+
+
+
+                                                                                             16-GPU systems, respectively. We observe that the performance im-
+                                                                                             provement decreases as the number of GPUs increases. This is because,
+                                                                                             with more GPUs, the application dataset is more distributed, and the
+                                                                                             amount of data allocated to each GPU’s memory decreases, resulting
+                                                                                             in reduced pressure on each coherence directory for tracking shared
+                                                                                             copies. Additionally, we compare REC with the baseline configured
+                                                                                             with different directory sizes to match equal storage costs (discussed in
+                                                                                             Section 4.3). We observe that REC achieves performance improvements
+                                                                                             of 2.04× and 1.83× over the baseline with directory sizes increased by
+Fig. 22. Performance comparison of REC and the baseline with equal storage cost              3× and 5×, respectively. The results confirm that simply increasing di-
+under different number of GPUs. Performance is normalized to the baseline with 8K
+                                                                                             rectory sizes is not an efficient approach, even in large-scale multi-GPU
+entries.
+                                                                                             systems.
+
+                                                                                             5.4. REC with Different GPU Architecture
+baseline coherence directory relatively increases as the L2 cache size
+decreases. To further explore the performance sensitivity to different
+                                                                                                 We extend the evaluation of REC to include a different GPU ar-
+L2 cache sizes, we evaluate REC in systems with L2 cache sizes of
+                                                                                             chitecture by adapting the simulation environment to a more recent
+0.5 MB and 8 MB. We find that REC achieves an average performance
+                                                                                             NVIDIA-styled GPU [27]. This involves increasing the number of com-
+improvement of 6.3% and 26.7% compared to the baseline with 0.5 MB
+                                                                                             putation and memory resources compared to the AMD GPU setup.
+and 8 MB L2 caches, respectively. Additionally, the performance trend
+                                                                                             Specifically, we change the GPU configuration to include 128 CUs, each
+of REC decreases as the L2 cache size increases since the effectiveness
+                                                                                             with a 128 kB L1V cache. The L2 cache size is increased to 72 MB
+of REC also reduces larger caches. Nevertheless, the results emphasize
+                                                                                             with the cache line size adjusted to 128B. With the increased cache
+the importance of coherence protocol in improving cache efficiency.
+                                                                                             line size, we configure the addressable range of REC to 2 kB, allowing
+Inter-GPU bandwidth: The bandwidth of inter-GPU links is a critical                          for coalescing up to the same number of tag addresses. We also scale
+factor in scaling multi-GPU performance. Fig. 20 shows the perfor-                           the input sizes of the workloads until the simulations remain feasible.
+mance of the baseline and REC under different inter-GPU bandwidths,                          The performance results, in Fig. 23, show that REC achieves a 12.9%
+relative to the 300 GB/s baseline. The results demonstrate that REC out-                     performance improvement over the baseline. This indicates that our
+performs the baseline, even in applications where performance begins                         proposed REC also benefits the NVIDIA-like GPU architecture.
+to saturate with increased bandwidth.
+Number of SAs: We also evaluate REC with increasing the number                               5.5. Effectiveness of REC on DNN applications
+of SAs as shown in Fig. 21. The performance improvement of REC
+decreases compared to the system with 16 SAs since the increased                                 We evaluate the performance improvement of REC in training
+number of SAs improves thread-level parallelism of GPUs. However, the                        two DNN models, VGG16 and ResNet18, using Tiny-Imagenet-200
+system with a larger number of SAs also elevates the intensity of data                       dataset [28]. As shown in Fig. 24, REC outperforms the baseline for
+sharing thus, increases the frequency of coherence directory evictions.                      training VGG16 and ResNet18 by 5.6% and 8.9%, respectively. The
+As a result, REC outperforms the baseline with 16 SAs by 17.1%.                              results imply that REC also has benefits in multi-GPU training on
+Number of GPUs: We evaluate REC in 8-GPU and 16-GPU systems,                                 DNN workloads. Additionally, GPUs have recently gained significant
+as shown in Fig. 22. To ensure a fair comparison, we do not change                           attention for training large language models (LLM). The computation
+the workload sizes. The results show that REC provides performance                           of LLM training comprises multiple decoder blocks with each primarily
+improvements of 24.7% and 14.7% over the baseline in 8-GPU and                               having series of matrix and vector operations [29]. In our evaluation,
+
+                                                                                        10
+G. Ko et al.                                                                                                           Journal of Systems Architecture 160 (2025) 103339
+
+
+we observe that REC improves multi-GPU performance by 20.2% and                   translation overheads, and [47]. Villa et al. [49] studied design-
+20.4% on GEMM and GEMV workloads, respectively. Considering real-                 ing trustworthy system-level simulation methodologies for single- and
+world LLM training, the memory requirements can become significant                multi-GPU systems. Lastly, NGS [50] enables multiple nodes in a data
+with large parameters which can pressure memory systems and lead                  center network to share the compute resources of GPUs on top of a
+to under-utilization of computation resources [29]. Since REC im-                 virtualization technique.
+proves the cache efficiency in multi-GPU systems, we expect a higher
+performance potential from REC in real-world LLM training.                        7. Conclusion
+
+6. Related work                                                                       In this paper, we propose REC to improve the efficiency of cache
+                                                                                  coherence in multi-GPU systems. Our analysis shows that the limited
+    Several prior works have proposed GPU memory consistency and
+                                                                                  capacity of coherence directories in fine-grained hardware protocols
+cache coherence mechanisms optimized for general-purpose domains
+                                                                                  frequently leads to evictions and unnecessary invalidations of shared
+[13–15,19,30–32]. GPU-VI [19] reduces stalls at the cache controller
+                                                                                  data. As a result, the increase in cache misses exacerbates NUMA
+by employing write-through, write-no-allocate L1 caches and treating
+                                                                                  overhead, leading to significant performance degradation in multi-GPU
+loads to the pending writes as misses. To maintain write atomicity,
+                                                                                  systems. To address this challenge, REC leverages memory access local-
+GPU-VI adds transient states and state transitions and requires invali-
+                                                                                  ity to coalesce multiple tag addresses within common address ranges,
+dation acknowledgments before write completion. REC is implemented
+                                                                                  effectively increasing the coverage of coherence directories without
+based on the relaxed memory models commonly adopted in recent
+                                                                                  incurring significant hardware overhead. Additionally, REC maintains
+GPU architectures, which do not require acknowledgments to be sent
+                                                                                  write-initiated invalidations at a fine granularity to ensure precise and
+or received over long-latency inter-GPU links. HMG [11] proposes a
+                                                                                  flexible coherence across GPUs. Experiments show that REC reduces
+lightweight directory protocol by addressing up-to-date memory consis-
+                                                                                  L2 cache misses by 53.5% and improves overall system performance
+tency and coherence requirements. HMG integrates separate layers for
+                                                                                  by 32.7%.
+managing inter-GPM and inter-GPU level coherence, reducing network
+traffic and complexity in deeply hierarchical multi-GPU systems. REC
+primarily addresses the increased cache misses to remotely fetched data           CRediT authorship contribution statement
+caused by frequent invalidations. Additionally, REC can be extended
+to support hierarchical multi-GPU systems posed by HMG without                        Gun Ko: Writing – original draft, Visualization, Validation, Soft-
+significant hardware modifications.                                               ware, Resources, Methodology, Investigation, Formal analysis, Data
+    Other efforts aim to design efficient cache coherence protocols for           curation, Conceptualization. Jiwon Lee: Formal analysis, Conceptu-
+other processor domains. Wang et al. [33] suggested a method to                   alization. Hongju Kal: Validation, Conceptualization. Hyunwuk Lee:
+efficiently support dynamic task parallelism on heterogeneous cache               Visualization, Validation. Won Woo Ro: Supervision, Project adminis-
+coherent systems. Zuckerman et al. [34] proposed Cohmeleon that                   tration, Conceptualization.
+orchestrates the coherence in accelerators in heterogeneous system-on-
+chip designs. HieraGen [35] and HeteroGen [36] are automated tools                Declaration of competing interest
+for generating hierarchical and heterogeneous cache coherence proto-
+cols, respectively, for generic processor designs. Li et al. [37] proposed            The authors declare that they have no known competing finan-
+methodologies to determine the minimum number of virtual networks                 cial interests or personal relationships that could have appeared to
+for cache coherence protocols that can avoid deadlocks. However, these            influence the work reported in this paper.
+studies do not address the challenges of redundant invalidations in the
+cache coherence mechanisms of multi-GPU systems.
+                                                                                  Acknowledgments
+    Significant research has addressed the NUMA effect in multi-GPU
+systems by proposing efficient page placement and migration strate-
+                                                                                      This work was supported by Institute of Information & communica-
+gies [5,6,38], data transfer and replication methods [4,7,8,10,39,40],
+                                                                                  tions Technology Planning & Evaluation (IITP) grant funded by the Ko-
+and address translation schemes [41–43]. In particular, several works
+                                                                                  rea government (MSIT) (No. 2024-00402898, Simulation-based High-
+have focused on improving the management of shared data within the
+                                                                                  speed/High-Accuracy Data Center Workload/System Analysis Platform)
+local memory hierarchy. NUMA-aware cache partitioning [3] dynami-
+cally allocates cache space to accommodate data from both local and
+remote memory by monitoring inter-GPU and local DRAM bandwidths.                  Data availability
+The authors also extend software coherence with bulk invalidations
+to L2 caches and evaluate the overhead associated with unnecessary                   The authors are unable or have chosen not to specify which data
+invalidations. SAC [12] proposes reconfigurable last-level caches (LLC)           has been used.
+that can be utilized as either memory-side or SM-side, depending on
+predicted application behavior in terms of effective LLC bandwidth.
+                                                                                  References
+SAC evaluates the performance of both software and hardware ex-
+tensions for LLC coherence. In contrast, REC specifically targets the
+                                                                                   [1] NVIDIA, NVIDIA DGX-2, 2018, https://www.nvidia.com/content/dam/en-
+issue of unnecessary invalidations under hardware coherence, which                     zz/Solutions/Data-Center/dgx-2/dgx-2-print-datasheet-738070-nvidia-a4-web-
+can undermine the efficiency of remote data caching. It introduces                     uk.pdf.
+a new directory structure, carefully examining the trade-off between               [2] NVIDIA, NVIDIA DGX A100 system architecture, 2020, https://download.
+performance and storage overhead.                                                      boston.co.uk/downloads/3/8/6/386750a7-52cd-4872-95e4-7196ab92b51c/
+                                                                                       DGX%20A100%20System%20Architecture%20Whitepaper.pdf.
+    Recent studies on multi-GPU and multi-node GPU systems also ad-
+                                                                                   [3] U. Milic, O. Villa, E. Bolotin, A. Arunkumar, E. Ebrahimi, A. Jaleel, A. Ramirez,
+dress challenges in various domains. Researchers proposed methods to
+                                                                                       D. Nellans, Beyond the socket: NUMA-aware GPUs, in: Proceedings of IEEE/ACM
+accelerate deep learning applications [44], graph neural networks [45],                International Symposium on Microarchitecture, 2017, pp. 123–135.
+and graphics rendering applications [46] in multi-GPU systems. Na                  [4] V. Young, A. Jaleel, E. Bolotin, E. Ebrahimi, D. Nellans, O. Villa, Combining
+et al. [47] addressed security challenges in inter-GPU communications                  HW/SW mechanisms to improve NUMA performance of multi-GPU systems, in:
+under unified virtual memory framework. Barre Chord [48] leverages                     Proceedings of IEEE/ACM International Symposium on Microarchitecture, 2018,
+page allocation schemes in multi-chip-module GPUs to reduce address                    pp. 339–351.
+
+
+                                                                             11
+G. Ko et al.                                                                                                                          Journal of Systems Architecture 160 (2025) 103339
+
+
+ [5] T. Baruah, Y. Sun, A.T. Dinçer, S.A. Mojumder, J.L. Abellán, Y. Ukidave, A.                [30] K. Koukos, A. Ros, E. Hagersten, S. Kaxiras, Building heterogeneous Unified
+     Joshi, N. Rubin, J. Kim, D. Kaeli, Griffin: Hardware-software support for efficient             Virtual Memories (UVMs) without the overhead, ACM Trans. Archit. Code Optim.
+     page migration in multi-GPU systems, in: Proceedings of IEEE International                      13 (1) (2016).
+     Symposium on High Performance Computer Architecture, 2020, pp. 596–609.                    [31] X. Ren, M. Lis, Efficient sequential consistency in GPUs via relativistic cache co-
+ [6] M. Khairy, V. Nikiforov, D. Nellans, T.G. Rogers, Locality-centric data and thread-             herence, in: Proceedings of IEEE International Symposium on High Performance
+     block management for massive GPUs, in: Proceedings of IEEE/ACM International                    Computer Architecture, 2017, pp. 625–636.
+     Symposium on Microarchitecture, 2020, pp. 1022–1036.
+                                                                                                [32] S. Puthoor, M.H. Lipasti, Turn-based spatiotemporal coherence for GPUs, ACM
+ [7] H. Muthukrishnan, D. Lustig, D. Nellans, T. Wenisch, GPS: A global publish-
+                                                                                                     Trans. Archit. Code Optim. 20 (3) (2023).
+     subscribe model for multi-GPU memory management, in: Proceedings of
+     IEEE/ACM International Symposium on Microarchitecture, 2021, pp. 46–58.                    [33] M. Wang, T. Ta, L. Cheng, C. Batten, Efficiently supporting dynamic task paral-
+ [8] L. Belayneh, H. Ye, K.-Y. Chen, D. Blaauw, T. Mudge, R. Dreslinski, N. Talati,                  lelism on heterogeneous cache-coherent systems, in: Proceedings of ACM/IEEE
+     Locality-aware optimizations for improving remote memory latency in multi-GPU                   International Symposium on Computer Architecture, 2020, pp. 173–186.
+     systems, in: Proceedings of the International Conference on Parallel Architectures         [34] J. Zuckerman, D. Giri, J. Kwon, P. Mantovani, L.P. Carloni, Cohmeleon:
+     and Compilation Techniques, 2022, pp. 304–316.                                                  Learning-based orchestration of accelerator coherence in heterogeneous SoCs, in:
+ [9] S.B. Dutta, H. Naghibijouybari, A. Gupta, N. Abu-Ghazaleh, A. Marquez, K.                       Proceedings of IEEE/ACM International Symposium on Microarchitecture, 2021,
+     Barker, Spy in the GPU-box: Covert and side channel attacks on multi-GPU                        pp. 350–365.
+     systems, in: Proceedings of ACM/IEEE International Symposium on Computer
+                                                                                                [35] N. Oswald, V. Nagarajan, D.J. Sorin, HieraGen: Automated generation of con-
+     Architecture, 2023, pp. 633–645.
+                                                                                                     current, hierarchical cache coherence protocols, in: Proceedings of ACM/IEEE
+[10] H. Muthukrishnan, D. Lustig, O. Villa, T. Wenisch, D. Nellans, FinePack:
+                                                                                                     International Symposium on Computer Architecture, 2020, pp. 888–899.
+     Transparently improving the efficiency of fine-grained transfers in multi-GPU
+     systems, in: Proceedings of IEEE International Symposium on High Performance               [36] N. Oswald, V. Nagarajan, D.J. Sorin, V. Gavrielatos, T. Olausson, R. Carr,
+     Computer Architecture, 2023, pp. 516–529.                                                       HeteroGen: Automatic synthesis of heterogeneous cache coherence protocols, in:
+[11] X. Ren, D. Lustig, E. Bolotin, A. Jaleel, O. Villa, D. Nellans, HMG: Extending                  Proceedings of IEEE International Symposium on High Performance Computer
+     cache coherence protocols across modern hierarchical multi-GPU systems, in:                     Architecture, 2022, pp. 756–771.
+     Proceedings of IEEE International Symposium on High Performance Computer                   [37] W. Li, A.G.U. of Amsterdam, N. Oswald, V. Nagarajan, D.J. Sorin, Determining
+     Architecture, 2020, pp. 582–595.                                                                the minimum number of virtual networks for different coherence protocols, in:
+[12] S. Zhang, M. Naderan-Tahan, M. Jahre, L. Eeckhout, SAC: Sharing-aware caching                   Proceedings of ACM/IEEE International Symposium on Computer Architecture,
+     in multi-chip GPUs, in: Proceedings of ACM/IEEE International Symposium on                      2024, pp. 182–197.
+     Computer Architecture, 2023, pp. 605–617.                                                  [38] Y. Wang, B. Li, A. Jaleel, J. Yang, X. Tang, GRIT: Enhancing multi-GPU
+[13] B.A. Hechtman, S. Che, D.R. Hower, Y. Tian, B.M. Beckmann, M.D. Hill, S.K.                      performance with fine-grained dynamic page placement, in: Proceedings of IEEE
+     Reinhardt, D.A. Wood, QuickRelease: A throughput-oriented approach to release                   International Symposium on High Performance Computer Architecture, 2024, pp.
+     consistency on GPUs, in: Proceedings of IEEE International Symposium on High                    1080–1094.
+     Performance Computer Architecture, 2014, pp. 189–200.
+                                                                                                [39] M.K. Tavana, Y. Sun, N.B. Agostini, D. Kaeli, Exploiting adaptive data com-
+[14] M.D. Sinclair, J. Alsop, S.V. Adve, Efficient GPU synchronization without
+                                                                                                     pression to improve performance and energy-efficiency of compute workloads in
+     scopes: Saying no to complex consistency models, in: Proceedings of IEEE/ACM
+                                                                                                     multi-GPU systems, in: Proceedings of IEEE International Parallel and Distributed
+     International Symposium on Microarchitecture, 2015, pp. 647–659.
+[15] J. Alsop, M.S. Orr, B.M. Beckmann, D.A. Wood, Lazy release consis-                              Processing Symposium, 2019, pp. 664–674.
+     tency for GPUs, in: Proceedings of IEEE/ACM International Symposium on                     [40] H. Muthukrishnan, D. Nellans, D. Lustig, J.A. Fessler, T.F. Wenisch, Efficient
+     Microarchitecture, 2016, pp. 1–13.                                                              multi-GPU shared memory via automatic optimization of fine-grained trans-
+[16] NVIDIA, NVIDIA TESLA V100 GPU architecture, 2017, https://images.nvidia.                        fers, in: Proceedings of the ACM/IEEE International Symposium on Computer
+     com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf.                           Architecture, 2021, pp. 139–152.
+[17] NVIDIA, NVIDIA A100 tensor core GPU architecture, 2020, https:                             [41] B. Li, J. Yin, Y. Zhang, X. Tang, Improving address translation in multi-
+     //images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-                          GPUs via sharing and spilling aware TLB design, in: Proceedings of IEEE/ACM
+     architecture-whitepaper.pdf.                                                                    International Symposium on Microarchitecture, 2021, pp. 1154–1168.
+[18] NVIDIA, NVIDIA NVLink high-speed GPU interconnect, 2024, https://www.
+                                                                                                [42] B. Li, J. Yin, A. Holey, Y. Zhang, J. Yang, X. Tang, Trans-FW: Short circuiting
+     nvidia.com/en-us/design-visualization/nvlink-bridges/.
+                                                                                                     page table walk in multi-GPU systems via remote forwarding, in: Proceedings
+[19] I. Singh, A. Shriraman, W.W.L. Fung, M. O’Connor, T.M. Aamodt, Cache coher-
+                                                                                                     of IEEE International Symposium on High Performance Computer Architecture,
+     ence for GPU architectures, in: Proceedings of IEEE International Symposium on
+                                                                                                     2023, pp. 456–470.
+     High Performance Computer Architecture, 2013, pp. 578–590.
+[20] Y. Sun, T. Baruah, S.A. Mojumder, S. Dong, X. Gong, S. Treadway, Y. Bao,                   [43] B. Li, Y. Guo, Y. Wang, A. Jaleel, J. Yang, X. Tang, IDYLL: Enhancing page
+     S. Hance, C. McCardwell, V. Zhao, H. Barclay, A.K. Ziabari, Z. Chen, R.                         translation in multi-GPUs via light weight PTE invalidations, in: Proceedings of
+     Ubal, J.L. Abellán, J. Kim, A. Joshi, D. Kaeli, MGPUSim: Enabling multi-                        IEEE/ACM International Symposium on Microarchitecture, 2015, pp. 1163–1177.
+     GPU performance modeling and optimization, in: Proceedings of ACM/IEEE                     [44] E. Choukse, M.B. Sullivan, M. O’Connor, M. Erez, J. Pool, D. Nellans, Buddy
+     International Symposium on Computer Architecture, 2019, pp. 197–209.                            compression: Enabling larger memory for deep learning and HPC workloads
+[21] T. Yuki, L.-N. Pouchet, Polybench 4.0, 2015.                                                    on GPUs, in: Proceedings of ACM/IEEE International Symposium on Computer
+[22] Y. Sun, X. Gong, A.K. Ziabari, L. Yu, X. Li, S. Mukherjee, C. Mccardwell, A.                    Architecture, 2020, pp. 926–939.
+     Villegas, D. Kaeli, Hetero-mark, a benchmark suite for CPU-GPU collaborative
+                                                                                                [45] Y. Tan, Z. Bai, D. Liu, Z. Zeng, Y. Gan, A. Ren, X. Chen, K. Zhong, BGS: Accelerate
+     computing, in: Proceedings of IEEE International Symposium on Workload
+                                                                                                     GNN training on multiple GPUs, J. Syst. Archit. 153 (2024) 103162.
+     Characterization, 2016, pp. 1–10.
+[23] AMD, AMD app SDK OpenCL optimization guide, 2015.                                          [46] X. Ren, M. Lis, CHOPIN: Scalable graphics rendering in multi-GPU systems via
+[24] A. Danalis, G. Marin, C. McCurdy, J.S. Meredith, P.C. Roth, K. Spafford, V. Tip-                parallel image composition, in: Proceedings of IEEE International Symposium on
+     paraju, J.S. Vetter, The Scalable Heterogeneous Computing (SHOC) benchmark                      High Performance Computer Architecture, 2021, pp. 709–722.
+     suite, in: Proceedings of the 3rd Workshop on General-Purpose Computation on               [47] S. Na, J. Kim, S. Lee, J. Huh, Supporting secure multi-GPU computing with dy-
+     Graphics Processing Units, 2010, pp. 63–74.                                                     namic and batched metadata management, in: Proceedings of IEEE International
+[25] R. Balasubramonian, A.B. Kahng, N. Muralimanohar, A. Shafiee, V. Srinivas,                      Symposium on High Performance Computer Architecture, 2024, pp. 204–217.
+     CACTI 7: New tools for interconnect exploration in innovative off-chip memories,           [48] Y. Feng, S. Na, H. Kim, H. Jeon, Barre chord: Efficient virtual memory trans-
+     ACM Trans. Archit. Code Optim. 14 (2) (2017) 14:1–25.                                           lation for multi-chip-module GPUs, in: Proceedings of ACM/IEEE International
+[26] NVIDIA, NVIDIA DGX-1 with tesla V100 system architecture, 2017, pp. 1–43.
+                                                                                                     Symposium on Computer Architecture, 2024, pp. 834–847.
+[27] NVIDIA, NVIDIA ADA GPU architecture, 2023, https://images.nvidia.com/aem-
+     dam/Solutions/Data-Center/l4/nvidia-ada-gpu-architecture-whitepaper-                       [49] O. Villa, D. Lustig, Z. Yan, E. Bolotin, Y. Fu, N. Chatterjee, Need for speed:
+     v2.1.pdf.                                                                                       Experiences building a trustworthy system-level GPU simulator, in: Proceedings
+[28] Y. Le, X. Yang, Tiny ImageNet visual recognition challenge, 2015, https://http:                 of IEEE International Symposium on High Performance Computer Architecture,
+     //vision.stanford.edu/teaching/cs231n/reports/2015/pdfs/yle_project.pdf.                        2021, pp. 868–880.
+[29] G. Heo, S. Lee, J. Cho, H. Choi, S. Lee, H. Ham, G. Kim, D. Mahajan, J. Park,              [50] J. Prades, C. Reaño, F. Silla, NGS: A network GPGPU system for orchestrating
+     NeuPIMs: NPU-PIM heterogeneous acceleration for batched LLM inferencing,                        remote and virtual accelerators, J. Syst. Archit. 151 (2024) 103138.
+     in: Proceedings of ACM International Conference on Architectural Support for
+     Programming Languages and Operating Systems, 2024, pp. 722–737.
+
+
+                                                                                           12
+G. Ko et al.                                                                                  Journal of Systems Architecture 160 (2025) 103339
+
+
+               Gun Ko received the B.S. degree in electrical engineering           Hyunwuk Lee received his B.S. and Ph.D. degrees in
+               from Pennsylvania State University in 2017. He is currently         electrical and electronic engineering from Yonsei University,
+               pursuing the Ph.D. degree with the Embedded Systems                 Seoul, Korea, in 2018 and 2024, respectively. He currently
+               and Computer Architecture Laboratory, School of Electrical          works in the memory division at Samsung Electronics. His
+               and Electronic Engineering, Yonsei University, Seoul, South         research interests include neural network accelerators and
+               Korea. His current research interests include GPU memory            GPU systems.
+               systems, multi-GPU systems, and virtual memory.
+
+
+
+
+               Jiwon Lee received the B.S. and Ph.D. degrees in electrical
+               and electronic engineering from Yonsei University, Seoul,           Won Woo Ro received the B.S. degree in electrical engineer-
+               South Korea, in 2018 and 2024, respectively. He currently           ing from Yonsei University, Seoul, South Korea, in 1996, and
+               works in the memory division at Samsung Electronics. His            the M.S. and Ph.D. degrees in electrical engineering from the
+               research interests include virtual memory, GPU memory               University of Southern California, in 1999 and 2004, respec-
+               systems, and storage systems.                                       tively. He worked as a Research Scientist with the Electrical
+                                                                                   Engineering and Computer Science Department, University
+                                                                                   of California, Irvine. He currently works as a Professor
+                                                                                   with the School of Electrical and Electronic Engineering,
+                                                                                   Yonsei University. Prior to joining Yonsei University, he
+                                                                                   worked as an Assistant Professor with the Department
+               Hongju Kal received the B.S. degree from Seoul National             of Electrical and Computer Engineering, California State
+               University of Science and Technology and Ph.D. degree from          University, Northridge. His industry experience includes a
+               Yonsei University in school of electric and electronic engi-        college internship with Apple Computer, Inc., and a contract
+               neering, Seoul, South Korea in 2018 and 2024, respectively.         software engineer with ARM, Inc. His current research
+               He currently works in the memory division at Samsung                interests include high-performance microprocessor design,
+               Electronics. His current research interests include memory          GPU microarchitectures, neural network accelerators, and
+               architectures, memory hierarchies, near memory processing,          memory hierarchy design.
+               and neural network accelerators.
+
+
+
+
+                                                                              13
+
--- a/papers_txt/Real-time-scheduling-for-multi-object-tracking-tasks-_2025_Journal-of-System.txt
+++ b/papers_txt/Real-time-scheduling-for-multi-object-tracking-tasks-_2025_Journal-of-System.txt
@@ -0,0 +1,901 @@
+                                                             Journal of Systems Architecture 160 (2025) 103349
+
+
+                                                                 Contents lists available at ScienceDirect
+
+
+                                                         Journal of Systems Architecture
+                                                         journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+Real-time scheduling for multi-object tracking tasks in regions with different
+criticalities
+Donghwa Kang a , Jinkyu Lee b ,∗, Hyeongboo Baek c ,∗
+a Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea
+b
+    Sungkyunkwan University (SKKU), Suwon, South Korea
+c
+    University of Seoul (UOS), Seoul, South Korea
+
+
+
+ARTICLE                 INFO                             ABSTRACT
+
+Keywords:                                                Autonomous vehicles (AVs) utilize sensors such as LiDAR and cameras to iteratively perform sensing, decision-
+Multi-object tracking                                    making, and actions. Multi-object tracking (MOT) systems are employed in the sensing stage of AVs, using these
+Real-time scheduling                                     sensors to detect and track objects like pedestrians and vehicles, thereby enhancing situational awareness.
+Timing guarantee
+                                                         These systems must handle regions of varying criticality and dynamically shifting locations, all within limited
+Criticality-awareness
+                                                         computing resources. Previous DNN-based MOT approaches primarily focused on tracking accuracy, but timing
+Autonomous driving
+                                                         guarantees are becoming increasingly vital for autonomous driving. Although recent studies have introduced
+                                                         MOT scheduling frameworks with timing guarantees, they are either restricted to single-camera systems or
+                                                         fail to prioritize safety-critical regions in the input images. We propose CA-MOT, a Criticality-Aware MOT
+                                                         execution and scheduling framework for multiple cameras. CA-MOT provides a control knob that balances
+                                                         tracking accuracy in safety-critical regions and timing guarantees. By effectively utilizing this control knob,
+                                                         CA-MOT achieves both high accuracy and timing guarantees. We evaluated CA-MOT’s performance using a
+                                                         GPU-enabled embedded board commonly employed in AVs, with data from real-world autonomous driving
+                                                         scenarios.
+
+
+
+1. Introduction                                                                          pooling and convolutional layers) using CNN (convolutional neural
+                                                                                         network)-based models (e.g., OS-Net [10]). For unmatched objects,
+    Autonomous vehicles (AVs) are systems that iteratively perform                       location-based methods like intersection over union (IoU) are applied.
+sensing, decision-making, and actions using various sensors such as                          MOT input images exhibit two key characteristics: (i) regions with
+LiDAR, radar, inertial measurement units (IMU), and cameras [1].                         varying levels of criticality and (ii) dynamically shifting locations. With
+Multi-object tracking (MOT) systems, used in the perception stage                        limited computing resources in AVs, it is crucial to deliver different
+of AVs, track objects like pedestrians and cars, enhancing situational                   levels of service quality based on criticality. Safety-critical regions,
+awareness. Since MOT information is periodically transferred to control                  where objects with a short time-to-collision (e.g., under 2 s) cluster,
+tasks, timely execution must be guaranteed to ensure safety and prevent                  must be prioritized. If multiple clusters exist, the broader area en-
+severe accidents [2–4]. Low accuracy, despite timely execution, may                      compassing them is considered the safety-critical region, as defined in
+result in missed objects, thus compromising AVs’ safety [2,4,5]. There-
+                                                                                         DNN-SAM [5]. Established methods compute time-to-collision using Li-
+fore, AV MOT systems should ensure timing guarantees with maximized
+                                                                                         DAR and IMU data; we follow the approach from DNN-SAM. This leads
+accuracy.
+                                                                                         to two requirements for criticality-aware MOT systems: (R1) accuracy
+    Tracking-by-detection [6,7] is widely used due to its high accuracy
+                                                                                         maximization for safety-critical regions and (R2) timing guarantees.
+and ability to leverage state-of-the-art DNN-based detection models
+                                                                                             Most existing DNN-based MOT approaches focus on accuracy [7,
+(e.g., YOLO series [8], Faster R-CNN [9]). For each input image from
+each camera, tracking-by-detection performs two tasks: detection and                     11,12], but timing guarantees are increasingly critical in autonomous
+association. Detection uses DNN-based models to sense the motion                         driving. Recent research has proposed MOT resource scheduling frame-
+information of objects, such as location and velocity, while association                 works that guarantee timing for every MOT execution [2,4]. How-
+matches objects between frames based on extracted feature informa-                       ever, [2] overlooks safety-criticality, while [4] focuses on a single task.
+tion (also called feature vectors or feature maps obtained through                       We address safety-criticality across multiple tasks, raising the following
+
+
+     ∗ Corresponding authors.
+       E-mail addresses: anima0729@kaist.ac.kr (D. Kang), jinkyu.lee@skku.edu (J. Lee), hbbaek359@gmail.com (H. Baek).
+
+https://doi.org/10.1016/j.sysarc.2025.103349
+Received 22 September 2024; Received in revised form 13 December 2024; Accepted 20 January 2025
+Available online 28 January 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+D. Kang et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103349
+
+
+challenges:
+
+ C1. How to balance R1 and R2 to efficiently use limited computing
+     resources.
+ C2. How to achieve both R1 and R2 by effectively using the control
+     knob developed from C1.
+    In this paper, we propose CA-MOT, a Criticality-Aware MOT exe-
+cution and scheduling framework for multiple MOT tasks. To address
+C1, CA-MOT offers three execution options (low, middle, and high
+workloads) to balance R1 and R2 for both detection and association.
+To address C2, CA-MOT introduces the notion of aging for detection
+and association sub-tasks, estimating the reliability of motion and
+feature information over time. Balancing the aging of these tasks is
+essential to achieve R1 and R2 with limited resources (to be discussed
+in Section 3.4). Based on this, CA-MOT develops two scheduling algo-
+rithms: EDF-BE and EDF-Slack. EDF-BE increases the workload of tasks
+waiting in the ready queue for execution (referred to as active tasks)
+without compromising the R2 bound when no other tasks are pending.
+In contrast, EDF-Slack is designed to handle scenarios with multiple
+active tasks.                                                                     Fig. 1. Tracking accuracy and execution time on different execution options of
+                                                                                  detection and association.
+    To validate CA-MOT’s performance in meeting R1 and R2, we
+conducted extensive experiments on an NVIDIA Jetson Xavier using
+the KITTI Dataset [13]. Additionally, we applied three detectors in our
+experiments: YOLOv5 [14], YOLOX [8], and Faster-RCNN [9].                         a system that tracks specific objects moving between the fields of view
+    The contributions of this paper are as follows:                               of multiple cameras (called hand-over), this is beyond the scope of
+                                                                                  our work. This paper focuses on dividing the multi-object tracking task
+    • We motivate the importance of balancing between aging of de-                into two subtasks (i.e., detection and association) and using DNN-based
+      tection and association to achieve R1 and R2 (Section 2).                   MOT-specific properties (i.e., reuse of motion and feature information)
+    • We propose a new system design, CA-MOT that addresses R1 and                to achieve R1 and R2 under limited resources.
+      R2 considering varying levels of criticality in different regions for
+      multiple MOT tasks (Section 3).                                             2.2. Trade-off between accuracy and execution time
+    • We develop new scheduling algorithms to effectively achieve R1
+      and R2 by balancing between aging of detection and association
+                                                                                      To address C1, we consider two factors: (i) the input image size
+      for each MOT task (Section 4).
+                                                                                  and detection within the safety-critical region, and (ii) the number of
+    • We demonstrate the effectiveness of CA-MOT in achieving R1 and
+                                                                                  objects used for feature extraction during association across all detected
+      R2 using a real-world self-driving dataset (Section 5).
+                                                                                  objects in each frame.
+                                                                                      Fig. 1(a) compares the multi-object tracking accuracy (MOTA) [15]
+2. Motivation                                                                     for the overall and safety-critical regions (referred to as overall and
+                                                                                  critical accuracy) and the execution time for a single MOT task using
+   This section presents target systems and motivates the system de-              three input image sizes (256 × 256, 416 × 416, 672 × 672). Overall
+sign of CA-MOT to address C1 and C2 based on measurement-based                    accuracy considers all objects, while critical accuracy focuses on the
+observations.                                                                     safety-critical region. YOLOv5 [14] is used for detection, and features
+                                                                                  are extracted for all detected objects. The KITTI dataset [13] is used.
+2.1. Target system
+                                                                                  For image sizes 256 × 256 and 416 × 416, detection is performed
+                                                                                  on a cropped region of interest (RoI) that includes the safety-critical
+    CA-MOT targets 2D MOT systems on AVs equipped with multiple
+                                                                                  region. If the RoI is smaller, it is resized to include the critical region;
+camera sensors. Each MOT task performs MOT execution on consecu-
+                                                                                  otherwise, the critical region is cropped accordingly. The safety-critical
+tive input frames received from the corresponding camera sensor at a
+                                                                                  region will be defined in Section 3. For the 672 × 672 size (i.e., the
+predetermined period. As this recurring task is required to complete
+                                                                                  original input size), detection occurs without cropping.
+a job within a specified deadline, each MOT task is considered a
+                                                                                      As shown in Fig. 1(a), reducing image size leads to a notable
+real-time task with a period and deadline. CA-MOT employs tracking-
+                                                                                  decrease in overall accuracy, while critical accuracy decreases less
+by-detection comprising two steps of MOT execution: detection and
+association. The front-end detector performs detection by exploiting              significantly due to prioritization of the safety-critical region in the RoI.
+the existing stand-alone DNN-based detector to identify the position              Additionally, execution time decreases as the image size is reduced,
+and class of objects in the input image. Using the locations of detected          demonstrating a trade-off between R1 and R2 when focusing on the
+objects, the feature extractor (e.g., the deployed CNN model such as              critical region.
+OSNet) extracts features (i.e., feature vectors or feature maps) for each             Fig. 1(b) shows the impact of varying the number of objects used
+object. These features capture the visual characteristics of each object.         for feature extraction on accuracy and execution time, with the image
+The back-end tracker compares the feature similarities between objects            size fixed at 672 × 672. The number of objects ranges from zero, three,
+in the current frame and the previous frame, matching objects with                and more than three. OS-Net [10] is used for feature extraction. As
+high similarity. For any remaining unmatched objects, a location-based            shown in Fig. 1(b), as the number of objects with feature extraction
+matching method such as IoU is applied. The tracker then stores the               increases, both overall and critical accuracy improve, but this also leads
+motion information (position and velocity) and the features of each               to increased execution time. This highlights a trade-off between R1 and
+object in preparation for the next frame.                                         R2 based on the number of objects considered for feature extraction.
+    We assume a system in which each camera independently tracks                      Section 3.3 details the MOT execution pipeline of CA-MOT, which
+objects moving within its field of view. While it is possible to consider         leverages these observations to effectively address C1.
+
+                                                                              2
+D. Kang et al.                                                                                                             Journal of Systems Architecture 160 (2025) 103349
+
+
+
+
+Fig. 2. System design of CA-MOT: the key features are (a) an aging-aware scheduler that provides timing guarantees and a criticality-aware flexible MOT execution pipeline
+including (b) a detection module that accommodates varying input sizes and (c) an association module that handles a varying number of objects for feature extraction.
+
+
+2.3. Different combination of detection and association                                     • It provides a timing guarantee while providing prioritized track-
+                                                                                              ing accuracy for the safety-critical regions by exploiting an MOT-
+    To address C2, Fig. 1(c) reveals an intriguing observation that differ-                   specific property.
+ent combinations of image sizes and the number of feature extractions
+                                                                                            To address the first goal, CA-MOT implements a criticality-aware
+yield distinct effects on accuracy and execution time. The experiment
+                                                                                        flexible MOT execution pipeline in which detection and association
+was conducted over 100 consecutive frames.
+                                                                                        are performed with different execution option by leveraging the ob-
+    In Fig. 1(c), the execution of detection or association is denoted by
+                                                                                        servations discussed in Section 2.2. To address the second goal, CA-
+𝑃 or 𝐹 . 𝑃 represents partial computation, where detection is performed                 MOT develops an aging-aware task-level scheduler to provide accuracy
+only on the region of interest (RoI) at a size of 256 × 256, including                  maximization while providing a timing guarantee by exploiting the
+a safety-critical region, and association is limited to location-based                  observations discussed in Section 2.3 building upon the MOT execution
+association without feature extraction. 𝐹 represents full computation,                  pipeline. The MOT execution pipeline and the scheduler are imple-
+where detection is performed on the entire image at a size of 672 × 672,                mented as separate threads, and they are communicated with shared
+and association includes feature extraction for all objects. The number                 memory.
+in the upper right of the notation indicates how many times the combi-                      CA-MOT does not require modifications to existing DNN models
+nation of detection and association has been performed. For example,                    (e.g., detectors and feature extractors), which allows for reusing most
+the notation 𝐹 𝐹 50 𝑃 𝑃 50 indicates that we use 𝐹 for both the detection               (if not all) stand-alone detectors and feature extractors. Notably, state-
+and association steps in the first 50 frames and 𝑃 for both phases in the               of-the-art detectors like YOLOv5 are inherently designed to handle
+remaining 50 frames. To mitigate the issue of objects outside the critical              varying input image sizes, and all CNN models can perform batch exe-
+region not being detected due to cropping, which can decrease the                       cution on multiple images (each corresponding to a different object). As
+accuracy of the non-critical region, we utilize the position information                shown in Fig. 2, the key features of CA-MOT are: (a) a scheduling policy
+of objects from the previous frame as predicted position information for                that selects one input per camera, provides timing guarantees, and
+the current frame using a prediction model such as Kalman filter [16]                   adjusts the workload for detection and association; (b) a module that
+during the execution of 𝑃 in the detection step. Except for 𝑃 𝑃 100 and                 processes detection with inputs of varying sizes; and (c) a module that
+𝐹 𝐹 100 , all combinations have the same proportion of 𝐹 and 𝑃 for the                  extracts features from a pre-determined number of detected objects.
+entire frames.
+    As shown in Fig. 1(c), although 𝐹 𝐹 50 𝑃 𝑃 50 and 𝑃 𝐹 100 have similar              3.2. Workflow
+execution times with (𝑃 𝐹 + 𝐹 𝑃 )100 and (𝐹 𝐹 + 𝑃 𝑃 )100 , they show lower
+tracking accuracy. This indicates that different combinations of 𝐹 and                      Fig. 2 presents the workflow of CA-MOT. During system operation,
+𝑃 can have a varying impact on accuracy. The observation in Fig. 1(c)                   the task scheduler maintains a queue to store images periodically
+necessitates a new scheduler that is capable of obtaining high tracking                 received from each camera sensor (⃝).  1   Then, the task scheduler deter-
+accuracy by capturing an MOT-specific property referred to as aging,                    mines the following for tasks in the queue: (a) the task to be scheduled,
+which will be detailed in Section 3.4.                                                  (b) the execution option for the detector, and (c) the execution option
+                                                                                        for the association (⃝).
+                                                                                                               2   After an image moves to the MOT execution
+3. System design of CA-MOT                                                              pipeline, the critical region identification module identifies a safety-
+                                                                                        critical region from the image and crops (or not) a RoI including the
+   This section presents the goal and design of CA-MOT to address C1                    safety-critical region according to the execution option for the detector
+and C2.                                                                                 (⃝).
+                                                                                          3    Depending on the execution option, the cropped RoI or entire
+                                                                                        image is processed for detection (⃝).4   Furthermore, depending on the
+3.1. System overview                                                                    number of objects for which features are extracted, CA-MOT selectively
+                                                                                        extracts features for detected objects (⃝).
+                                                                                                                                 5   All detected objects are then
+    CA-MOT utilizes a tracking-by-detection approach, consisting of                     matched with the tracked objects from the previous frame. If both the
+two steps: detection and association, where the front-end detector em-                  detected and tracked objects have feature vectors, they are associated
+ploys a pre-existing DNN-based detector to detect and classify objects                  through feature-based matching. Otherwise, they are associated solely
+in the input image, and the unmatched objects are matched using a                       based on their locations (⃝).
+                                                                                                                    6
+
+location-based method like IoU. CA-MOT aims at providing prioritized
+tracking accuracy for the safety-critical region with a timing guarantee                3.3. Criticality-aware flexible MOT execution pipeline
+for every MOT execution on limited computing resources by addressing
+C1 and C2 discussed in Section 1, which has the following design goals.                     The MOT execution pipeline conducts detection and association
+                                                                                        sequentially. CA-MOT can employ any existing stand-alone DNN-based
+    • It provides different execution options not only for detection but                detectors as long as it can accommodate different sizes of input images
+      also association considering different criticality of regions in input            (e.g., YOLO series) and offer a clear trade-off between accuracy and
+      images.                                                                           execution time. For each input image with a size of 672 × 672, the
+
+                                                                                    3
+D. Kang et al.                                                                                                       Journal of Systems Architecture 160 (2025) 103349
+
+
+detector performs the detection to identify the location and class of              association for each task at every scheduling decision. The scheduler
+multiple objects in the image. Once the scheduler determines the task              manages MOT tasks using a single queue and is triggered when a task
+(associated with an input image) to be scheduled and the execution                 completes its execution or a new task is released. As three execution
+option for the tasks, the detection is performed for the task according            options (e.g., low, middle, and high workloads) are provided for each
+to the execution option. CA-MOT provides three execution options                   detection and association under CA-MOT, the scheduler decides the
+(i.e., low, middle, and high workloads, respectively) providing a trade-           image size (e.g., 256 × 256, 416 × 416, and 672 × 672) for detection
+off between execution time and accuracy. For low and middle workload               and feature size (e.g., zero, three, and more than three) for association
+detections, CA-MOT first identifies the RoI with sizes of 256 × 256 and            according to the scheduling algorithms (to be presented in Section 4).
+416 × 416, respectively, and then detection is performed on cropped                    As discussed in Section 2.3, various combinations of image sizes
+RoI, which includes the safety-critical region. The area outside the               and the quantity of feature extractions result in different impacts on
+RoI is not subject to detection, and the motion information (e.g., size,           tracking accuracy. This is due to an important property of the MOT
+position, velocity, direction) of objects detected in the previous frame           system, which involves supplementing non-updated motion or feature
+is used in the prediction models such as the Kalman filter to obtain               information during detection and association in the current frame by
+the estimated information of objects in the current frame. On the other            utilizing information from the previous frame. For example, in scenar-
+hand, high-workload detection is performed on the original image with              ios with low and middle workload detection, the detection process does
+a size of 672 × 672.                                                               not cover the area outside the RoI. Instead, the motion information
+    We define the area that encompasses all safety-critical objects,               of objects detected in the previous frame, such as their size, position,
+which are objects with a time-to-collision of less than two seconds, as            velocity, and direction, is leveraged to estimate the corresponding
+the safety-critical area. If the safety-critical area exceeds the input size       information for objects in the current frame. Moreover, during the as-
+for the detector, as determined by the detection process (e.g., 256 × 256          sociation step, if the feature extracted from the immediately preceding
+or 416 × 416), the safety-critical area is cropped and resized to the              frame is unavailable due to low- and middle-workload associations,
+corresponding dimensions before being fed into the detector model.                 the feature-based matching algorithm compares the features extracted
+The locations of safety-critical objects are determined based on their             from objects in the current frame with the features extracted from the
+most recently computed positions, without projecting future safety-                nearest past frames. Therefore, the tracking accuracy is determined
+critical regions from them. There are numerous existing approaches                 by the reliability of the reused motion and feature information of the
+that calculate time-to-collision based on the relative positions of objects        objects. To capture the reliability of the motion and feature information
+and the ego vehicle given LiDAR and IMU data, and we assume the use                of objects, we propose a new notion of aging that specifies the number
+of one such method. It is also important to note that the KITTI dataset            of middle- or high-workload executions of detection and association
+provides both LiDAR and IMU data. For example, areas where objects                 conducted from the beginning of the MOT task, respectively. In order
+with a time-to-collision of less than 2 s congregate can be defined as             to update the motion and feature information as frequently as possi-
+safety-critical regions, and if multiple such areas exist, the encompassing        ble using limited computing resources, it is necessary to balance the
+area that includes all of them would be considered the safety-critical             aging of detection and association for each task. Note that increasing
+region. Please note that we adhere to the definition of the safety-critical        the aging of detection and association for all tasks simultaneously in
+region as defined in the existing paper DNN-SAM in [5]. It is assumed              every MOT execution is generally not feasible due to limited com-
+that the critical region is pre-calculated by external sensors such as             puting resources. Therefore, a mechanism is required to balance the
+LiDAR and IMU and provided to CA-MOT. If an input image does not                   aging of detection and association for all tasks while providing timing
+have a critical region, the entire frame is considered a critical region.          guarantees under constrained resources. To this end, we propose new
+As seen in Fig. 2, GPU is used only for the inference of DNN models,               scheduling algorithms that will be detailed in the next section.
+such as the detector (e.g., YOLOv5) and feature extractor (e.g., OSNet),
+while all other execution is performed on the CPU.                                 4. Scheduling algorithm
+    For the association, the MOT system uses the two-step approach [7].
+Initially, a CNN-based model (e.g., OS-Net [10]) is employed by the                   This section presents a task model and proposes new scheduling
+tracker to extract features from the detected objects. The tracker then            algorithms building upon CA-MOT.
+compares these features between the current and previous frames to
+identify object pairs with the highest feature similarity. For the re-             4.1. Task model
+maining objects that are not matched based on feature comparison, a
+location-based matching method such as IoU (intersection over union)                  Targeting MOT systems in AVs that involve 𝑛 camera sensors, we
+is used. CA-MOT also provides three execution options (i.e., low,                  consider a set 𝜏 consisting of 𝑛 MOT tasks denoted as 𝜏𝑖 ∈ 𝜏. Each
+middle, and high workloads, respectively) for the association. When                MOT task 𝜏𝑖 is responsible for conducting MOT execution using input
+it comes to the middle and high workload associations, the tracker                 images provided periodically by each camera sensor. As we employ
+extracts features from some (e.g., three) of the detected objects or               the methodology of tracking-by-detection, an MOT task consists of
+all of the detected objects, and then performs consecutive feature-                detection and association sub-tasks. Thus, the specification of each
+based and location-based matchings. On the other hand, low-workload                MOT task 𝜏𝑖 is given as 𝜏𝑖 = (𝑇𝑖 , 𝐶𝑖 (𝑠𝑖 , 𝑓𝑖 ), 𝐷𝑖 ), where 𝑇𝑖 represents the
+association performs location-based matching only. Depending on the                period (or the minimum inter-arrival time), 𝐶𝑖 (𝑠𝑖 , 𝑓𝑖 ) denotes the worst-
+execution option, CA-MOT may extract features from only a subset or                case execution time (WCET) based on the execution options (i.e., low,
+all of the detected objects, which means that the feature information              middle, and high-workload execution) for detection and association
+of objects may not be updated every time. Therefore, during feature-               sub-tasks, respectively, and 𝐷𝑖 indicates the relative deadline. The
+based matching, the algorithm compares the features extracted from                 execution time of the detection sub-task depends on the image size
+the objects in the current frame with the closest previously extracted             𝑠𝑖 ∈ 𝑆𝑖 = {𝐿, 𝑀 , 𝐻}, where 𝐿, 𝑀 , 𝐻 are 256 × 256, 416 × 416, and
+features of the tracked objects and matches the two objects with the               672 × 672, respectively. Note that CA-MOT supports arbitrary non-
+highest feature similarity.                                                        decreasing sizes for 𝑆𝑖 = {𝐿, 𝑀 , 𝐻}. On the other hand, the execution
+                                                                                   time of the association sub-task depends on the feature size 𝑓𝑖 ∈
+3.4. Aging-aware task scheduler                                                    𝐹𝑖 = {𝐿, 𝑀 , 𝐻}, where 𝐿, 𝑀 , 𝐻 are zero, from one to three, and more
+                                                                                   than three, respectively. Note that the tracking-by-detection methodol-
+   The CA-MOT implements a thread-level task scheduler to determine                ogy performs the association phase sequentially through feature-based
+the task to be scheduled and execution options for detection and                   matching followed by location-based matching using IoU. If 𝑓𝑖 is equal
+
+                                                                               4
+D. Kang et al.                                                                                                                       Journal of Systems Architecture 160 (2025) 103349
+
+Table 1                                                                                             On the other hand, the worst-case execution time 𝐶𝑖𝐴 (𝑓𝑖 ) of the associa-
+Notations used in the scheduling algorithms.
+                                                                                                    tion task depends on the feature size 𝑓𝑖 . It is calculated by considering
+ Symbol                       Description                                                           the time required for extracting features from detected objects and
+ 𝜏𝑖                           Task 𝑖 in the system                                                  performing matching methods such as feature-based and IoU-based
+ 𝑇𝑖                           Period of task 𝜏𝑖 (minimum inter-arrival time)
+                                                                                                    matching. An MOT task 𝜏𝑖 is considered schedulable if every job 𝐽𝑖
+ 𝐷𝑖                           Relative deadline of task 𝜏𝑖
+                                                                                                    (invoked by 𝜏𝑖 ) completes its execution within the relative deadline 𝐷𝑖 .
+ 𝐶𝑖 (𝑋 , 𝑌 )                  Worst-case execution time (WCET) of task 𝑖.
+                                                                                                    The overall schedulability of the system is determined by ensuring that
+                              𝑋: image size for detection (𝐿, 𝑀 , 𝐻)
+                              𝑌 : feature size for association (𝐿, 𝑀 , 𝐻)                           every task 𝜏𝑖 ∈ 𝜏 is schedulable.
+ 𝑠𝑖                           Image size for the detection sub-task of task 𝑖
+ 𝑓𝑖                           Feature size for the association sub-task of task 𝑖                   4.2. EDF best-effort
+ 𝑆𝑖                           Set of image size options for task 𝑖 (𝑆𝑖 = {𝐿, 𝑀 , 𝐻})
+ 𝐹𝑖                           Set of feature size options for task 𝑖 (𝐹𝑖 = {𝐿, 𝑀 , 𝐻})                  Building upon the system design of CA-MOT presented in Section 3,
+ 𝐿                            Low workload execution
+                                                                                                    we develop two scheduling algorithms that aim to provide not only
+ 𝑀                            Middle workload execution
+ 𝐻                            High workload execution                                               high tracking accuracy for the safety-critical regions but also a tim-
+ 𝐶𝑖𝐷 (𝑠𝑖 )                    WCET of the detection sub-task of task 𝑖, based on image
+                                                                                                    ing guarantee for every MOT execution. To this end, the proposed
+                              size 𝑠𝑖                                                               scheduling algorithms have the following two features: (F1) an offline
+ 𝐶𝑖𝐴 (𝑓𝑖 )                    WCET of the association sub-task of task 𝑖, based on                  timing guarantee for the minimum execution (i.e., low-workload exe-
+                              feature size 𝑓𝑖                                                       cution for both detection and association) of every MOT execution and
+ 𝑅𝐶𝑖 (𝐿, 𝐿)                   Remaining execution time for the minimum execution of                 (F2) an online policy to maximize tracking accuracy by systematically
+                              task 𝑖                                                                increasing workload (i.e., middle- or high-workload execution) of an
+ 𝑎𝑔 𝑒𝐷                        Aging value of the detection sub-task of task 𝑖                       MOT execution using notions of slack and aging without compromising
+     𝑖
+ 𝑎𝑔 𝑒𝐴
+     𝑖
+                              Aging value of the association sub-task of task 𝑖                     timing guarantee.
+ 𝑠𝑙𝑎𝑐 𝑘𝑖𝑡                     Slack time available for task 𝑖 at the current time 𝑡𝑐 𝑢𝑟                 The proposed scheduling algorithms are based on the non-preem-
+            𝑐 𝑢𝑟
+
+ 𝑞𝑖                           Minimum execution time of task 𝑖 in the interval                      ptive earliest deadline first (EDF) scheduling algorithm, which assigns
+                              [𝑡𝑐 𝑢𝑟 , 𝑑1 (𝑡𝑐 𝑢𝑟 )]                                                 higher priority to jobs with earlier deadlines without allowing any
+ 𝑝                            Sum of the minimum execution times for all tasks                      preemption. To provide the first feature F1, CA-MOT employs the
+ 𝑑1 (𝑡𝑐 𝑢𝑟 )                  Earliest deadline or future release time at time instant 𝑡𝑐 𝑢𝑟        existing schedulability analysis developed for non-preemptive EDF as
+ 𝑠𝑙𝑎𝑐 𝑘𝐷−                     Remaining slack after executing high-workload detection               follows.
+       𝑖
+                              for task 𝑖
+ 𝑠𝑙𝑎𝑐 𝑘𝐴−                     Remaining slack after executing high-workload                         Lemma 1. For a set 𝜏 of MOT tasks scheduled by non-preemptive EDF,
+       𝑖
+                              association for task 𝑖                                                minimum execution 𝐶𝑖 (𝐿, 𝐿) of every task 𝜏𝑖 ∈ 𝜏 can be executed without
+                                                                                                    deadline miss as long as the following holds for every task 𝜏𝑖 ∈ 𝜏.
+                                                                                                    max𝜏𝑖 𝐶𝑖 (𝐿, 𝐿) ∑ 𝐶𝑖 (𝐿, 𝐿)
+                                                                                                                   +                ≤ 1.0                                (2)
+to 𝐿, this indicates that no feature extraction has been performed for                                 min𝜏𝑖 𝑇𝑖       𝜏 ∈𝜏
+                                                                                                                             𝑇𝑖
+                                                                                                                      𝑖
+the frame, and thus, feature-based matching is skipped, proceeding
+directly to location-based matching. In the case of 𝐻, we employ the                                Proof. The lemma presents a schedulability condition for non-preem-
+maximum number of objects as defined by the environment (for the                                    ptive EDF, and its proof is outlined as follows. Let us target 𝜏𝑘 ∈ 𝜏;
+dataset considered, this is based on values measured across all videos),                            also, consider a virtual task 𝜏𝑥 ∉ 𝜏, whose 𝑇𝑥 and 𝐶𝑥 (𝐿, 𝐿) are set
+for example, 10. Then, the worst-case execution time 𝐶𝑖 (𝑠𝑖 , 𝑓𝑖 ) of each                          to min𝜏𝑖 ∈𝜏 𝑇𝑖 and max𝜏𝑖 ∈𝜏 𝐶𝑖 (𝐿, 𝐿), respectively. Now, we compare the
+MOT task 𝜏𝑖 is derived as follows.                                                                  finishing time of a job of 𝜏𝑘 when (Case 1) 𝜏 is scheduled by non-
+                                                                                                    preemptive EDF, and (Case 2) 𝜏 ∪ {𝜏𝑥 } is scheduled by preemptive EDF.
+𝐶𝑖 (𝑠𝑖 , 𝑓𝑖 ) = 𝐶𝑖𝐷 (𝑠𝑖 ) + 𝐶𝑖𝐴 (𝑓𝑖 ),                                                    (1)
+                                                                                                    Since at most one lower-priority job can block a high-priority job under
+where 𝐶𝑖𝐷 (𝑠𝑖 ) and 𝐶𝑖𝐴 (𝑓𝑖 ) are the worst-case execution times of detection                       non-preemptive scheduling, 𝜏𝑘 can be blocked by at most one lower-
+and association sub-tasks according to 𝑠𝑖 and 𝑓𝑖 , respectively. As shown                           priority job under Case 1; obviously, the WCET of the lower-priority job
+in Fig. 2, both the detection and the association sub-tasks involve GPU                             is upper-bounded by max𝜏𝑖 ∈𝜏 𝐶𝑖 (𝐿, 𝐿). Also, to block all the following
+operations, with their respective WCETs including the communication                                 jobs of 𝜏𝑘 , the blocking frequency should be no smaller than 𝑇𝑘 , which
+costs between the CPU and GPU. Note that the detection sub-task,                                    is lower-bounded by min𝜏𝑖 ∈𝜏 𝑇𝑖 . Therefore, the finishing time of a job of
+denoted as 𝜏𝑖𝐷 , and the association sub-task, denoted as 𝜏𝑖𝐴 , are executed                        𝜏𝑖 under Case 1 is no later than that under Case 2. Once we apply the
+consecutively without any preemption while sharing the same period                                  well-known schedulability condition for preemptive EDF to Case 2, the
+and relative deadline. Similarly, when an active task is running, it                                condition is the same as Eq. (2), which proves the lemma. □
+executes without any interruptions, while other tasks wait in the queue.
+                                                                                                        Note that the proof is self-contained, but a different proof for
+In addition, each task runs on an environment where non-preemption
+between the GPU and CPU is guaranteed. To ensure this, while the                                    Lemma 1 can be found in [5,17].
+CPU is running, the GPU waits for input from the CPU. Once the GPU                                      To provide the second feature F2, the proposed scheduling al-
+receives the input and is activated, the CPU waits until it receives the                            gorithms (i) dynamically increase the workload of each MOT task
+results from the GPU, as illustrated in Fig. 2. As seen Fig. 2, GPU                                 (e.g., from low workload to middle or high workload) without compro-
+is used only for the inference of DNN models, such as the detector                                  mising the timing guarantee while (ii) balance the aging of detection
+(e.g., YOLOv5) and feature extractor (e.g., OSNet), while all other                                 and association of every task. We propose two scheduling algorithms
+execution is performed on the CPU. Also, CA-MOT does not allow                                      that simultaneously provide (i) and (ii) in different ways: EDF-BE
+parallel execution for multiple MOT executions (see Table 1).                                       (EDF Best-Effort) and EDF-Slack (EDF with Slack reclamation), adapted
+    The worst-case execution time 𝐶𝑖𝐷 (𝑠𝑖 ) of the detection sub-task is                            from [5]. EDF-BE and EDF-Slack utilize slacks defined differently, but
+determined by the sum of various components, including preprocessing                                use the same mechanism (in Algorithm 2) to decide on the execution
+time (such as cropping and resizing the input image), image transfer                                option that employs a notion of aging.
+time from CPU memory to GPU memory, model inference time to                                             Let 𝑑1 (𝑡𝑐 𝑢𝑟 ) be the earliest deadline or future release time among
+                                                                                                                                                          𝑡
+obtain candidate objects, and postprocessing time (e.g., applying non-                              all tasks at a time instant 𝑡𝑐 𝑢𝑟 . The slack 𝑠𝑙𝑎𝑐 𝑘𝑖𝑐 𝑢𝑟 of task 𝜏𝑖 at 𝑡𝑐 𝑢𝑟
+maximum suppression) to extract the final objects from the candidates.                              under the EDF-BE is defined as the expected remaining time up to
+
+                                                                                                5
+D. Kang et al.                                                                                                                     Journal of Systems Architecture 160 (2025) 103349
+
+
+                                                                                          Algorithm 1 Slack calculation for 𝜏𝑘 at 𝑡𝑐 𝑢𝑟 under EDF-Slack
+                                                                                          Input: 𝜏, 𝑡𝑐 𝑢𝑟
+                                                                                                          𝑡
+                                                                                          Output: 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟
+                                                                                           1: 𝑝 = 0, 𝑈 = the left-hand-side of Equation (2)
+                                                                                           2: for 𝑖 = 𝑛 to 1, 𝜏𝑖 ∈ {𝜏1 , ..., 𝜏𝑛 |𝑑1 (𝑡𝑐 𝑢𝑟 ) ≤ ⋯ ≤ 𝑑𝑛 (𝑡𝑐 𝑢𝑟 )} do
+                                                                                                           𝐶𝑖 (𝐿, 𝐿)
+                                                                                           3:     𝑈 =𝑈−
+                                                                                                               𝑇𝑖
+                                                                                           4:   𝑞𝑖 = max(0, 𝑅𝐶𝑖 (𝐿, 𝐿) − (1 − 𝑈 ) ⋅ (𝑑𝑖 (𝑡𝑐 𝑢𝑟 ) − 𝑑1 (𝑡𝑐 𝑢𝑟 )))
+                                                                                                         (             𝑅𝐶𝑖 (𝐿, 𝐿) − 𝑞𝑖 )
+                                                                                           5:   𝑈 = min 1.0, 𝑈 +
+                                                                                                                     𝑑𝑖 (𝑡𝑐 𝑢𝑟 ) − 𝑑1 (𝑡𝑐 𝑢𝑟 )
+                                                                                           6:   𝑝 = 𝑝 + 𝑞𝑖
+                                                                                           7: end for
+                                                                                                             𝑡
+                                                                                           8: return 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 = 𝑑1 (𝑡𝑐 𝑢𝑟 ) − 𝑡𝑐 𝑢𝑟 − 𝑝
+
+
+
+
+                                                                                          that does not exceed the earliest deadline or future release, ensuring
+Fig. 3. Execution timeline of multiple MOT tasks under (a) baseline (non-preemptive       execution without deadline misses. The following lemma present the
+EDF), (b) EDF-BE, and (c) EDF-Slack scheduling policies.
+                                                                                          timing guarantee of EDF-BE.
+
+                                                                                          Theorem 1.      A task set 𝜏 that satisfies the condition in Eq. (2) is
+𝑑1 (𝑡𝑐 𝑢𝑟 ) after the execution of 𝐶𝑖 (𝐿, 𝐿) is completed, which is calculated            schedulable by EDF-BE .
+by 𝑑1 (𝑡𝑐 𝑢𝑟 ) − 𝑡𝑐 𝑢𝑟 − 𝐶𝑖 (𝐿, 𝐿). This slack value is only valid when there
+are no more than two tasks in the waiting queue at time 𝑡𝑐 𝑢𝑟 and no                      Proof. According to Lemma 1, for a task set 𝜏 that satisfies Eq. (2),
+future releases within the interval [𝑡𝑐 𝑢𝑟 , 𝑑1 (𝑡𝑐 𝑢𝑟 )). Using the slack value          the minimum execution time 𝐶𝑖 (𝐿, 𝐿) of all tasks 𝜏𝑖 ∈ 𝜏 guarantees
+conditionally provided at a scheduling decision, EDF-BE can perform                       execution without deadline misses. At each scheduling decision at 𝑡
+middle- or high-workload execution for detection and/or association.                      under the online policy of EDF-BE, the execution of a job exploiting
+                                                                                          any slack value does not impose additional inference on any other job.
+Example. Figs. 3(a) and (b) present a scheduling scenario of the                          This guarantees that all tasks 𝜏𝑖 receive no more interference than what
+baseline algorithm (i.e., non-preemptive EDF) and EDF-BE with an                          they would receive under non-preemptive EDF scheduling. Thus, this
+example task set. We consider an example task set 𝜏 = {𝜏1 , 𝜏2 } of                       theorem holds. □
+which 𝐶𝑖 = 𝐶𝑖𝐷 (𝐻) + 𝐶𝑖𝐴 (𝐻) = 25, 𝑇𝑖 = 25, 𝐶𝑖𝐷 (𝑠𝑖 ) = {5, 9, 12}, and
+𝐶𝑖𝐴 (𝑓𝑖 ) = {3, 8, 13} hold for 𝜏𝑖 ∈ 𝜏. As shown in Figs. 3(a) and (b), each              4.3. EDF with slack reclamation
+first job of 𝜏1 and 𝜏2 are released at 𝑡 = 0 and 𝑡 = 13, respectively. In
+the baseline algorithm, the first job of 𝜏1 executes for 25 time units,                       In the case of EDF-BE, more workload than the minimum execution
+and then the first job of 𝜏2 starts its execution at 𝑡 = 25 resulting in                  can only be processed when there is a single job in the waiting queue at
+a deadline miss at 𝑡 = 38. Let 𝑎𝑔 𝑒𝐷               𝐴
+                                         𝑖 and 𝑎𝑔 𝑒𝑖 be the aging value of                a given time 𝑡𝑐 𝑢𝑟 and no additional releases occur until 𝑑1 (𝑡𝑐 𝑢𝑟 ). This cre-
+detection and association of 𝜏𝑖 . The aging value is an integer satisfying                ates a limited opportunity for MOT tasks in CA-MOT to perform more
+𝑎𝑔 𝑒𝐷       𝐴                𝐷          𝐴
+    𝑖 , 𝑎𝑔 𝑒𝑖 ≥ 0, and 𝑎𝑔 𝑒𝑖 and 𝑎𝑔 𝑒𝑖 for all task 𝜏𝑖 ∈ 𝜏 are set to zero                workload than the minimum execution, thus restricting the potential
+at the beginning of the system. The, 𝑎𝑔 𝑒𝐷               𝐴
+                                              𝑖 (and 𝑎𝑔 𝑒𝑖 ) increases by one             to improve tracking accuracy. To address this limitation, we integrate
+at each time when a detection (and association) is run with middle- or                    the approach presented in [5] into EDF-Slack, allowing it to compute
+high-workload. In other words, the aging value refers to the number of                    slack in a different way than EDF-BE.
+executions excluding those with low workloads. By adjusting the aging                         Let 𝑑𝑖 (𝑡𝑐 𝑢𝑟 ) denote the 𝑖th earliest deadline or release time at 𝑡𝑐 𝑢𝑟 , and
+value, a balance is maintained so that neither detection nor association                  𝑅𝐶𝑖 (𝐿, 𝐿) represent the remaining execution time required to complete
+becomes disproportionately large.                                                         the minimum execution 𝐶𝑖 (𝐿, 𝐿). Algorithm 1 outlines the calculation
+                                                                                                                         𝑡
+                                                                                          of the slack value 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 for task 𝜏𝑘 at 𝑡𝑐 𝑢𝑟 within the EDF-Slack
+    Compared to EDF-Slack, EDF-BE is a simpler algorithm that utilizes
+                                                                                          algorithm, triggered at each scheduling decision. Since EDF is a job-
+as many resources as possible, executing a job for greater than 𝐶𝑖 (𝐿, 𝐿)
+                                                                                          level fixed-priority scheduling policy, wherein the priority of a job
+up to its closest future release only when there is exactly one job in the
+                                                                                          remains constant throughout its execution, scheduling decisions under
+waiting queue. EDF-BE naturally guarantees no deadline misses in any
+                                                                                          EDF occur either at the commencement of a job’s execution or upon its
+job execution. This is because, as stated in Lemma 1, the execution of
+                                                                                          completion. In the interval [𝑡𝑐 𝑢𝑟 , 𝑑1 (𝑡𝑐 𝑢𝑟 )], EDF-Slack processes tasks in
+𝐶𝑖 (𝐿, 𝐿) without deadline misses for all jobs is guaranteed under EDF.
+                                                                                          reverse EDF order, starting from the task with the latest deadline. Job
+Furthermore, when a job executes for more than 𝐶𝑖 (𝐿, 𝐿) under EDF-
+                                                                                          𝐽𝑘 of 𝜏𝑘 has the highest priority at 𝑡𝑐 𝑢𝑟 , with 𝑑1 (𝑡𝑐 𝑢𝑟 ) being its deadline,
+BE, there is only one active job in the waiting queue at that time. In
+                                                                                          as EDF-Slack follows the EDF policy. The goal of the slack calculation
+the case of EDF-BE, when the first job of task 𝜏1 starts its execution
+                                                                                          in Algorithm 1 is to delay the execution of all other tasks 𝜏𝑖 ∈ 𝜏 ⧵ 𝜏𝑘
+at 𝑡 = 0, it executes the minimum execution 𝐶1 (𝐿, 𝐿) = 8 until the
+                                                                                          beyond 𝑑1 (𝑡𝑐 𝑢𝑟 ) while ensuring that future deadlines are met. This is
+earliest deadline or future release at 𝑡 = 13, resulting in a slack of five.              repeated for all tasks in the waiting queue. To ensure 𝜏𝑖 completes
+Utilizing this slack, the task 𝜏1 then executes 𝐶1 (𝑀 , 𝐿), and the aging                 𝐶𝑖 (𝐿, 𝐿) before 𝑑𝑖 (𝑡𝑐 𝑢𝑟 ), EDF-Slack calculates the maximum execution
+factor 𝑎𝑔 𝑒𝐷
+           𝑖 increases by one. For the first job of task 𝜏2 , released at                 time in the interval 𝑑1 (𝑡𝑐 𝑢𝑟 ), 𝑑𝑖 (𝑡𝑐 𝑢𝑟 ), which is (1 − 𝑈 ) ⋅ (𝑑𝑖 (𝑡𝑐 𝑢𝑟 ) − 𝑑1 (𝑡𝑐 𝑢𝑟 )),
+𝑡 = 13, there is a slack of 4 until the earliest deadline or future release               where 𝑈 denote the left-hand-side of Eq. (2).
+at 𝑡 = 25. Thus, 𝐶2 (𝑀 , 𝐿) is executed, and 𝑎𝑔 𝑒𝐷  𝑖 increases by one. The                   The key steps in the slack calculation are as follows:
+second job of task 𝜏1 , released at 𝑡 = 13, has a slack of 5 until the earliest
+deadline or future release at 𝑡 = 38. To balance 𝑎𝑔 𝑒𝐷            𝐴
+                                                       𝑖 and 𝑎𝑔 𝑒𝑖 , 𝐶1 (𝐿, 𝑀)                  • 𝑞𝑖 is computed as the minimum execution of 𝜏𝑖 in the interval
+                      𝐴
+is executed, and 𝑎𝑔 𝑒𝑖 increases by one. The details of the online policy                         𝑡𝑐 𝑢𝑟 , 𝑑1 (𝑡𝑐 𝑢𝑟 ) (Lines 3–4).
+that effectively balances the aging of detection and association for each                       • 𝑅𝐶𝑖 (𝐿, 𝐿) is either zero or 𝐶𝑖 (𝐿, 𝐿), since scheduling decisions
+task will be provided at the end of this section. As can be observed                              are only made upon job completion or release in non-preemptive
+from the figure, at each scheduling decision, an execution is performed                           scheduling (Line 4).
+
+                                                                                      6
+D. Kang et al.                                                                                                                  Journal of Systems Architecture 160 (2025) 103349
+
+
+    • The execution rate of 𝜏𝑖 in the interval 𝑑1 (𝑡𝑐 𝑢𝑟 ), 𝑑𝑖 (𝑡𝑐 𝑢𝑟 ) is calculated       Algorithm 2 Determination of execution options
+      and recorded (Line 5).                                                                                         𝑡
+                                                                                            Input: 𝜏, 𝑡𝑐 𝑢𝑟 , 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟
+    • 𝑝 is set as the sum of the minimum execution times of all tasks
+                                                                                            Output: (𝑠𝑘 , 𝑓𝑘 )
+      𝜏𝑖 ∈ 𝜏 (Line 6).                                                                                    𝑡
+                                                                                             1: if 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 ≤ 0 then
+    • The slack is then determined as the remaining time slots, exclud-
+                                                                                             2:    return (𝐿, 𝐿)
+      ing 𝑝 (i.e., the sum of 𝑞𝑖 ), within the interval 𝑡𝑐 𝑢𝑟 , 𝑑1 (𝑡𝑐 𝑢𝑟 ) (Line
+                                                                                             3: else
+      7).
+                                                                                             4:    if 𝑎𝑔 𝑒𝐷
+                                                                                                          𝑘
+                                                                                                              ≤ 𝑎𝑔 𝑒𝐴
+                                                                                                                    𝑘
+                                                                                                                      then
+                                                                                                                            𝑡
+Example. Fig. 3(c) illustrates a scheduling scenario of EDF-Slack using                      5:     𝑠𝑙𝑎𝑐 𝑘𝐷−
+                                                                                                          𝑘
+                                                                                                               = 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 − (𝐶𝑘𝐷 (𝐻) − 𝐶𝑘𝐷 (𝐿))
+the same example tasks as shown in Figs. 3(a) and (b). The initial jobs                      6:     if 𝑠𝑙𝑎𝑐 𝑘𝐷−
+                                                                                                             𝑘
+                                                                                                                 ≥ 0 then
+of 𝜏1 and 𝜏2 are released at 𝑡 = 0 and 𝑡 = 13, respectively. Applying                        7:        return (𝐻 , 𝑓𝑘 (𝑠𝑙𝑎𝑐 𝑘𝐷−  𝑘
+                                                                                                                                     + 𝐶𝑘𝐴 (𝐿)))
+Algorithm 1, the calculated slack value for 𝜏1 at 𝑡𝑐 𝑢𝑟 = 0 is 17, allowing                  8:     else
+                                                                                                                             𝑡
+the first job of 𝜏1 to execute for 𝐶1 (𝐻 , 𝐻) until 𝑡 = 25. Furthermore,                     9:        return (𝑠𝑘 (𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 + 𝐶𝑘𝐷 (𝐿)), 𝐿)
+𝑎𝑔 𝑒𝐷
+    1
+      and 𝑎𝑔 𝑒𝐴1
+                 increment by one. Subsequently, the first job of 𝜏2 begins                 10:     end if
+its execution at 𝑡 = 25, executing for 𝐶2 (𝑀 , 𝐿) while increasing 𝑎𝑔 𝑒𝐷  2
+                                                                                            11:   else
+                                                                                                                       𝑡
+by one. Finally, the second job of 𝜏2 starts its execution at 𝑡 = 37.                       12:     𝑠𝑙𝑎𝑐 𝑘𝐴−
+                                                                                                          𝑘
+                                                                                                               = 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 − (𝐶𝑘𝐴 (𝐻) − 𝐶𝑘𝐴 (𝐿))
+                                                                                                             𝐴−
+                                                                                                    if 𝑠𝑙𝑎𝑐 𝑘𝑘 ≥ 0 then
+Comparing Fig. 3(b) that represents EDF-BE with Fig. 3(c) depicting                         13:
+EDF-Slack, we observe that the aging of 𝜏1 and 𝜏2 increases in the same                     14:        return (𝑠𝑘 (𝑠𝑙𝑎𝑐 𝑘𝐴−  𝑘
+                                                                                                                                + 𝐶𝑘𝐷 (𝐿)), 𝐻)
+amount in both cases. However, the key difference lies in the execution                     15:     else
+                                                                                                                                𝑡
+of the first job of 𝜏1 . Under EDF-Slack, this job is able to execute with a                16:        return (𝐿, 𝑓𝑘 (𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 + 𝐶𝑘𝐴 (𝐿)))
+high-workload execution, while under EDF-BE, it can only execute with                       17:     end if
+a middle-workload execution, which allows for higher expectations of                        18:   end if
+tracking accuracy in EDF-Slack.                                                             19: end if
+
+    The following proves the timing guarantee of EDF-Slack.
+Theorem 2.        A task set 𝜏 that satisfies Eq. (2) is schedulable under
+EDF-Slack .                                                                                      • If the slack is less than or equal to zero, the algorithm returns 𝐿
+                                                                                                   and 𝐿 (Lines 1–2).
+Proof. We prove this by contradiction. Assume, for the sake of con-                              • Otherwise, the algorithm compares the ages of the detection step
+tradiction, that the task set 𝜏 satisfies Eq. (2), but is not schedulable                          (𝑎𝑔 𝑒𝐷
+                                                                                                        𝑘
+                                                                                                          ) and the association step (𝑎𝑔 𝑒𝐴
+                                                                                                                                          𝑘
+                                                                                                                                            ) (Lines 3–4).
+under EDF-Slack. This implies that at some time 𝑡, the total utilization
+exceeds 1.0, and hence a deadline miss occurs for some job 𝐽𝑖 in 𝜏.                                    – If 𝑎𝑔 𝑒𝐷
+                                                                                                                𝑘
+                                                                                                                    is smaller than 𝑎𝑔 𝑒𝐴 𝑘
+                                                                                                                                            , indicating the detection step
+Let 𝑡𝑚𝑖𝑠𝑠 denote the earliest such time at which a deadline miss occurs,                                 requires more resources, the algorithm calculates 𝑠𝑙𝑎𝑐 𝑘𝐷−      𝑘
+                                                                                                                                                                            ,
+i.e., 𝑡𝑚𝑖𝑠𝑠 = 𝑑𝑖 , where 𝑑𝑖 is the deadline of 𝐽𝑖 . By the definition of EDF-                            representing the remaining slack after executing
+Slack, at each time 𝑡, the slack time for each task is computed based                                    high-workload detection (Line 5).
+on the highest-priority job 𝐽1 (𝑡𝑐 𝑢𝑟 ), where 𝑡𝑐 𝑢𝑟 denotes the current time.                         – If 𝑠𝑙𝑎𝑐 𝑘𝐷−
+                                                                                                                  𝑘
+                                                                                                                       is greater than or equal to zero, the high-workload
+Since no tasks are released in the interval [𝑡𝑐 𝑢𝑟 , 𝑑1 (𝑡𝑐 𝑢𝑟 )], the slack time                        detection is followed by middle- or high-workload associa-
+ensures that lower-priority tasks cannot block the execution of 𝐽1 . As a                                tion depending on 𝑠𝑙𝑎𝑐 𝑘𝐷−   𝑘
+                                                                                                                                         (Lines 6–7). In this case, 𝑓𝑘 (𝑥) is
+result, the blocking term in Eq. (2) remains valid during this interval.                                 set as follows:
+     Now, since EDF-Slack is based on EDF scheduling, the total utiliza-                                       ∗ 𝐿 for 𝑥 < 𝐶𝑘𝐴 (𝑀),
+tion 𝑈 (𝑡) at any time 𝑡 can be expressed as:
+          ∑                                                                                                    ∗ 𝑀 for 𝐶𝑘𝐴 (𝑀) ≤ 𝑥 < 𝐶𝑘𝐴 (𝐻),
+                   𝐶𝑖
+𝑈 (𝑡) =                    + 𝐵(𝑡),                                                                             ∗ 𝐻 for 𝑥 ≥ 𝐶𝑘𝐴 (𝐻).
+                𝑑
+         𝐽 ∈𝜏(𝑡) 𝑖
+                   − 𝑡𝑐 𝑢𝑟
+         𝑖
+
+where 𝐶𝑖 is the remaining execution time of task 𝐽𝑖 , and 𝐵(𝑡) is the                                  – If 𝑠𝑙𝑎𝑐 𝑘𝐷−
+                                                                                                                  𝑘
+                                                                                                                        is less than zero, the algorithm determines if
+blocking term. According to Eq. (2), 𝑈 (𝑡) ≤ 1.0 for all 𝑡. Since 𝑡𝑚𝑖𝑠𝑠 is                               middle- or high-workload detection can be performed based
+                                                                                                                    𝑡
+the earliest time a deadline miss occurs, we must have 𝑈 (𝑡𝑚𝑖𝑠𝑠 ) > 1.0.                                 on 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 , followed by low-workload association (Lines
+However, by Eq. (2), we know that 𝑈 (𝑡) ≤ 1.0 for all 𝑡 ≥ 𝑡𝑐 𝑢𝑟 , including                              8–10). In this case, 𝑠𝑘 (𝑥) is set as follows:
+𝑡𝑚𝑖𝑠𝑠 . This leads to a contradiction, as the assumption that 𝑈 (𝑡𝑚𝑖𝑠𝑠 ) >                                     ∗ 𝐿 for 𝑥 < 𝐶𝑘𝐷 (𝑀),
+1.0 contradicts the fact that 𝑈 (𝑡) ≤ 1.0 holds at all times. Therefore,
+                                                                                                               ∗ 𝑀 for 𝐶𝑘𝐷 (𝑀) ≤ 𝑥 < 𝐶𝑘𝐷 (𝐻),
+no deadline miss can occur, and the task set 𝜏 is schedulable under
+EDF-Slack. □                                                                                                   ∗ 𝐻 for 𝑥 ≥ 𝐶𝑘𝐷 (𝐻).
+
+    Note that the proof is self-contained, but a different proof can be                          • Lines 11–18 follow a similar procedure for determining the ex-
+found in [5].                                                                                      ecution options, giving preference to the association step. Here,
+    Determination of execution options. EDF-BE and EDF-Slack use                                   𝑠𝑙𝑎𝑐 𝑘𝐷− represents the remaining slack after executing the high-
+                                                                                                         𝑘
+different slack concepts to ensure timely execution of tasks while im-                             workload association.
+proving tracking accuracy by executing beyond the minimum
+                                                                                                According to the definition of aging, 𝑎𝑔 𝑒𝐷 𝑘
+                                                                                                                                               (and 𝑎𝑔 𝑒𝐴
+                                                                                                                                                        𝑘
+                                                                                                                                                          ) increase
+(i.e., 𝐶𝑖 (𝐿, 𝐿)). As shown in Figs. 3(b) and (c), both EDF-BE and
+                                                                                            by one when middle- or high-workload detection (and association) is
+EDF-Slack enhance the aging of detection and association through
+                                                                                            performed.
+predefined mechanisms. The goal of these mechanisms is to balance the
+                                                                                                DNN-SAM proposed in [10] introduces two scheduling algorithms:
+aging of detection and association, minimizing continuous omissions in
+                                                                                            EDF-MandFirst and EDF-Slack. Unlike CA-MOT, both DNN-SAM al-
+updating motion and feature information, thereby maximizing tracking
+                                                                                            gorithms target multi-object detection (MOD) tasks. The primary dis-
+accuracy.                                                                                   tinction between MOT and MOD lies in the presence or absence of
+    Algorithm 2 outlines the process for determining the execution                          dependencies between consecutive frames. In MOD, the detection oper-
+options for the detection and association steps of task 𝜏𝑘 at time 𝑡𝑐 𝑢𝑟                    ation for a given frame does not utilize any information from previous
+                          𝑡
+based on the slack 𝑠𝑙𝑎𝑐 𝑘𝑘𝑐 𝑢𝑟 calculated in Algorithm 1.                                   frames. Therefore, techniques that rely on previous frame information,
+
+                                                                                        7
+D. Kang et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103349
+
+
+such as aging-aware methods, cannot be employed in the DNN-SAM al-             Table 2
+gorithms. Another key difference is that DNN-SAM is responsible solely         Execution time measurement (average and maximum) in terms of image size, feature
+                                                                               size, and scheduling overhead.
+for detection execution and does not handle the association task. Both
+                                                                                Time (ms)       𝐶𝑖𝐷                            𝐶𝑖𝐴                            𝐶𝑖𝑠𝑐 ℎ𝑒
+DNN-SAM and CA-MOT algorithms are based on EDF and prioritize
+executing jobs with the earliest deadlines among the released tasks.                            L         M         H          L         M         H
+
+However, in contrast to CA-MOT, DNN-SAM splits each job at release              Average         28.0      30.6      36.7       8.3       63.4      74.4       0.3
+                                                                                Maximum         43.6      53.5      67.6       11.3      74.0      125.2      0.6
+into a mandatory job, responsible for execution in the safety-critical
+area, and an optional job, responsible for execution in non-critical
+areas. When any mandatory job is present in the waiting queue, it
+is always executed first using the EDF algorithm. The distinction be-
+tween MandFirst and EDF-Slack arises from whether the execution
+of an optional job may interfere with the execution of a mandatory
+job. Specifically, the scheduling behavior of DNN-SAM and EDF-Slack
+operates as follows:
+
+    • EDF-MandFirst in [10]: Any mandatory job in the waiting queue
+      has a higher priority than optional jobs and is scheduled using
+      EDF. If no mandatory jobs are in the queue, optional jobs are
+      executed using EDF, ensuring that they do not interfere with the         Fig. 4. Comparison for two tasks with the periods (equal to the relative deadlines) of
+      execution of future release jobs of mandatory tasks.                     180 ms and 270 ms.
+    • EDF-Slack in [10]: Any mandatory job in the waiting queue has
+      a higher priority than optional jobs and is scheduled using EDF.
+      If no mandatory jobs are in the queue, optional jobs are executed              the ground truth of objects in all frames with the tracking results
+      using EDF, potentially interfering with the execution of future-               obtained from the given techniques to measure accuracy. The
+      release mandatory jobs, within the slack calculated from the job’s             KITTI dataset consists solely of data captured from forward-facing
+      runtime.                                                                       cameras and does not utilize different cameras, meaning there
+    • EDF-BE of CA-MOT: A job is not split and has three execu-                      is no overlap in the areas they cover. Additionally, as it does
+      tion options for both detection and association. It is executed                not assume simultaneous capture by each camera, there are no
+      with the maximum workload option to avoid interfering with                     synchronization issues. CA-MOT aims to maximize the average
+      the execution of future release jobs, but only when exactly one                accuracy for the MOT tasks corresponding to all given cameras
+      job is present in the waiting queue. The aging of detection and                without missing any deadlines. This assumes that CA-MOT oper-
+      association tasks is considered for accuracy maximization.                     ates independently of camera interdependencies, with all cameras
+    • EDF-Slack of CA-MOT: A job is not split and has three execution                receiving the same forward-facing camera feed.
+      options for both detection and association. Regardless of the                • Execution time measurement: To obtain the WCET of different
+      number of jobs in the waiting queue, the job is executed with                  execution options for detection and association, we measured the
+      the maximum workload option, potentially interfering with the                  execution time by iterating 1000 times for each sub-tasks with
+      execution of future release jobs based on the slack calculated                 three different execution options of an MOT task and then took
+      from its runtime. The aging of detection and association tasks is              the largest value. We also measured the worst-case time required
+      considered for accuracy maximization.                                          for slack calculation and scheduling decisions such as Algorithms
+                                                                                     1 and 2. Table 2 shows the measurement results.
+5. Evaluation
+                                                                               5.2. Experiment result
+   This section evaluates the effectiveness of CA-MOT in achieving R1
+and R2 for multiple MOT tasks.
+                                                                                   We consider task sets in which schedulability is not guaranteed with
+                                                                               the high-workload execution for detection and association
+5.1. Experiment setting
+                                                                               (i.e., 𝐶𝑖 (𝐻 , 𝐻)) for all tasks but is guaranteed with the minimum
+                                                                               execution (𝐶𝑖 (𝐿, 𝐿)) according to Eq. (2). Note that the schedulability
+    • Software: CA-MOT employs the tracking-by-detection of which
+                                                                               with 𝐶𝑖 (𝑥, 𝑦) for 𝑥, 𝑦 ∈ {𝐿, 𝑀 , 𝐻} can be judged with Eq. (2) by
+      the detector is one of the most popular detectors, YOLOv5 [14]
+                                                                               substituting 𝐶𝑖 (𝐿, 𝐿) to 𝐶𝑖 (𝑥, 𝑦). To evaluate the effectiveness of CA-
+      model, and tracker is StrongSORT [7]. We confirmed that other
+                                                                               MOT we consider the following including a baseline and our two
+      detectors (i.e. YOLOX, Faster-RCNN) exhibit a similar trend to
+                                                                               proposed approaches.
+      YOLOv5 in terms of MOTA and execution time, as shown in Fig. 7.
+      For feature extraction conducted as a part of association, we used           • Detection first (DF): non-preemptive EDF in which the execution
+      OS-Net [10]. The YOLOv5 model was pretrained on the COCO                       option of all tasks 𝜏𝑖 ∈ 𝜏 is equally fixed to the rightmost
+      Dataset [18], while OS-Net was pretrained on the MSMT Dataset
+                                                                                     one among {𝐶𝑖 (𝐿, 𝐿), 𝐶𝑖 (𝑀 , 𝐿), 𝐶𝑖 (𝐻 , 𝐿), 𝐶𝑖 (𝐻 , 𝑀), 𝐶𝑖 (𝐻 , 𝐻)} that
+      [19]. The experimental environment is with Ubuntu 18.04.6 LTS,
+                                                                                     satisfies the schedulability condition in Eq. (2).
+      CUDA 11.4, and PyTorch 1.12.
+                                                                                   • EDF-BE: EDF-BE of which task set passes the schedulability con-
+    • Hardware: We consider the NVIDIA Jetson Xavier as a GPU-
+                                                                                     dition in Eq. (2), which is proposed in Section 4.2.
+      enabled embedded board [20]. The NVIDIA Jetson Xavier features
+                                                                                   • EDF-Slack: EDF-Slack of which task set passes the schedulability
+      a 64-bit 8-core CPU, 32 GB Memory, and 512-core Volta GPU. We
+                                                                                     condition in Eq. (2), which is proposed in Section 4.3.
+      utilized the MAXN mode provided by the NVIDIA Jetson Xavier.
+    • Dataset and performance metric: We used the KITTI Dataset                   Fig. 4 represents the tracking accuracy and the proportion of three
+      [13], which contains data collected from autonomous vehicle              execution options (i.e., 𝐿, 𝑀, and 𝐻) selected during detection and
+      driving. To evaluate the accuracy of each region, we measured            association for two tasks with different periods: 180 and 270 ms
+      the MOTA [15] as the most well-known performance metric for              (milliseconds). As shown in Fig. 4(a), for overall accuracy, EDF-BE and
+      tracking accuracy for critical and entire regions. MOTA compares         EDF-Slack achieve 20.2% and 26.6%, respectively, while DF achieves
+
+                                                                           8
+D. Kang et al.                                                                                                                       Journal of Systems Architecture 160 (2025) 103349
+
+
+
+
+Fig. 5. Comparison for four tasks with the same period (equal to the relative deadline)       Fig. 6. Visualization on KITTI dataset for three tasks with the periods of 400 ms. (For
+of 400 ms.                                                                                    interpretation of the references to color in this figure legend, the reader is referred to
+                                                                                              the web version of this article.)
+
+
+13.4%, which demonstrate the effectiveness of slack utilization and
+balancing aging of detection and association in increasing tracking
+accuracy. We observe that the slack reclamation performed by Algo-
+rithm 1 in EDF-Slack is significantly more effective in achieving high
+tracking accuracy than in EDF-BE which has limitations in obtaining
+a substantial amount of slack. For critical accuracy, EDF-BE and EDF-
+Slack achieve much higher accuracies, which are 28.3% and 32.2%,
+respectively, compared to 15.4% of DF. Based on this observation,
+we can interpret that even though EDF-BE obtains a smaller amount
+                                                                                                              Fig. 7. MOTA and execution time on other detectors.
+of slack compared to EDF-Slack, it efficiently performs tracking for
+the critical region with limited computing resources. On the other
+hand, EDF-Slack provides high tracking accuracy not only for the entire
+region but also for the safety-critical region, thanks to its efficient slack                     Additional experiments are conducted to ascertain if CA-MOT ex-
+reclamation. As seen in Fig. 4, EDF-Slack exhibits a significantly higher                     hibits comparable behavioral patterns across a range of detectors,
+proportion of high-workload execution and middle-workload execution                           including YOLOv5, which was evaluated previously. Fig. 7 displays
+for detection and association compared to other execution options. On                         the MOTA and execution time for various contemporary detectors,
+the other hand, EDF-BE shows a slight proportion of middle-workload                           analyzed according to their workload. Modern detectors are generally
+execution, while the majority of cases involve low-workload execution.                        classified into one-stage and two-stage categories based on their archi-
+    Fig. 5 depicts the results of another experiment involving three                          tecture and further into anchor-free and anchor-based types, contingent
+different sets of tasks, with the number of tasks ranging from two                            on their use of predefined anchors for object detection. Our study
+to four, all having the same periods (i.e., 400 ms with a guaranteed                          incorporated YOLOv5, a standard one-stage anchor-based detector. We
+minimum execution 𝐶𝑖 (𝐿, 𝐿), but no guaranteed maximum execution                              also investigate the performance of the two-stage anchor-based detector
+𝐶𝑖 (𝐻 , 𝐻) for 𝜏𝑖 ∈ 𝜏). In Fig. 5(a), the tracking accuracy of the evaluated                  Faster-RCNN [9] and the one-stage anchor-free detector YOLOX [21],
+approaches is shown as the number of tasks increases. For the case                            to verify the consistency of results. Faster-RCNN utilized ResNet-50
+of two tasks, EDF-Slack achieves an overall accuracy of 41.8% and a                           as its backbone network, while YOLOX was configured with a small
+critical accuracy of 41.4%, while EDF-BE achieves an overall accuracy                         version model. Both models were trained using the COCO dataset.
+of 24.3% and a critical accuracy of 27.2%. In contrast, DF achieves                           Despite minor discrepancies in specific ratios, the results consistently
+lower accuracy, with an overall accuracy of 18.0% and a critical                              demonstrate that both MOTA and execution time escalate in conjunc-
+accuracy of 18.7%. As the number of tasks increases, both EDF-BE and                          tion with increasing workload, as shown in Figs. 7(a) and (b). The
+EDF-Slack experience a decrease in accuracy, but they still outperform                        runtime trend of YOLOX is particularly noteworthy, which closely
+DF in terms of tracking accuracy. Even with only four tasks, EDF-                             mirrors that of YOLOv5. This pattern indicates that similar outcomes
+BE yields lower overall accuracy than DF, as it can only detect part                          may be expected from other detectors akin to YOLOv5.
+of the image when selecting the high workload option. Nevertheless,
+by prioritizing computations in critical regions at low and medium
+                                                                                              6. Related work
+workloads, EDF-BE attains higher critical accuracy than DF. Fig. 5(b)
+presents the distribution of execution options for EDF-BE and EDF-Slack
+                                                                                                  The tracking-by-detection model is a commonly used method in
+when there are three tasks. Similar to Fig. 4, it is evident that both
+                                                                                              the MOT field. It has shown significant progress and enhanced perfor-
+EDF-BE and EDF-Slack allocate the workload between the detection and
+association steps in a balanced manner using the ages. Additionally,                          mance recently, largely thanks to the evolution of deep neural networks
+EDF-Slack can reclaim more slack compared to EDF-BE.                                          (DNNs). A well-recognized model in this field, SORT (simple online
+    Fig. 6 presents the tracking outcomes of a single task within a set                       and real-time tracking) [22], does its matching based mainly on where
+of three tasks, each with a period of 400 ms, comparing (a) the DF                            objects are located, using detection tools to achieve this. To push this
+algorithm and (b) EDF-Slack. In the visualization, each tracked object                        model further, DeepSORT [6] builds on the SORT model by adding a
+is represented by a unique color and ID within a bounding box, with                           DNN-based re-identification model. This allows for the extraction of ob-
+the symbol ‘‘#’’ indicating the frame number. In the DF scenario, each                        ject features. By adding this layer, DeepSORT utilizes both the object’s
+task executes 𝐶𝑖 (𝐻 , 𝐿), leading to insufficient computational resources                     location and its visual information, leading to a stronger performance.
+for proper association. This inadequacy results in DF’s failure to track                      Recent work in this area, such as Deep OC-SORT [23] and Strong-
+two objects within the safety-critical region in the 167th frame and                          SORT [7], is geared towards enhancing the accuracy of these models
+causes an ID switch from 4 to 10 in the subsequent 168th frame,                               even more, focusing especially on refining and improving the matching
+as illustrated in Fig. 6. Conversely, EDF-Slack leverages aging and                           algorithms used in these systems. However, it is critical to understand
+slack techniques to allocate sufficient computational resources for both                      that these approaches are mainly designed for situations where there
+detection and association tasks, enabling accurate tracking of all objects                    are plenty of computing resources. Therefore, they might struggle to
+in the safety-critical region.                                                                meet the timing needs in systems that are restricted in resources, like
+
+                                                                                          9
+D. Kang et al.                                                                                                        Journal of Systems Architecture 160 (2025) 103349
+
+
+the embedded systems in self-driving vehicles where resources may be             Declaration of competing interest
+scarce.
+    Considering self-driving vehicles, which are fundamentally systems               The authors declare the following financial interests/personal rela-
+where safety is critical, even the smallest delays or slight drops in            tionships which may be considered as potential competing interests:
+accuracy can lead to significant and potentially dangerous risks. Some           Hyeongboo Baek reports financial support was provided by National
+research such as DNN-SAM [5] has tried to tackle these problems by               Research Foundation of Korea. If there are other authors, they declare
+suggesting frameworks that concentrate specifically on safety-critical           that they have no known competing financial interests or personal
+areas. These frameworks give priority to critical accuracy and use               relationships that could have appeared to influence the work reported
+uncertainty handling to ensure the highest safety standards. However,            in this paper.
+these research studies and their related approaches are mostly designed
+for multi-object detection systems and may not directly apply to or be           Acknowledgments
+effective in multi-object tracking. Likewise, another study, RT-MOT [2],
+aims to maximize the overall accuracy of multi-object tracking and                   This work was supported by the National Research Foundation of
+ensure on-time execution, but it overlooks the importance of individual          Korea (NRF) grant funded by the Korea government (MSIT) (RS-2023-
+objects in its approach. To address these limitations, our suggested             00250742, 2022R1A4A3018824, RS-2024-00438248). This work was
+framework, known as CA-MOT, aims to confront these challenges                    partly supported by the Institute of Information & Communications
+directly. By leveraging the unique traits of multi-object tracking in            Technology Planning & Evaluation(IITP)-ITRC(Information Technol-
+safety-critical systems, CA-MOT ensures on-time execution and boosts             ogy Research Center) grant funded by the Korea government(MSIT)
+tracking accuracy for objects that could potentially be dangerous to the         (IITP-2025-RS-2023-00259061).
+system. It builds on previous work while addressing their weaknesses
+to create a safer and more efficient tracking system.                            Data availability
+
+                                                                                    Data will be made available on request.
+7. Discussion
+
+    A limitation of CA-MOT is its exclusive reliance on a single CPU and         References
+GPU, which restricts scalability. A recent approach, Batch-MOT [3],
+addresses this limitation by processing input images from multiple                [1] M. Yang, S. Wang, J. Bakita, T. Vu, F.D. Smith, J.H. Anderson, J.-M. Frahm,
+                                                                                      Re-thinking CNN frameworks for time-sensitive autonomous-driving applications:
+cameras through a shared queue, distributing CPU operations across                    Addressing an industrial challenge, in: Proceedings of IEEE Real-Time Technology
+multiple CPUs, and employing batch processing on a single GPU. How-                   and Applications Symposium, IEEE, 2019, pp. 305–317.
+ever, this approach may introduce additional communication overhead               [2] D. Kang, S. Lee, H.S. Chwa, S.-H. Bae, C.M. Kang, J. Lee, H. Baek, RT-MOT:
+among CPUs, potentially determining its overall efficiency. The primary               Confidence-aware real-time scheduling framework for multi-object tracking tasks,
+                                                                                      in: Proceedings of IEEE Real-Time Systems Symposium, IEEE, 2022, pp. 318–330.
+contribution of Batch-MOT lies in its online schedulability analysis,
+                                                                                  [3] D. Kang, S. Lee, C.-H. Hong, J. Lee, H. Baek, Batch-MOT: Batch-enabled real-
+which dynamically determines the maximum number of images that                        time scheduling for multi-object tracking tasks, IEEE Trans. Comput.-Aided Des.
+can be batch-processed without violating their deadlines. Nonethe-                    Integr. Circuits Syst. (2024).
+less, unlike CA-MOT, Batch-MOT lacks support for multiple execution               [4] S. Liu, X. Fu, M. Wigness, P. David, S. Yao, L. Sha, T. Abdelzaher, Self-cueing
+                                                                                      real-time attention scheduling in criticality-aware visual machine perception, in:
+strategies during the association phase, resulting in suboptimal resource
+                                                                                      Proceedings of IEEE Real-Time Technology and Applications Symposium, IEEE,
+utilization for individual MOT tasks. Enhancing CA-MOT by incor-                      2022, pp. 173–186.
+porating batch processing capabilities to address these shortcomings              [5] W. Kang, S. Chung, J.Y. Kim, Y. Lee, K. Lee, J. Lee, K.G. Shin, H.S. Chwa,
+presents a promising avenue for future research. Furthermore, as high-                DNN-SAM: Split-and-merge DNN execution for real-time object detection, in:
+                                                                                      Proceedings of IEEE Real-Time Technology and Applications Symposium, 2022,
+lighted in previous studies, deploying CA-MOT on real-world platforms,
+                                                                                      URL https://rtcl.dgist.ac.kr/index.php/publication-2/.
+such as the F1/10 autonomous driving platform [5], offers significant             [6] N. Wojke, A. Bewley, D. Paulus, Simple online and realtime tracking with a
+potential for further investigation and practical validation.                         deep association metric, in: Proceedings of the IEEE International Conference on
+                                                                                      Image Processing, IEEE, 2017, pp. 3645–3649.
+                                                                                  [7] Y. Du, Y. Song, B. Yang, Y. Zhao, StrongSORT: Make deepsort great again, 2022,
+8. Conclusion                                                                         arXiv preprint arXiv:2202.13514.
+                                                                                  [8] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-
+                                                                                      time object detection, in: Proceedings of the IEEE/CVF Conference on Computer
+    In this paper, we proposed, CA-MOT, a new criticality-aware MOT
+                                                                                      Vision and Pattern Recognition, 2016, pp. 779–788.
+execution and scheduling framework. Aiming at achieving critical-                 [9] R. Girshick, Fast R-CNN, in: Proceedings of the IEEE International Conference
+accuracy maximization and timing guarantee, CA-MOT first proposes                     on Computer Vision, 2015, pp. 1440–1448.
+a new system design to offer a control knob between tracking accuracy            [10] K. Zhou, Y. Yang, A. Cavallaro, T. Xiang, Omni-scale feature learning for
+                                                                                      person re-identification, in: Proceedings of the IEEE International Conference
+and timing guarantee to efficiently utilize limited computing resources.
+                                                                                      on Computer Vision, 2019, pp. 3702–3712.
+Then, CA-MOT develops two scheduling algorithms to effectively uti-              [11] Y. Zhang, P. Sun, Y. Jiang, D. Yu, F. Weng, Z. Yuan, P. Luo, W. Liu, X.
+lize the system design while using the notions of slack and aging of                  Wang, ByteTrack: Multi-object tracking by associating every detection box, in:
+detection and association. Using various task sets and real-world au-                 Proceedings of the European Conference on Computer Vision, Springer, 2022,
+                                                                                      pp. 1–21.
+tonomous driving data, we demonstrated that CA-MOT can obtain high
+                                                                                 [12] Y. Zhang, C. Wang, X. Wang, W. Zeng, W. Liu, FairMOT: On the fairness of
+tracking accuracy of entire and safety-critical regions while ensuring                detection and re-identification in multiple object tracking, Int. J. Comput. Vis.
+the timely execution of all MOT tasks.                                                129 (11) (2021) 3069–3087.
+                                                                                 [13] A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? The
+                                                                                      KITTI vision benchmark suite, in: Proceedings of the IEEE/CVF Conference on
+CRediT authorship contribution statement                                              Computer Vision and Pattern Recognition, 2012.
+                                                                                 [14] ultralytics, YOLOv5 [Online], 2022, Available: https://github.com/ultralytics/
+                                                                                      yolov5.
+    Donghwa Kang: Writing – original draft, Software, Methodology,
+                                                                                 [15] K. Bernardin, R. Stiefelhagen, Evaluating multiple object tracking performance:
+Formal analysis. Jinkyu Lee: Writing – review & editing, Valida-                      the clear mot metrics, EURASIP J. Image Video Process. 2008 (2008) 1–10.
+tion, Formal analysis. Hyeongboo Baek: Writing – review & editing,               [16] G. Welch, G. Bishop, et al., An introduction to the Kalman filter, ACM SIGGRAPH
+Supervision, Funding acquisition, Formal analysis, Conceptualization.                 (1995).
+
+
+                                                                            10
+D. Kang et al.                                                                                             Journal of Systems Architecture 160 (2025) 103349
+
+
+[17] T.P. Baker, A stack-based resource allocation policy for realtime processes, in:           Jinkyu Lee is an associate professor in the Department
+     Proceedings of IEEE Real-Time Systems Symposium, IEEE, 1990, pp. 191–200.                  of Computer Science and Engineering at Sungkyunkwan
+[18] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár,               University (SKKU), South Korea, where he joined in 2014.
+     C.L. Zitnick, Microsoft COCO: Common objects in context, in: Proceedings of the            He received the BS, MS, and Ph.D. degrees in computer
+     European Conference on Computer Vision, Springer, 2014, pp. 740–755.                       science from the Korea Advanced Institute of Science and
+[19] L. Wei, S. Zhang, W. Gao, Q. Tian, Person transfer gan to bridge domain gap                Technology (KAIST), South Korea, in 2004, 2006, and 2011,
+     for person re-identification, in: Proceedings of the IEEE/CVF Conference on                respectively. He has been a research fellow/visiting scholar
+     Computer Vision and Pattern Recognition, 2018, pp. 79–88.                                  in the Department of Electrical Engineering and Computer
+[20] NVIDIA, NVIDIA Xavier Developer Kit. [Online], 2022, Available: https://www.               Science, University of Michigan until 2014. His research
+     nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-agx-xavier.                   interests include system design and analysis with timing
+[21] Z. Ge, S. Liu, F. Wang, Z. Li, J. Sun, Yolox: Exceeding yolo series in 2021, 2021,         guarantees, QoS support, and resource management in real-
+     arXiv preprint arXiv:2107.08430.                                                           time embedded systems and cyber–physical systems. He won
+[22] A. Bewley, Z. Ge, L. Ott, F. Ramos, B. Upcroft, Simple online and realtime track-          the best student paper award from the 17th IEEE Real-Time
+     ing, in: Proceedings of the IEEE International Conference on Image Processing,             and Embedded Technology and Applications Symposium
+     IEEE, 2016, pp. 3464–3468.                                                                 (RTAS) in 2011 and the Best Paper Award from the 33rd
+[23] G. Maggiolino, A. Ahmad, J. Cao, K. Kitani, Deep oc-sort: Multi-pedestrian                 IEEE Real-Time Systems Symposium (RTSS) in 2012.
+     tracking by adaptive re-identification, 2023, arXiv preprint arXiv:2302.11813.
+
+                                                                                                Hyeongboo Baek is an associate professor in the Depart-
+                          Dongwha Kang is a Ph.D. course student in the School                  ment of Artificial Intelligence, University of Seoul (UOS),
+                          of Computing, Korea Advanced Institute of Science and                 South Korea. He received the BS degree in Computer Science
+                          Technology (KAIST), South Korea. He received a BS and                 and Engineering from Konkuk University, South Korea, in
+                          MS degree in computer science from Incheon National Uni-              2010 and the MS and Ph.D. degrees in Computer Science
+                          versity (INU) in 2022 and 2024 respectively. His research             from KAIST, South Korea, in 2012 and 2016, respectively.
+                          interests include artificial intelligence, autonomous systems,        His research interests include cyber–physical systems, real-
+                          and real-time embedded systems.                                       time embedded systems, and system security. He won the
+                                                                                                best paper award from the 33rd IEEE Real-Time Systems
+                                                                                                Symposium (RTSS) in 2012.
+
+
+
+
+                                                                                           11
+
--- a/papers_txt/Refining-decision-boundaries-via-dynamic-label-adversa_2026_Computer-Standar.txt
+++ b/papers_txt/Refining-decision-boundaries-via-dynamic-label-adversa_2026_Computer-Standar.txt
@@ -0,0 +1,788 @@
+                                                                 Computer Standards & Interfaces 97 (2026) 104111
+
+
+                                                                     Contents lists available at ScienceDirect
+
+
+                                                           Computer Standards & Interfaces
+                                                              journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+Refining decision boundaries via dynamic label adversarial training for
+robust traffic classificationI
+Haoyu Tong a,c,d , Meixia Miao b,c,d , Yundong Liu a,c,d , Xiaoyu Zhang a,c,d                                              ,∗,
+
+Xiangyang Luo c,d , Willy Susilo e
+a
+    State Key Laboratory of Integrated Service Networks (ISN), Xidian University, 710121, Xi’an, China
+b School of Cyberspace Security, Xi’an University of Posts and Telecommunications, Xi’an, 710121, China
+c Key Laboratory of Cyberspace Security, Ministry of Education of China, 450001, Zhengzhou, China
+d Henan Key Laboratory of Cyberspace Situation Awareness, 450001, Zhengzhou, China
+e
+    School of Computing and Information Technology, University of Wollongong, Wollongong, Australia
+
+
+
+ARTICLE                  INFO                               ABSTRACT
+
+Keywords:                                                   Network traffic classification plays a critical role in securing modern communication systems, as it enables
+Traffic classification                                      the identification of malicious or abnormal patterns within traffic data. With the growing complexity of
+Adversarial examples                                        network environments, deep learning models have emerged as a compelling solution due to their ability to
+Adversarial training
+                                                            automatically learn discriminative representations from raw traffic. However, these models are highly vulner-
+Label noise
+                                                            able to adversarial examples, which can significantly degrade their performance by introducing imperceptible
+                                                            perturbations. While adversarial training (AT) has emerged as a primary defense, it often suffers from label
+                                                            noise, particularly when hard labels are forcibly assigned to adversarial examples whose true class may be
+                                                            ambiguous. In this work, we first analyze the detrimental effect of label noise on adversarial training, revealing
+                                                            that forcing hard labels onto adversarial examples can cause excessive shifts of the decision boundary away
+                                                            from the adversarial examples, which in turn degrades the model’s generalization. Motivated by the theoretical
+                                                            analysis, we propose Dynamic Label Adversarial Training (DLAT), a novel AT framework that mitigates label
+                                                            noise via dynamically mixed soft labels. DLAT interpolates the logits of clean and adversarial examples
+                                                            to estimate the labels of boundary-adjacent examples, which are then used as soft labels for adversarial
+                                                            examples. By adaptively aligning the decision boundary toward the vicinity of adversarial examples, the
+                                                            framework constrains unnecessary boundary shifts and alleviates generalization degradation caused by label
+                                                            noise. Extensive evaluations on network traffic classification benchmarks validate the effectiveness of DLAT in
+                                                            outperforming standard adversarial training and its variants in both robustness and generalization.
+
+
+
+1. Introduction                                                                                  there is a growing demand for more intelligent and adaptive classi-
+                                                                                                 fication methods that do not rely on payload visibility or fixed port
+    Network traffic classification, which aims to determine the appli-                           mappings.
+cation or service associated with observed traffic packets, flows, or                                In recent years, deep learning (DL) [9] has become a dominant
+sessions, serves as a fundamental building block in a wide range of                              paradigm for network traffic classification due to its ability to auto-
+networking tasks, including intrusion detection, quality-of-service man-                         matically extract the underlying representations from raw or lightly
+agement, and traffic engineering [1,2]. In the early stages of network                           processed traffic data [10–14]. Compared to traditional statistical or
+management, classification was carried out mainly through port-based                             machine learning approaches that rely heavily on manual feature en-
+identification [3,4] and deep packet inspection (DPI) [5,6]. However,                            gineering, deep neural networks, including convolutional, recurrent,
+these traditional approaches have become increasingly ineffective due                            and Transformer-based architectures, can effectively capture spatial
+to the widespread use of dynamic port allocation, encrypted commu-                               and temporal patterns in traffic data, enabling high accuracy even
+nication protocols, and intentional obfuscation techniques [7,8]. As                             in challenging scenarios such as previously unseen traffic. However,
+network environments become more complex and security-conscious,
+
+
+    I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces.
+    ∗ Corresponding author at: State Key Laboratory of Integrated Service Networks (ISN), Xidian University, 710121, Xi’an, China.
+    E-mail addresses: haoyutong@stu.xidian.edu.cn (H. Tong), miaofeng415@163.com (M. Miao), yundongliu@stu.xidian.edu.cn (Y. Liu),
+xiaoyuzhang@xidian.edu.cn (X. Zhang), xiangyangluo@126.com (X. Luo), wsusilo@uow.edu.au (W. Susilo).
+
+https://doi.org/10.1016/j.csi.2025.104111
+Received 26 October 2025; Received in revised form 29 November 2025; Accepted 8 December 2025
+Available online 13 December 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+H. Tong et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104111
+
+
+despite their impressive performance, deep learning-based classifiers               the adversarial example is far from the boundary, a larger weight is
+remain highly susceptible to adversarial examples. These are deliber-               assigned to the clean prediction. In contrast, when it is close to the
+ately crafted inputs with imperceptible perturbations that cause models             boundary, more weight is allocated to the adversarial output. This
+to misclassify [15,16]. In the context of traffic classification, adversarial       similarity-guided interpolation enables precise estimation of soft labels
+perturbations can manipulate flow-level features or packet sequences                for boundary-adjacent examples, which in turn facilitates more accu-
+in ways that evade detection without disrupting the underlying com-                 rate adjustment of the decision boundary. By avoiding rigid supervision
+munication protocols. To mitigate this vulnerability, adversarial train-            of hard labels, this adaptive labeling mechanism mitigates semantic
+ing has been widely adopted as a defense mechanism by introducing                   distortion and helps the model learn more robust decision surfaces
+adversarial examples during model training to enhance robustness [17].              under label noise. Our key contributions are outlined as follows:
+    While adversarial training is effective in many domains, apply-
+ing it to traffic classification poses unique challenges. Unlike natural                • We extend the understanding of label noise in adversarial training
+image domains, traffic data distributions typically exhibit higher in-                    to the domain of network traffic classification. The compact and
+trinsic dimensionality and more complex manifold structures. Different                    entangled distribution of traffic data makes it vulnerable to small
+application protocols often share significant common subsequences                         perturbations, increasing the likelihood of label inconsistency in
+at the byte level, creating naturally entangled features that separate                    adversarial examples. This inconsistency corresponds to a higher
+classes through subtle statistical patterns rather than distinct visual                   degree of label noise, which enforces incorrect alignment and
+characteristics. Furthermore, unlike images where semantic meaning                        impedes the learning of robust decision boundaries.
+is often locally correlated, traffic features exhibit long-range depen-                 • We provide a theoretical characterization of how hard-label
+dencies across packet sequences, making them particularly sensitive                       supervision on shifted adversarial examples induces excessive
+to small, strategically placed perturbations. These characteristics cause                 movement of the decision boundary. Specifically, enforcing
+even minor perturbations to readily shift traffic samples across class                    high-confidence predictions for adversarial examples distorts the
+boundaries, leading to significant label noise during training. This issue                classifier, increasing the risk of misclassification for nearby exam-
+is further exacerbated by standard adversarial training practices [18],                   ples from other classes.
+which introduce perturbed examples into the training set while still                    • We introduce a novel adversarial training method called DLAT,
+assigning them the same labels as their clean examples, thereby inten-                    which dynamically assigns soft labels to adversarial examples
+sifying the semantic mismatch between the true and assigned labels.                       based on their estimated proximity to the decision boundary.
+Traditional adversarial training typically enforces the original hard                     Instead of assigning uniform soft labels or incurring high compu-
+label on adversarial examples. While effective to some extent, this rigid                 tational overhead through explicit boundary detection, DLAT es-
+supervision introduces significant label noise, especially when adver-                    timates soft labels through interpolation between clean and ad-
+sarial examples cross or approach decision boundaries. Consequently,                      versarial examples, substantially reducing the cost of label gener-
+the decision boundary is pushed away from perturbed examples, often                       ation.
+reinforcing the robustness of the class in which the adversarial example
+is located at the expense of others. This imbalance undermines the
+                                                                                    2. Related work
+overall robustness of the model, particularly in tasks such as traffic
+classification, where class semantics are inherently ambiguous and
+                                                                                    2.1. Traffic classification
+sensitive to perturbations.
+    To address this issue, we propose Dynamic Label Adversarial Train-
+ing (DLAT), a novel adversarial training framework designed to mit-                      Traffic classification, the task of identifying and categorizing net-
+igate the adverse effects of excessive label noise in robust network                work traffic based on application types, has evolved significantly over
+traffic classification. Rather than rigidly assigning the original hard             the years. Traditional methods such as port-based classification and
+label to adversarial examples, DLAT constructs soft labels for examples             payload inspection (DPI) were initially dominant but became ineffec-
+near decision boundaries through a similarity-guided strategy that takes            tive due to dynamic port allocation, encryption, and protocol obfusca-
+advantage of the model’s output distributions. Such soft labels help                tion. Statistical and machine learning-based approaches later emerged,
+guide the decision boundary toward the neighborhood of adversarial                  leveraging flow-level features (e.g., packet size, inter-arrival time) to
+examples, rather than forcing it away due to overconfident and po-                  classify encrypted and unencrypted traffic. However, these methods
+tentially incorrect supervision. Instead of explicitly approximating the            still relied on manual feature engineering, which is time-consuming and
+decision boundary using computationally intensive techniques, such as               error prone. The advent of DNNs revolutionized traffic classification
+multi-step adversarial attacks with decaying step sizes, DLAT leverages             by automating feature extraction and improving accuracy. Lotfollahi
+the similarity between the output logits of clean and perturbed inputs              et al. [10] first applied deep learning to the field of traffic classification.
+to estimate the soft labels of the examples near the decision boundary.             By leveraging stacked autoencoders (SAE) and CNN architectures, it
+Specifically, since the similarity between their output distributions               enables automatic extraction of network traffic features and achieves
+reflects how close the adversarial example lies to the current decision             efficient classification of encrypted network traffic. Subsequent studies
+boundary, it serves as a reliable proxy for boundary proximity. Based               have advanced DL-based traffic classification in both accuracy and
+on this similarity, DLAT interpolates between the model’s prediction on             applicability. Wang et al. [19] proposed an end-to-end 1D-CNN model
+the clean and adversarial inputs. When adversarial and clean outputs                that processes raw packet bytes to capture spatial patterns, eliminating
+are closely aligned, the soft label remains closer to the clean prediction;         the need for manual feature design. Lan et al. [20] combined 1D-
+on the contrary, greater divergence triggers a softer supervisory signal            CNN, Bi-LSTM, and multi-head attention to classify darknet traffic,
+that better reflects the model’s uncertainty regarding adversarial input.           leveraging side-channel features to enhance robustness. LEXNet [21]
+This adaptive labeling mechanism mitigates the semantic distortion                  further improved deployment efficiency by introducing a lightweight
+introduced by fixed-label training, thus reducing the risk of reinforcing           and interpretable CNN with residual connections and a prototype layer,
+incorrect decision boundaries and improving robustness under label                  enabling real-time inference on edge devices without sacrificing ac-
+noise. Specifically, since the similarity between the output distributions          curacy. Liu et al. [22] introduced an innovative hybrid architecture
+of clean and adversarial examples serves as an effective proxy for their            TransECA-Net, combining ECANet-enhanced CNN modules with Trans-
+proximity to the decision boundary, DLAT computes this similarity                   former encoders to simultaneously extract local channel-wise features
+to guide the interpolation between their corresponding logits. When                 and global temporal dependencies.
+
+                                                                                2
+H. Tong et al.                                                                                                                    Computer Standards & Interfaces 97 (2026) 104111
+
+
+2.2. Adversarial example attacks and defense                                       Truncation. To standardize the size of the input dimensions of the
+                                                                                   model, we truncate the flow to the first 784 bytes:
+    While deep learning has significantly advanced traffic classification,
+                                                                                   𝜏𝑘 (F ) = (𝑏1 , … , 𝑏min(𝐿,𝑘) ),     𝑘 = 784.                                              (2)
+it inherits the inherent vulnerabilities of DNNs and is susceptible to
+adversarial example attacks. Adversarial examples are inputs delib-                Zero-Padding. For flows with 𝐿 < 784, zero-padding is applied to
+erately modified with subtle perturbations that cause the model to
+                                                                                   ensure uniform dimensionality:
+produce incorrect predictions while remaining imperceptible to hu-                             {
+man observers. This vulnerability also poses serious challenges to the                           (𝑏1 , … , 𝑏𝐿 , 0, … , 0) if 𝐿 < 784,
+                                                                                   𝜋784 (F ) =                                                                                (3)
+security and reliability of DL-based traffic classification systems, high-                       𝜏784 (F )                otherwise.
+lighting the need for robust defense methods. Szegedy et al. [23] first
+revealed this weakness by formulating an optimization problem to                   Image Mapping. The resulting 784-dimensional vector is reshaped into
+find minimal perturbations that cause misclassification, attributing the           a 28 × 28 grayscale image in row-major order. We define the mapping
+phenomenon to local linearity in deep networks. Goodfellow et al. [15]             𝛷 ∶ Z784
+                                                                                        256
+                                                                                            → Z28×28
+                                                                                                256
+                                                                                                     as:
+introduced the Fast Gradient Sign Method (FGSM), which efficiently
+                                                                                          ⎡ 𝑏1        𝑏2      ⋯        𝑏28 ⎤
+generates adversarial examples by leveraging the linear approxima-                        ⎢                                 ⎥
+                                                                                            𝑏        𝑏30      ⋯        𝑏56 ⎥
+tion of the loss function. Kurakin et al. [24] extended FGSM to an                 𝛷(𝐟) = ⎢ 29                                ,                                               (4)
+iterative version (BIM) to improve attack success. Madry et al. [17]                      ⎢ ⋮         ⋮       ⋱         ⋮ ⎥
+                                                                                          ⎢𝑏         𝑏746     ⋯             ⎥
+                                                                                                                       𝑏784 ⎦
+further enhanced this with Projected Gradient Descent (PGD), adding                       ⎣ 745
+random initialization to avoid local optima and establish a robust attack          where 𝐟 = 𝜋784 (F ) is the padded byte vector. This bijection arranges
+benchmark. Carlini and Wagner [25] proposed a strong optimization-                 bytes row-by-row into a square image.
+based attack C&W that effectively bypasses gradient masking defenses.
+                                                                                   Normalization. Finally, pixel values are normalized to the range [0, 1]:
+Sadeghzadeh [16] extends the adversarial attack to the traffic clas-
+sification field and proposes adversarial pad attack and adversarial                                𝛷(𝐟)𝑖,𝑗
+                                                                                    (𝛷(𝐟))𝑖,𝑗 =      .                                                  (5)
+payload attack for packet and flow classification respectively, as well                          255
+as adversarial burst attack for the statistical characteristics of flow time          The resulting tensor 𝑥 =  (𝛷(𝜋784 (F ))) ∈ [0, 1]28×28 is used as the
+series.                                                                            input to downstream neural models.
+    Adversarial training (AT) is a widely adopted defense strategy to
+enhance DNNs’ robustness against such adversarial attacks by incor-                3.2. Notion
+porating adversarial examples into the training process. Proposed by
+Goodfellow et al. [15], AT initially used FGSM adversarial examples                    Let 𝒙 ∈ [0, 1]28×28 denote the resulting input image. The neural net-
+combined with clean examples for optimization. Madry et al. [17]                   work takes 𝒙 as input and outputs either class predictions (e.g., traffic
+showed that stronger PGD-based adversarial examples provide better                 type or application label) or binary decisions (e.g., benign vs. mali-
+robustness through a min–max optimization. However, PGD training                   cious), depending on the task. Consider a 𝐾-class classification task on
+often leads to overfitting on adversarial examples and reduced accu-               the dataset  = {(𝒙𝑖 , 𝒚 𝑖 )}𝑁
+                                                                                                                𝑖=1
+                                                                                                                    where 𝒙𝑖 are preprocessed network traffic
+racy on clean data, highlighting a trade-off between robustness and                and 𝒚 𝑖 ∈  = {1, … , 𝐾} are class labels. We consider a parameterized
+generalization. To address this, Zhang et al. [26] introduced TRADES to            model 𝑓𝜽 ∶ [0, 1]28×28 →  that maps a normalized grayscale image 𝑥
+balance this trade-off with a regularized loss. Wang et al. [27] proposed          to a probability distribution over classes (i.e., 𝒑 = 𝑓𝜽 (𝒙)) and the final
+MART, which treats misclassified examples differently to enhance ro-               predicted label is obtained by 𝒚̂ = arg max𝑘 𝒑𝑘 . We then denote the
+                                                                                   standard loss function in the standard training process:
+bustness. Dong et al. [28] developed AWP, combining input and weight
+                                                                                                  1 ∑
+perturbations to flatten the loss landscape and further reduce robust                                 𝑁
+
+error. However, the aforementioned methods were originally proposed                𝑠𝑡 (𝜽, ) =         𝓁(𝑓𝜽 (𝒙𝑖 ), 𝒚 𝑖 ),                                                    (6)
+                                                                                                  𝑁 𝑖=1
+for image classification tasks and are not specifically designed for
+robust traffic classification. Directly applying these methods to traffic          where 𝑁 is the number of the training data, and 𝓁(⋅) denotes a loss
+classification may not yield optimal results. For example, adversarial             function that measures the discrepancy between the model prediction
+training applied to traffic data frequently induces substantial label              and the ground-truth label (e.g., cross-entropy).
+noise, and inadequate management of such noise can considerably
+hinder the enhancement of model robustness.                                        3.3. Adversarial attack
+
+                                                                                       Deep learning models are known to be vulnerable to adversar-
+3. Preliminaries
+                                                                                   ial examples perturbed by imperceptible noise that induce incorrect
+                                                                                   predictions. Network traffic classifiers based on deep learning inherit
+3.1. Pre-processing                                                                this vulnerability: small, carefully designed perturbations can cause
+                                                                                   significant degradation in classification performance. Formally, given
+   Consider a raw network traffic flow as a discrete byte-level se-                a trained model 𝑓𝜃 ∶ [0, 1]28×28 →  and a clean input 𝑥, an adversary
+quence of arbitrary length. Formally, a raw traffic flow is defined as             aims to craft a perturbed input 𝑥′ = 𝑥 + 𝛿 such that:
+a variable-length sequence:
+                                                                                    Minimize       ‖𝛿‖𝑝 ,
+F = (𝑏1 , 𝑏2 , … , 𝑏𝐿 ),                                                (1)        subject to:     𝑓𝜽 (𝒙 + 𝛿) = 𝒚 𝑡𝑎𝑟𝑔𝑒𝑡 ,                                                    (7)
+                                                                                                                      28×28
+where 𝐿 ∈ N+ denotes the sequence length, and each byte 𝑏𝑖 ∈                                       𝒙 + 𝛿 ∈ [0, 1]             ,
+Z256 = {0, 1, … , 255}. The flow F thus resides in the input space                 where 𝛿 denotes the adversarial perturbation and ‖ ⋅ ‖𝑝 (𝑝 ∈ {0, 1, 2, ∞})
+      ⋃
+ ∶= ∞       𝑘
+        𝑘=1 Z256 , which encompasses all finite-length byte sequences.             quantifies perturbation magnitude. For traffic image inputs, 𝑥′ = 𝑥 + 𝛿
+   Following the methodology proposed by [19], each raw traffic flow               maintains the structural properties of legitimate traffic while causing
+F is standardized to a fixed length of 784 bytes to enable batch process-          misclassification. Under a white-box threat model where adversaries
+ing and compatibility with convolutional neural networks. Specifically,            possess full knowledge of both the preprocessing pipeline 𝛹 and clas-
+the transformation pipeline 𝛹 ∶  → Z28×28
+                                        256
+                                              consists of:                         sifier parameters 𝜃, attacks are executed directly in the image domain.
+
+                                                                               3
+H. Tong et al.                                                                                                               Computer Standards & Interfaces 97 (2026) 104111
+
+
+Crucially, the perturbation is constrained within the payload region of          flow (or packet) and 𝒙′ = 𝒙 + 𝛿 be its adversarial example. In standard
+the traffic image, rather than the padding area.                                 adversarial training, each sample is annotated with a hard label 𝒚,
+                                                                                 while the underlying ground-truth semantics are better represented by
+Payload-Constrained Perturbation. To ensure semantic fidelity when
+                                                                                 a softer distribution P(𝑌 ∣ 𝒙), especially for adversarial examples lying
+mapping perturbed inputs back to the traffic domain, the adversarial
+perturbation 𝛿 is restricted to the non-padding (i.e., payload) region:          close to the decision boundary. This inherent discrepancy between
+                                                                                 the hard label and the true soft distribution can be regarded as label
+ = {(𝑖, 𝑗) ∣ 28(𝑖 − 1) + 𝑗 ≤ 𝐿} ,                                    (8)        noise. Under adversarial perturbations 𝒙′ , such mismatches are further
+                                                                                 amplified, leading to a higher effective label noise rate, which we define
+where  denotes the set of pixels corresponding to the original 𝐿 bytes
+                                                                                 as
+of the flow F . During attack iterations, any updates falling outside
+                                                                                              1 ∑ [
+                                                                                                 𝑁
+ are explicitly zeroed out. While this constraint does not achieve                                                            ]
+the theoretically optimal adversarial perturbation, it aligns with re-           𝑝𝑒 (′ ) =         I 𝒚 𝑖 ≠ arg max P(𝑌 ∣ 𝒙′𝑖 ) ,                                      (12)
+                                                                                              𝑁 𝑖=1
+alistic payload limitations in network traffic and therefore produces
+semantically faithful perturbations that are more suitable for practical         where ′ = (𝒙′𝑖 , 𝒚 𝑖 ) denotes the adversarial training set, and P(𝑌 ∣ 𝒙′𝑖 )
+deployment. In this work, we adopt the PGD (Projected Gradient De-               reflects the (unknown) ground-truth label distribution of the perturbed
+scent) [17] as our primary adversarial method. Specifically, we perform          input. Such excessive label noise disrupts the supervision learning,
+iterative updates on the input image within the allowed perturbation             preventing the model from accurately learning the underlying discrim-
+budget 𝜖 and constrain the perturbation to the valid traffic region :           inative features of the data. As a result, the classifier may overfit
+                (             (    (            )))                              to incorrect labels or adversarial patterns rather than the true class
+𝒙𝑡+1 = 𝛱𝜖 (𝒙)∩ 𝒙𝑡 + 𝛼 ⋅ sign ∇𝒙  𝑓𝜽 (𝒙𝑡 ), 𝒚     ,                (9)
+                                                                                 semantics. This issue is particularly critical in adversarial training for
+where  denotes the loss function, 𝛱 is the projection operator that             traffic classification, where decision boundaries between classes are
+restricts the updated input to the intersection of the valid region  and        inherently subtle and highly sensitive to small perturbations.
+the 𝓁𝑝 -ball of radius 𝜖 centered at 𝒙, and 𝛼 is the step size.
+                                                                                 4.2. Impact of label noise on decision boundary robustness
+3.4. Adversarial training
+                                                                                     Adversarial training assumes that the label of an adversarial ex-
+   One of the most effective defenses against adversarial attacks is
+                                                                                 ample remains unchanged from its clean example. However, when
+adversarial training (AT), which enhances model robustness by incor-
+                                                                                 an adversarial example crosses the decision boundary into a region
+porating adversarial examples into the training process. Specifically, it
+                                                                                 semantically aligned with a different class, assigning it the original
+formulates the training objective as a min–max optimization:
+                                                                                 label introduces semantic inconsistency. We formalize this effect in a
+     1 ∑
+        𝑁
+                     (              )                                            binary classification setting. Let the input space be  ⊂ R𝑑 and the
+min         max 𝓁 𝑓𝜽 (𝒙𝑖 + 𝛿𝑖 ), 𝒚 𝑖 ,                      (10)
+ 𝜽 𝑁       ‖𝛿𝑖 ‖𝑝 ≤𝜖                                                             label space be  = {𝐴, 𝐵}. Consider a classifier 𝑓𝜽 ∶  → [0, 1],
+       𝑖=1
+   For network traffic classifiers, we extend this paradigm with                 where 𝑓𝜽 (𝒙) denotes the predicted probability of class 𝐴, and 1 − 𝑓𝜽 (𝒙)
+payload-aware constraints:                                                       is the probability of class 𝐵. The decision boundary is defined by the
+                                                                                 hypersurface 𝜽 = {𝒙 ∈  ∣ 𝑓𝜽 (𝒙) = 0.5}. We consider an adversarial
+     1 ∑
+         𝑁
+min        max 𝓁(𝑓𝜽 (𝒙𝑖 + 𝛿), 𝒚 𝑖 )                               (11)           example 𝒙′ generated from a clean input 𝒙 of class 𝐴, such that 𝒙′ lies in
+ 𝜽 𝑁       𝛿 ∈
+        𝑖=1 𝑖 𝑖                                                                  the classification region of class 𝐵, i.e., 𝑓𝜽 (𝒙′ ) < 0.5. During adversarial
+              {                                     }                            training, if 𝒙′ is labeled as 𝐴 (i.e., the same as 𝒙), then minimizing
+where 𝑖 = 𝛿 ∣ ‖𝛿‖𝑝 ≤ 𝜖 and 𝛿(𝑖,𝑗) = 0, ∀(𝑖, 𝑗) ∉ 𝑖 is the constraint
+set for the 𝑖th example.                                                         the loss on 𝒙′ pushes the decision boundary toward class 𝐵, potentially
+                                                                                 degrading the robustness of that class.
+4. Label noise
+                                                                                 Definition 1 (Margin Distance). Given a example 𝒙 ∈  and a classifier
+    Label noise in adversarial training refers to the semantic mismatch          𝑓 ∶  → [0, 1], the margin distance from 𝒙 to the decision boundary
+between the assigned labels and the true labels of adversarial examples.          = {𝒙 ∈  ∣ 𝑓 (𝒙) = 0.5} is defined as:
+As first proposed by Dong et al. [18], this phenomenon arises from
+                                                                                 𝑑𝑖𝑠𝑡(𝒙, ) = 𝑚𝑖𝑛 ‖𝒙 − 𝒙‖𝑝 .                                                          (13)
+the practice of assigning adversarial examples the same labels as their                          𝒙 ∈
+clean input. Given a clean input-label pair (𝒙, 𝒚), adversarial training
+constructs a perturbed input 𝒙′ = 𝒙 + 𝛿 and assigns it the original              Theorem 1 (Excessive Boundary Shift Induced by Hard-Label Adversarial
+label 𝒚 during training. However, the true label of 𝒙′ may differ due            Training ). Consider a binary classifier 𝑓 ∶  → [0, 1], with the pre-training
+to the semantic distortion introduced by the adversarial perturbation            decision boundary defined as:
+𝛿. This distributional shift is especially detrimental to learning robust
+representations, as it misguides the optimization process.                       pre = {𝒙 ∈  ∣ 𝑓pre (𝒙) = 0.5}.                                                      (14)
+
+                                                                                 Suppose 𝒙𝐴 ∈ 𝐴 is a clean example from class A and 𝒙′𝐴 = 𝒙𝐴 + 𝛿 is an
+4.1. Amplified label noise in robust traffic classification
+                                                                                 adversarial example generated to cross pre , i.e., 𝑓pre (𝒙′𝐴 ) < 0.5. Let 𝑓post be
+    While label noise poses a general challenge in adversarial training,         the classifier obtained via hard-label adversarial training using (𝒙′𝐴 , 𝑦𝐴 ) as
+it becomes even more prominent in the context of robust network                  supervision, where 𝑦𝐴 = 1. Then, under hard-label supervision, the training
+traffic classification. Unlike image data, where semantic changes are            objective enforces high-confidence predictions for 𝒙′𝐴 , i.e.,
+often human-perceivable, traffic data is inherently opaque and lacks
+                                                                                 𝑓post (𝒙′𝐴 ) ≫ 0.5,                                                                   (15)
+intuitive visual features. Consequently, different classes of traffic data
+are compactly distributed and highly entangled, small perturbations in           which necessarily implies that the new decision boundary post = {𝒙 ∣
+the byte-level input space can lead to disproportionately large semantic         𝑓post (𝒙) = 0.5} must satisfy
+changes that are not easily detectable by human inspection. In such a
+scenario, the probability of label mismatch between clean and adversar-                                 𝑓post (𝒙′𝐴 ) − 0.5
+                                                                                 dist(𝒙′𝐴 , post ) =                        .                                         (16)
+ial examples increases. Let 𝒙 be the image representation of a network                                  ‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝
+
+                                                                             4
+H. Tong et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104111
+
+
+                                                                                 (0.5, 0.5) to guide adversarial training. However, in multi-classification,
+                                                                                 it is difficult to determine the soft labels of the examples near the deci-
+                                                                                 sion boundary, and the boundary may be the intersection of decisions of
+                                                                                                                                     1    1         1
+                                                                                 multiple classes, and using soft labels such as ( || , || , … , || ) does not
+                                                                                 fit the shape of the decision boundary well. A natural solution would be
+                                                                                 to find the examples near the current decision boundary that are within
+                                                                                 the same class as the original class of the adversarial example, and
+                                                                                 use the model’s output about them as a soft label. However, explicitly
+                                                                                 detecting the decision boundary via iterative adversarial attacks is
+                                                                                 computationally expensive. Instead, DLAT capitalizes on the fact that
+                                                                                 the decision boundary must lie within the space between clean and
+                                                                                 adversarial examples, using a lightweight interpolation mechanism to
+                                                                                 approximate the soft labels of boundary-adjacent examples.
+
+                                                                                 5.2. Method design
+   Fig. 1. Decision boundary changes: Hard-Label AT vs. Soft-Label DLAT.
+                                                                                     In order to accurately estimate the soft label of the examples near
+                                                                                 the decision boundary, we first need to determine the proximity of
+                                                                                 the adversarial examples to the current decision boundary, when the
+    In typical cases where 𝑓post (𝒙′𝐴 ) → 1, the post-training boundary          adversarial examples are farther away from the decision boundary, the
+moves far beyond 𝒙′𝐴 in the direction of class B. As a result, many              output logits of the clean examples are given higher weight for interpo-
+nearby class-B examples 𝒙𝐵 ∈ 𝐵 satisfying 𝒙𝐵 ≈ 𝒙′𝐴 may fall                     lation in order to adjust the timely adjustment of the decision boundary
+into the wrong side of the decision boundary, resulting in increased             to the vicinity of the adversarial examples, and on the contrary, the
+misclassification. The detailed proof can be found in Appendix.                  adversarial examples are given higher weight for interpolation to be
+    Although Theorem 1 is formulated in a binary classification setting          able to prevent the adjusted decision boundary from crossing too much
+for analytical clarity, the underlying insights naturally extend to multi-       distance from the adversarial examples.
+class scenarios. In the multi-class case, a classifier defines multiple
+decision boundaries between classes. Hard-label adversarial training on          Algorithm 1: Dynamic Label Adversarial Training
+an adversarial example 𝒙′ with true label 𝑦 forces an increase in the             1 Input: Network traffic dataset 𝐷; Learning rate 𝜂; Total
+logit margin:                                                                        training epochs 𝑇 ; Model architecture 𝑓
+                                                                                  2 Initialize model 𝑓 with parameters 𝜽                 // Model
+𝑧𝑦 (𝒙) − 𝑧𝑘 (𝒙),   ∀𝑘 ≠ 𝑦,                                            (17)            initialization
+which effectively pushes the decision boundaries of all other classes             3 for 𝑖 ∈ [𝑇 ] do
+away from 𝒙′ . When 𝒙′ lies near the intersection of multiple class re-           4     foreach batch (𝑿, 𝒀 ) ∈ 𝐷 do
+gions, this aggressive supervision disproportionately expands the region          5         𝑿 ′ ← 𝑃 𝐺𝐷(𝑓 , 𝑿, 𝒀 )    // Adversarial example
+of class 𝑦 at the expense of compressing neighboring class regions,                           generation
+analogous to the boundary distortion shown in the binary case.                    6          𝑶 ← 𝑓 (𝑿)
+    Our dynamic label assignment mitigates this issue by relaxing                 7          𝑶′ ← 𝑓 (𝑿 ′ )
+the overconfident supervision for adversarial examples near decision              8          𝐾𝐿 ← 𝐷𝑖𝑣(𝑶, 𝑶′ )                   // KL-based distance
+boundaries. Rather than forcing 𝒙′ deep into the original decision field,                     computation
+the interpolated target 𝒚 mix the interpolated target 𝒚 mix guides a more         9          𝛼 ← tanh(𝐾𝐿)+1
+                                                                                                      2
+appropriate adjustment of the decision boundaries. This calibrated               10          𝒀 𝑚𝑖𝑥 ← (1 − 𝛼) ⋅ 𝑶′ + 𝛼 ⋅ 𝑶               // Mixing label
+supervision prevents the excessive boundary shift described in Theorem                        construction
+1, enabling the model to maintain robustness in practical multi-class            11       adv ← 𝐷𝑖𝑣(𝑶′ , 𝒀 𝑚𝑖𝑥 )
+traffic classification tasks.                                                    12       clean ← CE (𝑶, 𝒀 )
+                                                                                 13       total ← adv + clean
+5. Dynamic label adversarial training                                            14       𝜽 ← 𝜽 − 𝜂 ⋅ ∇𝜽 total                         // Model update
+                                                                                 15    end
+    Motivated by the analysis of label noise on the robustness of adver-         16 end
+sarial training in Section 4, we propose DLAT (Dynamic Label Adversar-
+ial Training), an adversarial training strategy that efficiently improves           Given a clean example 𝒙 and its adversarial example 𝒙′ = 𝒙 + 𝛿, let
+adversarial robustness utilizing dynamically mixed soft labels.                  𝑓 denote the classifier with outputs 𝑶 = 𝑓 (𝒙) and 𝑶′ = 𝑓 (𝒙′ ). Since the
+                                                                                 mapping between clean examples and hard labels can be established
+5.1. Design inspiration                                                          soon by training, we can utilize the Kullback–Leibler (KL) divergence to
+                                                                                 quantify the distance between the adversarial example and the decision
+    In traditional adversarial training, assigning hard labels to adver-         boundary:
+sarial examples introduces significant label noise, since the true label                         ∑                sof tmax(𝑶𝑖 )
+of an adversarial example may differ from its clean counterpart. This            𝐷𝑖𝑣(𝑶, 𝑶′ ) =       sof tmax(𝑶𝑖 ) log           .                   (18)
+                                                                                                𝑖                 sof tmax(𝑶′𝑖 )
+label noise forces the decision boundary to move far away from these
+                                                                                     Higher 𝐷𝑖𝑣 typically indicates larger distortion and label noise. To
+examples, as shown in Fig. 1, ultimately leading to degraded model
+                                                                                 obtain a stable and responsive mixing factor 𝛼 ∈ [0, 1], we normal-
+robustness. To address this issue, the first step is to mitigate label
+                                                                                 ize 𝐷𝑖𝑣(𝑶, 𝑶′ ) using the tanh function, which provides a smooth and
+noise. According to Theorem 1 and Section 4.1, using soft labels can
+                                                                                 symmetric mapping and naturally bounds the output. Accordingly, we
+effectively reduce such label noise, thereby preventing the decision
+                                                                                 define:
+boundary from over-shifting. In binary classification, this corresponds                   (           )
+to adjusting the boundary toward the neighborhood of the adversarial                  tanh 𝐷𝑖𝑣(𝑶, 𝑶′ ) + 1
+                                                                                 𝛼=                        .                                         (19)
+examples, which can be achieved by assigning a soft label such as                               2
+
+                                                                             5
+H. Tong et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104111
+
+
+   This factor interpolates between 𝑶′ and 𝑶 to form the mixed soft              Table 1
+label:                                                                           The balanced ISCX-VPN dataset.
+                                                                                  Type                Imbalanced dataset    Imbalanced dataset
+𝒚 𝑚𝑖𝑥 = (1 − 𝛼) ⋅ 𝑶′ + 𝛼 ⋅ 𝑶.                                        (20)
+                                                                                                      Total number          Training set number    Test set number
+    The training objective of DLAT combines two components. The first             VPN_Chat            7946                  1500                   200
+is a KL divergence loss that aligns the model’s prediction on 𝒙′ with             VPN_Email           596                   1500                   59
+                                                                                  VPN_File Transfer   1898                  1500                   189
+𝒚 𝑚𝑖𝑥 to improve the model robustness:
+                                                                                  VPN_P2P             912                   1500                   91
+            (        )                                                            VPN_Streaming       1199                  1500                   119
+adv = 𝐷𝑖𝑣 𝑶′ , 𝒚 𝑚𝑖𝑥 ,                                          (21)
+                                                                                  VPN_VoIP            20 581                1500                   200
+where the second is a cross-entropy loss that is used to allow the
+model to learn generalization knowledge and improve clean example                Table 2
+classification accuracy:                                                         The balanced CICIoT2022 dataset.
+            ∑
+clean = −     𝒚 𝑖 log sof tmax(𝑶𝑖 ).                         (22)                Type                Imbalanced dataset    Imbalanced dataset
+                 𝑖                                                                                    Total number          Training set number    Test set number
+   The overall loss is formulated as:
+                                                                                  VPN_Chat            7946                  1500                   200
+       [                                              ]                           VPN_Email           596                   1500                   59
+min max adv (𝑓𝜽 (𝒙 + 𝛿), 𝒚 𝑚𝑖𝑥 ) + clean (𝑓𝜃 (𝒙), 𝒚) .             (23)
+ 𝜽 𝛿𝑖 ∈𝑖                                                                         VPN_File Transfer   1898                  1500                   189
+    By dynamically adapting label softness based on Eq. (18)–(20) and             VPN_P2P             912                   1500                   91
+                                                                                  VPN_Streaming       1199                  1500                   119
+balancing loss components Eq. (21)–(23), DLAT mitigates excessive
+                                                                                  VPN_VoIP            20 581                1500                   200
+boundary shift caused by label noise, enabling models to learn robust
+decision boundaries for tasks like traffic classification. The pseudo-code
+for DLAT is presented on Algorithm 1.                                            Table 3
+                                                                                 The balanced ISCX-ALL dataset.
+6. Experiments                                                                    Type                Imbalanced dataset    Imbalanced dataset
+                                                                                                      Total number          Training set number    Test set number
+    In this section, we perform a wide variety of comprehensive ex-               Chat                7681                  5400                   600
+periments to evaluate the performance of DLAT on both clean and                   Email               6459                  5400                   600
+adversarial traffic. These evaluations are carried out on two datasets            File Transfer       7405                  5400                   600
+                                                                                  P2P                 1849                  1652                   184
+and compared against four state-of-the-art adversarial training methods
+                                                                                  Streaming           3936                  3540                   393
+in the computer vision field.                                                     VoIP                19 597                5400                   600
+                                                                                  VPN_Chat            7946                  5400                   600
+6.1. Experiment setup                                                             VPN_Email           596                   538                    59
+                                                                                  VPN_File Transfer   1898                  1754                   189
+                                                                                  VPN_P2P             912                   830                    91
+Datasets. Experiments are performed using the ISCX VPN-nonVPN                     VPN_Streaming       1199                  1108                   119
+                                                                                  VPN_VoIP            20 581                5400                   600
+dataset [29] and the CICIoT2022 dataset [30]. The former includes
+encrypted and unencrypted traffic, while the latter focuses on IoT-
+related scenarios with both benign and malicious behaviors. We con-
+struct three experimental settings from those datasets. The first, re-
+ferred to as ISCX-VPN, includes six categories of encrypted VPN traffic:
+                                                                                 Evaluation Metrics. In our experiments, we adopt two primary evalua-
+VPN_Chat, VPN_Email, VPN_File Transfer, VPN_P2P, VPN_Streaming,
+                                                                                 tion metrics to assess the effectiveness of DLAT: the Robust Classification
+and VPN_VoIP. The second setting, named ISCX-ALL, expands the clas-
+                                                                                 Accuracy (RCC) and the Clean Sample Accuracy (ACC). ASR measures
+sification scope to twelve categories by incorporating six VPN and six
+                                                                                 the proportion of adversarial traffic that successfully fools the model,
+non-VPN traffic types. The third setting, derived from the CICIoT2022
+                                                                                 indicating the robustness of the defense mechanism under adversarial
+dataset, defines a six-class classification task encompassing typical
+                                                                                 attacks. A lower RCC implies stronger robustness. In contrast, ACC
+IoT device states and activities. The categories include: Power, Idle,
+                                                                                 evaluates the classification accuracy on clean, unperturbed traffic, re-
+Interactions, Scenarios, Active, and Attacks. Since the original datasets
+                                                                                 flecting the model’s predictive performance under normal conditions.
+exhibit significant class imbalance, we first split the data into training
+                                                                                 A higher ACC indicates better generalization and utility in benign
+and testing sets with a 9:1 ratio, and then apply class-wise balancing
+                                                                                 settings. We report both metrics to provide a comprehensive assessment
+separately within each subset to ensure a relatively balanced class
+distribution. The statistics of the balanced datasets are summarized in          of the trade-off between robustness and standard accuracy.
+Table 1, 2 and 3.                                                                Baselines. We compare DLAT to the following representative ad-
+Training. We adopt two representative neural network architectures as            versarial training baselines, including PGD-AT [17], TRADES [26],
+backbone models: PreActResNet [31], DenseNet [32], MobileNet [33],               MART [27], and AWP [28]. All baseline methods are implemented
+WideResNet [34], and FFNN (Feed-Forward Neural Network) [35].                    following their original settings. For TRADES, the trade-off parameter
+Both models are trained for 80 epochs using the momentum-based                   𝜆 is set to 1∕6, as suggested in the original paper. For AWP, the weight
+stochastic gradient descent (MSGD) [36], with a momentum coefficient             perturbation step size 𝛾 is set to 0.01. Unlike those training methods,
+of 0.9 and a weight decay of 5 × 10−4 . The initial learning rate is set         which still rely on hard labels and thus remain sensitive to mislabeled
+to 0.1, and a multi-stage learning rate decay strategy is applied: the           data, DLAT explicitly incorporates soft-label supervision, making it
+learning rate is reduced by a factor of 10 at the 40th epoch.                    more robust under label noise.
+
+Attack and defense settings. For adversarial evaluation, we adopt the
+                                                                                 6.2. The effectiveness of DLAT
+widely used PGD-20 under the 𝓁∞ norm constraint. The perturbation
+radius 𝜖 is set to 24∕255, and the step size 𝛼 is 4∕255. For generating
+adversarial examples used in adversarial training, we employ PGD-10              Clean accuracy assessment. As shown in Table 4, the normal model
+under the same 𝓁∞ -bounded perturbation settings.                                trained without adversarial defenses achieves the highest ACC across
+
+                                                                             6
+H. Tong et al.                                                                                                                   Computer Standards & Interfaces 97 (2026) 104111
+
+
+Table 4
+The clean sample accuracy (ACC) and robust classification accuracy (RCC) of different adversarial training methods across four network architectures: ResNet,
+DenseNet, MobileNet, WideResNet, and FFNN on the ISCX-VPN, ISCX-ALL and CICIoT2022 datasets (%).
+ Dataset         Method   Model
+
+                          ResNet                        DenseNet                      MobileNet                     WideResNet                      FFNN
+
+                          ACC            RCC            ACC            RCC            ACC            RCC            ACC              RCC            ACC             RCC
+
+                 Normal   99.02 ± 0.30   0.00 ± 0.00    99.92 ± 0.08   0.67 ± 0.09    99.17 ± 0.00   3.58 ± 0.14    99.75 ± 0.00     0.83 ± 0.07    98.25 ± 0.00    7.67 ± 0.58
+                 PGD-AT   98.72 ± 0.18   96.32 ± 0.29   96.02 ± 0.23   91.00 ± 0.72   97.87 ± 0.25   90.00 ± 2.69   99.35 ± 0.08     96.01 ± 0.11   97.25 ± 0.24    87.00 ± 0.81
+                 TRADES   96.75 ± 0.37   94.62 ± 0.30   92.98 ± 0.29   89.92 ± 0.15   93.18 ± 0.44   85.35 ± 3.38   97.92 ± 0.24     96.03 ± 0.18   92.02 ± 0.41    83.68 ± 0.87
+ ISCX-VPN
+                 MART     98.08 ± 0.43   94.20 ± 0.59   82.65 ± 0.72   78.90 ± 0.53   80.83 ± 1.76   70.85 ± 1.74   98.51 ± 0.19     92.72 ± 0.17   93.28 ± 0.20    84.58 ± 0.60
+                 AWP      98.18 ± 0.17   96.22 ± 0.17   95.40 ± 0.33   92.92 ± 0.09   93.40 ± 0.42   90.10 ± 0.49   73.82 ± 0.46     72.18 ± 0.54   95.63 ± 0.24    88.32 ± 0.29
+                 DLAT     98.83 ± 0.09   96.53 ± 0.08   98.77 ± 0.26   93.93 ± 0.42   98.20 ± 0.10   93.07 ± 0.47   99.08 ± 0.05     96.38 ± 0.36   96.88 ± 0.17    86.37 ± 0.30
+
+                 Normal   93.95 ± 4.36   2.04 ± 1.06    96.70 ± 2.11   0.23 ± 0.07    91.52 ± 4.99   3.74 ± 0.12    96.22 ± 1.48     7.23 ± 0.48    88.48 ± 0.27    1.61 ± 0.21
+                 PGD-AT   88.56 ± 0.10   87.34 ± 0.20   82.96 ± 0.26   80.61 ± 0.30   82.19 ± 0.24   78.87 ± 0.73   88.63 ± 0.03     86.12 ± 2.89   83.00 ± 0.34    77.23 ± 0.29
+                 TRADES   88.31 ± 0.13   86.19 ± 0.45   79.19 ± 1.12   73.98 ± 3.39   80.39 ± 0.80   75.26 ± 2.93   87.32 ± 1.41     84.90 ± 2.54   76.47 ± 1.90    71.01 ± 0.75
+ ISCX-ALL
+                 MART     88.19 ± 0.18   86.33 ± 0.51   77.22 ± 0.19   76.08 ± 0.22   80.78 ± 0.33   77.79 ± 0.31   87.67 ± 0.12     86.10 ± 0.45   75.99 ± 0.64    69.95 ± 1.79
+                 AWP      86.31 ± 0.11   85.44 ± 0.10   78.00 ± 0.19   76.43 ± 0.48   78.83 ± 0.07   77.58 ± 0.16   85.85 ± 0.12     84.71 ± 0.05   81.30 ± 0.21    76.91 ± 0.21
+                 DLAT     89.44 ± 0.32   86.68 ± 0.40   88.83 ± 0.80   82.18 ± 0.43   84.35 ± 0.36   75.84 ± 1.27   88.71 ± 0.02     87.14 ± 0.41   86.79 ± 0.26    74.32 ± 0.81
+
+                 Normal   99.82 ± 0.32   0.04 ± 0.01    99.73 ± 0.01   0.63 ± 0.02    98.50 ± 2.59   0.00 ± 0.00    99.99 ± 0.00     0.56 ± 0.01    99.67 ± 0.06    0.12 ± 0.06
+                 PGD-AT   99.27 ± 0.08   96.26 ± 3.18   98.20 ± 0.02   96.86 ± 0.44   98.20 ± 0.79   97.65 ± 0.47   99.46 ± 0.21     93.73 ± 0.46   83.32 ± 2.40    81.36 ± 2.58
+                 TRADES   98.35 ± 0.82   98.90 ± 0.57   98.04 ± 0.00   97.81 ± 1.36   98.05 ± 0.31   91.38 ± 0.74   98.06 ± 0.02     97.62 ± 0.19   96.84 ± 0.11    89.20 ± 0.27
+ CICIoT2022
+                 MART     98.19 ± 0.02   96.37 ± 2.27   98.05 ± 0.31   95.50 ± 0.50   98.06 ± 0.28   95.20 ± 0.40   99.00 ± 0.05     97.00 ± 0.10   98.20 ± 0.20    91.28 ± 1.50
+                 AWP      98.25 ± 0.10   96.50 ± 0.20   98.10 ± 0.15   96.00 ± 0.25   98.15 ± 0.12   95.50 ± 0.30   99.10 ± 0.05     98.00 ± 0.10   98.00 ± 0.15    90.10 ± 0.50
+                 DLAT     99.70 ± 0.02   99.20 ± 0.12   98.89 ± 0.17   97.12 ± 0.24   98.06 ± 0.28   97.88 ± 0.14   99.66 ± 0.02     98.99 ± 0.11   98.87 ± 0.09    91.93 ± 0.86
+
+
+
+
+  Fig. 2. The robust classification accuracy (RCC) of DLAT under 𝓁1 and 𝓁2 norm-bounded PGD-20 attacks on datasets ISCX-VPN, ISCX-ALL and CICIot2022.
+
+
+all architectures, ranging from 98.25% to 99.92% on ISCX-VPN, from                          notably outperforming PGD-AT, TRADES, MART, and AWP, with top
+88.48% to 96.70% on ISCX-ALL, and from 98.50% to 99.99% on                                  results exceeding 96% on ResNet and WideResNet. Similarly, on ISCX-
+CICIot2022. However, it fails completely under adversarial attacks,                         ALL and CICIot2022, it maintains leading robustness, achieving up to
+with robustness classification accuracy (RCC) close to zero. In the                         87.14% and 98.99% RCC on WideResNet and surpassing competing
+table, boldface highlights the best performance for each metric, while                      methods by a clear margin. These findings underscore the superior
+underlining indicates the second-best. Compared to the normal model,                        robustness of DLAT while retaining competitive clean accuracy.
+adversarial training methods such as PGD-AT, TRADES, and MART                                   Secondly, to further assess the robustness of DLAT against unseen
+significantly improve robustness, albeit at the cost of decreased clean                     adversarial threats, we evaluate its robustness under a diverse set
+accuracy. Specifically, PGD-AT maintains relatively higher ACC (e.g.,                       of attack methods, including adversarial perturbations constrained by
+98.72% on ResNet and 88.56% on ISCX-ALL, while TRADES and MART                              different norm bounds (i.e., 𝓁1 and 𝓁2 norms) as well as FGSM [15],
+show larger reductions in ACC on clean examples). Our method, DLAT,                         PGD-100 [17], and AutoAttack [37]. We first report the performance
+consistently achieves competitive ACC, reaching up to 98.83% on                             of DLAT under 𝓁1 - and 𝓁2 -bounded PGD-20 attacks on the ISCX-
+ResNet and 89.44% on ISCX-ALL, surpassing all baselines on ISCX-                            VPN, ISCX-ALL, and CICIot2022 datasets, as illustrated in Fig. 2. Each
+ALL and maintaining top-tier accuracy on ISCX-VPN and CICIot2022.                           heatmap visualizes the RCC achieved by five different models un-
+These results demonstrate that DLAT effectively enhances robustness                         der increasing perturbation radii. It can be observed that DLAT ex-
+with minimal compromise to clean performance.                                               hibits strong robustness under both 𝓁1 - and 𝓁2 -bounded PGD-20 at-
+Robust accuracy assessment. We first evaluate the RCC of various ad-                        tacks. Notably, the defense is more effective against 𝓁1 -norm pertur-
+versarial training methods under adversarial attacks. As shown in Table                     bations, as indicated by the overall darker color tones in the corre-
+4, adversarial training markedly improves RCC compared with the nor-                        sponding heatmaps. This suggests that DLAT better preserves classi-
+mal model, which exhibits near-zero robustness. Among the compared                          fication performance when facing sparse but high-magnitude pertur-
+methods, DLAT consistently surpasses most baselines in the majority of                      bations. Among the evaluated models, ResNet and DenseNet generally
+cases across both datasets and network architectures. Specifically, on                      exhibit higher RCC scores across both norm types and datasets, with
+ISCX-VPN, DLAT attains RCC scores above 86% across all architectures,                       RCC remaining above 0.8 under moderate 𝓁1 perturbations (e.g., 𝜖 =
+
+                                                                                      7
+H. Tong et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104111
+
+
+
+
+                        Fig. 3. The RCC of DLAT under FGSM, PGD-100, AutoAttack on ISCX-VPN, ISCX-ALL, and CICIot2022 datasets.
+
+
+
+
+          Fig. 4. The robust classification accuracy (RCC) of various models across classes on ISCX-ALL under increasing adversarial perturbation radii.
+
+
+1140∕255). In contrast, MobileNet and DenseNet show relatively lower                in performance, particularly when 𝜖 exceeds 24/255. Despite this,
+robustness, particularly under 𝓁2 -bounded attacks, where RCC values                architectures such as ResNet and wideresnet continue to maintain RCC
+gradually decrease below 0.6 as the perturbation radius increases.                  above 0.5 at 𝜖 = 32∕255, suggesting that DLAT remains effective even
+Nonetheless, the performance degradation across all models is smooth                under adaptive and high-strength adversarial attacks. These results
+rather than abrupt, suggesting that DLAT retains a degree of robustness             collectively demonstrate the generalization capability of the framework
+and stability.                                                                      across a broad range of attacks and perturbation intensities.
+   As shown in Fig. 3, we further assess the performance of DLAT                        We thirdly evaluate the robustness of DLAT under varying attack
+under three previously unseen adversarial attacks: FGSM, PGD-100,                   intensities, where the attack intensity corresponds to the radii of ad-
+and AutoAttack. Under FGSM, all evaluated models exhibit strong                     versarial perturbations (denoted by Epsilon 𝜖). As comprehensively
+robustness, with RCC values typically exceeding 0.85 below 𝜖 = 24∕255,              illustrated in Fig. 4, we present the RCC performance for each indi-
+and models such as ResNet and WideResNet experiencing only marginal                 vidual class within the ISCX-ALL dataset (including Chat, Email, File
+performance degradation. As the perturbation strength increases under               Transfer, P2P, Streaming, VoIP, VPN_Chat, VPN_Email, VPN_File Trans-
+PGD-100, the RCC gradually decreases across all models. Nonetheless,                fer, VPN_P2P, VPN_Streaming, and VPN_VoIP) across multiple network
+most models achieve RCCs above 0.5 at 𝜖 = 32∕255 on the ISCX-VPN                    architectures (ResNet, DenseNet, MobileNet, WideResNet, FFNN) un-
+dataset, indicating a moderate level of robustness. AutoAttack presents             der increasing perturbation radii (𝜖 ranging from 0 to 56/255). The
+the most challenging scenario, leading to a more pronounced decline                 adversarial training of DLAT is performed using adversarial examples
+
+                                                                                8
+H. Tong et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104111
+
+
+
+
+                                    (a) Accuracy curve                                                   (b) Loss curve
+
+                              Fig. 5. Comparison of accuracy and loss convergence results for DenseNet on the ISCX-ALL Dataset.
+
+
+generated with a perturbation radius of 𝜖 = 24∕255. As shown in Fig. 4,           Table 5
+across most classes and architectures, the trained models demonstrate             Comparison of the time consumption for each epoch of the adversarial training
+strong robustness when the attack intensity remains within or below               methods (s).
+this radius (𝜖 ≤ 24∕255), and the models still maintain relatively strong          Dataset       Model          AT         TRADES      MART      AWP       DLAT
+resilience to perturbations (i.e., 24∕255 < 𝜖 < 32∕255). However, once                           ResNet         16.99      17.98       19.38     19.19     19.07
+𝜖 exceeds 32∕255, the attack becomes significantly stronger, leading to            ISCX-VPN      DenseNet       12.59      14.02       14.52     15.84     14.28
+a noticeable drop in RCC, especially for non-VPN classes.                                        MobileNet      26.14      28.55       28.14     30.83     27.98
+                                                                                                 WideResNet     139.62     136.84      147.27    140.37    152.07
+                                                                                                 FFNN           4.02       3.85        3.94      4.36      4.41
+6.3. The efficiency of DLAT
+                                                                                                 ResNet         74.32      80.69       84.49     89.11     81.57
+                                                                                   ISCX-ALL      DenseNet       57.64      60.83       63.62     66.78     62.95
+    To evaluate the training efficiency of DLAT, we compare its con-                             MobileNet      113.71     114.23      130.42    129.99    117.19
+vergence with that of representative adversarial training baselines,                             WideResNet     673.35     621.27      688.85    688.37    762.18
+including AT, TRADES, MART, and AWP. As illustrated in Fig. 5,                                   FFNN           16.43      15.03       17.86     17.62     16.31
+DLAT demonstrates significantly faster convergence in both accuracy                              ResNet         47.35      48.92       51.19     51.32     49.63
+and loss. Specifically, in the accuracy curve (Fig. 5(a), DLAT rapidly                           DenseNet       61.02      63.11       66.68     68.92     64.90
+improves during the initial training epochs, reaching a stable accuracy            CICIoT2022    MobileNet      121.56     122.91      132.23    135.13    124.87
+                                                                                                 WideResNet     680.37     690.82      703.16    710.55    695.09
+above 0.85 within 30 epochs. In contrast, competing methods exhibit
+                                                                                                 FFNN           18.06      19.42       18.98     19.56     20.43
+slower convergence and lower final performance, with TRADES and
+MART stabilizing below 0.80. Similarly, the loss curve (Fig. 5(b) further
+highlights the advantage of DLAT in optimization stability. It consis-
+tently maintains a lower loss value throughout training and converges             that DLAT consistently improves robustness and generalization over
+to a final loss below 0.3, which is noticeably lower than those of other          standard adversarial training.
+methods. These results collectively demonstrate that DLAT not only
+accelerates the convergence process but also facilitates optimization             CRediT authorship contribution statement
+toward better minima, indicating its efficiency and practicality for
+robust model training.                                                               Haoyu Tong: Writing – original draft. Meixia Miao: Methodology,
+    In addition to its fast convergence, DLAT maintains comparable                Formal analysis, Project administration. Yundong Liu: Data curation.
+training time per epoch to other adversarial training methods, as re-             Xiaoyu Zhang: Writing – original draft, Supervision. Xiangyang Luo:
+ported in Table 5. Across different model architectures and datasets,             Resources, Funding acquisition. Willy Susilo: Visualization, Validation,
+the time cost of DLAT remains close to that of AT, TRADES, MART, and              Funding acquisition.
+AWP. By achieving improved robustness and faster convergence with-
+out sacrificing efficiency, DLAT offers a practical solution for robust           Declaration of competing interest
+network traffic classification.
+                                                                                      The authors declare that they have no known competing finan-
+                                                                                  cial interests or personal relationships that could have appeared to
+7. Conclusion                                                                     influence the work reported in this paper.
+
+    In this paper, we investigated the vulnerability of deep traffic              Acknowledgments
+classifiers to adversarial examples and the label noise introduced by
+hard-label supervision in adversarial training. To address this issue, we            This work is funded by the Open Foundation of Key Laboratory of
+proposed DLAT, a dynamic adversarial training framework that assigns              Cyberspace Security, Ministry of Education of China and Henan Key
+soft labels to adversarial examples based on the similarity between               Laboratory of Cyberspace Situation Awareness (No. KLCS20240103),
+clean and perturbed outputs. This similarity-guided interpolation helps           National Natural Science Foundation of China (No. 62472345), and
+mitigate label noise and align the decision boundary more effectively.            Fundamental Research Funds for the Central Universities, China (No.
+Experimental results on traffic classification benchmarks demonstrate             QTZX25088).
+
+                                                                              9
+H. Tong et al.                                                                                                                   Computer Standards & Interfaces 97 (2026) 104111
+
+
+Appendix. The proof Theorem 1                                                              Data availability
+
+
+Theorem 1 (Excessive Boundary Shift Induced by Hard-Label Adversarial                         Data will be made available on request.
+Training ). Consider a binary classifier 𝑓 ∶  → [0, 1], with the pre-training
+decision boundary defined as:                                                              References
+pre = {𝒙 ∈  ∣ 𝑓pre (𝒙) = 0.5}.
+                                                                                            [1] A. Azab, M. Khasawneh, S. Alrabaee, K.-K.R. Choo, M. Sarsour, Network traffic
+                                                                                                classification: Techniques, datasets, and challenges, Digit. Commun. Netw. 10 (3)
+Suppose 𝒙𝐴 ∈ 𝐴 is a clean example from class A and 𝒙′𝐴 = 𝒙𝐴 + 𝛿 is an
+                                                                                                (2024) 676–692.
+adversarial example generated to cross pre , i.e., 𝑓pre (𝒙′𝐴 ) < 0.5. Let 𝑓post be         [2] H. Yuan, G. Li, A survey of traffic prediction: from spatio-temporal data to
+the classifier obtained via hard-label adversarial training using (𝒙′𝐴 , 𝑦𝐴 ) as                intelligent transportation, Data Sci. Eng. 6 (1) (2021) 63–85.
+supervision, where 𝑦𝐴 = 1. Then, under hard-label supervision, the training                 [3] A.W. Moore, K. Papagiannaki, Toward the accurate identification of net-
+                                                                                                work applications, in: International Workshop on Passive and Active Network
+objective enforces high-confidence predictions for 𝒙′𝐴 , i.e.,
+                                                                                                Measurement, Springer, 2005, pp. 41–54.
+                                                                                            [4] A. Madhukar, C. Williamson, A longitudinal study of P2P traffic classification,
+𝑓post (𝒙′𝐴 ) ≫ 0.5,
+                                                                                                in: 14th IEEE International Symposium on Modeling, Analysis, and Simulation,
+                                                                                                IEEE, 2006, pp. 179–188.
+which necessarily implies that the new decision boundary post = {𝒙 ∣
+                                                                                            [5] S. Fernandes, R. Antonello, T. Lacerda, A. Santos, D. Sadok, T. Westholm,
+𝑓post (𝒙) = 0.5} must satisfy                                                                   Slimming down deep packet inspection systems, in: IEEE INFOCOM Workshops
+                                                                                                2009, IEEE, 2009, pp. 1–6.
+                       𝑓post (𝒙′𝐴 ) − 0.5
+dist(𝒙′𝐴 , post ) =                        .                                               [6] N. Hubballi, M. Swarnkar, M. Conti, BitProb: Probabilistic bit signatures for
+                       ‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝                                                       accurate application identification, IEEE Trans. Netw. Serv. Manag. 17 (3) (2020)
+                                                                                                1730–1741, http://dx.doi.org/10.1109/TNSM.2020.2999856.
+                                                                                            [7] A. Azab, P. Watters, R. Layton, Characterising network traffic for skype forensics,
+Proof. Let 𝒙𝐴 ∈ 𝐴 be a clean example correctly classified as class A,                          in: 2012 Third Cybercrime and Trustworthy Computing Workshop, 2012, pp.
+and let 𝒙′𝐴 = 𝒙𝐴 + 𝛿 be its adversarial variant generated to cross the                          19–27, http://dx.doi.org/10.1109/CTC.2012.14.
+original decision boundary pre , i.e.,                                                     [8] H. Mohajeri Moghaddam, Skypemorph: Protocol Obfuscation for Censorship
+                                                                                                Resistance, University of Waterloo, 2013.
+𝑓pre (𝒙′𝐴 ) < 0.5.                                                                          [9] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015)
+                                                                                                436–444.
+Hard-label adversarial training uses the tuple (𝒙′𝐴 , 𝑦𝐴 = 1) as supervised                [10] M. Lotfollahi, M.J. Siavoshani, R.S.H. Zade, M. Saberian, Deep packet: a novel
+data, forcing the model 𝑓post to assign high confidence to 𝒙′𝐴 :                                approach for encrypted traffic classification using deep learning, Soft Comput.
+                                                                                                24 (2017) 1999–2012, URL https://api.semanticscholar.org/CorpusID:35187639.
+𝑓post (𝒙′𝐴 ) → 1.                                                                          [11] L. Yang, A. Finamore, F. Jun, D. Rossi, Deep learning and traffic classification:
+                                                                                                Lessons learned from a commercial-grade dataset with hundreds of encrypted
+    Now, consider the new decision boundary:                                                    and zero-day applications, 2021, arXiv preprint arXiv:2104.03182.
+                                                                                           [12] M.H. Pathmaperuma, Y. Rahulamathavan, S. Dogan, A.M. Kondoz, Deep learning
+post = {𝒙 ∣ 𝑓post (𝒙) = 0.5}.                                                                  for encrypted traffic classification and unknown data detection, Sensors 22 (19)
+                                                                                                (2022) 7643.
+We approximate 𝑓post in a neighborhood of 𝒙′𝐴 using a first-order Taylor                   [13] X. Lin, G. Xiong, G. Gou, Z. Li, J. Shi, J. Yu, Et-bert: A contextualized datagram
+                                                                                                representation with pre-training transformers for encrypted traffic classification,
+expansion:                                                                                      in: Proceedings of the ACM Web Conference 2022, 2022, pp. 633–642.
+                                                                                           [14] X. Ma, W. Zhu, J. Wei, Y. Jin, D. Gu, R. Wang, EETC: An extended encrypted
+𝑓post (𝒙) ≈ 𝑓post (𝒙′𝐴 ) + ∇𝒙 𝑓post (𝒙′𝐴 )⊤ (𝒙 − 𝒙′𝐴 ).                                         traffic classification algorithm based on variant resnet network, Comput. Secur.
+                                                                                                128 (2023) 103175.
+Let 𝒙 ∈ post denote the closest point on the new boundary to 𝒙′𝐴 . By                    [15] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial
+definition,                                                                                     examples, in: International Conference on Learning Representations, ICLR, 2014.
+                                                                                           [16] A.M. Sadeghzadeh, S. Shiravi, R. Jalili, Adversarial network traffic: Towards
+𝑓post (𝒙 ) = 0.5.                                                                              evaluating the robustness of deep-learning-based network traffic classification,
+                                                                                                IEEE Trans. Netw. Serv. Manag. 18 (2) (2021) 1962–1976.
+Using the linear approximation, we have:                                                   [17] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning
+                                                                                                models resistant to adversarial attacks, in: International Conference on Learning
+0.5 ≈ 𝑓post (𝒙′𝐴 ) + ∇𝒙 𝑓post (𝒙′𝐴 )⊤ (𝒙 − 𝒙′𝐴 ).                                              Representations, ICLR, 2018.
+                                                                                           [18] C. Dong, L. Liu, J. Shang, Label noise in adversarial training: A novel per-
+Solving for the shift vector:                                                                   spective to study robust overfitting, Adv. Neural Inf. Process. Syst. 35 (2022)
+                                                                                                17556–17567.
+∇𝒙 𝑓post (𝒙′𝐴 )⊤ (𝒙 − 𝒙′𝐴 ) ≈ 0.5 − 𝑓post (𝒙′𝐴 ).                                         [19] W. Wang, M. Zhu, J. Wang, X. Zeng, Z. Yang, End-to-end encrypted traffic
+                                                                                                classification with one-dimensional convolution neural networks, in: 2017 IEEE
+Let 𝒗 = ∇𝒙 𝑓post (𝒙′𝐴 )∕‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝 be the normalized gradient (i.e., the                International Conference on Intelligence and Security Informatics, ISI, IEEE,
+local normal direction to the decision boundary). Then the minimal                              2017, pp. 43–48.
+distance from 𝒙′𝐴 to the boundary is:                                                      [20] J. Lan, X. Liu, B. Li, Y. Li, T. Geng, DarknetSec: A novel self-attentive deep
+                                                                                                learning method for darknet traffic classification and application identification,
+                     |𝑓post (𝒙′𝐴 ) − 0.5|                                                       Comput. Secur. 116 (2022) 102663.
+‖𝒙 − 𝒙′𝐴 ‖𝑝 =                        .                                                    [21] K. Fauvel, F. Chen, D. Rossi, A lightweight, efficient and explainable-by-design
+                   ‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝                                                           convolutional neural network for internet traffic classification, in: Proceedings
+    As 𝑓post (𝒙′𝐴 ) → 1, this implies:                                                          of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining,
+                                                                                                2023, pp. 4013–4023.
+                         0.5
+dist(𝒙′𝐴 , post ) →                  .                                                    [22] Z. Liu, Y. Xie, Y. Luo, Y. Wang, X. Ji, TransECA-net: A transformer-based model
+                   ‖∇𝒙 𝑓post (𝒙′𝐴 )‖𝑝                                                           for encrypted traffic classification, Appl. Sci. 15 (6) (2025) 2977.
+    This lower bound quantifies how far the decision boundary must                         [23] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R.
+                                                                                                Fergus, Intriguing properties of neural networks, 2013, arXiv:1312.6199.
+move beyond 𝒙′𝐴 to satisfy 𝑓post (𝒙′𝐴 ) = 1. If ∇𝒙 𝑓post (𝒙′𝐴 ) is not vanish-
+                                                                                           [24] A. Kurakin, I.J. Goodfellow, S. Bengio, Adversarial examples in the physical
+ingly large, this distance is significant. Finally, since 𝒙′𝐴 was crafted to                    world, in: Artificial Intelligence Safety and Security, Chapman and Hall/CRC,
+lie just beyond pre , i.e., in close proximity to the original boundary,                       2018, pp. 99–112.
+the boundary movement beyond 𝒙′𝐴 implies that the new decision                             [25] N. Carlini, D. Wagner, Towards evaluating the robustness of neural networks, in:
+                                                                                                2017 IEEE Symposium on Security and Privacy, S&P, IEEE, 2017, pp. 39–57.
+boundary has crossed deep into the region previously occupied by class
+                                                                                           [26] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically princi-
+B. Therefore, class-B examples in the vicinity of 𝒙′𝐴 are likely to be                          pled trade-off between robustness and accuracy, in: International Conference on
+misclassified as class A under 𝑓post . □                                                        Machine Learning, PMLR, 2019, pp. 7472–7482.
+
+
+                                                                                      10
+H. Tong et al.                                                                                                                       Computer Standards & Interfaces 97 (2026) 104111
+
+
+[27] Y. Wang, D. Zou, J. Yi, J. Bailey, X. Ma, Q. Gu, Improving adversarial                     [32] G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected
+     robustness requires revisiting misclassified examples, in: International Conference             convolutional networks, in: Proceedings of the IEEE Conference on Computer
+     on Learning Representations, ICLR, 2019.                                                        Vision and Pattern Recognition, 2017, pp. 4700–4708.
+[28] D. Wu, S.-T. Xia, Y. Wang, Adversarial weight perturbation helps robust                    [33] A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M.
+     generalization, Adv. Neural Inf. Process. Syst. 33 (2020) 2958–2969.                            Andreetto, H. Adam, Mobilenets: Efficient convolutional neural networks for
+[29] G.D. Gil, A.H. Lashkari, M. Mamun, A.A. Ghorbani, Characterization of encrypted                 mobile vision applications, 2017, arXiv preprint arXiv:1704.04861.
+     and VPN traffic using time-related features, in: Proceedings of the 2nd Interna-           [34] S. Zagoruyko, N. Komodakis, Wide residual networks, 2016, arXiv preprint
+     tional Conference on Information Systems Security and Privacy, ICISSP 2016,                     arXiv:1605.07146.
+     SciTePress Setúbal, Portugal, 2016, pp. 407–414.                                           [35] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by
+[30] S. Dadkhah, H. Mahdikhani, P.K. Danso, A. Zohourian, K.A. Truong, A.A.                          back-propagating errors, Nature 323 (6088) (1986) 533–536.
+     Ghorbani, Towards the development of a realistic multidimensional IoT profiling            [36] N. Qian, On the momentum term in gradient descent learning algorithms, Neural
+     dataset, in: 2022 19th Annual International Conference on Privacy, Security &                   Netw. 12 (1) (1999) 145–151.
+     Trust, PST, IEEE, 2022, pp. 1–11.                                                          [37] F. Croce, M. Hein, Reliable evaluation of adversarial robustness with an ensemble
+[31] K. He, X. Zhang, S. Ren, J. Sun, Identity mappings in deep residual networks,                   of diverse parameter-free attacks, in: ICML, 2020.
+     in: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, the
+     Netherlands, October 11–14, 2016, Proceedings, Part IV 14, Springer, 2016, pp.
+     630–645.
+
+
+
+
+                                                                                           11
+
--- a/papers_txt/Robust-zero-watermarking-method-for-multi-medical-image_2026_Computer-Standa.txt
+++ b/papers_txt/Robust-zero-watermarking-method-for-multi-medical-image_2026_Computer-Standa.txt
--- a/papers_txt/Sharing-as-You-Desire--A-fuzzy-certificateless-proxy-re-e_2026_Computer-Stan.txt
+++ b/papers_txt/Sharing-as-You-Desire--A-fuzzy-certificateless-proxy-re-e_2026_Computer-Stan.txt
@@ -0,0 +1,946 @@
+                                                                Computer Standards & Interfaces 97 (2026) 104121
+
+
+                                                                    Contents lists available at ScienceDirect
+
+
+                                                          Computer Standards & Interfaces
+                                                             journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+Sharing as You Desire: A fuzzy certificateless proxy re-encryption scheme for
+efficient and privacy-preserving cloud data sharing
+Jiasheng Chen a , Zhenfu Cao a ,∗, Liangliang Wang b,c , Jiachen Shen a , Xiaolei Dong a
+a
+  East China Normal University, Software Engineering Institute, Shanghai Collaborative Innovation Center of Trusted Industry Internet
+Software, Shanghai, 200062, China
+b
+  Shanghai University of Electric Power, Faculty of Artificial Intelligence, Shanghai, 201306, China
+c
+  Police Integration Computing Key Laboratory of Sichuan Province, Luzhou, 646000, China
+
+
+
+ARTICLE                INFO                                ABSTRACT
+
+Keywords:                                                  Secure sharing mechanism in the cloud environment not only needs to realize efficient ciphertext storage of
+Cloud security                                             resource-constrained clients, but also needs to build a trusted data sharing system. Aiming at the limitations of
+Proxy re-encryption                                        existing schemes in terms of user identity privacy protection, insufficient access control granularity, and data
+Certificateless cryptography
+                                                           sharing security, we propose a fuzzy certificateless proxy re-encryption (FCL-PRE) scheme. In order to achieve
+Conditional privacy
+                                                           much better fine-grained delegation and effective conditional privacy, our scheme regards the conditions as an
+                                                           attribute set associated with pseudo-identities, and re-encryption can be performed if and only if the overlap
+                                                           distance of the sender’s and receiver’s attribute sets meets a specific threshold. Moreover, the FCL-PRE scheme
+                                                           ensures anonymity, preventing the exposure of users’ real identities through ciphertexts containing identity
+                                                           information during transmission. In the random oracle model, FCL-PRE not only guarantees confidentiality,
+                                                           anonymity, and collusion resistance but also leverages the fuzziness of re-encryption to provide a certain level
+                                                           of error tolerance in the cloud-sharing architecture. Experimental results indicate that, compared to other
+                                                           existing schemes, FCL-PRE offers up to a 44.6% increase in decryption efficiency while maintaining the lowest
+                                                           overall computational overhead.
+
+
+
+1. Introduction                                                                                      In response to the demand for secure cloud data sharing, the proxy
+                                                                                                 re-encryption (PRE) [4] scheme was proposed. This technology not
+    As information technology and the Internet continue to evolve,                               only allows data to be stored on the cloud server but also capitalizes
+users can now access networks anytime and anywhere through mo-                                   on the cloud’s computing capabilities to securely achieve decryption
+bile devices, driving the widespread adoption of cloud services. By                              authorization in Fig. 1. In a typical PRE scheme, key generation center
+leveraging flexible resource scheduling and high network accessibility,                          (KGC) is responsible for generating the system’s public parameters
+cloud computing has attracted enterprises such as Amazon, Google,                                and issuing public–private key pairs for registered users based on the
+and Alibaba to introduce cloud-based data storage, access, and shar-                             master secret key. Generally, the data sender encrypts information
+ing services [1–3]. However, cloud service providers are not always                              with their own 𝐼𝐷 (i.e., e-mail account, phone numbers) and produces
+completely trustworthy. Due to factors such as technical limitations                             the re-encryption key for authorized users, which is stored on the
+or economic incentives, they may engage in practices that could com-                             cloud server alongside the ciphertext. Only the authorized recipient
+promise users’ rights. In recent years, data breaches have occurred
+                                                                                                 can instruct the cloud server to perform ciphertext transformation using
+frequently: in 2018, Tesla’s Kubernetes console on AWS was left un-
+                                                                                                 the re-encryption key, thereby achieving secure data sharing. However,
+secured, allowing attackers to exploit the cloud environment; in 2019,
+                                                                                                 despite simplifying certificate management, traditional identity-based
+Capital One faced misconfigurations on AWS, enabling hackers to gain
+                                                                                                 proxy re-encryption (IB-PRE [5]) still suffers from several limitations:
+unauthorized access and disclose more than 100 million user data. Ev-
+                                                                                                 (1) it relies on the KGC for key escrow, meaning that if the KGC is
+idently, although outsourcing data to the cloud can reduce the burden
+of hardware maintenance, it also deprives users of direct control over                           compromised or acts maliciously, users’ private keys are at serious risk
+their data, thereby increasing the risk of potential privacy breaches.                           of exposure; (2) it lacks flexible dynamic authorization, such that even
+
+
+
+    ∗ Corresponding author.
+    E-mail addresses: jschen@stu.ecnu.edu.cn (J. Chen), zfcao@sei.ecnu.edu.cn (Z. Cao), llwang@shiep.edu.cn (L. Wang), jcshen@sei.ecnu.edu.cn (J. Shen),
+dongxiaolei@sei.ecnu.edu.cn (X. Dong).
+
+https://doi.org/10.1016/j.csi.2025.104121
+Received 30 June 2025; Received in revised form 23 November 2025; Accepted 21 December 2025
+Available online 23 December 2025
+0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+J. Chen et al.                                                                                                    Computer Standards & Interfaces 97 (2026) 104121
+
+
+                                                                                  progressed, the limitations of the original PRE model gradually be-
+                                                                                  came evident. For example, a malicious user may collude with the
+                                                                                  proxy to recover the sender’s private key. Ateniese et al. [12] later
+                                                                                  presented a unidirectional PRE scheme that offers a certain level of
+                                                                                  resistance against collusion attacks, although it still depends on a
+                                                                                  public key infrastructure (PKI) for certificate management. Gentry [13]
+                                                                                  addressed the burden imposed by PKI by introducing the paradigm
+                                                                                  of certificate-based cryptography, thereby eliminating the need for
+                 Fig. 1. Data sharing based on proxy re-encryption.
+                                                                                  online third-party certificate queries. Sur et al. [14] further applied
+                                                                                  this paradigm by designing a certificate-based encryption scheme. They
+                                                                                  were the first to combine it with proxy re-encryption, and thus pro-
+minor changes in a user’s identity information require the regeneration           posed a certificate-based proxy re-encryption (CB-PRE) scheme that
+of private keys, thus increasing administrative overhead and system               achieves chosen-ciphertext (IND-CCA) security in the random oracle
+complexity; and (3) it struggles to satisfy the requirements of high-             model. On the other hand, to further simplify the public key infrastruc-
+privacy scenarios. For instance, in mobile healthcare, patients’ private          ture, Green and Ateniese [5] extended PRE to identity-based scenarios,
+information may be directly used as public keys for encryption [6–8].             significantly reducing certificate management overhead by replacing
+Once an attacker traces such identifiers to a patient’s real identity, a          traditional public keys with user identifiers and achieving adaptive
+severe privacy breach can result, endangering the patient’s information           CCA security. In this context, Ge et al. [15] designed an identity-
+security.                                                                         based broadcast PRE (BPRE) scheme that supports revocation of a
+    To address the challenges of insufficient anonymity, key escrow,              shared user set and can resist chosen-plaintext attacks, while Zhang
+and difficulty in dynamic privilege adjustment, we propose an anony-              et al. [16] employed bilinear pairings to construct an identity-based
+mous fuzzy certificateless proxy re-encryption scheme (FCL-PRE). Our              BPRE scheme for VANETs that achieves CPA security with constant
+scheme not only supports identity hiding and fuzzy matching, but                  decryption overhead.
+also effectively prevents unauthorized access and significantly improves
+                                                                                      (2) Conditional PRE schemes: Once the basic transformation capabil-
+system error tolerance. The main contributions of FCL-PRE are as
+                                                                                  ity of PRE had been established, researchers began to enrich PRE with
+follows.
+                                                                                  more expressive access control and privacy guarantees. In traditional
+     • Fuzzy certificateless PRE with conditional privacy. A new                  PRE systems, once the proxy obtains a re-encryption key, it can often
+       fuzzy certificateless proxy re-encryption scheme that is tolerant          convert all ciphertexts of the delegator for the designated delegatee,
+       to noisy biometric measurements is proposed. Specifically, the             which is incompatible with fine-grained authorization requirements. To
+       trusted authority first derives a stable, unique biometric iden-           address this issue, Weng et al. [19] first proposed conditional proxy
+       tity 𝑈 𝐼𝐷 from noisy biometric samples, and then generates a               re-encryption (CPRE). In their construction, a condition expression is
+       pseudo-identity with a specific set of attributes 𝜔 = (𝜔𝑖 )𝑛𝑖=1            embedded into the re-encryption key, so that the proxy is only able
+       for it. Re-encryption is allowed only when the overlap between             to transform ciphertexts that satisfy the specified condition, which
+       the sender’s and receiver’s attribute sets satisfies a threshold           enforces strict control over the proxy’s capability at the semantic level.
+       condition, that is |𝜔 ∩ 𝜔′ | ≥ 𝑑. This policy enforces conditional         At the same time, Ateniese et al. [22] presented a PRE scheme with key
+       privacy on top of pseudo-identities, simplifies key management in          privacy. Even if an adversary obtains a re-encryption key, it cannot dis-
+       the certificateless setting, and enables flexible and efficient data       tinguish the delegatee’s identity, which further protects the receiver’s
+       sharing among users with similar attributes.                               privacy. Shao et al. [18] achieved key privacy while preserving CCA
+     • Anonymous data sharing via pseudonyms. The proposed                        security. Li et al. [17] incorporated the idea of conditional PRE into
+       scheme enhances conditional privacy and reduces the cost of                certificate-based cryptography. Their scheme allows only ciphertexts
+       managing pseudonyms by tightly binding biometrics, pseudo-                 associated with specific subsets to be transformed and forwarded to
+       identities, and strong keys. The trusted authority internally main-        designated delegatees, and also attains CCA security. In order to sup-
+       tains a mapping (𝑈 𝐼𝐷, 𝑃 𝑈 𝐼𝐷, 𝜔), where 𝜔 is associated with              port more expressive access structures, Yao et al. [21] designed a CPRE
+       𝑃 𝑈 𝐼𝐷. Thus, the privacy-preserving pseudo-identity can only              scheme with ciphertext evolution, which ensures that the delegation
+       be recovered by the fully trusted authority. Meanwhile, a user             process remains under the data owner’s control. Li et al. [20] proposed
+       can encrypt and share data on behalf of an attribute group                 a CPRE scheme that supports only a single receiver. Lin et al. [30]
+       using a single 𝑃 𝑈 𝐼𝐷, rather than maintaining many separate               developed a CPRE scheme tailored for IoT scenarios, which supports
+       pseudonyms, thus significantly reducing the key management                 revocation of misbehaving users without relying on a fully trusted
+       overhead on the user side.                                                 third party. Zhang et al. [31] designed a key-sharing mechanism based
+     • Security and practicality. We provide a detailed security proof            on CPRE and combined it with a bilinear accumulator to verify the
+       of FCL-PRE in the random oracle model, demonstrating that it               integrity of homomorphic encryption keys stored in the cloud. Chen
+       satisfies chosen plaintext attack (IND-CPA) security. Theoreti-            et al. [25] constructed a conditional BPRE scheme based on bilinear
+       cal analysis and experimental results show that FCL-PRE not                pairings under conditional constraints.
+       only achieves anonymity, error tolerance, and resistance to collu-             (3) Certificateless-based PRE schemes: Due to the inherent key escrow
+       sion attack, but also has minimal computational overhead in the            problem in identity-based cryptography, Sur et al. [32] introduced
+       decryption phase.                                                          PRE into the certificateless public key setting [33], and then proposed
+                                                                                  the concept of certificateless proxy re-encryption (CL-PRE). In CL-PRE,
+2. Related work                                                                   each user’s private key is split into a partial private key generated
+                                                                                  by a key generation center (KGC) and a user-chosen secret value.
+    (1) Basic PRE schemes: In 1998, Blaze et al. [4] first introduced the         This design avoids full key escrow by the KGC and does not require
+notion of proxy re-encryption (PRE), which enables a semi-honest proxy            traditional certificate management, which makes CL-PRE particularly
+to transform ciphertexts without accessing the underlying decryption              suitable for resource-constrained environments. Within this framework,
+keys. Subsequent early works primarily examined how to delegate                   Bhatia et al. [34] constructed a lightweight pairing-free CL-PRE scheme
+decryption capabilities securely and efficiently so as to support data            and applied it to mobile healthcare scenarios. Eltayieb et al. [35]
+sharing and access control in cloud environments [9–11]. As research              further adopted blockchain as the proxy to execute the re-encryption
+
+                                                                              2
+J. Chen et al.                                                                                                             Computer Standards & Interfaces 97 (2026) 104121
+
+
+Table 1
+Summary of functional comparison with other schemes.
+ Schemes            Techniques             Conditional privacy       Fuzzy matching            Anonymity              Multiple receivers              Collusion resistance
+ [13,14,17]         CB-PRE                 ×                         ×                         ×                      ×                               ✓
+ [18]               CPRE                   ✓                         ×                         ✓                      ✓                               ×
+ [15,16]            IB-PRE                 ×                         ×                         ×                      ✓                               ✓
+ [19,20]            CPRE                   ✓                         ×                         ×                      ×                               ×
+ [21]               IB-CPRE                ✓                         ×                         ×                      ✓                               ✓
+ [22]               CPRE                   ✓                         ×                         ✓                      ×                               ×
+ [23,24]            CL-PRE                 ×                         ×                         ×                      ✓                               ✓
+ [25]               IB-CPRE                ✓                         ×                         ✓                      ✓                               ✓
+ [26,27]            Fuzzy IB-CPRE          ✓                         ✓                         ×                      ✓                               ×
+ [28,29]            CL-CPRE                ✓                         ×                         ×                      ✓                               ✓
+ Ours               Fuzzy CL-CPRE          ✓                         ✓                         ✓                      ✓                               ✓
+
+
+
+algorithm, which not only preserves data confidentiality but also pro-            3.1. Bilinear map
+vides a flexible revocation mechanism. Subsequent CL-PRE works [23,
+24,36] mainly focused on improving efficiency, supporting revocation,                Suppose there exists a mapping 𝑒 ∶ G × G → G𝑇 , where G and
+and enhancing traceability. Similarly, to prevent cloud platforms from            G𝑇 represent two cyclic groups with the same prime order 𝑞. 𝑃 is
+abusing re-encryption permissions, Li et al. [28] proposed a novel                a generator of G, then a bilinear map 𝑒 should have the following
+pairing-free scheme based on certificateless conditional BPRE. Zhou               properties [40]:
+et al. [29] combined certificateless public key cryptography and PRE,
+                                                                                      • Bilinearity: 𝑒(𝑎𝑃 , 𝑏𝑃 ) = 𝑒(𝑃 , 𝑃 )𝑎𝑏 holds for all 𝑎, 𝑏 ∈ 𝑍𝑞∗ .
+which realizes multi-level data access control, dynamic key update, and
+ciphertext evolution.                                                                 • Nondegeneracy: There exists 𝑃 such that 𝑒(𝑃 , 𝑃 ) ≠ 1.
+    (4) Fuzzy PRE schemes: In another line of research, advances in                   • Computability: 𝑒(𝑃1 , 𝑃2 ) can be computed efficiently for all 𝑃1 , 𝑃2
+biometric technologies have introduced new design dimensions for                        ∈ G.
+PRE. Fuzzy identity-based encryption (FIBE) [37] leverages biometric
+characteristics such as fingerprints and irises, which are inherently             3.2. Useful definitions
+unique and tamper-resistant, to derive descriptive attribute sets that
+serve as a natural attribute space for encryption and authorization.              Definition 1 (Shamir Secret Sharing [41]). Shamir’s secret sharing
+Following this idea, Fang et al. [26] proposed an FCPRE scheme in                 scheme, introduced in 1979, is based on polynomial interpolation. A
+which descriptive keywords are used as conditions to realize fuzzy                secret 𝑠 is divided into 𝑛 shares, denoted as 𝑠1 , … , 𝑠𝑛 with a threshold
+                                                                                  𝑡, such that any set of at least 𝑡 participants 𝑖 can recover 𝑠, whereas
+conditional PRE. In their scheme, the proxy can re-encrypt ciphertexts
+                                                                                  any subset of size less than 𝑡 gains no information about it. The scheme
+according to a 𝑡-out-of-𝑑 threshold strategy. Xiong et al. [38] later
+                                                                                  consists of the following phases:
+proposed an improved pairing-based fuzzy identity-based signature
+(FIBS) scheme that supports the error tolerance property. Li et al. [27]              • Secret distribution: Let  = {1 , … , 𝑛 } denote the set of par-
+presented the first lattice-based FIB-CPRE scheme. Their scheme pro-                    ticipants and randomly select the secret value 𝑠 ∈ 𝑍𝑞∗ . Then, a
+vides finer-grained control over delegated decryption, but incurs high                  polynomial 𝐹 (𝑥) of degree 𝑡 − 1 is selected that satisfying the
+computational cost, which negatively affects overall encryption and                     condition of 𝐹 (0) = 𝑠, then 𝐹 (𝑥) can be expressed as:
+decryption efficiency. It should be noted that the use of biometric
+                                                                                                      ∑
+                                                                                                      𝑡−1
+traits can significantly improve usability, but the noise inevitably intro-             𝐹 (𝑥) = 𝑠 +         𝑎𝑗 𝑥𝑗 mod 𝑞.
+duced during biometric acquisition and feature extraction makes key                                   𝑗=1
+generation and matching more challenging. To cope with this issue,                      Therefore, the share set 𝑆𝑆 = {(𝜔𝑖 , 𝑠𝑖 )|1 ≤ 𝑖 ≤ 𝑛}, where 𝐹 (𝜔𝑖 ) =
+Wang et al. [39] proposed a novel fuzzy certificateless signature au-                   𝑠𝑖 . The 𝑖th share (𝜔𝑖 , 𝑠𝑖 ) is privately delivered to the corresponding
+thentication scheme that achieves conditional privacy while effectively                 participant 𝑖 .
+protecting the confidentiality of users’ real biometric characteristics.              • Secret reconstruction: Let 𝑆 ⊆ {1, … , 𝑛} be a group with |𝑆| = 𝑡.
+    As summarized in Table 1, existing PRE schemes and their variants                   The secret value is reconstructed from shares 𝑠1 , … , 𝑠𝑛 using the
+have achieved substantial progress in terms of functionality and ap-                    Lagrange interpolation method:
+plicability to diverse scenarios. However, several important limitations                          ∑                         ∑
+remain.                                                                                 𝐹 (𝑥) =       𝛥𝜔𝑖 ,𝑆 (𝑥)𝐹 (𝜔𝑖 ) =       𝛥𝜔𝑖 ,𝑆 (𝑥)𝑠𝑖 .
+                                                                                                 𝑖 ∈𝑆                       𝑖 ∈𝑆
+
+     • The scalability on the receiver side is restricted. Many schemes                                           ∏           𝑥−𝜔𝑘
+                                                                                        where 𝛥𝜔𝑖 ,𝑆 (𝑥) =          𝑖 ∈𝑆,𝑘≠𝑖 𝜔𝑖 −𝜔𝑘    is denoted as the Lagrange
+       do not efficiently support data sharing among multiple receivers,                coefficient.
+       which limits their practicality in large-scale collaborative appli-
+       cations, such as schemes [14,17,20].                                       Definition 2 (Decisional Bilinear Diffie–Hellman (DBDH) Assumption).
+     • The strong binding between real identities and biometric char-             Given a random instance (𝑃 , 𝑎𝑃 , 𝑏𝑃 , 𝑐𝑃 , 𝑇 ), 𝑃 ∈ G, 𝑎, 𝑏, 𝑐 are randomly
+       acteristics introduces significant privacy risks. Some biometric-          selected elements from 𝑍𝑞∗ , and 𝑇 is an element in G𝑇 . The DBDH
+       based schemes do not adequately protect the identity privacy               assumption requires determining whether 𝑇 is equal to 𝑒(𝑃 , 𝑃 )𝑎𝑏𝑐 or
+       of senders and receivers, and therefore cannot satisfy stringent           a random element in G𝑇 . For any PPT algorithms , the advantage
+       privacy requirements, as in schemes [23,24,26,28,29].                      of successfully distinguishing between 𝑇 = 𝑒(𝑃 , 𝑃 )𝑎𝑏𝑐 and a random
+                                                                                  element is defined as follows.
+3. Preliminaries                                                                  𝐴𝑑𝑣𝐷𝐵𝐷𝐻 (𝜆) = |𝑃 𝑟[(𝑃 , 𝑎𝑃 , 𝑏𝑃 , 𝑐𝑃 , 𝑒(𝑃 , 𝑃 )𝑎𝑏𝑐 ) = 1]|
+                                                                                     
+                                                                                                 − |𝑃 𝑟[(𝑃 , 𝑎𝑃 , 𝑏𝑃 , 𝑐𝑃 , 𝑇 ) = 1]|
+    This section briefly overviews the basic concepts and techniques
+discussed in our scheme. Table 2 provides a list of symbols and their                If the advantage 𝐴𝑑𝑣𝐷𝐵𝐷𝐻
+                                                                                                         
+                                                                                                              (𝜆) in solving the DBDH is negligible, then
+descriptions.                                                                     the DBDH assumption holds.
+
+                                                                              3
+J. Chen et al.                                                                                                      Computer Standards & Interfaces 97 (2026) 104121
+
+
+Table 2
+Summary of notations.
+ Symbol                                       Description
+ 𝜆                                            Security parameter
+ 𝑚𝑠𝑘                                          Master secret key
+ 𝑏𝑖𝑜                                          Biometric characteristic
+ 𝐼𝑑𝐺𝑒𝑛(⋅)                                     An identity extraction function
+ 𝑈 𝐼𝐷                                         Realistic identity
+ 𝑃 𝑈 𝐼𝐷                                       Pseudo-identity
+ 𝑑                                            Error tolerance
+ 𝜔                                            An attribute set
+ 𝑥𝑃 𝑈 𝐼𝐷                                      Secret value
+ 𝑆𝐾𝑃 𝑈 𝐼𝐷                                     User’s full private key
+ 𝑃 𝐾𝑃 𝑈 𝐼𝐷                                    Public key
+ 𝑅𝐾,𝜔,                                      Re-encryption key
+ 𝐶𝑇                                           Original ciphertext
+ 𝐶𝑇 ′                                         Re-encrypted ciphertext
+
+
+
+                                                                                                     Fig. 2. The operation flow of FCL-PRE.
+Definition 3 (Syntax of FCL-PRE). The nine polynomial-time algorithms
+shown below constitute our FCL-PRE scheme.
+                                                                                       • Key Generation Center (KGC): As an honest but curious KGC, it
+     • Setup. On input a security parameter 𝜆, TA and KGC generate
+                                                                                         is responsible for performing system initialization and generating
+       system parameter 𝑝𝑎𝑟𝑎𝑚𝑠, and a master secret key 𝑚𝑠𝑘 that is kept
+                                                                                         a partial private key related to the user’s identity, and it is
+       secret from user.
+                                                                                         assumed that KGC and TA will not collude.
+     • PartialPrivateKey. After TA publishes the pseudo-identity 𝑃 𝑈 𝐼𝐷
+                                                                                       • Cloud Proxy Server (CPS): CPS is responsible for storing original
+       for each registered user, KGC generates the corresponding partial
+                                                                                         ciphertexts and executing conditional re-encryption operations.
+       private key 𝐷𝑃 𝑈 𝐼𝐷 and sends it to the user.
+                                                                                         When the receiver  sends an access request, CPS first verifies
+     • SetSecretValue. The sender  executes the algorithm, and
+                                                                                         whether the condition |𝜔 ∩ 𝜔′ | ≥ 𝑑. If so, sender  generates a cor-
+       chooses a secret value 𝑥𝑃 𝑈 𝐼𝐷 randomly.
+                                                                                         responding re-encryption key for CPS to perform re-encryption.
+     • SetPrivateKey. On input 𝑃 𝑈 𝐼𝐷, 𝑝𝑎𝑟𝑎𝑚𝑠, 𝑥𝑃 𝑈 𝐼𝐷 and 𝐷𝑃 𝑈 𝐼𝐷 , 
+                                                                                         Otherwise, CPS refuses to implement the re-encryption operation.
+       generates the complete private key 𝑆𝐾𝑃 𝑈 𝐼𝐷 .
+                                                                                         Please note that, as a semi-trusted entity, it may still attempt to
+     • SetPublicKey.  performs this algorithm, and inputs 𝑥𝑃 𝑈 𝐼𝐷 , then
+                                                                                         infer user privacy from the shared data.
+       outputs the full public key 𝑃 𝐾𝑃 𝑈 𝐼𝐷 .
+                                                                                       • Sender ():  can use the public key associated with 𝑃 𝑈 𝐼𝐷 to
+     • Encryption. On input 𝑃 𝑈 𝐼𝐷, 𝑝𝑎𝑟𝑎𝑚𝑠, a message 𝑚, and 𝑃 𝐾𝑃 𝑈 𝐼𝐷 ,                 encrypt the data to be shared, generate the original ciphertext
+        computes the original ciphertext 𝐶𝑇 .
+                                                                                         𝐶𝑇 and upload it to CPS storage. In addition,  produces the
+     • ReKey Generation. Given the private key 𝑆𝐾𝑃 𝑈 𝐼𝐷 , ’s pseudo-                    corresponding re-encryption key 𝑅𝐾 ,𝜔, according to the result
+       identity 𝑃 𝑈 𝐼𝐷′ and the corresponding 𝑃 𝐾𝑃 𝑈 𝐼𝐷′ ,  generates a                 of the verification equation, and sends it to CPS.
+       conditional re-encryption key 𝑅𝐾 ,𝜔, by running this algorithm.
+                                                                                       • Receiver (): The authorized receiver  can decrypt and obtain
+     • Re-encryption. Upon receiving 𝑅𝐾 ,𝜔, , the original ciphertext                  the plaintext by downloading the re-encrypted ciphertext.
+       𝐶𝑇 , the cloud should verify whether the equation |𝜔 ∩ 𝜔′ | ≥
+       𝑑 holds. If and only when the algorithm satisfies, the origi-
+       nal ciphertext 𝐶𝑇 can be re-encrypted, and the second-layer of               4.2. Security guarantee model
+       ciphertext 𝐶𝑇 ′ can be generated.
+     • Decryption. The user invokes it to decrypt the corresponding                    There are two types of adversaries in the certificateless cryptosys-
+       ciphertext, resulting in either the plaintext 𝑚 or ⟂.                        tem [42]: 1 is the first type of adversary, which can replace the user’s
+                                                                                    public key, and 2 is the second type of adversary, which can obtain
+4. Scheme model                                                                     the master secret key. Game-I and Game-II are the IND-CPA security
+                                                                                    games for FCL-PRE. Please note that each pseudo-identity 𝑃 𝑈 𝐼𝐷 is
+   In this section, we introduce the system model, outline the security             associated with an attribute set 𝜔.
+guarantee model, and specify security requirements, respectively.                      Game-I. This game embodies the attack ability of 1 , challenger 
+                                                                                    responds to 1 ’s a series queries by controlling the following oracles.
+4.1. System model
+                                                                                       • Initialization. When 𝜆 is received,  first executes the Setup
+   The operation flow of fuzzy certificateless proxy re-encryption                       algorithm to obtain 𝑝𝑎𝑟𝑎𝑚𝑠, and generates the system master key
+scheme is shown in Fig. 2. It includes five different parties, namely:                   𝑚𝑠𝑘. Then,  outputs 𝑝𝑎𝑟𝑎𝑚𝑠 and keeps 𝑚𝑠𝑘 in secret.
+Trusted Authority, Key Generation Center, Cloud Proxy Server, Sender,                  • Phase 1. The adversary 1 initiates a series of queries, and 
+and Receiver.                                                                            responds accordingly.
+
+     • Trusted Authority (TA): TA is a fully trusted authority whose                         – PPKQuery oracle 𝑝𝑝𝑘 :  executes the PartialPrivateKey
+       primary role is to generate privacy-preserving pseudo-identities                        algorithm to generate the partial private key 𝐷𝑃 𝑈 𝐼𝐷 for the
+       𝑃 𝑈 𝐼𝐷 for users and to cooperate with KGC in setting up and pub-                       𝑃 𝑈 𝐼𝐷 and returns it to 1 .
+       lishing the public parameters. At the same time, it maintains an                      – SKQuery oracle 𝑠𝑘 : After receiving the partial private key
+       internal mapping (𝑈 𝐼𝐷, 𝑃 𝑈 𝐼𝐷, 𝜔), where 𝜔 denotes the attribute                       𝐷𝑃 𝑈 𝐼𝐷 ,  first runs PartialPrivateKey and SetSecretValue
+       set associated with each 𝑃 𝑈 𝐼𝐷. Only the pseudo-identity and                           algorithms to obtain the corresponding 𝐷𝑃 𝑈 𝐼𝐷 and 𝑥𝑃 𝑈 𝐼𝐷 .
+       its associated attribute information are exposed to other entities,                     Next,  runs the SetPrivateKey algorithm to generate the
+       while the real identity 𝑈 𝐼𝐷 remains exclusively known to TA.                           complete private key 𝑆𝐾𝑃 𝑈 𝐼𝐷 , and returns it to 1 .
+
+                                                                                4
+J. Chen et al.                                                                                                          Computer Standards & Interfaces 97 (2026) 104121
+
+
+            – PKQuery oracle 𝑝𝑘 :  runs the SetSecretValue algorithm                        (3) If 2 has sent the private key queries to the challenge
+              to obtain 𝑥𝑃 𝑈 𝐼𝐷 , and extracts the user’s public key 𝑃 𝐾𝑃 𝑈 𝐼𝐷                    identity 𝑃 𝑈 𝐼𝐷𝜋 that meets the |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑 condition,
+              by running the SetPublicKey algorithm. Finally,  returns                           the re-encryption key queries can no longer be performed,
+              it to 1 .                                                                          and the information related to the re-encrypted ciphertext
+            – PK replacement oracle 𝑝𝑘𝑟𝑝 : When 1 queries a two-                                cannot be queried.
+              tuple (𝑃 𝑈 𝐼𝐷, 𝑃 ̃𝐾𝑃 𝑈 𝐼𝐷 ), where 𝑃 ̃𝐾𝑃 𝑈 𝐼𝐷 is the newly se-
+                                                                                         • Guess. Finally, 2 guesses the challenge bit 𝑏′ ∈ {0, 1}. If 𝑏′ = 𝑏,
+              lected public key to replace the public key 𝑃 𝐾𝑃 𝑈 𝐼𝐷 cur-
+                                                                                           2 wins this game.
+              rently associated with 𝑃 𝑈 𝐼𝐷. Therefore, 1 performs pub-
+              lic key replacement, such as 𝑃 𝐾𝑃 𝑈 𝐼𝐷 = 𝑃 ̃  𝐾𝑃 𝑈 𝐼𝐷 .
+                                                                                     Definition 5. According to the definition of Game-II, our FCL-PRE is
+            – ReKeyGen oracle 𝑟𝑘 :  runs the ReKey Generation al-
+                                                                                     IND-CPA secure if the advantage of 2 is negligible, defined as
+              gorithm and returns a re-encryption key 𝑅𝐾 ,𝜔, to 1 . If
+                                                                                                                       1
+              the public key of 𝑃 𝑈 𝐼𝐷 has been replaced at this time, 1            𝐴𝑑𝑣𝐺𝑎𝑚𝑒−𝐼𝐼
+                                                                                        
+                                                                                                (𝜆) = |𝑃 𝑟[𝑏′ = 𝑏] −     |.
+                                                                                          2                            2
+              cannot perform this query.
+            – Re-encryption oracle 𝑟𝑒𝑒𝑛 :  performs it and returns a re-
+                                                                                     4.3. Security requirements
+              encrypted 𝐶𝑇 ′ to 1 . If the public key of 𝑃 𝑈 𝐼𝐷 has been
+              replaced, 1 cannot perform the query.
+                                                                                        The proposed FCL-PRE scheme should satisfy the following security
+     • Challenge. After completing all the interactions between 1 and               objectives.
+       , 1 outputs a challenge identity 𝑃 𝑈 𝐼𝐷𝜋 and two messages of
+                                                                                         • Confidentiality. FCL-PRE must protect sensitive information before
+       equal length (𝑚0 , 𝑚1 ).  randomly selects a message 𝑚𝑏 , 𝑏 ∈ {0, 1},
+                                                                                           it is uploaded to the CPS and prevent any access by unauthorized
+       calculates the corresponding ciphertext and returns it to 1 .
+                                                                                           recipients. Additionally, when generating the original ciphertext
+     • Phase 2. 1 and challenger  continue to conduct queries and                        and re-encryption key, conditional information is incorporated to
+       answers similar to phase 1, but must follow three constraints.                      ensure that re-encryption can only be performed if the original
+                                                                                           ciphertext meets specific conditions.
+           (1) 1 has never queried the partial private key or private key
+                                                                                         • Anonymity. To protect user privacy, FCL-PRE must conceal the
+               for the challenge identity 𝑃 𝑈 𝐼𝐷𝜋 that meets the |𝜔 ∩ 𝜔𝜋 | ≥
+                                                                                           user’s real biometric identity. Unless it is a trusted third party,
+               𝑑.
+                                                                                           no adversary can establish a valid biometric identification as-
+           (2) If 1 sends the re-encryption key queries to a challenge
+                                                                                           sociation, thereby preventing the leakage of the user’s identity
+               identity 𝑃 𝑈 𝐼𝐷𝜋 that meets the |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑 condition, then
+                                                                                           information.
+               the partial private key queries or private key queries can no
+                                                                                         • Error tolerance. Considering that biometric characteristic may con-
+               longer be performed.
+                                                                                           tain some noise with each sampling, FCL-PRE must exhibit error
+           (3) If 1 has sent the partial private key or private key queries               tolerance. Specifically, when the distance between the biometric
+               to challenge identity 𝑃 𝑈 𝐼𝐷𝜋 that meets the |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑                  identity 𝜔 of the sender  and another identity 𝜔′ is higher than
+               condition, the re-encryption key queries can no longer be                   a predefined threshold 𝑑, the proxy can use the re-encryption
+               performed, and the information related to the re-encrypted                  key to generate the corresponding re-encrypted ciphertext for 𝜔′ ,
+               ciphertext cannot be queried.                                               enabling efficient data sharing.
+                                                                                         • Collusion resistance. In our FCL-PRE, even in the presence of semi-
+     • Guess. Finally, 1 guesses the challenge bit 𝑏′ ∈ {0, 1}. If 𝑏′ = 𝑏,
+                                                                                           trusted parties, such as collusion between CPS and the receiver,
+       1 wins this game.
+                                                                                           CPS cannot obtain the sender’s complete private key and thus
+                                                                                           cannot perform any decryption operations, ensuring the system’s
+Definition 4. According to the definition of Game-I, our FCL-PRE is                        security against internal collusion attacks.
+IND-CPA secure if the advantage of 1 is negligible, defined as
+                                 1
+𝐴𝑑𝑣𝐺𝑎𝑚𝑒−𝐼
+   
+          (𝜆) = |𝑃 𝑟[𝑏′ = 𝑏] −     |.                                                5. The proposed FCL-PRE scheme
+       1                         2
+
+   Game-II. The game embodies the attack ability of 2 , challenger                     In this section, we thoroughly describe FCL-PRE, which supports
+responds to 2 ’s a series queries by controlling the following oracles.             efficient fuzzy data sharing through anonymized biometric identities.
+Game-II is similar to Game-I, therefore, only their main differences are             The procedure flow of FCL-PRE is presented in Fig. 3.
+presented below.
+                                                                                     5.1. System initialization
+     • Initialization. When 𝜆 is received,  first executes the Setup
+       algorithm to obtain 𝑝𝑎𝑟𝑎𝑚𝑠, and generates a system master key                  (1) Upon inputting the security parameter 𝜆, KGC generates a bilinear
+       𝑚𝑠𝑘. Then,  returns them to 2 .                                                  pairing parameters (𝑒, G, G𝑇 , 𝑞, 𝑃 ), where G and G𝑇 represent two
+     • Phase 1. 2 issues a series of queries similar to those in Game-I,                 cyclic groups with the same prime order 𝑞, 𝑒 ∶ G × G → G𝑇 , 𝑃
+       and  responds accordingly. At this time, 2 lacks the ability to                  is the generator of G. Then, KGC selects 𝑠 ∈ 𝑍𝑞∗ randomly and
+       replace the public key.                                                            calculates the system public key 𝑃𝑝𝑢𝑏 = 𝑠𝑃 .
+     • Challenge. Similar to the Game-I.                                              (2) TA considers a symmetric key encryption scheme to hide the
+     • Phase 2. 2 and challenger  continue to conduct similar queries                   user’s realistic identity 𝑈 𝐼𝐷, denoted by 𝐸𝑛𝑐𝜙 (⋅) and 𝐷𝑒𝑐𝜙 (⋅).
+       and answers as in phase 1, but must follow three constraints.                      Here, 𝐸𝑛𝑐𝜙 (⋅) represents the encryption algorithm, 𝐷𝑒𝑐𝜙 (⋅) rep-
+                                                                                          resents the decryption algorithm, and 𝜙 is the shared symmetric
+           (1) 2 has never queried the private key for the challenge                     key.
+               identity 𝑃 𝑈 𝐼𝐷𝜋 that meets the |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑 condition.               (3) Finally, TA and KGC choose four collision-resistant hash func-
+           (2) If 2 sends the re-encryption key queries to a challenge                   tions: 𝐻1 ∶ {0, 1}∗ → G, 𝐻2 ∶ {0, 1}∗ → G, 𝐻3 ∶ {0, 1}∗ → G,
+               identity 𝑃 𝑈 𝐼𝐷𝜋 that meets the |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑 condition, then              and 𝐻4 ∶ {0, 1}∗ → 𝑍𝑞∗ , define the system parameters as 𝑝𝑎𝑟𝑎𝑚𝑠 =
+               the private key queries can no longer be performed.                        {G, G𝑇 , 𝑒, 𝑞, 𝑑, 𝑃 , 𝑃𝑝𝑢𝑏 , 𝐻1 , 𝐻2 , 𝐻3 , 𝐻4 }.
+
+                                                                                 5
+J. Chen et al.                                                                                                               Computer Standards & Interfaces 97 (2026) 104121
+
+
+
+
+                                                           Fig. 3. The algorithm procedure of FCL-PRE.
+
+
+5.2. User registration phase                                                            (1) 𝑗 picks a random number 𝑟𝑗 ∈ 𝑍𝑞∗ , and a polynomial 𝑔(𝑥) of
+                                                                                            degree 𝑑 − 1 such that 𝑔(0) = 𝑟𝑗 and assigns 𝑔(𝜔𝑖 ) = 𝑟𝑖,𝑗 , where
+    Before sharing data, each user must register their identity informa-                    𝑖 ∈ {1, … , 𝑛}. Then, 𝑗 computes
+tion with TA. Let the sender be denoted as 𝑗 . First, 𝑗 transmits the
+                                                                                            𝑈1 = 𝑟𝑗 𝑃 , 𝐸𝑗 = 𝐻2 (𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝑃𝑝𝑢𝑏 ),
+realistic biometric information 𝑏𝑖𝑜 (i.e., fingerprint) to TA via a secure                           ∏
+channel. Then, TA applies the identity extraction function 𝐼𝑑𝐺𝑒𝑛(⋅)                         𝑉1 = 𝑚        (𝑒(𝑃𝑝𝑢𝑏 , 𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))𝑟𝑖,𝑗 × 𝑒(𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 , 𝐸𝑗 )𝑟𝑖,𝑗 )𝛥𝜔𝑖 ,𝑆 (0)
+to convert 𝑏𝑖𝑜 into a unique biometric identity 𝑈 𝐼𝐷𝑗 = 𝐼𝑑𝐺𝑒𝑛(𝑏𝑖𝑜).                                  𝜔𝑖 ∈𝑆
+
+The 𝐼𝑑𝐺𝑒𝑛(⋅) function is similar to a hash function and is irreversible.                    𝑗 uploads the original ciphertext 𝐶𝑇 = (𝑈1 , 𝑉1 ) to the CPS.
+It transforms the biometrics into an identity that is indistinguishable
+                                                                                        (2) Finally, 𝑗 selects 𝑘 ∈ 𝑍𝑞∗ randomly, and computes 𝑅 = 𝑘𝑃 ,
+from random information and cannot be used to infer the original
+                                                                                            ℎ = 𝐻4 (𝑈1 ∥ 𝑉1 ∥ 𝑅 ∥ 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝑃 𝑈 𝐼𝐷𝑗 ). Then, 𝑗 generates a
+biometrics [39,41].
+                                                                                            signature 𝜎𝑗 = 𝑘 + ℎ ⋅ 𝑥𝑃 𝑈 𝐼𝐷𝑗 mod 𝑞, and transmits (𝑅, 𝜎) to the
+    Next, TA generates a pseudo-identity as 𝑃 𝑈 𝐼𝐷𝑗 = 𝐸𝑛𝑐𝜙 (𝑈 𝐼𝐷𝑗 ∥
+                                                                                            CPS.
+𝑛𝑃 𝑈 𝐼𝐷 ) ∥ 𝑇𝑗 to protect the real biometric identity, where 𝑛𝑃 𝑈 𝐼𝐷 repre-
+sents the number of pseudo-identities requested and 𝑇𝑗 is the validity
+period of the pseudo-identity. Meanwhile, TA internally maintains a                    5.4. Verification and sharing phase
+mapping (𝑈 𝐼𝐷𝑗 , 𝑃 𝑈 𝐼𝐷𝑗 , 𝜔), where 𝜔 is the attribute set associated with
+𝑃 𝑈 𝐼𝐷𝑗 . Eventually, TA publishes 𝑃 𝑈 𝐼𝐷𝑗 and keeps 𝑈 𝐼𝐷𝑗 secret.                         When a new receiver 𝑗 initiates an access request, 𝑗 first needs
+                                                                                       to send the current pseudo-identity to CPS. After the identity authen-
+  (1) Upon receiving the attribute set 𝜔 associated with 𝑗 ’s pseudo-                 tication is successful, CPS performs re-encryption operations based on
+      identity 𝑃 𝑈 𝐼𝐷𝑗 , KGC first randomly selects a polynomial 𝑝(𝑥) of               this pseudo-identity.
+      degree 𝑑 − 1 such that 𝑝(0) = 𝑠 and assigns 𝑝(𝜔𝑖 ) = 𝑠𝑖 , where
+      𝑖 ∈ {1, … , 𝑛}. Then it calculates the partial private key as 𝐷𝑖,𝑗 =              (1) The CPS first computes ℎ′ = 𝐻4 (𝑈1 ∥ 𝑉1 ∥ 𝑅 ∥ 𝑃 𝐾𝑃 𝑈 𝐼𝐷′ ∥
+                                                                                                                                                                               𝑗
+      𝑠𝑖 𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ). The partial private key (𝐷𝑖,𝑗 )𝑛𝑖=1 of 𝑗 is represented                                    ?
+      by KGC as 𝐷𝑃 𝑈 𝐼𝐷𝑗 .                                                                  𝑃 𝑈 𝐼𝐷𝑗′ ) and 𝜎𝑗 𝑃 = 𝑅 + ℎ′ ⋅ 𝑃 𝐾𝑃 𝑈 𝐼𝐷′ . After the signature verifi-
+                                                                                                                                     𝑗
+                                                                                            cation is successful, CPS selects a 𝑑-element subset, 𝑆 ⊆ 𝜔 ∩ 𝜔′
+  (2) After receiving the partial private key 𝐷𝑃 𝑈 𝐼𝐷𝑗 , 𝑗 can calculate
+                                                                                            randomly, and determines whether the input attribute set 𝜔′
+      Lagrange coefficients and perform local verification to ensure
+                                                                                            satisfies |𝜔 ∩ 𝜔′ | ≥ 𝑑, if yes, CPS returns the result to the sender.
+      consistency: 𝑒(𝐷𝑃 𝑈 𝐼𝐷𝑗 , 𝑃 ) = 𝑒(𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ), 𝑃𝑝𝑢𝑏 ). Then, 𝑗 chooses
+                                                                                        (2) 𝑗 generates the corresponding re-encryption key for the pseudo-
+      a random secret value 𝑥𝑃 𝑈 𝐼𝐷𝑗 ∈ 𝑍𝑞∗ , a polynomial 𝑦(𝑥) of degree
+                                                                                            identity based on the result. 𝑗 computes 𝜑 = 𝑒(𝐷𝑃 𝑈 𝐼𝐷𝑗 ,
+      𝑑 − 1 such that 𝑦(0) = 𝑥𝑃 𝑈 𝐼𝐷𝑗 , and lets 𝑦(𝜔𝑖 ) = 𝑥𝑖,𝑃 𝑈 𝐼𝐷𝑗 , where
+                                                                                            𝐻1 (𝑃 𝑈 𝐼𝐷𝑗′ )), 𝑅𝐾 ,𝜔, = −𝐷𝑃 𝑈 𝐼𝐷𝑗 − 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 + 𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷𝑗
+      𝑖 ∈ {1, … , 𝑛}. Then, 𝑗 ’s secret value (𝑥𝑖,𝑃 𝑈 𝐼𝐷𝑗 )𝑛𝑖=1 is defined as
+                                                                                            𝑃 𝐾𝑃 𝑈 𝐼𝐷′ ∥ 𝜔 ∥ 𝜔′ ), and then sends 𝑅𝐾 ,𝜔, to CPS.
+      𝑥𝑃 𝑈 𝐼𝐷𝑗 .                                                                                      𝑗
+
+  (3) Obtaining 𝐷𝑃 𝑈 𝐼𝐷𝑗 , 𝑗 sets the full private key as 𝑆𝐾𝑃 𝑈 𝐼𝐷𝑗 =                  (3) Finally, CPS can use the re-encryption key 𝑅𝐾 ,𝜔, to convert
+      (𝐷𝑃 𝑈 𝐼𝐷𝑗 , 𝑥𝑃 𝑈 𝐼𝐷𝑗 ).                                                               𝐶𝑇 into a re-encrypted ciphertext 𝐶𝑇 ′ . It computes 𝑈2 = 𝑈1 ,
+                                                                                            𝑉2 = 𝑉1 𝑒(𝑈1 , 𝑅𝐾 ,𝜔, ), and then outputs 𝐶𝑇 ′ = (𝑈2 , 𝑉2 ) to the
+  (4) 𝑗 calculates 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 = 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝑃 as the public key, and pub-
+                                                                                            authorized recipient.
+      lishes it.
+
+5.3. Data encryption phase                                                             5.5. Data decryption phase
+
+
+     Given the 𝑗 ’s identity 𝑃 𝑈 𝐼𝐷𝑗 associated with an attribute set 𝜔 =                The procedure to decrypt the original ciphertext and the re-
+(𝜔𝑖 )𝑛𝑖=1 , the public key 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 , and a message 𝑚.                               encrypted ciphertext is as follows:
+
+                                                                                   6
+J. Chen et al.                                                                                                                                Computer Standards & Interfaces 97 (2026) 104121
+
+
+                                                                                                   Correctness
+   For the original ciphertext 𝐶𝑇 = (𝑈1 , 𝑉1 ):
+                                      𝑉1
+   𝑚= ∏
+                                                   𝛥𝜔𝑖 ,𝑆 (0)
+              𝜔𝑖 ∈𝑆 𝑒(𝑈1 , 𝐷𝑃 𝑈 𝐼𝐷𝑗 + 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 )
+              ∏
+          𝑚      𝜔𝑖 ∈𝑆 (𝑒(𝑃𝑝𝑢𝑏 , 𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))
+                                               𝑟𝑖,𝑗
+                                                    × 𝑒(𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 , 𝐸𝑗 )𝑟𝑖,𝑗 )𝛥𝜔𝑖 ,𝑆 (0)
+      =                 ∏                                        𝛥𝜔𝑖 ,𝑆 (0)
+                            𝜔𝑖 ∈𝑆 𝑒(𝑈1 , 𝐷𝑃 𝑈 𝐼𝐷𝑗 + 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 )
+                                             𝑚
+      =
+          ∏                    𝑒(𝑈1 ,𝐷𝑃 𝑈 𝐼𝐷𝑗 +𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 )             𝛥𝜔𝑖 ,𝑆 (0)
+              𝜔𝑖 ∈𝑆 ( 𝑒(𝑃𝑝𝑢𝑏 ,𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))
+                                            𝑟𝑖,𝑗
+                                                 ×𝑒(𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ,𝐸𝑗 )𝑟𝑖,𝑗
+                                                                         )
+                                                      𝑚
+      =                    ∑
+                   𝑒(𝑟𝑗 𝑃 , 𝜔 ∈𝑆 (𝑝(𝜔𝑖 )𝛥𝜔𝑖 ,𝑆 (0))𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))𝑒(𝑟𝑗 𝑃 ,𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 )
+                              𝑖
+                ∑                                                   ∑
+          𝑒(𝑠𝑃 , 𝜔 ∈𝑆 (𝑔(𝜔𝑖 )𝛥𝜔𝑖 ,𝑆 (0))𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))𝑒(𝑥𝑃 𝑈 𝐼𝐷𝑗 𝑃 , 𝜔 ∈𝑆 (𝑔(𝜔𝑖 )𝛥𝜔𝑖 ,𝑆 (0))𝐸𝑗 )
+                    𝑖                                                        𝑖
+                              𝑚
+      =                                                   =𝑚
+          𝑒(𝑟𝑗 𝑃 ,𝑠𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))𝑒(𝑟𝑗 𝑃 ,𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 )
+          𝑒(𝑠𝑃 ,𝑟𝑗 𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))𝑒(𝑥𝑃 𝑈 𝐼𝐷𝑗 𝑃 ,𝑟𝑗 𝐸𝑗 )
+   For the re-encrypted ciphertext 𝐶𝑇 ′ = (𝑈2 , 𝑉2 ):
+                                                 𝑉2
+   𝑚= ∏
+                                                            ′ 𝛥𝜔𝑖 ,𝑆 (0)
+         𝜔𝑖 ∈𝑆 𝑒(𝑈2 , 𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷𝑗′ 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝜔 ∥ 𝜔 ))
+         ∏
+        𝑚 𝜔𝑖 ∈𝑆 (𝑒(𝑃𝑝𝑢𝑏 , 𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))𝑟𝑖,𝑗 × 𝑒(𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 , 𝐸𝑗 )𝑟𝑖,𝑗 )𝛥𝜔𝑖 ,𝑆 (0) 𝑒(𝑈1 , 𝑅𝐾 ,𝜔, )
+      =           ∏                                                       ′ 𝛥𝜔𝑖 ,𝑆 (0)
+                      𝜔𝑖 ∈𝑆 𝑒(𝑈2 , 𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷′ 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝜔 ∥ 𝜔 ))
+                                                                  𝑗
+
+          𝑚𝑒(𝑠𝑃 , 𝑟𝑗 𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ))𝑒(𝑥𝑃 𝑈 𝐼𝐷𝑗 𝑃 , 𝑟𝑗 𝐸𝑗 )𝑒(𝑟𝑗 𝑃 , −𝐷𝑃 𝑈 𝐼𝐷𝑗 − 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 + 𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝑃 𝐾𝑃 𝑈 𝐼𝐷′ ∥ 𝜔 ∥ 𝜔′ ))
+                                                                                                                                      𝑗
+      =
+                                                           𝑒(𝑟𝑗 𝑃 , 𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷′ 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝜔 ∥ 𝜔′ ))
+                                                                                          𝑗
+
+          𝑚𝑒(𝑟𝑗 𝑃 , 𝐷𝑃 𝑈 𝐼𝐷𝑗 + 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 )𝑒(𝑟𝑗 𝑃 , −𝐷𝑃 𝑈 𝐼𝐷𝑗 − 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 )𝑒(𝑟𝑗 𝑃 , 𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝑃 𝐾𝑃 𝑈 𝐼𝐷′ ∥ 𝜔 ∥ 𝜔′ )))
+                                                                                                                              𝑗
+      =                                                                                                                                      =𝑚
+                                                         𝑒(𝑟𝑗 𝑃 , 𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷′ 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝜔 ∥ 𝜔′ ))
+                                                                                      𝑗
+
+
+
+
+  (1) For the original ciphertext 𝐶𝑇 , sender 𝑗 can get the plaintext by                                                restores the corresponding record and returns 𝐻1 (𝑃 𝑈 𝐼𝐷)
+      computing                                                                                                         = (ℎ1𝑖 )𝑛𝑖=1 to 1 . Otherwise, for this tuple,  considers the
+                                           𝑉1                                                                           following two cases:
+       𝑚= ∏
+                                                       𝛥𝜔𝑖 ,𝑆 (0)
+                  𝜔𝑖 ∈𝑆 𝑒(𝑈1 , 𝐷𝑃 𝑈 𝐼𝐷𝑗 + 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 )                                                                     ∗ Case 1: If |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑,  randomly selects a polyno-
+                                                                                                                              mial 𝑡(𝑥) of degree 𝑑 − 1 such as 𝑡(0) = ℎ, and returns ℎ
+  (2) For the re-encrypted ciphertext 𝐶𝑇 ′ , only authorized receivers                                                        to 1 . Then,  saves the tuple (𝑃 𝑈 𝐼𝐷, ℎ, ⟂, ⟂) in the
+      can successfully obtain the data.                                                                                       𝐿1 .
+                                                    𝑉2                                                                      ∗ Case 2: If |𝜔 ∩ 𝜔𝜋 | < 𝑑,  need to selects 𝛼𝑢 ∈ {0, 1} at
+       𝑚= ∏
+                                                                  ′ 𝛥𝜔𝑖 ,𝑆 (0)                                                random, where the probability of 𝛼𝑢 = 1 is 𝛾.
+                  𝜔𝑖 ∈𝑆 𝑒(𝑈2 ,𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷𝑗′ 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝜔 ∥ 𝜔 ))
+
+                                                                                                                                  (1) When 𝛼𝑢 = 0,  chooses a random number
+                                                                                                                                      𝑧𝑖 ∈ 𝑍𝑞∗ , a polynomial 𝑦(𝑥) of degree 𝑑 − 1,
+6. Security analysis                                                                                                                  𝑦(0) = 𝑧. Let 𝑧𝑖 = 𝑦(𝜔𝑖 ), where 𝑖 = {1, … , 𝑛},
+                                                                                                                                       calculates 𝐻1 (𝑃 𝑈 𝐼𝐷) = 𝑧𝑖 𝑐𝑃 , and saves tuple
+6.1. Security proof for FCL-PRE                                                                                                       (𝑃 𝑈 𝐼𝐷, 𝑧𝑖 𝑐𝑃 , (𝑧𝑖 )𝑛𝑖=1 , 0) in the 𝐿1 .
+                                                                                                                                  (2) When 𝛼𝑢 = 1,  selects 𝑧∗ ∈ 𝑍𝑞∗ , outputs
+Theorem 1. If adversary 1 breaks FCL-PRE with a non-negligible advan-                                                                𝐻1 (𝑃 𝑈 𝐼𝐷) = 𝑧∗ 𝑃 and saves tuple (𝑃 𝑈 𝐼𝐷, 𝑧∗ 𝑃 ,
+tage 𝜀, we can construct an algorithm  that solves the DBDH assumption                                                               𝑧∗ , 1) in the 𝐿1 .
+in polynomial time with an advantage 𝜀′ .
+
+Proof. Given a set of challenge instance (𝑃 , 𝑎𝑃 , 𝑏𝑃 , 𝑐𝑃 , 𝑇 ),  acts as
+                                                                                                                      – 𝐻2 Query:  maintains an initially empty list of the form
+a subroutine of the adversary 1 and attempts to determine whether
+                                                                                                                        𝐿2 (𝑃 𝑈 𝐼𝐷, 𝑡𝑖 , 𝑌𝑖 ). When 1 makes a query, if 𝑃 𝑈 𝐼𝐷 already
+𝑇 = 𝑒(𝑃 , 𝑃 )𝑎𝑏𝑐 . Therefore,  needs to answer a series of inquiries from
+                                                                                                                        exists in the 𝐿2 ,  answers with 𝑌𝑖 , otherwise it randomly
+1 .
+                                                                                                                        selects 𝑡𝑖 ∈ 𝑍𝑞∗ , calculates 𝑌𝑖 = 𝑡𝑖 𝑃 and adds the tuple
+    ∙ Initialization. By executing Setup algorithm,  gets 𝑝𝑎𝑟𝑎𝑚𝑠 =                                                     (𝑃 𝑈 𝐼𝐷, 𝑡𝑖 , 𝑌𝑖 ) to the 𝐿2 .
+      {G, G𝑇 , 𝑞, 𝑒, 𝑑, 𝑃 , 𝑃𝑝𝑢𝑏 , 𝐻1 , 𝐻2 , 𝐻3 }. Then,  sets 𝑃𝑝𝑢𝑏 = 𝑎𝑃 , and 𝑎                                     – 𝐻3 Query:  maintains an initially empty list of the form
+      is the master key, which is unknown to .                                                                         𝐿3 (𝑋 ′ , 𝐻 ′ ). If 𝑋 ′ is in the list 𝐿3 ,  returns 𝐻 ′ to 1 .
+                                                                                                                        Otherwise,  uniformly selects an element 𝐻 ′ ∈ G, returns
+              – 𝐻1 Query:  maintains an initially empty list of the form                                               it and records the pair (𝑋 ′ , 𝐻 ′ ) in 𝐿3 .
+                𝐿1 (𝑃 𝑈 𝐼𝐷, (ℎ1𝑖 )𝑛𝑖=1 , (𝑧𝑖 )𝑛𝑖=1 , 𝛼𝑢 ), 1 publishes 𝑃 𝑈 𝐼𝐷 for
+                query.  first chooses 𝜋 ∈ {1, 2, … , 𝑞𝐻1 } and defines 𝑃 𝑈 𝐼𝐷𝜋                                  ∙ Phase 1. For a series of inquiries raised by 1 ,  answers as
+                as the challenge identity. If 𝑃 𝑈 𝐼𝐷 already exists in the 𝐿1 ,                                    follows.
+
+                                                                                                        7
+J. Chen et al.                                                                                                              Computer Standards & Interfaces 97 (2026) 104121
+
+
+           – PPKQuery oracle 𝑝𝑝𝑘 : 1 publishes an identity 𝑃 𝑈 𝐼𝐷 for                       𝑃 𝑈 𝐼𝐷𝜋 ,  fails in this game. Otherwise,  randomly selects a
+             query,  maintains a list of the form 𝐿𝑝𝑝𝑘 (𝑃 𝑈 𝐼𝐷, 𝐷𝑃 𝑈 𝐼𝐷 )                    message 𝑚𝑏 , where 𝑏 ∈ {0, 1}, calculates the ciphertext 𝐶𝑇𝑏 =
+                                                                                                                   ∏
+             as the answer to 1 . If 𝑃 𝑈 𝐼𝐷 already exists in the 𝐿𝑝𝑝𝑘 ,                     (𝑈𝑏 , 𝑉𝑏 ) = (𝑏𝑃 , 𝑚𝑏 𝜔𝑖 ∈𝑆 𝑒(𝑃 𝐾𝑃 𝑈 𝐼𝐷𝜋 , 𝑡𝑖 𝑏𝑃 )𝑇 𝛥𝜔𝑖 ,𝑆 (0) ) and sends 𝐶𝑇𝑏
+              first performs the 𝐻1 Query in the above steps to obtain                       to 1 .
+             𝐻1 (𝑃 𝑈 𝐼𝐷). Otherwise,  finds the tuple in the 𝐿1 :                          ∙ Phase 2. Adversary 1 initiates a series of queries similar to
+                 ∗ Case1: If |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑, the challenger  aborts and                       Phase 1, and  responds accordingly. Please note that the queries
+                   outputs ‘‘fault’’.                                                         issued by 1 in this phase must comply with the constraints in
+                 ∗ Case2: If |𝜔 ∩ 𝜔𝜋 | < 𝑑,  randomly selects a polyno-                      the security model.
+                   mial 𝑝(𝑥) of degree 𝑑 − 1, 𝑝(0) = 𝑎, let 𝑝(𝜔𝑖 ) = 𝑎𝑖 , where             ∙ Guess. Once the adversary 1 provides a guess 𝑏′ ∈ {0, 1} for the
+                   𝑖 ∈ {1, … , 𝑛}.  returns 𝑧𝑖 𝑎𝑃 to 1 , and saves tuple                    challenge bit,  outputs 1 if 𝑏′ = 𝑏 and 0 otherwise. □
+                   (𝑃 𝑈 𝐼𝐷, (𝐷𝑃 𝑈 𝐼𝐷 )) in the 𝐿𝑝𝑝𝑘 .
+                                                                                         Theorem 2. If adversary 2 breaks FCL-PRE with a non-negligible advan-
+           – PKQuery oracle 𝑝𝑘 : 1 publishes an identity 𝑃 𝑈 𝐼𝐷 for                    tage 𝜀, we can construct an algorithm  that solves the DBDH assumption
+             query,  maintains a list of the form 𝐿𝑝𝑢𝑏 (𝑃 𝑈 𝐼𝐷, 𝑃 𝐾𝑃 𝑈 𝐼𝐷 ,             in polynomial time with an advantage 𝜀′ .
+             (𝑥𝑖,𝑃 𝑈 𝐼𝐷 )𝑛𝑖=1 ) as the answer to 1 . If 𝑃 𝑈 𝐼𝐷 already exists in
+             the 𝐿𝑝𝑢𝑏 ,  restores the corresponding record and returns
+                                                                                         Proof. Similar to the Theorem 1, therefore, only their main differences
+             𝑃 𝐾𝑃 𝑈 𝐼𝐷 to 1 . Otherwise,  randomly selects 𝑥𝑗 ∈ 𝑍𝑞∗ ,
+                                                                                         are presented below.
+             a polynomial 𝑦(𝑥) of degree 𝑑 − 1, 𝑦(0) = 𝑥𝑗 , let 𝑦(𝜔𝑖 ) =
+             𝑥𝑖,𝑃 𝑈 𝐼𝐷 , where 𝑖 ∈ {1, … , 𝑛}. In this case, we suppose that                ∙ Initialization.  returns the 𝑝𝑎𝑟𝑎𝑚𝑠 and 𝑚𝑠𝑘 = 𝑠 to 2 . It should
+             𝑥𝑃 𝑈 𝐼𝐷 = (𝑥𝑖,𝑃 𝑈 𝐼𝐷 )𝑛𝑖=1 while  calculates 𝑃 𝐾𝑃 𝑈 𝐼𝐷 = 𝑥𝑃 𝑈 𝐼𝐷 𝑃 ,            be noted that 2 represents the KGC, which has access to the
+             and returns it to 1 . Finally,  maintains (𝑃 𝑈 𝐼𝐷, 𝑃 𝐾𝑃 𝑈 𝐼𝐷 ,                 partial private key and is computed by challenger . Therefore,
+             (𝑥𝑖,𝑃 𝑈 𝐼𝐷 )𝑛𝑖=1 ) in 𝐿𝑝𝑢𝑏 .                                                     in this case, there is no need to simulate the PartialPrivateKey
+           – PK replacement oracle 𝑝𝑘𝑟𝑝 : When 1 queries the tuple                          algorithm as well as the hash function 𝐻1 . Next,  randomly
+             (𝑃 𝑈 𝐼𝐷, 𝑃 ̃
+                        𝐾𝑃 𝑈 𝐼𝐷 ), if 𝑃 𝑈 𝐼𝐷 has not been queried for the                     chooses an integer 𝑟 ∈ [1, 𝑞𝐻2 ] and to the queries raised by 2 , 
+             public key,  generates a public key query on 𝑃 𝑈 𝐼𝐷 to                          answers as follows:
+             obtain 𝑃 ̃𝐾𝑃 𝑈 𝐼𝐷 and records (𝑃 𝑈 𝐼𝐷, 𝑃 ̃ 𝐾𝑃 𝑈 𝐼𝐷 , ⟂) in 𝐿𝑝𝑢𝑏 .
+             Otherwise,  maintains (𝑃 𝑈 𝐼𝐷, 𝑃 ̃ 𝐾𝑃 𝑈 𝐼𝐷 , ⟂) in 𝐿𝑝𝑢𝑏 .                           – 𝐻2 Query: When 2 queries the existing 𝑃 𝑈 𝐼𝐷 in 𝐿2 , 
+           – SKQuery oracle 𝑠𝑘 : 1 publishes an identity 𝑃 𝑈 𝐼𝐷 for                               will respond with 𝑌𝑖 , otherwise it considers the following
+             query,  maintains a list of the form 𝐿𝑠𝑘 (𝑃 𝑈 𝐼𝐷, 𝑆𝐾𝑃 𝑈 𝐼𝐷 )                          two situations:
+             as the answer to 1 . If 𝑃 𝑈 𝐼𝐷 has already queried, 
+             restores the corresponding record and returns 𝑆𝐾𝑃 𝑈 𝐼𝐷 to                                  ∗ Case 1: If 𝑗 = 𝑟,  computes 𝐻2 (𝑃 𝑈 𝐼𝐷𝑗 ∥ 𝑃 𝐾𝑃 𝑈 𝐼𝐷𝑗 ∥
+             1 , otherwise,  considers the following two cases:                                         𝑃𝑝𝑢𝑏 ) = 𝑐𝑃 and returns it to 2 .
+                                                                                                        ∗ Case 2: If 𝑗 ≠ 𝑟,  randomly selects 𝑡𝑖 ∈ 𝑍𝑞∗ , and
+                 ∗ Case 1: If |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑,  aborts and outputs ‘‘fault’’.                              calculates 𝑌𝑖 = 𝑡𝑖 𝑃 , then  returns it to 2 . Finally,
+                 ∗ Case 2: If |𝜔 ∩ 𝜔𝜋 | < 𝑑,  returns the 𝑆𝐾𝑃 𝑈 𝐼𝐷 to 1                                  adds the tuple (𝑃 𝑈 𝐼𝐷, 𝑡𝑖 , 𝑌𝑖 ) to 𝐿2 .
+                   and saves tuple (𝑃 𝑈 𝐼𝐷, 𝐷𝑃 𝑈 𝐼𝐷 , 𝑥𝑃 𝑈 𝐼𝐷 ) in the 𝐿𝑠𝑘 .
+                                                                                            ∙ Phase 1. For a series of inquiries raised by 2 ,  answers as
+           – ReKeyGen oracle 𝑟𝑘 :  first searches whether tuple                             follows.
+             (𝑃 𝑈 𝐼𝐷, 𝑃 𝑈 𝐼𝐷′ , 𝑅𝐾 ,𝜔, ) exists in the 𝐿𝑟 𝑘. If so,  returns
+             𝑅𝐾 ,𝜔, to 1 . Otherwise, we suppose that 1 has con-                              – PKQuery oracle 𝑝𝑘 : 2 publishes an identity 𝑃 𝑈 𝐼𝐷 for
+             ducted the above series of queries when querying the ROM,                              query,  first selects 𝜋 ∈ [1, 𝑞𝑝𝑢𝑏 ] randomly, and defines
+             so when |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑,  will follow the steps below:                                  𝑃 𝑈 𝐼𝐷𝜋 as the challenge identity.
+                 ∗ Case 1: When 𝛼1 = 1,  follows the above steps                                       ∗ Case 1: If 𝑃 𝑈 𝐼𝐷 has been queried,  restores the
+                   to obtain 𝑃 𝑈 𝐼𝐷’s public–private key pair (𝑆𝐾𝑃 𝑈 𝐼𝐷 ,                                 corresponding record and returns 𝑃 𝐾𝑃 𝑈 𝐼𝐷 = 𝑥𝑃 𝑈 𝐼𝐷 𝑃
+                   𝑃 𝐾𝑃 𝑈 𝐼𝐷 ), and the public key 𝑃 𝐾𝑃′ 𝑈 𝐼𝐷 of 𝑃 𝑈 𝐼𝐷′ .                                to 2 .
+                   Then,  calculates 𝜑 = 𝑒(𝐷𝑃 𝑈 𝐼𝐷 , 𝐻1 (𝑃 𝑈 𝐼𝐷′ )), and the
+                                                                                                        ∗ Case 2: If 𝑃 𝑈 𝐼𝐷 has not been queried, then  consid-
+                   re-encryption key 𝑅𝐾 ,𝜔, = −𝐷𝑃 𝑈 𝐼𝐷𝑗 − 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 +
+                                                                                                          ers the following scenario:
+                   𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝑃 𝐾𝑃 𝑈 𝐼𝐷′ ∥ 𝜔 ∥ 𝜔′ ).
+                                              𝑗
+                 ∗ Case 2: When 𝛼1 = 0 and 𝛼2 = 1,  response fails.                                         (1) If |𝜔 ∩ 𝜔𝜋 | < 𝑑 and 𝑗 ≠ 𝜋,  selects a ran-
+                 ∗ Case 3: When 𝛼1 = 0 and 𝛼2 = 0,  randomly selects                                            dom number 𝑥∗𝑖,𝑃 𝑈 𝐼𝐷 ∈ 𝑍𝑞∗ , a polynomial 𝑦(𝑥)
+                                                                                                                                          𝑗
+                   𝑅𝐾 ,𝜔, ∈ G and returns to 1 .                                                              of degree 𝑑 − 1, 𝑦(0) = 𝑥∗𝑖,𝑃 𝑈 𝐼𝐷 , let 𝑦(𝜔𝑖 ) =
+                                                                                                                                                        𝑗
+                                                                                                                   ∗
+                                                                                                                 𝑥𝑖,𝑃 𝑈 𝐼𝐷 , where 𝑖 ∈ {1, … , 𝑛}. Next,  calculates
+                                                                                                                          𝑗
+           – Re-encryption oracle 𝑟𝑒𝑒𝑛 : Suppose that the public key of                                         𝑃 𝐾𝑃 𝑈 𝐼𝐷 = 𝑥𝑃 𝑈 𝐼𝐷 𝑃 , and returns it to 2 . Finally,
+             𝑃 𝑈 𝐼𝐷 has not been replaced, the original ciphertext 𝐶𝑇 =
+                                                                                                                  saves the tuple (𝑃 𝑈 𝐼𝐷, (𝑥𝑖,𝑃 𝑈 𝐼𝐷𝑗 )𝑛𝑖=1 , 𝑃 𝐾𝑃 𝑈 𝐼𝐷 )
+             (𝑈1 , 𝑉1 ) at this time.
+                                                                                                                 to 𝐿𝑝𝑢𝑏 .
+                 ∗ Case 1: If |𝜔 ∩ 𝜔𝜋 | < 𝑑,  aborts and outputs ‘‘fault’’.                                 (2) If |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑 and 𝑗 = 𝜋,  calculates 𝑃 𝐾𝑃 𝑈 𝐼𝐷 =
+                 ∗ Case 2: If |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑,  considers the following two                                       𝑎𝑃 , and returns it to the adversary 2 . Finally, 
+                   cases:                                                                                        maintains the tuple (𝑃 𝑈 𝐼𝐷𝜋 , (𝑥𝑖,𝑃 𝑈 𝐼𝐷𝑗 )𝑛𝑖=1 ,
+                                                                                                                 𝑃 𝐾𝑃 𝑈 𝐼𝐷 ) to the 𝐿𝑝𝑢𝑏 .
+                     (1) If 𝛼𝑢 = 1,  aborts and outputs ‘‘fault’’.
+                     (2) If 𝛼𝑢 = 0,  re-encrypts the 𝐶𝑇 into 𝐶𝑇 ′ =                              – SKQuery oracle 𝑠𝑘 :  considers the following two cases:
+                         (𝑈1 , 𝑉1 𝑒(𝑈1 , 𝑅𝐾 ,𝜔, )) and sends it to 1 .
+                                                                                                        ∗ Case 1: If 𝑃 𝑈 𝐼𝐷 has been queried,  restores the
+                                                                                                          corresponding record and returns 𝑆𝐾𝑃 𝑈 𝐼𝐷 to 2 .
+    ∙ Challenge. 1 outputs 𝑃 𝑈 𝐼𝐷𝜋 and two messages of equal length                                    ∗ Case 2: If 𝑃 𝑈 𝐼𝐷 has not been queried,  considers the
+      (𝑚0 , 𝑚1 ). If the flag variable 𝛼𝑢 ≠ 0 of the challenge identity                                   following scenario:
+
+                                                                                     8
+J. Chen et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104121
+
+
+                      (1) If |𝜔 ∩ 𝜔𝜋 | < 𝑑 and 𝑗 ≠ 𝑟,  makes sure                    7. Performance evaluation
+                          that 2 has performed PKQuery and all hash
+                          queries. Then,  calculates 𝐷𝑃 𝑈 𝐼𝐷 and returns                This section provides a systematic performance evaluation of FCL-
+                          the 𝑆𝐾𝑃 𝑈 𝐼𝐷 = (𝐷𝑃 𝑈 𝐼𝐷 , 𝑥𝑃 𝑈 𝐼𝐷 ) to 2 , while           PRE and other related schemes from both theoretical and experimental
+                          saving the tuple (𝑃 𝑈 𝐼𝐷, 𝐷𝑃 𝑈 𝐼𝐷 , 𝑥𝑃 𝑈 𝐼𝐷 ) in the        perspectives. First, we built an experimental system on Ubuntu 20.10,
+                          𝐿𝑠𝑘 .                                                       using Python 3.10 and Sagemath 9.8, setting the security parameter to
+                      (2) If |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑 and 𝑗 = 𝑟,  aborts and outputs            𝜆 = 256. The chosen elliptic curve 𝐸∕𝐹𝑝 is defined by the simplified
+                          ‘‘fault’’.                                                  Weierstrass equation 𝑦2 = 𝑥3 + 𝑎𝑥 + 𝑏.
+
+           – ReKeyGen oracle 𝑟𝑘 : For the re-encryption key queries
+                                                                                      7.1. Theoretical analysis
+             of 𝑃 𝑈 𝐼𝐷 and 𝑃 𝑈 𝐼𝐷′ , when |𝜔 ∩ 𝜔𝜋 | ≥ 𝑑,  makes the
+             following answer:
+                                                                                          Table 3 compares the number of modular exponentiations, scalar
+                 (1) If 𝑗 ≠ 𝑟, the challenger  outputs the re-encryption             multiplications, and bilinear pairings for FCL-PRE, YDKR21 [43],
+                     key 𝑅𝐾 ,𝜔, = −𝐷𝑃 𝑈 𝐼𝐷𝑗 − 𝑥𝑃 𝑈 𝐼𝐷𝑗 𝐸𝑗 + 𝐻3 (𝜑 ∥ 𝑥𝑃 𝑈 𝐼𝐷𝑗        FLWL24 [24], and ZZYL20 [44], to assess the computational overhead
+                     𝑃 𝐾𝑃 𝑈 𝐼𝐷′ ∥ 𝜔 ∥ 𝜔′ ).                                           at different stages. All three references adopt CL-PRE in data-sharing
+                              𝑗
+                                                                                      scenarios. In the following, we focus on the major computational
+                 (2) If 𝑗 = 𝑟 and the private key of 𝑃 𝑈 𝐼𝐷′ has been
+                                                                                      overhead on the sender side 𝑗 .
+                     queried,  responds with failure.
+                                                                                          Encryption: The efficiency ranking is YDKR21 [43] < FLWL24 [24]
+                 (3) If 𝑗 = 𝑟 and the private key of 𝑃 𝑈 𝐼𝐷′ has not been
+                                                                                      < Ours < ZZYL20 [44]. Since biometric characteristic 𝑏𝑖𝑜 inevitably
+                     queried,  randomly selects 𝑅𝐾 ,𝜔, ∈ G as the
+                                                                                      contains noise during collection, FCL-PRE binds each registered user’s
+                     answer and returns it to 2 .                                    pseudo-identity to an attribute set {𝜔}𝑛𝑖=1 . Consequently, during encryp-
+    ∙ Challenge. 2 outputs 𝑃 𝑈 𝐼𝐷𝜋 and two messages of equal length                  tion, 𝑗 must bind attribute fragments to the message, ensuring both
+      (𝑚0 , 𝑚1 ). If the challenge identity 𝑃 𝑈 𝐼𝐷𝜋 ≠ 𝑃 𝑈 𝐼𝐷𝑟 ,  fails in this       data confidentiality and system error tolerance.
+      game. Otherwise,  randomly selects a message 𝑚𝑏 , where 𝑏 ∈                        ReKey Generation: The efficiency ranking is YDKR21 [43] <
+                                                                       ∏              ZZYL20 [44] < Ours < FLWL24 [24]. In FCL-PRE, users are allowed
+      {0, 1}, calculates the ciphertext 𝐶𝑇𝑏 = (𝑈𝑏 , 𝑉𝑏 ) = (𝑏𝑃 , 𝑚𝑏 𝜔𝑖 ∈𝑆
+      𝑒(𝑏𝑃 , 𝑠𝐻1 (𝑃 𝑈 𝐼𝐷𝜋 ))𝑇  𝛥𝜔𝑖 ,𝑆 (0)
+                                          ) and sends 𝐶𝑇𝑏 to 2 . □                   to omit or update some attributes during key generation, eliminating
+                                                                                      the extra computational overhead associated with regenerating public–
+                                                                                      private key pairs. Moreover, even if the proxy CPS colludes with the
+6.2. Security properties of FCL-PRE                                                   receiver, it cannot deduce the user’s real identity from the re-encryption
+                                                                                      key.
+     • Confidentiality. According to the above security proof, the pro-                   Decrypt1: The efficiency ranking is ZZYL20 [44] < YDKR21 [43]
+       posed FCL-PRE scheme satisfies IND-CPA secure in the random                    < FLWL24 [24] = Ours. Compared to ZZYL20 [44] and YDKR21 [43],
+       oracle model and holds under the DBDH assumption. In addition,                 FCL-PRE improves the decryption efficiency on the sender side 𝑗 by
+       before re-encryption, the proxy CPS needs to authenticate regis-               40.57% and 44.6%, respectively, significantly reducing computational
+       tered users, and re-encryption is only allowed when the original               burden.
+       ciphertext meets a certain condition, which further enhances the                   In summary, by integrating certificateless encryption with secret
+       confidentiality of the scheme.                                                 sharing technology, FCL-PRE enhances user privacy and system error
+     • Anonymity. FCL-PRE converts each user’s real biometric identity                tolerance while effectively addressing the stringent privacy require-
+       𝑈 𝐼𝐷𝑗 into a pseudo-identity 𝑃 𝑈 𝐼𝐷𝑗 = 𝐸𝑛𝑐𝜙 (𝑈 𝐼𝐷𝑗 ∥ 𝑛𝑃 𝑈 𝐼𝐷𝑗 ) ∥ 𝑇𝑗           ments in cloud-based data-sharing scenarios.
+       through a symmetric encryption algorithm for hiding. Therefore,
+       if an adversary wishes to obtain 𝑈 𝐼𝐷𝑗 , he/she must first acquire             7.2. Experimental analysis
+       the symmetric key 𝜙. However, in our scheme, only a trusted TA
+       can extract 𝜙, thereby ensuring the anonymity of the user’s real                   Computational overhead. To ensure the objectivity and accuracy
+       identity.                                                                      of our results, we excluded the Setup algorithm from the experiment,
+     • Error tolerance. We employ secret sharing technology to divide                 as it is executed only once and has a negligible impact on the user
+       the system master key 𝑠 and the secret value 𝑥𝑃 𝑈 𝐼𝐷𝑗 into 𝑛                   encryption experience. For the remaining algorithms, each was exe-
+       independent components. Based on these components, the sender                  cuted 100 times, and the average execution time was recorded. Fig.
+       𝑗 generates the final complete private key and the corresponding              4 reports the execution time of all main stages in our scheme as a
+       ciphertext. In the verification phase, the ciphertext can be re-               function of the number of receivers/messages. Specifically, Fig. 4(a)–(c)
+       encrypted if the attribute set contains at least 𝑑 valid attributes.           show the sender-side costs, including Encryption time, ReKey Gen-
+       Here, 𝑑 is defined as an error tolerance parameter, so as to achieve           eration time, and Decrypt1 time, respectively. Fig. 4(d) presents the
+       the system’s error tolerance and enhance its robustness.                       Re-encryption time at the cloud proxy server, while Fig. 4(e) depicts
+     • Collusion Resistance. Given the commercial nature of cloud ser-                the Decrypt2 time at the authorized receiver. Fig. 4(f) summarizes
+       vice providers, a potential risk arises that they may collude                  the total computational overhead across all parties. As the number
+       with the receiver 𝑗 to acquire 𝑗 ’s private key 𝑆𝐾𝑃 𝑈 𝐼𝐷𝑗 =                  of receivers/messages increases, all stages exhibit an approximately
+       (𝐷𝑃 𝑈 𝐼𝐷𝑗 , 𝑥𝑃 𝑈 𝐼𝐷𝑗 ). However, under the threshold secret sharing,           linear growth. Our FCL-PRE scheme consistently incurs lower decryp-
+       collusion between 𝑗 and CPS is infeasible. First, 𝑗 ’s full private          tion time, re-encryption time, and overall computational cost than the
+       key consists of a partial private key 𝐷𝑃 𝑈 𝐼𝐷𝑗 and a secret value              compared schemes, as illustrated in Fig. 4(c), (d), and (f). These results
+       𝑥𝑃 𝑈 𝐼𝐷𝑗 , both of which are divided into 𝑛 components. This means             demonstrate that FCL-PRE achieves better efficiency and scalability,
+       that at least 𝑡 attribute shards must be obtained to recover one               particularly in multi-receiver settings.
+       of the keys. Second, even if the colluder obtains 𝑥𝑃 𝑈 𝐼𝐷𝑗 , they                  Communication overhead. Table 3 compares the communication
+       cannot deduce the sender’s partial private key 𝐷𝑃 𝑈 𝐼𝐷𝑗 , because              overhead of YDKR21 [43], FLWL24 [24], ZZYL20 [44], and our pro-
+       𝐷𝑃 𝑈 𝐼𝐷𝑗 = 𝑠𝐻1 (𝑃 𝑈 𝐼𝐷𝑗 ), where 𝑠 is the master key. Since the                posed scheme. The storage and transmission overheads of the data
+       master key 𝑠 is unknown to the colluder, they cannot calculate                 sender and cloud proxy server, including the original ciphertext, re-
+       𝐷𝑃 𝑈 𝐼𝐷𝑗 .                                                                     encryption key, and re-encrypted ciphertext, are discussed in detail.
+
+                                                                                  9
+J. Chen et al.                                                                                                            Computer Standards & Interfaces 97 (2026) 104121
+
+
+
+Table 3
+Comparison of cryptographic operations of related schemes.
+ Scheme             Computational cost                                                                           Communication cost
+                    Encryption           ReKeyGen         Re-encryption        Decrypt1          Decrypt2        CT1                     CT2                ReKey
+ YDKR21 [43]        𝑇𝑝 + 8𝑇𝑒             6𝑇𝑒              2𝑇𝑝 + 2𝑇𝑒            𝑇𝑝 + 𝑇𝑒           𝑇𝑝 + 2𝑇𝑒        3|G| + 2|G𝑇 |           4|G| + 2|G𝑇 |      6|G| + 4|𝑍𝑞∗ |
+ FLWL24 [24]        𝑇𝑝 + 3𝑇𝑒             2𝑇𝑒              2𝑇𝑝                  𝑇𝑝                2𝑇𝑒             2|G| + |G𝑇 |            3|G𝑇 |             |G|
+ ZZYL20 [44]        2𝑇𝑒 + 𝑇𝑠𝑚            𝑇𝑝 + 3𝑇𝑒 + 𝑇𝑠𝑚   𝑇𝑝                   𝑇𝑝 + 𝑇𝑒 + 𝑇𝑠𝑚     𝑇𝑝 + 𝑇𝑒 + 𝑇𝑠𝑚   2|G| + |𝑍𝑞∗ |           2|G| + |𝑍𝑞∗ |      |𝑍𝑞∗ |
+ Ours               2𝑇𝑝 + 𝑇𝑒 + 2𝑇𝑠𝑚      𝑇𝑝 + 𝑇𝑒          𝑇𝑝                   𝑇𝑝                2𝑇𝑝             |G| + |G𝑇 | + |𝑍𝑞∗ |    |G| + |G𝑇 |        |G| + 2|𝑍𝑞∗ |
+
+
+
+
+                  (a) Execution time of Encryption.          (b) Execution time of ReKey Genera-                 (c) Execution time of Decrypt1.
+                                                             tion.
+
+
+
+
+                 (d) Execution time of Re-encryption.             (e) Execution time of Decrypt2.                      (f) Total execution time.
+
+
+                                                            Fig. 4. The execution time of each phase.
+
+
+
+
+                        (a) Original ciphertext.                      (b) Re-encrypted ciphertext.                      (c) Re-encryption key.
+
+
+                                                          Fig. 5. Communication overhead comparison.
+
+
+    Sender side: Regarding the transmission of the original ciphertext,                 which may lead to a potential risk of key misuse. As we can see in Fig.
+our proposed scheme and ZZYL20 [44] achieve the lowest commu-                           5(c), FCL-PRE requires only KB level for storage, making it well-suited
+nication cost, as shown in Fig. 5(a). Although our scheme incurs                        for resource-constrained mobile devices without imposing a significant
+slightly higher communication overhead for the transmission of the                      burden on the sender side.
+re-encryption key compared to ZZYL20 [44], it is worth noting that                         Cloud proxy server (CPS) side: For the storage of re-encrypted cipher-
+ZZYL20 pre-generates and stores the re-encryption key in the cloud,                     text, our scheme also demonstrates the lowest communication cost, as
+
+                                                                                   10
+J. Chen et al.                                                                                                                     Computer Standards & Interfaces 97 (2026) 104121
+
+
+shown in Fig. 5(b). Even when the number of designated recipients                              [5] Matthew Green, Giuseppe Ateniese, Identity-based proxy re-encryption, in: Ap-
+is relatively large, i.e., 50 receivers, FCL-PRE requires only 12.5 KB                             plied Cryptography and Network Security: 5th International Conference, ACNS
+                                                                                                   2007, Zhuhai, China, June 5-8, 2007, Springer, 2007, pp. 288–306.
+of communication overhead at the CPS side. It indicates that FCL-
+                                                                                               [6] Chunpeng Ge, Willy Susilo, Jiandong Wang, Liming Fang, Identity-based condi-
+PRE not only effectively minimizes the cloud’s communication burden                                tional proxy re-encryption with fine-grained policy, Comput. Stand. Interfaces 52
+but also ensures a flexible and reliable sharing mechanism without                                 (2017) 1–9.
+compromising data security.                                                                    [7] Hongmei Pei, Peng Yang, Weihao Li, Miao Du, Zhongjian Hu, Proxy re-encryption
+                                                                                                   for secure data sharing with blockchain in internet of medical things, Comput.
+                                                                                                   Netw. 245 (2024) 110373.
+8. Conclusion                                                                                  [8] Guijiang Liu, Haibo Xie, Wenming Wang, Haiping Huang, A secure and efficient
+                                                                                                   electronic medical record data sharing scheme based on blockchain and proxy
+    In this paper, we propose FCL-PRE, a fuzzy certificateless proxy                               re-encryption, J. Cloud Comput. 13 (1) (2024) 44.
+re-encryption scheme that facilitates flexible key management while                            [9] Anca-Andreea Ivan, Yevgeniy Dodis, Proxy cryptography revisited, in: NDSS,
+                                                                                                   2003.
+ensuring efficient and secure data sharing. By integrating anonymous
+                                                                                              [10] Yang Lu, Efficient certificate-based proxy re-encryption scheme for data sharing
+biometric recognition, our approach conceals users’ real identities,                               in public clouds, KSII Trans. Internet Inf. Syst. (TIIS) 9 (7) (2015) 2703–2718.
+achieving effective conditional privacy and bolstering system error                           [11] Zhiguang Qin, Hu Xiong, Shikun Wu, Jennifer Batamuliza, A survey of proxy re-
+tolerance. Notably, we prevent malicious re-encryption requests by                                 encryption for secure data sharing in cloud computing, IEEE Trans. Serv. Comput.
+verifying the signature, while secret sharing technology enhances collu-                           (2016) 1–18.
+                                                                                              [12] Giuseppe Ateniese, Kevin Fu, Matthew Green, Susan Hohenberger, Improved
+sion resistance. Moreover, a formal security analysis under the random                             proxy re-encryption schemes with applications to secure distributed storage, ACM
+oracle model demonstrates that FCL-PRE resists chosen-plaintext at-                                Trans. Inf. Syst. Secur. (TISSEC) 9 (1) (2006) 1–30.
+tacks. Compared to existing schemes, FCL-PRE significantly reduces                            [13] Craig Gentry, Certificate-based encryption and the certificate revocation problem,
+computational and communication overhead, achieving the lowest total                               in: International Conference on the Theory and Applications of Cryptographic
+                                                                                                   Techniques, Springer, 2003, pp. 272–293.
+computational cost and ciphertext storage overhead. In future work, we
+                                                                                              [14] Chul Sur, Youngho Park, Sang Uk Shin, Kyung Hyune Rhee, Changho Seo,
+aim to optimize dynamic user revocation and enhance adaptability to                                Certificate-based proxy re-encryption for public cloud storage, in: 2013 Sev-
+real-world cloud environments with more complex access policies.                                   enth International Conference on Innovative Mobile and Internet Services in
+                                                                                                   Ubiquitous Computing, IEEE, 2013, pp. 159–166.
+CRediT authorship contribution statement                                                      [15] Chunpeng Ge, Zhe Liu, Jinyue Xia, Liming Fang, Revocable identity-based
+                                                                                                   broadcast proxy re-encryption for data sharing in clouds, IEEE Trans. Dependable
+                                                                                                   Secur. Comput. 18 (3) (2019) 1214–1226.
+    Jiasheng Chen: Writing – original draft, Software, Methodology,                           [16] Jing Zhang, Shuangshuang Su, Hong Zhong, Jie Cui, Debiao He, Identity-based
+Investigation, Formal analysis, Conceptualization. Zhenfu Cao: Writing                             broadcast proxy re-encryption for flexible data sharing in VANETs, IEEE Trans.
+– review & editing, Supervision, Resources, Funding acquisition. Lian-                             Inf. Forensics Secur. 18 (2023) 4830–4842.
+                                                                                              [17] Jiguo Li, Xuexia Zhao, Yichen Zhang, Certificate-based conditional proxy re-
+gliang Wang: Writing – review & editing, Validation, Methodology,
+                                                                                                   encryption, in: International Conference on Network and System Security,
+Formal analysis, Data curation. Jiachen Shen: Validation, Supervision,                             Springer, 2015, pp. 299–310.
+Formal analysis. Xiaolei Dong: Validation, Funding acquisition, Formal                        [18] Jun Shao, Peng Liu, Yuan Zhou, Achieving key privacy without losing CCA
+analysis.                                                                                          security in proxy re-encryption, J. Syst. Softw. 85 (3) (2012) 655–665.
+                                                                                              [19] Jian Weng, Robert H. Deng, Xuhua Ding, Cheng-Kang Chu, Junzuo Lai,
+                                                                                                   Conditional proxy re-encryption secure against chosen-ciphertext attack, in:
+Declaration of competing interest                                                                  Proceedings of the 4th International Symposium on Information, Computer, and
+                                                                                                   Communications Security, 2009, pp. 322–332.
+    The authors declare that they have no known competing finan-                              [20] Cui Li, Rongmao Chen, Yi Wang, Qianqian Xing, Baosheng Wang, REEDS: An
+cial interests or personal relationships that could have appeared to                               efficient revocable end-to-end encrypted message distribution system for IoT,
+                                                                                                   IEEE Trans. Dependable Secur. Comput. 21 (5) (2024) 4526–4542.
+influence the work reported in this paper.
+                                                                                              [21] Shimao Yao, Ralph Voltaire J. Dayot, In-Ho Ra, Liya Xu, Zhuolin Mei, Jiaoli
+                                                                                                   Shi, An identity-based proxy re-encryption scheme with single-hop conditional
+Acknowledgments                                                                                    delegation and multi-hop ciphertext evolution for secure cloud data sharing, IEEE
+                                                                                                   Trans. Inf. Forensics Secur. 18 (2023) 3833–3848.
+                                                                                              [22] Giuseppe Ateniese, Karyn Benson, Susan Hohenberger, Key-private proxy re-
+   This work was supported in part by the National Natural Science
+                                                                                                   encryption, in: Cryptographers’ Track at the RSA Conference, Springer, 2009,
+Foundation of China (Grant No. 62132005, 62172162), in part by                                     pp. 279–294.
+Shanghai Trusted Industry Internet Software Collaborative Innovation                          [23] Chengdong Ren, Xiaolei Dong, Jiachen Shen, Zhenfu Cao, Yuanjian Zhou, Clap-
+Center, in part by Fundamental Research Funds for the Central Uni-                                 pre: Certificateless autonomous path proxy re-encryption for data sharing in the
+versities, in part by Police Integration Computing Key Laboratory of                               cloud, Appl. Sci. 12 (9) (2022) 4353.
+                                                                                              [24] Jingyu Feng, Yue Li, Teng Wang, Shuanggen Liu, A certificateless threshold proxy
+Sichuan Province (Grant No. JWRH202401001).
+                                                                                                   re-encrypted data sharing scheme with cloud-chain collaboration in industrial
+                                                                                                   internet environments, IEEE Internet Things J. 11 (20) (2024) 33247–33268.
+Data availability                                                                             [25] Liqing Chen, Meng Zhang, Jiguo Li, Conditional identity-based broadcast proxy
+                                                                                                   re-encryption with anonymity and revocation, IEEE Trans. Reliab. 74 (3) (2025)
+                                                                                                   3573–3584.
+    Data will be made available on request.
+                                                                                              [26] Liming Fang, Jiandong Wang, Chunpeng Ge, Yongjun Ren, Fuzzy conditional
+                                                                                                   proxy re-encryption, Sci. China Inf. Sci. 56 (5) (2013) 1–13.
+                                                                                              [27] BaoHong Li, JieFei Xu, YanZhi Liu, Lattice-based fuzzy conditional proxy
+References                                                                                         re-encryption, J. Internet Technol. 20 (5) (2019) 1379–1385.
+                                                                                              [28] Binhan Li, Lunzhi Deng, Yiming Mou, Na Wang, Yanli Chen, Siwei Li, A pairing-
+ [1] Shuzhou Sun, Hui Ma, Zishuai Song, Rui Zhang, WebCloud: Web-based cloud                       free data sharing scheme based on certificateless conditional broadcast proxy
+     storage for secure data sharing across platforms, IEEE Trans. Dependable Secur.               re-encryption suitable for cloud-assisted IoT, IEEE Internet Things J. 12 (20)
+     Comput. 19 (3) (2020) 1871–1884.                                                              (2025) 42754–42768.
+ [2] Maithilee Joshi, Karuna P. Joshi, Tim Finin, Delegated authorization framework           [29] Yousheng Zhou, Yurong Li, Yuanni Liu, A certificateless and dynamic conditional
+     for ehr services using attribute-based encryption, IEEE Trans. Serv. Comput. 14               proxy re-encryption-based data sharing scheme for IoT cloud, J. Internet Technol.
+     (6) (2019) 1612–1623.                                                                         26 (2) (2025) 165–172.
+ [3] Yinbin Miao, Robert H. Deng, Ximeng Liu, Kim-Kwang Raymond Choo, Hongjun                 [30] Shi Lin, Li Cui, Niu Ke, End-to-end encrypted message distribution system for
+     Wu, Hongwei Li, Multi-authority attribute-based keyword search over encrypted                 the Internet of Things based on conditional proxy re-encryption, Sensors 24 (2)
+     cloud data, IEEE Trans. Dependable Secur. Comput. 18 (4) (2019) 1667–1680.                    (2024) 1–16.
+ [4] Matt Blaze, Gerrit Bleumer, Martin Strauss, Divertible protocols and atomic proxy        [31] Yongjing Zhang, Zhouyang Zhang, Shan Ji, Shenqing Wang, Shitao Huang,
+     cryptography, in: International Conference on the Theory and Applications of                  Conditional proxy re-encryption-based key sharing mechanism for clustered
+     Cryptographic Techniques, Springer, 1998, pp. 127–144.                                        federated learning, Electronics 13 (5) (2024) 848.
+
+
+                                                                                         11
+J. Chen et al.                                                                                            Computer Standards & Interfaces 97 (2026) 104121
+
+
+[32] Chul Sur, Chae Duk Jung, Youngho Park, Kyung Hyune Rhee, Chosen-ciphertext                 Zhenfu Cao is currently a Distinguished Professor with
+     secure certificateless proxy re-encryption, in: IFIP International Conference on           East China Normal University, China. Since 1981, he has
+     Communications and Multimedia Security, Springer, 2010, pp. 214–232.                       been published over 400 academic papers in journals or
+[33] Sattam S. Al-Riyami, Kenneth G. Paterson, Certificateless public key cryptogra-            conferences. His research interests include cryptography,
+     phy, in: International Conference on the Theory and Application of Cryptology              number theory, and information security. He has received
+     and Information Security, Springer, 2003, pp. 452–473.                                     a number of awards, including the Ying-Tung Fok Young
+[34] Tarunpreet Bhatia, Anil K. Verma, Gaurav Sharma, Secure sharing of mobile                  Teacher Award, in 1989, the National Outstanding Youth
+     personal healthcare records using certificateless proxy re-encryption in cloud,            Fund of China, in 2002, and the Special Allowance by
+     Trans. Emerg. Telecommun. Technol. 29 (6) (2018) e3309.                                    the State Council, in 2005. He was a co-recipient of the
+[35] Nabeil Eltayieb, Liang Sun, Ke Wang, Fagen Li, A certificateless proxy re-                 2007 IEEE International Conference on Communications
+     encryption scheme for cloud-based blockchain, in: Frontiers in Cyber Security:             Computer Award, in 2007.
+     Second International Conference, FCS 2019, Xi’an, China, November 15–17,
+     2019, Proceedings 2, Springer, 2019, pp. 293–307.
+[36] Emmanuel Ahene, Junfeng Dai, Hao Feng, Fagen Li, A certificateless signcryption            Liangliang Wang received the Ph.D. degree from Shanghai
+     with proxy re-encryption for practical access control in cloud-based reliable smart        Jiao Tong University, in 2016. He has published academic
+     grid, Telecommun. Syst. 70 (2019) 491–510.                                                 papers in prestigious venues including IEEE Transactions
+[37] Amit Sahai, Brent Waters, Fuzzy identity-based encryption, in: Annual Interna-             on Dependable and Secure Computing, IEEE Transactions
+     tional Conference on the Theory and Applications of Cryptographic Techniques,              on Vehicular Technology, IEEE Internet of Things Journal,
+     Springer, 2005, pp. 457–473.                                                               Knowledge-Based Systems and SCIENCE CHINA Information
+[38] Hu Xiong, YaNan Chen, GuoBin Zhu, ZhiGuang Qin, Analysis and improvement                   Sciences. He is currently an Associate Professor with the
+     of a provable secure fuzzy identity-based signature scheme, Sci. China Inf. Sci.           College of Computer Science and Technology, Shanghai
+     57 (2014) 1–5.                                                                             University of Electric Power. His research interests include
+[39] Liangliang Wang, Jiangwei Xu, Baodong Qin, Mi Wen, Kefei Chen, An efficient                applied cryptography, information security and privacy
+     fuzzy certificateless signature-based authentication scheme using anonymous                preserving.
+     biometric identities for VANETs, IEEE Trans. Dependable Secur. Comput. 22 (1)
+     (2024) 292–307.                                                                            Jiachen Shen received the bachelor’s degree from Shang-
+[40] Dan Boneh, Matt Franklin, Identity-based encryption from the Weil pairing, in:             hai Jiao Tong University, Shanghai, China, in 2001, and
+     Annual International Cryptology Conference, Springer, 2001, pp. 213–229.                   the master’s and Ph.D. degrees from the University of
+[41] Adi Shamir, How to share a secret, Commun. ACM 22 (11) (1979) 612–613.                     Louisiana at Lafayette, Lafayette, LA, USA, in 2003 and
+[42] A. Riyami, Sattam S., K.G. Paterson, Certificateless public key cryptography, in:          2008, respectively. He joined East China Normal University,
+     Chi-Sung Laih (Ed.), Advances in Cryptology - ASIACRYPT 2003, Springer Berlin              Shanghai, China, in 2015. His research interests include
+     Heidelberg, Berlin, Heidelberg, 2003, pp. 452–473.                                         applied cryptography, cloud security, searchable encryption,
+[43] Shimao Yao, Ralph Voltaire J. Dayot, Hyung-Jin Kim, In-Ho Ra, A novel revo-                and blockchains.
+     cable and identity-based conditional proxy re-encryption scheme with ciphertext
+     evolution for secure cloud data sharing, IEEE Access 9 (2021) 42801–42816.
+[44] Xiaoyu Zheng, Yuyang Zhou, Yalan Ye, Fagen Li, A cloud data deduplication
+     scheme based on certificateless proxy re-encryption, J. Syst. Archit. 102 (2020)
+                                                                                                Xiaolei Dong is currently a Distinguished Professor with
+     101666.
+                                                                                                East China Normal University. She hosts a lot of research
+                                                                                                projects supported by the National Basic Research Program
+                           Jiasheng Chen is currently pursuing the Ph.D. degree with            of China (973 Program), the National Natural Science
+                           the Department of Cryptography and Cyber Security School             Foundation of China, and the Special Funds on Information
+                           of Software Engineering, East China Normal University,               Security of the National Development and Reform Commis-
+                           Shanghai, China. Her research interests include applied              sion. Her research interests include cryptography, number
+                           cryptography and information security.                               theory, and trusted computing.
+
+
+
+
+                                                                                           12
+
--- a/papers_txt/SiamIDS--A-novel-cloud-centric-Siamese-Bi-LSTM-framework_2026_Computer-Stand.txt
+++ b/papers_txt/SiamIDS--A-novel-cloud-centric-Siamese-Bi-LSTM-framework_2026_Computer-Stand.txt
--- a/papers_txt/StorStack--A-full-stack-design-for-in-storage-_2025_Journal-of-Systems-Archi.txt
+++ b/papers_txt/StorStack--A-full-stack-design-for-in-storage-_2025_Journal-of-Systems-Archi.txt
@@ -0,0 +1,704 @@
+                                                             Journal of Systems Architecture 160 (2025) 103348
+
+
+                                                                 Contents lists available at ScienceDirect
+
+
+                                                        Journal of Systems Architecture
+                                                        journal homepage: www.elsevier.com/locate/sysarc
+
+
+
+
+StorStack: A full-stack design for in-storage file systems
+Juncheng Hu, Shuo Chen, Haoyang Wei, Guoyu Wang, Chenju Pei, Xilong Che ∗
+College of Computer Science and Technology, Jilin University, Chang Chun, 130022, China
+
+
+
+ARTICLE                INFO                             ABSTRACT
+
+Keywords:                                               Due to the increasingly significant cost of data movement, In-storage Computing has attracted considerable
+File system                                             attention in academia. While most In-storage Computing works allow direct data processing, these methods do
+In-storage Computing                                    not completely eliminate the participation of the CPU during file access, and data still needs to be moved from
+Storage-class Memory
+                                                        the file system into memory for processing. Even though there are attempts to put file systems into storage
+                                                        devices to solve this problem, the performance of the system is not ideal when facing high latency storage
+                                                        devices due to bypassing the kernel and lacking page cache.
+                                                            To address the above issues, we propose StorStack, a full-stack, highly configurable in-storage file system
+                                                        framework, and simulator that facilitates architecture and system-level researches. By offloading the file system
+                                                        into the storage device, the file system can be closer to the data, reducing the overhead of data movements.
+                                                        Meanwhile, it also avoids kernel traps and reduces communication overhead. More importantly, this design
+                                                        enables In-storage Computing applications to completely eliminate CPU participation. StorStack also designs
+                                                        the user-level cache to maintain performance when storage device access latency is high. To study performance,
+                                                        we implement a StorStack prototype and evaluate it under various benchmarks on QEMU and Linux. The results
+                                                        show that StorStack achieves up to 7x performance improvement with direct access and 5.2x with cache.
+
+
+
+1. Introduction                                                                           the design and operation of file systems determine their reliance on
+                                                                                          the CPU when accessing the file system. For In-storage Computing,
+    In traditional computing architectures, data must be transferred                      although researchers are gradually reducing CPU involvement, current
+from storage devices to memory for processing, which not only con-                        file systems still rely on the CPU to handle complex file management
+sumes the computing resources of the host, but also results in high                       tasks and ensure system security and integrity.
+energy consumption and I/O latency. As data scales continue to expand,                        On the one hand, to reduce the software overhead of file systems,
+In-storage Computing has been proposed to alleviate the pressure of
+                                                                                          many works aim at the kernel trap. For example, there are some efforts
+data movement [1,2]. The core idea is to perform computations directly
+                                                                                          to move the file system into user space [8–13]. But running in user
+where the data is stored, without the need to move the data. The
+                                                                                          space may compromise the reliability of the file system, hence bugs
+emergence of high-speed storage devices like SSDs [3] and SCMs [4,5]
+has significantly advanced research in In-storage Computing and trans-                    or malicious software may cause crashes and data loss. Some of these
+formed computer storage systems. To fully leverage the potential of                       works try to move the critical parts of the file system back to the kernel.
+storage systems and exploit the characteristics of this new computing                     But in most cases, data-plane operations are interleaved with control-
+paradigm, a redesign of storage stack software is required.                               plane operations, which may diminish the performance improvement
+    As the most essential part of the storage stack software, file systems                brought by kernel bypassing. In recent years, firmware file systems
+have been residing in the operating system kernel for a very long                         have been proposed, which move file systems onto the storage device
+time because they need to perform integrity assurance and access                          controller [14–16] to completely get rid of the kernel trap. However,
+control to ensure data security. The kernel is considered a trusted field                 those file systems are designed to be strongly coupled with the storage
+compared to the user space. However, this seemingly good design has                       device, making the device lack the replaceability of file system and
+been challenged by new technologies. With the emergence of faster                         the compatibility with conventional operating systems. In addition,
+storage devices such as SSDs and SCMs, access latency decreases signif-                   these firmware file systems do not provide comprehensive security
+icantly compared to HDDs [6], leading to the software overhead of file
+                                                                                          guarantees.
+systems [7,8] becoming a major performance bottleneck. Meanwhile,
+
+
+  ∗ Corresponding author.
+    E-mail addresses: jchu@jlu.edu.cn (J. Hu), chenshuo22@mails.jlu.edu.cn (S. Chen), hywei23@mails.jlu.edu.cn (H. Wei), wgy21@mails.jlu.edu.cn
+(G. Wang), peicj2121@mails.jlu.edu.cn (C. Pei), chexilong@jlu.edu.cn (X. Che).
+
+https://doi.org/10.1016/j.sysarc.2025.103348
+Received 29 August 2024; Received in revised form 24 November 2024; Accepted 18 January 2025
+Available online 27 January 2025
+1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+J. Hu et al.                                                                                                      Journal of Systems Architecture 160 (2025) 103348
+
+
+    On the other hand, to fully leverage the advantages of In-storage             2.1. Hardware trends
+Computing, it is necessary to eliminate the participation of host-side
+OS from the storage access path. In-storage Computing advocates for a                 Compared to the large, slow HDD, solid-state drive (SSD) is a kind of
+data-centric approach, where computation units are embedded within                flash-based non-volatile storage with small form factor, high speed, and
+the storage devices to enable direct data processing. However, in the             low energy costs [17,18]. SSDs on the market today can provide up to
+process of accessing files, traditional file systems still require CPU            30 TB of capacity and 7 GB/s throughput on sequential read/write. To
+involvement. To know which data should be transferred next, file access           fully exploit the high performance, modern SSDs have switched from
+should be first handled by the host-side file system in the operating             SATA to PCIe and NVMe. PCIe 5.0 [19] supports up to 16 lanes and 32
+system kernel. This CPU intervention limits the computational capacity            GT/s data rate, which leads to more than 60GB/s bandwidth. NVMe [3]
+improvements that In-storage Computing can offer.                                 is a communication protocol for non-volatile memories attached via
+    Another point worth noting is that numerous studies propose im-               PCIe, supporting up to 65,535 I/O queues each with 65,535 depth. It
+proving system performance by allowing user applications to bypass                also supports SSD-friendly operations like ZNS and KV, which can help
+the kernel and communicate directly with storage devices. This method             SSDs further enhance SSDs’ throughput capabilities.
+demonstrates significant performance improvements when dealing with                   Storage class memory (SCM), also referred to as persistent mem-
+high-speed storage devices. However, due to the diversity of storage de-          ory (PMEM) or non-volatile memory (NVM), is a different type of
+vices and their varying latencies, system performance may suffer when             storage device that is fast and byte-addressable like DRAM, but can
+bypassing the high-speed cache, especially when using high-latency,               also retain data without power like SSDs. Various technologies such as
+low-speed storage devices. Therefore, the impact of cache configuration           PRAM [20,21], MRAM [22], and ReRAM [23,24]have been explored to
+on performance is also a subject of our further research. In summary,             implement SCM, each exhibiting different performance characteristics.
+despite various attempts to optimize file systems performance and                 SCM provides higher bandwidth than SSD; it offers latency close to
+reduce CPU involvement, current solutions still have several issues.              DRAM, and its capacity falls between SSD and DRAM [25]. As a new
+    To further optimize the performance and security of file systems              blood in the storage hierarchy, SCM can provide more possibilities to
+and fully unleash the potential of in-storage computing, we propose               multiple workloads [26–29].
+StorStack, which is a full-stack, highly configurable, in memory file                 Consequently, while the increased bandwidth and reduced latency
+system framework and simulator on high-speed storage devices such as              of storage devices have substantially boosted the performance of com-
+SSDs and SCMs. Since file systems always have a fixed primary func-               puter systems and enabled novel application scenarios, these advance-
+tionality of managing the data mapping, which is similar in function to           ments also introduce several challenges. These challenges include
+the flash translation layer (FTL) on the storage controller, we consider it       heightened complexity in data management, the need to balance cost
+natural and reasonable to run the file system on the storage controller.          and efficiency, and issues related to technical compatibility and migra-
+    StorStack has three main components: a device firmware runtime                tion.
+for file systems enabling file systems to run directly on the storage
+device, a user library to expose POSIX interfaces to user applications,           2.2. In-storage computing
+and a kernel driver to guarantee access control. By moving the file sys-
+tem into the storage, StorStack aims to gain performance improvement                  While these new storage devices have significantly altered the
+from the concept of In-storage Computing that brings the file system              memory hierarchy of computer systems, the memory wall between
+closer to the data. Moreover, the file system code is removed from the            CPU and off-chip memory is still the bottleneck of the whole system,
+kernel, which can avoid the latency and context switches caused by                especially with the rise of data-intensive workloads and the slowdown
+kernel traps during file access. More importantly, StorStack can remove           of Moore’s law and Dennard scaling. To reduce the overhead of data
+the CPU from the storage access path of In-storage Computing appli-               movement, In-storage Computing(ISC) [30–32]is proposed, gaining
+cations, maximizing the potential of In-storage Computing. To ensure              increasing attention with advancements in integration technologies.
+the security and reliability of the file system, StorStack has designed an        However, most current research predominantly focuses on offloading
+efficient security mechanism, introducing a device-side controller as the         user-defined tasks to storage devices, and this approach still faces
+runtime and retaining control plane operations within the host kernel.            limitations in practice.
+By reducing the ratio of control plane to data plane operations, kernel               First, existing ISC methods exhibit significant shortcomings in terms
+traps are minimized, enhancing performance. StorStack also includes a             of compatibility and portability. On the host side, developers must de-
+user-level cache to explore the impact of cache on the performance of             sign custom APIs for ISC, which are incompatible with existing system
+                                                                                  interfaces such as POSIX, demanding substantial modifications to the
+in-storage file systems.
+                                                                                  host code [32]. On the drive side, the drive program either collaborates
+    We implemented StorStack as a prototype and evaluated it on
+                                                                                  with the host file system to access the correct file data [33] or manages
+QEMU and Linux 5.15. Experimental results demonstrate that StorStack
+                                                                                  the drive as a bare block device without a file system. However, most
+performs up to 5.2x faster times than Ext-4 with cache and 7x times
+                                                                                  systems still rely on file system-based external storage access, with the
+with direct access. Regarding the cache, we find that as access latency
+                                                                                  file system typically running on the CPU. Consequently, ISC tasks often
+increases, file systems with cache always maintain high speeds, whereas
+                                                                                  require CPU involvement when accessing external storage data.
+the speed of file systems without cache decreases significantly.
+                                                                                      Secondly, current approaches lack adequate protection and isolation
+                                                                                  for ISC applications. To fully leverage the high speed of modern storage
+2. Background and related work                                                    devices, multiple ISC applications may need to execute concurrently.
+                                                                                  Without proper data protection mechanisms, malicious or erroneous
+    The storage or memory system has changed a lot in the past decades.           ISC tasks could access unauthorized data. Without isolation, the exe-
+With the development of speed, capacity, and size, and the emergence              cution of one ISC task could compromise the performance and security
+of new types of storage, a rethink of both hardware and software is               of others. However, most existing research [1,34,35] assumes that ISC
+required to exploit the potential of the system in the next era. In this          tasks operate in an exclusive execution environment, failing to address
+section, we first discuss the trends of two novel high-speed non-volatile         these concerns effectively. Additionally, when specific code is offloaded
+storage, and then explored the significance of applying In-storage Com-           to storage devices, attackers can exploit vulnerabilities in in-storage
+puting on these storage devices. Finally, we briefly introduce three file         software and hardware firmware, such as buffer overflows [36,37] or
+systems in different locations.                                                   bus snooping attacks, to escalate privileges and harm the system.
+
+                                                                              2
+J. Hu et al.                                                                                                     Journal of Systems Architecture 160 (2025) 103348
+
+
+2.3. File system                                                                 3. Design
+
+    The evolution of storage hardware poses higher demands for soft-
+                                                                                     In this section, we first discuss the design principles of StorStack,
+ware systems. As a crucial part of the software stack of the storage
+                                                                                 followed by an overview of its architecture, connection between host
+system, file systems should be redesigned to minimize software over-
+heads, especially the involvement of the OS kernel on the data path.             and device, scheduling mechanisms and reliability designs.
+Many efforts have explored the possibility of different file system
+locations.
+                                                                                 3.1. Principles
+Kernel file systems. Numerous typical file systems are implemented
+inside kernel as kernel file systems, including Ext4, XFS, etc. Due to
+the isolation of kernel space, kernel file systems can easily manage                 1. Provide a full-stack framework to enable in-storage file sys-
+data and metadata with reliability guarantees [38]. Recent works on              tems without compromising performance. To support in-storage
+kernel file systems have sought to exploit the capabilities of modern            FS, StorStack’s design includes a user library, a kernel driver, and a
+storage devices. For example, F2FS [39] is built on append-only logging          firmware FS runtime. By bringing FS code out of the kernel and closer
+to adapt to the characteristics of flash memory. PMFS [38] introduce             to the data, StorStack avoids the kernel trap and reduces the commu-
+a new hardware primitive to avoid the consistency issues caused by               nication overhead. StorStack also incorporates a user-level cache to
+CPU cache while accessing SCM. DAX [40] bypasses the buffer cache                maintain the performance when the access latency of the device is high.
+of the system to support direct access to the storage hardware so that
+                                                                                     2. Make full use of the heterogeneity of the host CPU and
+the redundant data movement between DRAM and SCM is removed.
+                                                                                 storage device controller. The in-storage FS yields the host CPU time
+NOVA [41] explores the hybrid of DRAM and SCM as a specially
+designed log-structured file system. However, kernel file systems have           to user application codes and cuts the energy cost, while conflicts due
+several limitations. Firstly, the development and debugging process              to concurrent access are resolved on the host CPU to maintain the per-
+within kernel space is inherently complex and difficult. Furthermore,            formance. If necessary, the cache is also retained on the host side, and is
+every file system access necessitates a kernel trap, which inevitably in-        managed by the user space. Such a heterogeneous system can maximize
+troduces latency. Additionally, the frequent context switching between           the overall performance and minimize the power consumption of the
+user processes and the kernel increases CPU overhead.                            system.
+User-space file systems. User-space file systems are implemented                     3. Guarantee the reliability of the file system with minimal
+mostly in user space to bypass the kernel and reduce the overhead as-            overhead. To provide essential guarantees such as permission check-
+sociated with kernel traps. However, since most user-space file systems          ing, StorStack keeps its control plane within the trusted area. Addi-
+are implemented in untrusted environments, ensuring data security and            tionally, to enhance performance, a token mechanism is introduced to
+reliability becomes challenging. User-space file systems need sophisti-
+                                                                                 prevent StorStack from accessing the kernel during data-plane opera-
+cated design, usually the collaboration between kernel space and user
+                                                                                 tions.
+space, to keep them reliable. For example, Strata [11] separate the
+file system into a per-process user space update log for concurrent                  4. Keep compatible with conventional operating systems. The
+writing and a read-only kernel space shared area for data persistence.           design of StorStack does not require changes to current operating
+Moneta-D [9] provides a hardware virtual channel support with kernel             systems. Instead, the user lib and kernel driver of StorStack are add-
+space file system protection policy and a user space driver to access the        ons. Even without them, the StorStack storage device can be accessed
+hardware. There are also efforts to implement the control-plane of the           with typical block- or byte-based interfaces, just like traditional SSDs
+file system as a trusted user space process [8,12].                              or SCMs. StorStack also supports per-partition replaceable file sys-
+Firmware file systems. Works that offload part or the whole of the file          tems, which is a regular function in current operating systems but not
+system into the storage device firmware are categorized as firmware              supported by firmware file systems.
+file systems. There are three representative works on firmware file sys-             5. Support heterogeneous computing. By providing a device-level
+tems: DevFS [14], CrossFS [15] and FusionFS [16]. DevFS and CrossFS
+                                                                                 file interface, StorStack may enable multiple advanced heterogeneous
+explore the possibility of moving the file system to the storage side to
+                                                                                 access patterns, including In-storage Computing (ISC) [31,32,42,43]
+benefit from kernel bypass. FusionFS goes further on the previous two
+                                                                                 and direct I/O access from GPUs [44,45] or NICs [42,46]. In this work,
+works and attempts to gain performance by combining multiple storage
+access operations. However, we have identified several problems of               we provide basic support for these patterns and plan to further explore
+these file systems. First, these firmware file systems are tightly coupled       them in future research.
+with specific storage devices, which makes it hard for users to select               6. Run with reasonable hardware setup on the storage device.
+alternative file systems or upgrade the software version of the current          Previous research on firmware file systems has assumed that device
+file system. Second, none of these file systems are designed to operate          controller hardware capabilities are severely limited. However, today’s
+effectively in scenarios with significant communication latency. Third,          high-end storage devices feature up to 4 cores and DRAM capacity that
+the lack of security mechanisms limits their applicability in real-world         can reach 1% of their storage capacity [47]. As in-storage processing
+environments.                                                                    evolves, hardware configurations will continue to improve [30,43,
+                                                                                 48–50]. In StorStack, we assume that the device possesses sufficient
+2.4. Motivation                                                                  capabilities to run file systems alongside a runtime environment. Fu-
+                                                                                 ture research can investigate the benefits of integrating in-storage file
+    Although kernel file systems are well-designed and time-tested, their
+                                                                                 systems with additional device-side capabilities, such as power loss
+design principles, which assume high device access latency, are no
+                                                                                 protection capacitors or the flash translation layer.
+longer suitable for modern high-speed devices. User-space file systems
+and firmware file systems have explored new approaches to file system
+implementation in the era of high-speed storage; however, they may               3.2. Architecture
+lead to inferior performance with traditional devices, compromised
+security controls, or inflexible, non-replaceable file systems. To ad-
+dress these issues, we introduce StorStack, a fast, flexible, and secure             To support in-storage file systems with compatibility, flexibility, and
+in-storage file system framework. The detailed comparison between                reliability, StorStack has three major parts distributed over user space,
+StorStack and previous file systems is shown in Table 1.                         kernel space, and device side.
+
+                                                                             3
+J. Hu et al.                                                                                                                Journal of Systems Architecture 160 (2025) 103348
+
+                Table 1
+                The detailed comparison between StorStack and previous file systems.
+                                     Software access    Expected hardware FS position             Host-side cache   Replaceable FS     Isolated access
+                                     latency            latency                                                                        control
+                 Kernel FS           High               High                Host                  ✓                 ✓                  ✓
+                 User-space FS       Low                Low                 Host                  ◦                 ✓                  ◦
+                 Prev.Firm FS        Low                Low                 Device                ×                 ×                  ×
+                 StorStack           Low                Either              Device                ✓                 ✓                  ✓
+
+
+
+
+Fig. 1. StorStack Architecture. StorStack consists of three major modules: the U-lib, the K-lib, and the Firm-RT; and there are two workflows: a data-plane workflow, and
+a control-plane workflow. The interconnection between them is shown in the figure.
+
+
+3.2.1. High-level design                                                                   subsequently transmits it to the device-side Firm-RT. The Firm-RT
+    As shown in Fig. 1, StorStack consists of three major parts: a user                    receives the NVMe command, checks its validity, and then forwards the
+lib (U-lib), a kernel driver (K-lib), and an FS runtime in device                          command to the FS. The FS handles the file operation and then works
+firmware (Firm-RT).                                                                        with the FTL or other hardware instruments to arrange the data blocks
+U-lib. The U-lib is the interface for user applications to access the                      on the storage media. The primary distinction between this routine and
+in-storage FS, offered as a dynamic link library. The main job of the                      a typical kernel-based file system lies in the fact that the file system
+U-lib is to expose POSIX file operations to users, provide user-level                      logic is inside the storage device, thus StorStack thereby eliminating
+cache, and manage the connection with the device. It also cooperates                       the need for kernel traps during data access.
+with the K-lib and the Firm-RT to ensure the reliability of the                                The control plane (blue dashed lines in Fig. 1) provides necessary
+system.                                                                                    supports for the data plane to work properly. Control-plane operations
+K-lib. The K-lib is a kernel module to provide control-plane op-                           on the host side, including memory resource allocation and identity
+erations with reliability. Its work includes resource allocation and                       token assignment, are delegated to the kernel to ensure security and
+permission checking. Although it resides in the kernel, the functions                      reliability. The host-side control-plane operations are designed to be
+of K-lib are designed to be rarely called to avoid the performance                         rarely called to reduce kernel trap overhead. On the device, the control
+penalty associated with kernel traps.                                                      plane assists in check the authentication of requests, manage the FS,
+Firm-RT. The Firm-RT is a runtime on the storage firmware that                             and deal with other management operations. More detailed security
+offers essential hardware and software support for in-storage FS to run                    and reliability policies will be described in Section 3.5.
+on the device controller. To serve the FS, Firm-RT communicates
+with both the U-lib for data-plane operations, and the K-lib for                           3.2.3. Organization on the storage
+control-plane operations.                                                                      In StorStack, file systems are stored in the storage media with
+                                                                                           pointers originating from partitions, so that the framework can choose
+3.2.2. StorStack workflow                                                                  the right FS to access a partition. We dedicate a partition to store all
+    For clarity, the workflow of StorStack is divided into a data plane                    the FS binaries that are used by user-created partitions, and each FS
+and a control plan. The data-plane workflow handles data accesses from                     in this partition can be indexed by a number. Here we assume that a
+user space, and the control plane is responsible for maintaining the                       GUID partition table (GPT) to organize the partitions. Each user-created
+system’s functionality, safety, and reliability.                                           partition is associated with an FS when it is formatted, and the FS will
+    For the data plane (red lines in Fig. 1), when a user application                      be added to the FS partition we just mentioned if it was not there yet.
+calls a file operation in StorStack, the host-side U-lib will check the                    To indicate the relation between the user-created partition and its FS,
+cache if the cache is used. If the cache is bypassed or penetrated,                        the index number of the FS is added to the attribute flags bits of the
+the U-lib packs it into an extended NVMe protocol command, and                             partition’s GPT entry. The organization is illustrated in Fig. 2. This
+
+                                                                                       4
+J. Hu et al.                                                                                                                Journal of Systems Architecture 160 (2025) 103348
+
+
+
+
+                           Fig. 2. Partition organization. Figure shows how the FS is stored on the storage and associated with the partition.
+
+
+design allows StorStack to provide different file systems to different                  disk access when the system does not support StorStack. It is note-
+partitions. Meanwhile, the GPT and the partitions are still available for               worthy that the protocol can be further extended under StorStack to
+the typical kernel file system routine.                                                 support more paradigms like transactional access [51], log-structured
+                                                                                        access [52,53], operations fusing [16], or In-storage Computing. We
+3.3. File access pattern                                                                will leave these further explorations to our future work.
+                                                                                            With StorStack, heterogeneous hardware like GPUs can implement
+    The U-lib provides POSIX IO and AIO interfaces to user appli-                       this extended protocol to access files directly without involving the
+cations, and the complicated reliability and performance designs are                    CPU. For different types of hardware, there are two ways to transmit
+transparent to users. For regular IO interfaces, the write operations                   data. For those who have their own memory (memory-mapped) like
+(write, pwrite) act differently with and without cache. When the cache                  GPUs, StorStack can directly place the data to their memory via PCIe
+is used, writes will return as soon as an operation passes some simple                  bus. For hardware without memory (I/O mapped), StorStack should put
+check and is put into the queue. The interface will not promise that                    the data into the main memory. The manipulation of data destination
+the data is written to the disk before it returns, just like a traditional              is directed by the target device driver.
+kernel file system, unless the fsync is called. Without cache, the writes
+will block the process until the data is written to the storage. The                    3.4.2. Multi-queue arrangement
+read interfaces (read, pread) will not return until the data is available,                  NVMe uses multiple queues to improve performance, supporting up
+regardless of whether there is a cache. The AIO interfaces return                       to 65,536 I/O queues, with 65,536 commands per queue. Normally,
+immediately when an operation is put into the queue, and the real                       NVMe offers at least a pair of queues (one submission queue and one
+return value can be fetched by non-blocking check, blocking suspend,                    completion queue) for each core to fully utilize the bandwidth without
+or signal.                                                                              introducing locks. In StorStack, file operations are processed on the
+                                                                                        device side, particularly when the storage device features a multi-core
+    To make sure that StorStack performs well on high-latency storage
+                                                                                        controller. To fully utilize the parallelism of the controller cores while
+devices, an optional user-level per-process cache is provided. Because
+                                                                                        minimizing the potential conflicts of concurrent file access, StorStack
+the reliability of StorStack can only be ensured by the device-side file
+                                                                                        introduces a special queue organization.
+system but not the U-lib, we choose per-process cache to prevent
+                                                                                            As Fig. 3 shows, every user process in StorStack is assigned a bunch
+malicious processes from polluting data by writing to a global cache
+                                                                                        of queue pairs, the number of which is equal to the storage device
+without check. The user-level cache has two ways to deal with write
+                                                                                        controller core count. Each queue pair of the queue pair bunch is bound
+operations: the write-back method returns immediately after the data
+                                                                                        to a controller core of the storage device, so that a process can distribute
+is put into the cache; the write-around method drops the dirty data in
+                                                                                        any file operation to a specific controller core. Meanwhile, each user
+cache and returns after the operation is put into the queue. The write-
+                                                                                        thread has its exclusive queue pair bunch to avoid queue contention
+back cache has a higher performance than the write-around cache,
+                                                                                        on the host side.
+while the write-around cache can provide higher data consistency. In
+                                                                                            The purpose of this arrangement is to enable the host-side ap-
+fact, our evaluation shows that the write-back cache in StorStack can
+                                                                                        plications to control which operation should be dispatched to which
+outperform the page cache inside the kernel.
+                                                                                        controller core. For example, read intensive applications can issue read
+                                                                                        operations to all cores with a round robin strategy. For write intensive
+3.4. Connectivity                                                                       applications, different threads can send the write operations on the
+                                                                                        same file to the same controller core to reduce lock contention between
+    Here we discuss how the host-side U-lib and K-lib communicate                       controller cores. We will leave the exploration of the scheduling policy
+with the device-side Firm-RT. StorStack’s communication is based                        for different workloads to future works.
+on NVMe to take full advantage of high-speed storage devices. We
+also propose a multi-queue design to improve the performance of                         3.5. Security and reliability
+device-side FS.
+                                                                                            From a hardware perspective, the privileged mode (ring 0) that the
+3.4.1. Communication protocol                                                           kernel runs on and the user mode that user applications run on are
+    The communication protocol between the host CPU and StorStack                       isolated, which means the access to resources is restricted by hardware.
+device is a queued protocol extended from NVMe [3]. NVMe is a                           The privileged mode can thus be treated as a trusted area, whereas the
+protocol for accessing non-volatile memories connected via PCIe that                    user mode as an untrusted area. StorStack introduces the device-side
+supports multiple queues to maximize the throughput, which is suitable                  controller as a run-time, which is also isolated from user code and thus
+for novel high-speed storage devices such as SSDs and SCMs.                             viewed as a trusted area.
+    To enable the transfer of file operations, we extend the NVMe                           For safety, everything critical to the correctness of the system should
+command list to incorporate the POSIX I/O interface. Meanwhile, the                     be placed in the trusted area. Typical kernel file systems are placed
+regular data access pattern of NVMe is retained to enable normal                        inside the kernel as they need to manage the data on block devices.
+
+                                                                                    5
+J. Hu et al.                                                                                                           Journal of Systems Architecture 160 (2025) 103348
+
+
+
+
+Fig. 3. Queue arrangement and scheduling policies. This figure shows how the
+queue pairs are mapped between host CPU threads and device controller cores.
+
+
+
+
+StorStack shifts FS to the device side, which is also a trusted area.
+Meanwhile, as described in Section 3.2.2, StorStack separates the host-
+side workflow into a control plane and a data plane. The control plane
+is designed to reside in the host-side trusted area, i.e. the kernel, to
+cooperate with the device-side FS to ensure security and reliability.
+    An important design principle of the control plane is to reduce the            Fig. 4. Permission checking. Figure shows how the user space, the kernel space, and
+overhead of the kernel trap. In StorStack, this is done by reducing                the device work together to check the validity of a request without frequent kernel
+the proportion of control-plane operations and data-plane operations.              traps.
+
+There are two types of control-plane workflow on the host side: re-
+source allocation and access control. Both of them are designed to be
+called rarely.                                                                     generates a secret key if one has not been set yet, then save and copy
+                                                                                   it to the device by kernel NVMe driver. Once the key is set, K-lib
+3.5.1. Resource allocation                                                         uses it to encrypt the process’s credential information (i.e. uid) into
+   The U-lib of StorStack is a user-space driver that communicates                 MAC (Message Authentication Code). The resulting token, which is the
+with the NVMe storage device. It needs to set up VFIO and manage                   output of the encryption, is then returned to the process. Since the
+DMA memory mapping to enable direct access from user space. It also                secret key is stored in the kernel, the process cannot forge a token
+needs to allocate areas for caches. These operations involve the kernel            but can only use the one assigned by the kernel, which can prove the
+but only need to be run once when the device is initialized, so there              authenticity of the uid claimed by the process. Before being sent to the
+will not be any performance loss in regular file access.                           device, every request from the process is tagged with the process’s uid
+                                                                                   and the token, so that the device can use the secret key and the token
+3.5.2. Permission checking                                                         to verify the uid and check the identity of the request. This mechanism
+    To provide access control, file systems must check the user’s permis-          requires only one communication between the kernel and the device to
+sion to make sure that a file operation is legal. In kernel file systems,          share the secret key, and one kernel trap to initialize the token for each
+the file system can use the process structure in the kernel to validate the        process. Also, the K-lib is implemented as a kernel driver, without
+process’s identity, and then compare it with the permission information            any modification to the core functions of the kernel, which makes it
+stored in the file’s inode. In StorStack, however, the file system resides         compatible with conventional operating system.
+on the device rather than in the kernel, so the kernel needs to share the
+process’s information with the device to support permission checking.              3.5.3. Device lock
+    To avoid entering the kernel frequently, DevFS [14] maintains a                    StorStack is designed to support direct I/O not only from CPUs,
+table that maps CPU IDs to process credentials in the device. All                  but also from different types of heterogeneous computing devices.
+requests are tagged with the CPU’s ID that the process runs on before              To prevent concurrent access to the same file from multiple devices,
+they are sent to the device. The kernel is modified to update the table            a concurrency control method is required. A common practice is to
+whenever a process is scheduled on a host CPU. There are two problems              implement a distributed lock across all devices, but this can be too
+with this mechanism. Firstly, it assumes that the CPU ID is unforgeable,           costly for low-level hardware. In StorStack, we provide in-storage file-
+but usually a malicious process can potentially exploit the ID of another          level locking mechanisms to protect the files from unexpected access
+CPU to escalate its privilege. Secondly, this requires a modification to           by multiple devices.
+the process scheduler, which is a core module of the kernel, so making                 StorStack supports two types of lock: (1) spinning lock, an error
+it incompatible with standard OS kernels, and may slow down the                    code will be returned to the caller if the file it accesses is already locked
+system.                                                                            by another device, allowing the caller to continue attempting to acquire
+    In StorStack, we propose a new method to share the credential of               the lock until the file is unlocked; (2) sleeping lock, where if the file
+the process, with less communication, safer guarantee, and no change               is locked, any requests from other devices to that file will wait in the
+to the Linux kernel. The process is shown in Fig. 4. When the U-                   submission queue until the file is unlocked. From the perspective of
+lib is initialized on a process, it calls the K-lib (a kernel driver)              concurrency, StorStack supports both shared lock and exclusive lock,
+via ioctl() (system call) to get a credential token. The K-lib                     which act exactly the same as those on other systems.
+
+                                                                               6
+J. Hu et al.                                                                                                                   Journal of Systems Architecture 160 (2025) 103348
+
+
+
+
+Fig. 5. Random and sequential r/w. Figure shows the basic performance of StorStack compared with Ext-4, under different cache, block size, and in-storage file system settings.
+
+
+                                                                                           running on its host machine. There are two reasons for the simula-
+                                                                                           tion: first, although there are several works regarding programmable
+                                                                                           storage controllers [49,55–57], these solutions are either expensive or
+                                                                                           lack high-level programmability as most of them are based on FPGA;
+                                                                                           second, by simulating with various latency settings, we can evaluate the
+                                                                                           performance of StorStack on different types of storage devices, which
+                                                                                           can be costly if done with real hardware. In our prototype, QEMU has
+                                                                                           been modified to handle extended NVMe POSIX I/O operations and
+                                                                                           check the token of each operation.
+
+                                                                                           4. Evaluation
+
+                                                                                              In this section, we evaluate the performance of StorStack and com-
+                                                                                           pare it with popular file systems to answer the following questions:
+                     Fig. 6. Time cost for a single operation.
+
+                                                                                               • Is StorStack efficient enough compared to widely used kernel file
+                                                                                                 systems?
+3.6. Implementation                                                                            • How much performance is gained from the kernel trap avoidance?
+                                                                                               • How does StorStack perform on different types of devices?
+    We have implemented a prototype of StorStack, which consists of                            • How is the concurrency performance of StorStack?
+three parts: the U-lib, the K-lib and the Firm-RT. The source code
+of this prototype is available at https://anonymous.4open.science/r/
+StorStack-524F/.                                                                           4.1. Experimental setup
+    The U-lib is implemented under Linux 5.15, utilizing SPDK [54]
+to access storage devices from user space. The SPDK library is modified                        Our experiment platform is a 20-core 2.4 GHz Intel Xeon server
+in StorStack to transfer POSIX I/O operations over NVMe. The U-lib                         equipped with 64 GB DDR4 memory and 512 GB SSD. Among them, 8
+comprises two major components: a dynamic link library that provides
+                                                                                           cores with 16 GB memory are assigned to the QEMU VM to simulate the
+interfaces and a user-level cache for accessing the device, and a daemon
+                                                                                           StorStack host; other cores with 16 GB memory are reserved to emulate
+program responsible for managing the connection to the device.
+                                                                                           the StorStack device. Both the StorStack host and the StorStack device
+    The K-lib is implemented as a simple kernel module in Linux 5.15
+                                                                                           runs on Linux 5.15.
+kernel. It only takes charge of two things: creating the secret key when
+the StorStack is initialized so that the K-lib and the Firm-RT can                             StorStack’s expected settings on the device require only a minimal
+use it to encrypt and decrypt the MAC token for processes’ credentials;                    embedded system with abstractions of hardware functions and neces-
+generating the MAC token from the uid of the current process with                          sary libraries, but due to our simulation requirements, we choose Linux
+HMAC algorithm when the process initializes, and then return it to                         as the device-side environment to support the execution of QEMU.
+the U-lib. The interface of the K-lib is exposed to the user space                             In this section, we evaluate the performance of StorStack using
+through ioctl.                                                                             Filebench [58], a widely used benchmarking suite for testing file system
+    The Firm-RT is the only component that located on the device                           performance. We access StorStack under various configurations, includ-
+side. In this work, the Firm-RT is not implemented on actual stor-                         ing different cache options, device access latency, thread numbers and
+age hardware but is instead simulated using QEMU and the system                            read/write ratios, to address the four questions previously raised.
+
+                                                                                       7
+J. Hu et al.                                                                                                                 Journal of Systems Architecture 160 (2025) 103348
+
+
+
+
+                 Fig. 7. Performance with simulated latency. This figure shows the change in throughput as a function of simulated device access latency.
+
+
+
+
+                                                                    Fig. 8. Multi-thread Performance.
+
+
+4.2. Random and sequential r/w                                                          read, and uncached write.
+                                                                                            When the cache hits, the data resides in fast DRAM, resulting in
+    First, we evaluate StorStack’s performance with single-thread ran-                  low data-fetch latency. In this scenario, traditional Ext-4 exhibits higher
+dom and sequential read/write tests. The random tests run on a 1 GB                     access latency, as the kernel trap accounts for most of the latency.
+file with 1K, 4K, and 16K bytes I/O size. The sequential tests run on                   In contrast, StorStack shows lower latency because its cache is imple-
+a 8 GB file with 8K, 32K, and 128K bytes I/O size. Both of the files                    mented inside user space eliminating the need for kernel traps. When a
+are stored on the DRAM memory, which is simulated as a PMEM by                          cache miss occurs, the primary overhead shifts to the multiple rounds
+memmap. The tests are performed on StorStack (referred to as SS) with                   of storage device access, which further increases the performance gap
+two different in-storage FS settings: SS+Ext-4 and SS+Ext-4_DAX.                        between traditional Ext-4 and StorStack.
+Then we compare them with Ext-4. We also evaluate the performance
+of SS without cache (SS NC) and Ext-4 with direct IO (Ext-4_DIO)                        4.4. Impact of access latency
+to study performance improvement when accessed directly.
+    Fig. 5 shows the results of the random and sequential tests. In                         Storage devices with different access latencies may influence the
+both tests, SS outperforms traditional kernel-level Ext-4, due to our                   performance of file systems. In this experiment, we use multiple latency
+kernel-bypass and near-data file system design. SS+Ext-4_DAX with                       settings to simulate devices with different access speed. The latency is
+user-level write-back cache achieves averagely 1.98x, 4.25x, 3.59x, and                 simulated on the device side by QEMU.
+4.08x performance gain on random read, random write, sequential                             We compare the performance of SS with Ext-4 under cached and
+read, and sequential write respectively compared with Ext-4 with                        uncached settings using several latency settings. The latency ranges
+page cache. For direct access, the speed increase is 6.41x, 6.21x,                      from 0 μs to 25 μs to simulate connection methods from DDR to PCIe
+4.72x, and 1.90x respectively. Another interesting phenomenon is that                   to RDMA. Tests run with 4KB block size.
+in cached StorStack, the performances of SS+Ext-4 and SS+Ext-                               Fig. 7 shows the result of this test. With a cache, both SS and
+4_DAX are similar, indicating that the choice of the in-storage file                    Ext-4 are not susceptible to the rise of latency. However, without
+system does not matter because most operations are handled by                           cache, the performance of SS has a 78.20% degrade from 526MB/s
+the user-level cache. However, in uncached tests, SS+Ext-4_DAX                          at 0 simulated latency to 115 MB/s at 25 μs latency. The performance
+show better results, which means that the in-storage file system may                    of Ext-4 also cuts 20.98% from 54MB/s to 43MB/s. Note that the
+influence the overall performance in direct access.                                     experiment introduces extra latency due to QEMU, so the simulated 0
+                                                                                        latency is larger than 0 actually, meaning that the curve can even go
+4.3. Profit of kernel bypassing                                                         higher on the left side of the graph. The result illustrates that direct
+                                                                                        access of SS should only be enabled on ultra-low latency devices. For
+    We measure the time cost of a single operation to study the profit                  other hardware, it is better to enable the cache.
+of kernel bypassing. The cached test demonstrates the impact of kernel
+trap on the access of in-memory page cache. The uncached test shows                     4.5. Multi-thread performance
+the impact of both kernel trap and write amplification on direct access
+to the storage device. Both tests utilize 4KB block size, and the files                    To study the performance of StorStack under multiple threads, we
+are stored on the simulated PMEM. The results in Fig. 6 indicate that                   evaluate SS and Ext-4 under a multi-thread micro-benchmark. The
+compared to Ext-4, SS+Ext-4_DAX reduces latency by 91.91%,                              benchmark is to perform parallel 4KB file operations on one file with 4
+50.46%, 69.83%, and 81.83% on cached read, cached write, uncached                       threads, each thread is a reader or a writer, and the ratio of readers and
+
+                                                                                    8
+J. Hu et al.                                                                                                                       Journal of Systems Architecture 160 (2025) 103348
+
+
+writers is set to 4:0, 3:1, 1:3, and 0:4. Fig. 8 shows the result. StorStack                  [10] M. Dong, H. Bu, J. Yi, B. Dong, H. Chen, Performance and protection in the
+is faster than Ext-4 in all concurrent read and write scenarios of our                             ZoFS user-space NVM file system, in: Proceedings of the 27th ACM Symposium
+                                                                                                   on Operating Systems Principles, ACM, Huntsville Ontario Canada, 2019, pp.
+test. For cached scenario, SS is on average 2.88x faster than Ext-4 in
+                                                                                                   478–493, http://dx.doi.org/10.1145/3341301.3359637.
+all read-write ratios. For uncached scenario, the speed up is 17.34x.                         [11] Y. Kwon, H. Fingler, T. Hunt, S. Peter, E. Witchel, T. Anderson, Strata: A cross
+                                                                                                   media file system, in: Proceedings of the 26th Symposium on Operating Systems
+5. Conclusion                                                                                      Principles, in: SOSP ’17, Association for Computing Machinery, New York, NY,
+                                                                                                   USA, 2017, pp. 460–477, http://dx.doi.org/10.1145/3132747.3132770.
+                                                                                              [12] J. Liu, A.C. Arpaci-Dusseau, R.H. Arpaci-Dusseau, S. Kannan, File systems as
+    In this paper, we present StorStack, a full-stack design for in-storage                        processes, in: 11th USENIX Workshop on Hot Topics in Storage and File Systems,
+file systems framework and simulator. The StorStack components across                              HotStorage 19, USENIX Association, Renton, WA, 2019.
+user space, kernel space, and device space collaborate to enable file                         [13] S. Zhong, C. Ye, G. Hu, S. Qu, A. Arpaci-Dusseau, R. Arpaci-Dusseau, M. Swift,
+systems to run inside the storage device efficiently and reliably. We                              MadFS: per-file virtualization for userspace persistent memory filesystems, in:
+                                                                                                   21st USENIX Conference on File and Storage Technologies, FAST 23, 2023, pp.
+implement a prototype of StorStack and evaluate it with various set-
+                                                                                                   265–280.
+tings. Experimental results show that StorStack outperforms current                           [14] S. Kannan, A.C. Arpaci-Dusseau, R.H. Arpaci-Dusseau, Y. Wang, J. Xu, G. Palani,
+kernel file systems in both cached and uncached scenes. Some further                               Designing a true direct-access file system with devfs, in: 16th USENIX Conference
+performance optimizations, such as the combination of file systems and                             on File and Storage Technologies, FAST 18, USENIX Association, Oakland, CA,
+                                                                                                   2018, pp. 241–256.
+storage hardware capabilities, the exploration of multi-queue schedul-
+                                                                                              [15] Y. Ren, C. Min, S. Kannan, Crossfs: A cross-layered direct-access file system,
+ing strategies for different workloads, and the performance of direct                              in: 14th USENIX Symposium on Operating Systems Design and Implementation,
+access from heterogeneous devices, are left to future work.                                        OSDI 20, USENIX Association, 2020, pp. 137–154.
+                                                                                              [16] J. Zhang, Y. Ren, S. Kannan, FusionFS: fusing I/O operations using ciscops
+                                                                                                   in firmware file systems, in: 20th USENIX Conference on File and Storage
+CRediT authorship contribution statement
+                                                                                                   Technologies, FAST 22, USENIX Association, Santa Clara, CA, 2022, pp. 297–312.
+                                                                                              [17] N. Agrawal, V. Prabhakaran, T. Wobber, J.D. Davis, M. Manasse, R. Panigrahy,
+   Juncheng Hu: Writing – review & editing, Writing – original draft.                              Design tradeoffs for SSD performance, in: USENIX 2008 Annual Technical
+Shuo Chen: Formal analysis, Data curation. Haoyang Wei: Formal                                     Conference, in: ATC’08, USENIX Association, USA, 2008, pp. 57–70.
+analysis, Data curation. Guoyu Wang: Writing – review & editing,                              [18] F. Chen, D.A. Koufaty, X. Zhang, Understanding intrinsic characteristics and
+                                                                                                   system implications of flash memory based solid state drives, in: Proceedings
+Writing – original draft. Chenju Pei: Formal analysis, Data curation.                              of the Eleventh International Joint Conference on Measurement and Modeling of
+Xilong Che: Methodology, Conceptualization.                                                        Computer Systems, in: SIGMETRICS ’09, Association for Computing Machinery,
+                                                                                                   New York, NY, USA, 2009, pp. 181–192, http://dx.doi.org/10.1145/1555349.
+Declaration of competing interest                                                                  1555371.
+                                                                                              [19] Welcome to PCI-SIG | PCI-SIG, https://pcisig.com/.
+                                                                                              [20] Y. Choi, I. Song, M.-H. Park, H. Chung, S. Chang, B. Cho, J. Kim, Y. Oh, D.
+    The authors declare that they have no known competing finan-                                   Kwon, J. Sunwoo, J. Shin, Y. Rho, C. Lee, M.G. Kang, J. Lee, Y. Kwon, S. Kim,
+cial interests or personal relationships that could have appeared to                               J. Kim, Y.-J. Lee, Q. Wang, S. Cha, S. Ahn, H. Horii, J. Lee, K. Kim, H. Joo, K.
+influence the work reported in this paper.                                                         Lee, Y.-T. Lee, J. Yoo, G. Jeong, A 20nm 1.8V 8gb PRAM with 40mb/s program
+                                                                                                   bandwidth, in: 2012 IEEE International Solid-State Circuits Conference, 2012,
+                                                                                                   pp. 46–48, http://dx.doi.org/10.1109/ISSCC.2012.6176872.
+Acknowledgments                                                                               [21] H. Volos, A.J. Tack, M.M. Swift, Mnemosyne: Lightweight persistent memory,
+                                                                                                   ACM SIGARCH Comput. Archit. News 39 (1) (2011) 91–104, http://dx.doi.org/
+   This work was funded by the National Key Research and De-                                       10.1145/1961295.1950379.
+velopment Programme No. 2024YFB3310200, and by Key scientific                                 [22] S.-W. Chung, T. Kishi, J.W. Park, M. Yoshikawa, K.S. Park, T. Nagase, K.
+                                                                                                   Sunouchi, H. Kanaya, G.C. Kim, K. Noma, M.S. Lee, A. Yamamoto, K.M. Rho,
+and technological R&D Plan of Jilin Province of China under Grant                                  K. Tsuchida, S.J. Chung, J.Y. Yi, H.S. Kim, Y. Chun, H. Oyamatsu, S.J. Hong,
+No. 20230201066GX, and by the Central University Basic Scientific                                  4Gbit density STT-MRAM using perpendicular MTJ realized with compact cell
+Research Fund Grant No.2023-JCXK-04.                                                               structure, in: 2016 IEEE International Electron Devices Meeting, IEDM, 2016, pp.
+                                                                                                   27.1.1–27.1.4, http://dx.doi.org/10.1109/IEDM.2016.7838490.
+                                                                                              [23] H. Akinaga, H. Shima, Resistive random access memory (ReRAM) based on metal
+References
+                                                                                                   oxides, Proc. IEEE 98 (12) (2010) 2237–2251, http://dx.doi.org/10.1109/JPROC.
+                                                                                                   2010.2070830.
+ [1] G. Koo, K.K. Matam, T. I, H.K.G. Narra, J. Li, H.-W. Tseng, S. Swanson, M.               [24] K. Kawai, A. Kawahara, R. Yasuhara, S. Muraoka, Z. Wei, R. Azuma, K. Tanabe,
+     Annavaram, Summarizer: trading communication with computing near storage,                     K. Shimakawa, Highly-reliable TaOx reram technology using automatic forming
+     in: Proceedings of the 50th Annual IEEE/ACM International Symposium on                        circuit, in: 2014 IEEE International Conference on IC Design & Technology, 2014,
+     Microarchitecture, 2017, pp. 219–231.                                                         pp. 1–4, http://dx.doi.org/10.1109/ICICDT.2014.6838600.
+ [2] S.S.M. Gahagan, S. Bhaskaran, T. Bunker, A. De, Y. Jin, Y. Liu, S. Swanson,              [25] K. Suzuki, S. Swanson, The Non-Volatile Memory Technology Database
+     Willow: A User-Programmable ssd, OSDI, 2014.                                                  (NVMDB), Tech. Rep. CS2015-1011, Department of Computer Science &
+ [3] NVMe specifications, https://nvmexpress.org/specifications/.                                  Engineering, University of California, San Diego, 2015.
+ [4] Intel, Intel® Optane™ Persistent Memory, https://www.intel.com/content/www/              [26] S. Matsuura, Designing a persistent-memory-native storage engine for SQL
+     us/en/products/docs/memory-storage/optane-persistent-memory/overview.                         database systems, in: 2021 IEEE 10th Non-Volatile Memory Systems and Ap-
+     html.                                                                                         plications Symposium, NVMSA, IEEE, Beijing, China, 2021, pp. 1–6, http://dx.
+ [5] S. Mittal, J.S. Vetter, A survey of software techniques for using non-volatile                doi.org/10.1109/NVMSA53655.2021.9628842.
+     memories for storage and main memory systems, IEEE Trans. Parallel Distrib.              [27] R. Tadakamadla, M. Patocka, T. Kani, S.J. Norton, Accelerating database work-
+     Syst. 27 (5) (2016) 1537–1550, http://dx.doi.org/10.1109/TPDS.2015.2442980.                   loads with DM-WriteCache and persistent memory, in: Proceedings of the 2019
+ [6] M. Wei, M. Bjørling, P. Bonnet, S. Swanson, I/O speculation for the microsecond               ACM/SPEC International Conference on Performance Engineering, in: ICPE ’19,
+     era, in: 2014 USENIX Annual Technical Conference, USENIX ATC 14, 2014, pp.                    Association for Computing Machinery, New York, NY, USA, 2019, pp. 255–263,
+     475–481.                                                                                      http://dx.doi.org/10.1145/3297663.3309669.
+ [7] S. Peter, J. Li, I. Zhang, D.R.K. Ports, D. Woos, A. Krishnamurthy, T. Anderson,         [28] W. Wang, C. Yang, R. Zhang, S. Nie, X. Chen, D. Liu, Themis: malicious wear
+     T. Roscoe, Arrakis: the operating system is the control plane, in: 11th USENIX                detection and defense for persistent memory file systems, in: 2020 IEEE 26th
+     Symposium on Operating Systems Design and Implementation, OSDI 14, 2014,                      International Conference on Parallel and Distributed Systems, ICPADS, 2020, pp.
+     pp. 1–16.                                                                                     140–147, http://dx.doi.org/10.1109/ICPADS51040.2020.00028.
+ [8] H. Volos, S. Nalli, S. Panneerselvam, V. Varadarajan, P. Saxena, M.M. Swift,             [29] B. Zhu, Y. Chen, Q. Wang, Y. Lu, J. Shu, Octopus+ : An RDMA-enabled distributed
+     Aerie: Flexible file-system interfaces to storage-class memory, in: Proceedings               persistent memory file system, ACM Trans. Storage 17 (3) (2021) 1–25, http:
+     of the Ninth European Conference on Computer Systems, in: EuroSys ’14,                        //dx.doi.org/10.1145/3448418.
+     Association for Computing Machinery, New York, NY, USA, 2014, pp. 1–14,                  [30] J. Do, V.C. Ferreira, H. Bobarshad, M. Torabzadehkashi, S. Rezaei, A. Hey-
+     http://dx.doi.org/10.1145/2592798.2592810.                                                    darigorji, D. Souza, B.F. Goldstein, L. Santiago, M.S. Kim, P.M.V. Lima, F.M.G.
+ [9] A.M. Caulfield, T.I. Mollov, L.A. Eisner, A. De, J. Coburn, S. Swanson, Providing             França, V. Alves, Cost-effective, energy-efficient, and scalable storage computing
+     safe, user space access to fast, solid state disks, ACM SIGPLAN Not. 47 (4) (2012)            for large-scale AI applications, ACM Trans. Storage 16 (4) (2020) 21:1–21:37,
+     387–400, http://dx.doi.org/10.1145/2248487.2151017.                                           http://dx.doi.org/10.1145/3415580.
+
+
+                                                                                          9
+J. Hu et al.                                                                                                                        Journal of Systems Architecture 160 (2025) 103348
+
+
+[31] L. Kang, Y. Xue, W. Jia, X. Wang, J. Kim, C. Youn, M.J. Kang, H.J. Lim,                    [57] J. Kwak, S. Lee, K. Park, J. Jeong, Y.H. Song, Cosmos+ OpenSSD: rapid prototype
+     B. Jacob, J. Huang, IceClave: A trusted execution environment for in-storage                    for flash storage systems, ACM Trans. Storage 16 (3) (2020) 15:1–15:35, http:
+     computing, in: MICRO-54: 54th Annual IEEE/ACM International Symposium                           //dx.doi.org/10.1145/3385073.
+     on Microarchitecture, in: MICRO ’21, Association for Computing Machinery,                  [58] Filebench, https://github.com/filebench/filebench.
+     New York, NY, USA, 2021, pp. 199–211, http://dx.doi.org/10.1145/3466752.
+     3480109.
+[32] Z. Ruan, T. He, J. Cong, INSIDER: designing in-storage computing system                                             Juncheng Hu received the bachelor’s degree and doctor
+     for emerging high-performance drive, in: 2019 USENIX Annual Technical                                               of Engineering degree from Jilin University in 2017 and
+     Conference, USENIX ATC 19, USENIX Association, Renton, WA, 2019, pp.                                                2022, where he is current a lecturer in Jilin University. His
+     379–394.                                                                                                            research interests include data mining, machine learning,
+[33] A.M. Caulfield, T.I. Mollov, L.A. Eisner, A. De, J. Coburn, S. Swanson, Providing                                   computer network and parallel computing.
+     safe, user space access to fast, solid state disks, ACM SIGPLAN Not. 47 (4) (2012)                                      jchu@jlu.edu.cn
+     387–400.
+[34] S. Cho, C. Park, H. Oh, S. Kim, Y. Yi, G.R. Ganger, Active disk meets flash: A case
+     for intelligent ssds, in: Proceedings of the 27th International ACM Conference
+     on International Conference on Supercomputing, 2013, pp. 91–102.
+[35] J. Do, Y.-S. Kee, J.M. Patel, C. Park, K. Park, D.J. DeWitt, Query processing
+     on smart ssds: Opportunities and challenges, in: Proceedings of the 2013
+     ACM SIGMOD International Conference on Management of Data, 2013, pp.                                                Shuo Chen is currently working toward the master’s de-
+     1221–1230.                                                                                                          gree with College of Computer Science and Technology,
+[36] C. Cowan, S. Beattie, J. Johansen, P. Wagle, {Point Guar d.}: Protecting pointers                                   Jilin University since 2022. His research field is computer
+     from buffer overflow vulnerabilities, in: 12th USENIX Security Symposium,                                           architecture, mainly focusing on optimization for caching
+     USENIX Security 03, 2003.                                                                                           systems.
+[37] L. Szekeres, M. Payer, T. Wei, D. Song, Sok: Eternal war in memory, in: 2013                                             chenshuo22@mails.jlu.edu.cn
+     IEEE Symposium on Security and Privacy, IEEE, 2013, pp. 48–62.
+[38] S.R. Dulloor, S. Kumar, A. Keshavamurthy, P. Lantz, D. Reddy, R. Sankaran, J.
+     Jackson, System software for persistent memory, in: Proceedings of the Ninth Eu-
+     ropean Conference on Computer Systems - EuroSys ’14, ACM Press, Amsterdam,
+     The Netherlands, 2014, pp. 1–15, http://dx.doi.org/10.1145/2592798.2592814.
+[39] C. Lee, D. Sim, J. Hwang, S. Cho, F2FS: A new file system for flash storage, in:
+     13th USENIX Conference on File and Storage Technologies, FAST 15, USENIX                                            Wei Haoyang a 23rd-year Master’s student in Computer
+     Association, Santa Clara, CA, 2015, pp. 273–286.                                                                    Science and Technology at Jilin University, focuses on
+[40] DAX, https://www.kernel.org/doc/Documentation/filesystems/dax.txt.                                                  computer architecture research, with a primary interest in
+[41] J. Xu, S. Swanson, NOVA: A log-structured file system for hybrid volatile/non-                                      the application of new storage devices.
+     volatile main memories, in: Proceedings of the 14th Usenix Conference on File                                           hywei23@mails.jlu.edu.cn
+     and Storage Technologies, in: FAST’16, USENIX Association, USA, 2016, pp.
+     323–338.
+[42] M. Torabzadehkashi, S. Rezaei, A. HeydariGorji, H. Bobarshad, V. Alves, N.
+     Bagherzadeh, Computational storage: An efficient and scalable platform for big
+     data and HPC applications, J. Big Data 6 (1) (2019) 100, http://dx.doi.org/10.
+     1186/s40537-019-0265-5.
+[43] W. Cao, Y. Liu, Z. Cheng, N. Zheng, W. Li, W. Wu, L. Ouyang, P. Wang, Y.
+     Wang, R. Kuan, Z. Liu, F. Zhu, T. Zhang, POLARDB meets computational storage:                                       Guoyu Wang is currently working toward the doctor’s
+     efficiently support analytical workloads in cloud-native relational database, in:                                   degree with College of Computer Science and Technology,
+     Proceedings of the 18th USENIX Conference on File and Storage Technologies,                                         Jilin University.
+     in: FAST’20, USENIX Association, USA, 2020, pp. 29–42.                                                                   wgy21@mails.jlu.edu.cn
+[44] Nvidia, NVIDIA RTX IO: GPU accelerated storage technology, https://www.
+     nvidia.com/en-us/geforce/news/rtx-io-gpu-accelerated-storage-technology/.
+[45] AMD, Radeon™ Pro SSG graphics, https://www.amd.com/en/products/
+     professional-graphics/radeon-pro-ssg.
+[46] Z. An, Z. Zhang, Q. Li, J. Xing, H. Du, Z. Wang, Z. Huo, J. Ma, Optimizing the
+     datapath for key-value middleware with NVMe SSDs over RDMA interconnects,
+     in: 2017 IEEE International Conference on Cluster Computing, CLUSTER, 2017,
+     pp. 582–586, http://dx.doi.org/10.1109/CLUSTER.2017.69.                                                             Pei Chenju is an undergraduate student at the School of
+[47] Samsung, Samsung 990 PRO with heatsink, https://semiconductor.samsung.                                              Computer Science and Technology at Jilin University. His
+     com/content/semiconductor/global/consumer-storage/internal-ssd/990-pro-                                             field of research is computer system architecture, and he is
+     with-heatsink.html.                                                                                                 currently investigating new L7 load balancing solutions.
+[48] A. Ltd, ARM computational storage solution, https://www.arm.com/solutions/                                               peicj2121@mails.jlu.edu.cn
+     storage/computational-storage.
+[49] Samsung, Samsung SmartSSD, https://www.xilinx.com/applications/data-center/
+     computational-storage/smartssd.html.
+[50] ScaleFlux, ScaleFlux, https://scaleflux.com/.
+[51] E. Gal, S. Toledo, A transactional flash file system for microcontrollers, in: 2005
+     USENIX Annual Technical Conference, USENIX ATC 05, 2005.
+[52] J. Koo, J. Im, J. Song, J. Park, E. Lee, B.S. Kim, S. Lee, Modernizing file system
+     through in-storage indexing, in: Proceedings of the 15th Usenix Symposium on                                        Che Xilong Received the M.S. and Ph.D. degrees in Com-
+     Operating Systems Design and Implementation, Osdi ’21, Usenix Assoc, Berkeley,                                      puter Science from Jilin University, in 2006 and 2009
+     2021, pp. 75–92, http://dx.doi.org/10.5281/zenodo.4659803.                                                          respectively.
+[53] LevelDB, https://github.com/google/leveldb.                                                                             Currently, He is a full professor and doctoral supervisor
+[54] Storage performance development kit, https://spdk.io/.                                                              at the College of Computer Science and Technology, Jilin
+[55] DFC open source, https://github.com/DFC-OpenSource.                                                                 University, China.
+[56] M. Jung, OpenExpress: fully hardware automated open research framework                                                  His current research areas are Parallel & Distributed
+     for future fast NVMe devices, in: 2020 USENIX Annual Technical Conference,                                          Computing, High Performance Computing Architectures,
+     USENIX ATC 20, 2020, pp. 649–656.                                                                                   and related optimizations.
+                                                                                                                             He is a member of the China Computer Federation.
+                                                                                                                         Corresponding author of this paper.
+                                                                                                                             chexilong@jlu.edu.cn
+
+
+
+
+                                                                                           10
+
--- a/papers_txt/V-Bridge--A-dynamic-cross-shard-blockchain-protocol-_2026_Computer-Standards.txt
+++ b/papers_txt/V-Bridge--A-dynamic-cross-shard-blockchain-protocol-_2026_Computer-Standards.txt
@@ -0,0 +1,874 @@
+                                                               Computer Standards & Interfaces 97 (2026) 104123
+
+
+                                                                   Contents lists available at ScienceDirect
+
+
+                                                         Computer Standards & Interfaces
+                                                            journal homepage: www.elsevier.com/locate/csi
+
+
+
+
+V-Bridge: A dynamic cross-shard blockchain protocol based on off-chain
+payment channel
+Xueting Huang a , Xiangwei Meng a,c,b , Kai Zhang a,c,b , Ce Yang a,c,b , Wei Liang a,c,b                                        ,∗,
+
+Kuan-Ching Li a,c,b ,∗
+a
+    College of Computer Science and Engineering, Hunan University of Science and Technology, XiangTan 411201, China
+b Hunan University of Science and Technology Sanya Research Institute, SanYa 572000, China
+c Hunan Key Laboratory for Service Computing and Novel Software Technology, XiangTan 411201, China
+
+
+
+
+ARTICLE                 INFO                              ABSTRACT
+
+Keywords:                                                 Sharding technology effectively improves system throughput by distributing the blockchain transaction
+Blockchain                                                load to multiple shards for parallel processing, and it is the core solution to the scalability problem of
+Sharding                                                  blockchain. However, as the number of shards increases, the frequency of cross-shard transactions increases
+Cross-shard
+                                                          significantly, leading to increased communication and computational overhead, transaction delays, uneven
+Off-chain payment channel
+                                                          resource allocation, and load imbalance, which becomes a key bottleneck for performance expansion. To this
+                                                          end, this article proposes the cross-shard transaction protocol V-Bridge, which draws on the concept of off-chain
+                                                          payment channels to establish distributed virtual fund channels between Trustors in different shards, convert
+                                                          cross-shard transactions into off-chain transactions and realize the logical flow of funds. To further enhance
+                                                          cross-shard transaction performance, our V-Bridge integrates an intelligent sharding adjustment mechanism,
+                                                          and a cross-shard optimized critical path protection algorithm (CSOCPPA) to dynamically balance shard loads,
+                                                          alleviate resource allocation issues, and minimize performance bottlenecks. Experimental results show that
+                                                          compared with existing state-of-the-art protocols, our proposed V-Bridge’s average throughput is increased by
+                                                          26% to 46%, and transaction delays are reduced by 15% to 24%.
+
+
+
+1. Introduction                                                                               overhead, introduces complex synchronization issues, and creates po-
+                                                                                              tential performance bottlenecks—diminishing the efficiency gains of
+    Blockchain [1,2], as a decentralized, transparent, and tamper-proof                       sharding [14].
+technology, has great potential in various fields such as privacy protec-                         Various cross-shard transaction protocols have been proposed to
+                                                                                              address these challenges to enhance processing efficiency and reduce
+tion, medical applications, and the Internet of Things [3–5]. In cross-
+                                                                                              synchronization overhead. For instance, Monoxide [15] employs an
+border payments, for example, it offers an alternative to traditional
+                                                                                              asynchronous consensus mechanism to minimize inter-shard waiting
+banking systems, enabling fast and cost-effective transactions [6]. How-
+                                                                                              times. However, while transactions are processed in parallel, the sys-
+ever, blockchain systems face significant scalability challenges during
+                                                                                              tem struggles to ensure state consistency and communication over-
+high transaction volumes [7–9]. For instance, Ethereum [10] often ex-
+                                                                                              head remains significant in large-scale sharding environments. Broker-
+periences network congestion during peak usage, leading to transaction
+                                                                                              Chain [16] introduces brokers to coordinate cross-shard transaction
+delays and increased GAS fees. Addressing scalability has become a
+                                                                                              processing, but its approach appears impractical in real-world appli-
+critical priority. Sharding [11,12] technology offers a promising solu-
+                                                                                              cations. The system relies heavily on securing sufficient intermediary
+tion by partitioning the blockchain network into independent shards,
+                                                                                              nodes to stabilize cross-shard transactions. However, such nodes are
+enabling parallel transaction processing and faster confirmations [13].
+                                                                                              challenging to acquire in decentralized networks, especially under
+However, the benefits of sharding are hindered by challenges posed by
+                                                                                              heavy transaction loads. This reliance amplifies dependence on individ-
+cross-shard transactions. These transactions require data synchroniza-
+                                                                                              ual nodes, increasing the risk of centralization and limiting scalability.
+tion and state validation across shards, which increases communication
+                                                                                              Consequently, BrokerChain [16] struggles to address complex scenarios
+
+
+  ∗ Correspondence to: College of Computer Science and Engineering, Hunan University of Science and Technology, No. 2 Taoyuan Road, Yuhu
+District, Xiangtan 411201, Hunan Province, China.
+     E-mail addresses: wliang@hnust.edu.cn (W. Liang), aliric@hnust.edu.cn (K.-C. Li).
+
+https://doi.org/10.1016/j.csi.2025.104123
+Received 17 December 2024; Received in revised form 27 October 2025; Accepted 26 December 2025
+Available online 31 December 2025
+0920-5489/© 2026 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
+X. Huang et al.                                                                                                  Computer Standards & Interfaces 97 (2026) 104123
+
+
+involving dynamic node allocation and shard optimization, exposing               2. Related work
+key weaknesses in its design’s practicality and adaptability. To ad-
+dress shard distribution, BrokerChain [16] employs epoch, where the              2.1. Cross-shard transaction
+Metis algorithm [17] optimizes user account partitioning to prevent
+related transactions from being spread across shards in subsequent                   To address the challenges of cross-shard transactions, a variety
+                                                                                 of protocols have been developed [22–24], with a focus on enhanc-
+epochs. This strategy is similarly adopted in protocols like X-Shard [18].
+                                                                                 ing performance, scalability, and consistency. Monoxide [15] uses an
+However, in practice, the Metis algorithm [17] suffers from partition
+                                                                                 asynchronous consensus mechanism and temporary payment chan-
+imbalance and fragmentation of critical nodes. This fragmentation can
+                                                                                 nels to decompose cross-shard transactions into intra-shard operations,
+distribute high-frequency transaction paths across multiple shards, in-          boosting throughput. However, the complexity of state synchronization
+creasing cross-shard communication overhead and processing latency.              and consistency verification across shards increases communication
+Ultimately, this undermines system performance and scalability. In con-          overhead, impacting large-scale system performance. OmniLedger [25]
+clusion, significant challenges remain while both BrokerChain [16] and           employs the Atomix protocol to ensure the atomicity of cross-shard
+X-Shard [18] offer innovative approaches to cross-shard transaction              transactions. Through a client-driven two-phase commit process, it
+optimization. Improvements are needed in node allocation, security               freezes the UTXO of the input shard and then releases the UTXO
+management, and performance optimization to realize the full potential           of the output shard, maintaining consistent transaction state updates.
+of cross-shard systems.                                                          RapidChain [26] improves system throughput by reducing commu-
+    To address the shortcomings of existing cross-shard transaction              nication rounds through parallel verification. However, it struggles
+                                                                                 with performance bottlenecks when handling cross-shard transactions
+protocols, we propose the V-Bridge, a novel solution leveraging the
+                                                                                 with long dependency chains due to prolonged waiting times. Bro-
+off-chain transaction model of bidirectional payment channels. The V-
+                                                                                 kerChain [16] simplifies cross-shard communication by introducing
+Bridge facilitates logical fund interactions across shards by constructing
+                                                                                 third-party intermediary nodes that convert cross-shard transactions
+virtual channels. To reduce reliance on single nodes in decentralized            into intra-shard transactions. However, this approach introduces risks
+environments, the V-Bridge introduces trustee groups as relay nodes              related to decentralization and trust management. Pyramid [27] im-
+within each shard. These groups, supported by a flexible and ro-                 plements a hierarchical sharding protocol where BridgeShard processes
+bust management framework, ensure seamless cross-shard transactions              transactions across multiple shards in a single consistency round, sig-
+while distributing the load effectively. Virtual fund channels between           nificantly reducing confirmation latency. However, it increases system
+trustees are settled on-chain only upon closure, which significantly             management complexity. CHERUBIM [28] leverages pipeline process-
+reduces on-chain synchronization overhead and enhances transaction               ing based on the 2PC protocol to enhance cross-shard transaction
+efficiency. Additionally, the V-Bridge incorporates an intelligent shard         throughput but falls short in mitigating long transaction latencies.
+adjustment mechanism based on a Consistent Hashing Ring [19,20]                  Building on these advancements, we propose the V-Bridge protocol,
+and GINI coefficients [21], integrated with the Cross-Shard Optimized            offering an efficient and flexible solution to the challenges of cross-
+                                                                                 shard transactions. V-Bridge supports seamless operation in complex
+Critical Path Protection Algorithm (CSOCPPA). This combination im-
+                                                                                 transaction scenarios, ensuring the efficient functioning of blockchain
+proves the coordination of dynamic shard adjustments and cross-shard
+                                                                                 systems.
+transactions, further enhancing system performance.
+    The main contributions of this article are summarized as follows:            2.2. Payment channel
+
+• Blockchain Sharding Protocol Based on Virtual Fund Payment Channels                Payment channels are off-chain mechanisms that improve blockchain
+  (V-Bridge): We propose a virtual channel solution inspired by off-             performance by optimizing transaction processing, reducing network
+  chain payment channels to facilitate cross-shard transactions. This            load, delays, and fees. Users lock funds via smart contracts, enabling
+  approach minimizes direct interactions between shards, significantly           multiple off-chain transactions without recording each one on the
+  reducing communication overhead and transaction delays. Conse-                 blockchain. The mechanism defines rules for initial fund allocation and
+  quently, V-Bridge enhances system throughput and overall perfor-               status updates. Users can update the status off-chain, with each update
+  mance.                                                                         verified by signed certificates from both parties. When the channel
+                                                                                 closes, the final status is submitted to the blockchain to settle the fund
+• Cross-shard Optimization and Dynamic Shard Adjustment Mechanism:
+                                                                                 distribution. By minimizing on-chain interactions, payment channels
+  The V-Bridge Protocol integrates the Cross-Shard Optimized Crit-
+                                                                                 reduce blockchain resource use. A key feature is the dynamic adjustment
+  ical Path Protection Algorithm (CSOCPPA) and a dynamic shard
+                                                                                 of participant balances while maintaining total fund immutability. This
+  adjustment mechanism to improve system performance. CSOCPPA                    improves transaction efficiency and ensures security. Payment channels
+  enhances the Metis algorithm [17] for better account allocation, re-           leverage cryptographic tools like digital signatures to ensure transaction
+  ducing cross-shard assignments. The dynamic adjustment mechanism               integrity and reduce the costs and delays of on-chain operations,
+  uses Consistent Hashing and GINI coefficient analysis to optimize              especially during congestion.
+  shard splitting and merging for balanced load distribution.                        In recent years, payment channels have been widely used in many
+• System Implementation: We implemented the V-Bridge protocol and                fields, such as privacy protection [29], network scalability [30] and
+  evaluated its performance on Ubuntu 20.04.1. Experimental results              cross-chain interoperability [31], Internet of Things expansion [32]
+  demonstrate that, compared to BrokerChain and X-Shard, V-Bridge                and other directions. In the sharding architecture of this article, the
+  outperforms in terms of throughput, transaction confirmation la-               off-chain payment channel also provides an efficient solution for pro-
+  tency, workload balancing, and consensus success rate under identical          cessing cross-shard transactions. By transferring transactions between
+                                                                                 shards to off-chain, payment channels effectively reduce the complexity
+  conditions.
+                                                                                 of cross-shard communications, significantly improve the throughput
+                                                                                 and performance of the sharding system, and provide important support
+    The remainder of this paper is organized as follows. Section 2
+                                                                                 for the further development of sharding technology.
+reviews related work. Section 3 presents an overview of the V-Bridge
+system. Section 4 details the protocol design. Section 5 introduces cross-       3. Overview
+shard optimization and dynamic shard adjustment. Section 6 provides a
+security analysis. Section 7 reports experimental results, and Section 8             This section introduces the V-Bridge system model and its workflow,
+concludes the paper.                                                             followed by the deployment process of Trustor.
+
+                                                                             2
+X. Huang et al.                                                                                                  Computer Standards & Interfaces 97 (2026) 104123
+
+
+3.1. System architecture and workflow                                            3.2. Trustor deployment
+
+   Similar to BrokerChain [16], V-Bridge is also executed in epoch, and              In our V-Bridge, we introduce the concept of the Trustor. Before
+V-Bridge includes two types of shards: settlement and account shards,            delving into the specifics of the Trustor, it is essential to understand
+of which there are S settlement shards and 1 status shard. The specific          the concept of a trust group. A trust group consists of nodes selected by
+definitions are as follows:                                                      the system to facilitate the transfer of funds across shards. These nodes
+                                                                                 provide liquidity through staking or contributing resources like funds
+• Settle shard (S-Shard): Generates transaction blocks by packaging              or computing power, creating a bridge for cross-shard transactions.
+  transactions and achieves intra-shard consistency at the beginning of          In return, they earn commissions as incentives. By ensuring sufficient
+  each epoch.                                                                    liquidity, these nodes establish virtual transaction channels between
+• Account Shard (A-shard): A-shard optimizes account allocation based            shards, enabling successful cross-shard payments for users both inside
+  on the user’s transaction history data to alleviate the problem of load        and outside the shards. The node performing this critical role is the
+  imbalance in the shard system. During the system startup phase, A-             Trustor. The process for participating in system transactions involves
+  shard collects user transaction data from S-shard, uses this data to           the following key steps:
+  build a user transaction network, and optimizes user status distribu-
+                                                                                     Step1 Trustor application: At the start of each epoch, nodes from
+  tion through the CSOCPPA. After the optimization, A-shard generates
+                                                                                 various shards may apply to serve as Trustors by submitting collateral,
+  a status block and sends it to S-shard to update the user account status
+                                                                                 undergoing a credit assessment (excluding low-reputation nodes with
+  distribution.
+                                                                                 poor historical performance), and providing proof of sufficient liquidity
+    Other Settings: We adopt a Byzantine fault-tolerant adversarial model        to meet the minimum funding threshold.
+[33], assuming the presence of malicious actors capable of compromis-                Step2 Qualification verification: The system assesses applicant
+ing specific shard nodes and performing arbitrary dishonest actions,             nodes based on their collateral, computational capacity, credit score,
+such as data tampering, delayed messaging, or refusing to submit re-             and liquidity. Each node is assigned an initial credit score that re-
+quired information. These adversaries are slow-adaptive, meaning they            flects its resources and historical performance. The system prioritizes
+cannot frequently rotate the compromised nodes within a single epoch.            selecting Trustors from nodes with higher credit scores. However, to
+The system assumes a partially synchronous communication model,                  mitigate the risk of excessive centralization, a probabilistic selection
+where messages may experience delays but are guaranteed eventual                 mechanism is employed. Specifically, high-reputation nodes (Top 30%)
+delivery.                                                                        are assigned a 60% probability of selection, while medium-reputation
+    Building upon this adversarial model, the V-Bridge mechanism is              nodes (Top 30%–60%) are allocated a 40% probability. Even if there
+designed to ensure resilience against such malicious actors. The system          are enough high-reputation candidates, there remains a 40% chance
+architecture is structured into three key layers: the Network Initializa-        of selecting a medium-reputation node, thus distributing power within
+tion Layer, the Account State Reconstruction Layer, and the Transaction          the system and reducing centralization risks. In the event that there
+Processing and Consensus Layer.                                                  are insufficient qualified candidates, the system will randomly select
+                                                                                 additional well-performing nodes to supplement the trust set. The
+• Network Initialization Layer (NIL): This layer is the core foundation of       trusted set is dynamically maintained: nodes with significantly reduced
+  V-Bridge and is responsible for the reasonable sharding of nodes and           liquidity or credit scores are automatically removed.
+  transaction loads in the system to form multiple independent sharding              Step3 Leader election: For each shard’s trust set, the system des-
+  structures.                                                                    ignates the Trustor node with the highest credit score as the leader,
+• Transaction Processing and Consensus Layer (TPCL): According to the            with the leader’s term being tied to the current epoch. The leader is
+  distribution of accounts and transactions, this layer optimizes and re-        not allowed to serve consecutive terms in two consecutive epochs. Each
+  constructs the account status within the shard through the CSOCPPA,            Trustor generates a unique identifier 𝑇id , defined as:
+  generates a dynamic state reconfiguration plan, and improves system
+  performance.                                                                   𝑇id = hash(ShardID ∥ Rep ∥ Deposit ∥ Value ∥ Relate)
+• Account State Reconstruction Layer (ASRL): This layer is responsible           where Rep is the credit score, Deposit is the collateral, Value is the
+  for transaction verification and consensus, ensuring the security and          remaining balance, Relate indicates the cross-shard relationships.
+  consistency of all transactions.                                                  The Relate variable is defined as:
+                                                                                          {
+    Based on the above architecture, each layer works together to real-                     0         if no channel with other shards;
+                                                                                 Relate =
+ize each epoch process from node identity authentication to transaction                     ShardID if a channel with another shard exists.
+processing. As shown in Fig. 1, the workflow can be divided into the
+following five steps:                                                            Each 𝑇id is stored within the shard. The leader can query these 𝑇id
+    Step1 PoW verification: Nodes verify their identity using PoW [34]           values to execute cross-shard transactions and select the most suitable
+(via public key and IP) to prevent Sybil attacks. Verified nodes are             executor from the trusted set.
+evenly distributed across shards in a round-robin fashion.
+    Step2 Select Trustor: After shard assignment, nodes apply to serve           4. V-Bridge protocol design
+as Trustors in this epoch. Each shard forms a trust group to support
+subsequent transactions.                                                             In this section, we introduced the core modules of the V-Brige pro-
+    Step3 Transaction consensus: The settlement shard validates                  tocol, including the new Merkle tree, the solution of cross-shard trans-
+transactions and adds them to the pool. Consensus is reached via                 actions based on virtual payment channels, and the final transaction
+PBFT [33]. For cross-shard cases, virtual channels (Section 4.2) manage          settlement.
+interactions; failed verifications trigger rollback (Section 4.3).
+    Step4 State optimization: The CSOCPPA algorithm (Section 5.1)                4.1. New Merkle Patricia Tree
+adjusts account placement based on access patterns. A consensus-
+derived state block is distributed to shards for the next epoch.                    To support efficient user state queries and cross-shard transaction
+    Step5 Epoch sharding: In the new epoch, PBFT [33] guides account             routing, we design a shard management framework that combines a
+migration and transaction routing based on the latest state diagram,             Consistent Hashing Ring with a New Merkle Patricia Tree (NMPT) (see
+ensuring consistency and performance.                                            Fig. 2).
+
+                                                                             3
+X. Huang et al.                                                                                              Computer Standards & Interfaces 97 (2026) 104123
+
+
+
+
+                                                  Fig. 1. Workflow diagram of V-Bridge for an Epoch.
+
+
+
+
+                                                   Fig. 2. The data structure and mapping of NMPT.
+
+
+   When a user initiates a query or transaction request that requires          the first shard whose assigned range contains 𝐻𝑢 , thereby determining
+locating their associated shard, the system first computes the user’s          the user’s query shard. Each shard maintains a uniquely assigned hash
+hash value 𝐻𝑢 = Hash(ID𝑢 ) using the SHA-256 algorithm [35]. It then           interval and dynamically records the user–shard mappings within that
+performs a clockwise search on the consistent hash ring [19] to identify       range to support efficient user lookup and cross-shard routing.
+
+                                                                           4
+X. Huang et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104123
+
+
+
+
+                                                           Fig. 3. Example of cross-shard Tx processing.
+
+
+   The NMPT serves as the core of state storage and improves on the                  This message is signed by Alice with Sig𝐴 and sent to the leader of the
+traditional Merkle Patricia Tree by:                                                 trust group in her shard, denoted as 𝑀leader .
+                                                                                         Upon receiving the message, the leader verifies whether Alice’s bal-
+    • Separating ordinary and Trustor user states into independent                   ance is sufficient to cover the transaction fee and ensures no duplicate
+      branches;                                                                      transactions with the same timestamp, thereby preventing double-
+    • Embedding a Shard Range module to explicitly encode shard                      spending. Once verification is complete, the leader selects 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀
+      positions on the hash ring;                                                    with sufficient financial reserves based on the transaction requirements
+    • Enabling parallel updates, fast redirection, and reduced state                 and then sends the verification results and the 𝑇id information of the
+      conflicts.                                                                     𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 responsible for this transaction to other Trustor nodes for
+                                                                                     confirmation. Suppose 2/3 of the Trustors agree to the transaction.
+   These enhancements ensure consistency and rapid verification in                   In that case, the leader will record the transaction, send 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 ’s
+dynamic, cross-shard environments. The usage of this framework in                    information to Alice for confirmation, and then 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 will begin
+transaction execution is elaborated in the next section.                             processing the transaction.
+                                                                                         Step2 Trustor coordination and locking: When the 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀
+4.2. The specific implementation process of V-brige protocol                         receives the transaction request 𝑇 𝑋request , it extracts the recipient Bob’s
+                                                                                     user ID and uses a Hash Ring to locate his shard (e.g., Shard3). It then
+                                                                                     generates and sends a query message:
+    We establish a channel contract (ChannelContract) between two
+                                                                                            {{           }              }
+shards via a Trustor, converting most transactions into intra-shard op-              𝑄𝑀 = 𝑇 𝑋request , Inform, Sig𝑀
+erations. Fund transfers between channel participants occur off-chain,
+with on-chain interactions limited to channel creation and final set-                Upon receiving 𝑄𝑀 , Shard3 verifies the transaction and retrieves Al-
+tlement. This approach significantly reduces the on-chain overhead of                ice’s account by checking her signature. It consults the Dynamic Map-
+                                                                                     ping Table to determine the shard locations of both Alice and Bob
+cross-shard transactions and enhances system performance. The proto-
+                                                                                     (e.g., Shard1 and Shard2), then forwards the query to Shard2 and
+col involves four main steps: Initial Fund Locking, Trustor Coordination
+                                                                                     broadcasts the involved shard locations.
+and Locking, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 Payment Execution, and HTLC (Hashed Time-
+                                                                                         Afterward, 𝑇 𝑋request is submitted to Shard1 for verification. Simul-
+lock Contract) Unlocking [36,37]. To illustrate the protocol’s operation,
+                                                                                     taneously, the leader node 𝑁leader in Shard2 analyzes the transaction
+we present a cross-shard transaction case that demonstrates how it
+                                                                                     and trust-related metrics to select a 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 to handle execution.
+ensures the seamless completion of cross-shard transactions (see Fig.
+                                                                                     This proactive selection validates the transaction without waiting for
+3). Alice, a user in Shard1, wants to initiate a cross-shard transaction
+                                                                                     Shard1’s response, reducing delay.
+to transfer an amount 𝑣 to Bob, a user in Shard2. The process unfolds
+                                                                                         The selected 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 in Shard2 extracts 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 ’s address from
+as follows:
+                                                                                     Sig𝑀 and deploys a contract 𝐶𝐵 , which locks the specified funds. The
+    Step1 Initial fund locking: First, Alice must locate Bob’s shard                 contract enforces mutual agreement between sender’s and receiver’s
+position through the Hash Ring. Therefore, Alice will create a message               Trustor to release funds. If consensus is not reached, the funds remain
+containing the transaction information, structured as follows:                       locked. 𝐶𝐵 also includes timeout and fallback mechanisms to ensure
+𝑇 𝑋request = {Property𝐴 , ToUser𝐵 , Time𝐴 , 𝑣, 𝑇lock , Sig𝐴 }                        progress under adverse conditions. Once completed, the verifier in
+                                                                                     Shard2 broadcasts the result to confirm transaction integrity.
+where 𝑇lock is the preset lock time for the funds, Time𝐴 is the timestamp                In parallel, upon confirmation of 𝑇 𝑋request in Shard1, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀
+for creating the transaction, and 𝑣 represents the transaction amount.               creates another contract 𝐶𝐴 and locks funds under rules similar to 𝐶𝐵 .
+
+                                                                                 5
+X. Huang et al.                                                                                                          Computer Standards & Interfaces 97 (2026) 104123
+
+
+                         Table 1
+                         Symbol explanations for algorithms.
+                          Symbol                         Explanation
+                          G                              Graph structure, containing nodes (accounts) and edges (transactions).
+                          u, v                           Account nodes in the graph, representing any two accounts.
+                          a, b                           Account nodes involved in matching or merging operations.
+                          degree(a)                      The degree of account 𝑎, representing the number of neighbors (transactions) it has.
+                          maxDepth                       Maximum depth of a node, used to prioritize high-frequency trading nodes.
+                          neighbor(a)                    The neighboring accounts of account 𝑎, connected in the graph.
+                          edge(u, v)                     Edge between accounts 𝑢 and 𝑣, representing the transaction connection.
+                          edge_weight(u, v)              Weight of the edge between accounts 𝑢 and 𝑣.
+                          target_size                    Target size for the coarsened network, representing the desired scale.
+                          threshold                      Threshold value for merging or retaining nodes based on transaction volume.
+                          region                         Region formed by the depth-priority growing algorithm, containing multiple accounts.
+                          sorted_accounts                List of accounts sorted by degree (transaction volume).
+
+
+
+When 𝐶𝐴 is finalized, its details are shared with Shard2 for synchro-                4.3. Transaction settlement and failure handling
+nization.
+    Alice then sends the funds 𝑣 and a message 𝜁 to 𝐶𝐴 , where 𝜁 =                       The transaction settlement process follows a standard payment
+{𝑣, Timenow , ToUser𝐵 , 𝑇lock }. The contract’s unlocking condition                  channel model, addressing both cooperative and exceptional scenarios.
+requires that the corresponding transfer occurs within the time window                   In the cooperative case, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 and 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 mutually agree to
+𝑇lock , avoiding conflicts or double-spending. This ensures that the                 close the channel. 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 generates and signs a closure message
+locked funds are correctly routed to 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 .                                     containing the final balance and fund allocation, then forwards it to
+    Step3 TrustorN payment execution: In Shard2, after the contract                  𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 for verification and co-signature. The jointly signed message is
+is established, the 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 creates an intra-shard transaction message             broadcast to the blockchain, where Shard 1 and Shard 2 verify the sig-
+as follows:                                                                          natures and release the allocated funds. If residual funds remain locked,
+                                                                                     the inter-shard smart contract redistributes them to the appropriate
+𝑇 𝑋second = {Property𝑁 , toUser𝐵 , Time𝑁 , 𝑣, Sig𝑁 }
+                                                                                     addresses, finalizing the settlement.
+This transaction is sent to the verifier nodes within the shard for                      In the uncooperative case, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 may unilaterally broadcast
+validation. Meanwhile, Bob generates a random number 𝑅 and creates                   the most recent transaction state, initiating a challenge period during
+the following message:                                                               which 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 can dispute outdated information. If no challenge is
+                                                                                     made, the transaction is settled according to the submitted state.
+𝜁 ′ = {H(𝑅), Sig𝐵 , {𝑇 𝑋second }}
+                                                                                         In the event of abnormal failures, recovery mechanisms safeguard
+This message is sent to 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 . Upon receiving the message, the                  both fund security and system liveness. If 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 maliciously with-
+𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 forwards H(𝑅) to contract 𝐶𝐴 , which automatically initiates               holds the required secret 𝑅, and 𝐶𝐴 fails to receive it within the time-
+the HTLC [36]. According to the rules of 𝐶𝐴 , only when 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀                    out, a rollback is triggered: funds are refunded to 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 , 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁
+provides the value of 𝑅 corresponding to H(𝑅) within the specified                   forfeits their deposit, and suffers a reputation penalty. Repeated of-
+time 𝑇1 (𝑇1 < 𝑇lock ), 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 can start the next transaction smoothly.            fenses may result in disqualification.
+Otherwise, when the time 𝑇lock for Alice to lock the funds ends, the                     If failure occurs due to force majeure, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 submits a failure
+funds will return to Alice’s account, and the system will start to query             proof
+the initiator of the transaction failure and punish him (Section 4.3).
+                                                                                     𝜇 = {Sig𝑁 , 𝑇 𝑋request , 𝑇 𝑋second , Tablenow }
+     Once 𝑇 𝑋second is included in the block, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 will transfer the
+funds 𝑠 = 𝑣 to contract 𝐶𝐵 , locking them for the time period 𝑇2 (𝑇2 <               to Shard 1 and Shard 2. Once validated that the transfer could not be
+𝑇1 ). Bob is then notified that the funds are ready to be claimed. Once              completed within the lock time 𝑇1 , the locked funds in 𝐶𝐵 are refunded
+Bob provides the correct 𝑅 within the specified time window, the                     to 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 , and Shard 1 returns funds to 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 , ensuring proper
+locked funds will be released to Bob.                                                recovery and termination (see Table 1).
+     Step4 HTLC unlocking: The 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 completes the payment, yet
+the funds remain locked unless the process is correctly finalized within             5. Cross-shard optimization and dynamic shard adjustment mech-
+the designated time window 𝑇2 . To initiate the release, Bob must submit             anism
+the correct value of 𝑅 along with his digital signature, which enables
+public verification. Upon receiving and verifying 𝑅, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 generates             5.1. CSOCPPA
+a channel message:
+                                                                                         Traditional Metis algorithms [17] coarsen transaction graphs using
+𝜃1 = {Sig𝑁 , Tablenow , 𝑅}
+                                                                                     random or edge-weighted matching but often overlook critical paths
+Here, Tablenow represents the latest balance allocation table, which is              and high-transaction nodes in cross-shard transactions. These elements
+then sent to the 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 . The 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 provides𝑅 to the contract 𝐶𝐴              are essential for throughput and latency, and mishandling them can
+for verification. If H(𝑅) matches, the contract notifies 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 that              increase cross-shard communication and cause load imbalance [38]. To
+the match is successful. At this time, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀 forwards the message to             address this, we propose CSOCPPA, a coarsening-phase optimization al-
+other idle Trustors in shard1 to jointly verify the allocation table and             gorithm that identifies and preserves critical paths and high-transaction
+version number. If 2∕3 of the Trustors verify that it is correct, 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑀          nodes during graph compression (The key symbols and notations used
+will generate the following channel message:                                         in our algorithms are summarized in Table 1). CSOCPPA effectively
+                                                                                     alleviates load imbalance, improves system throughput, and optimizes
+𝜃2 = {Sig𝑀 , {𝜃1 }}
+                                                                                     overall resource utilization.
+and sends it back to 𝑇 𝑟𝑢𝑠𝑡𝑜𝑟𝑁 . The idle Trustor group in Shard2 verifies               (1) Adjustment during the coarsening stage: Consider a directed
+the updated balance table and ensures that the latest channel states                 weighted graph 𝐺 = (𝑇 , 𝐸), where 𝑇 represents the transaction nodes
+are properly recorded. Through this coordinated process, the updated                 and 𝐸 represents the transaction edges. Each node 𝑡 ∈ 𝑇 represents a
+balances are securely synchronized, and the final signed states preserve             user account or contract, and each directed edge 𝑒 = (𝑡𝑖 , 𝑡𝑗 ) ∈ 𝐸 denotes
+the integrity and consistency of the transaction.                                    a transaction from 𝑡𝑖 to 𝑡𝑗 with associated cost or weight 𝑤𝑒 (𝑒).
+
+                                                                                 6
+X. Huang et al.                                                                                                         Computer Standards & Interfaces 97 (2026) 104123
+
+
+    To evaluate transaction chain structure during coarsening, we de-                Algorithm 1: Network Coarsening with Isolation Prevention
+fine the longest path 𝑃 as the path with the highest cumulative trans-                  Input: 𝐺 , target_size , threshold
+action cost or weight from the starting node to the terminal node. To                   Output: 𝐺𝑐𝑜𝑎𝑟𝑠𝑒𝑛𝑒𝑑
+capture the depth and height of such a chain, we introduce two metrics:               1 foreach 𝑎 ∈ 𝐺.𝑎𝑐𝑐𝑜𝑢𝑛𝑡𝑠 do
+Depth(𝑡), which represents the maximum cumulative weight of any                       2     𝑑𝑒𝑔𝑟𝑒𝑒(𝑎) ← count(neighbor (𝑎))// Init node degree
+path ending at node 𝑡, and Height(𝑡), which represents the maximum
+                                                                                      3 𝑠𝑜𝑟𝑡𝑒𝑑_𝑎𝑐𝑐𝑜𝑢𝑛𝑡𝑠 ← Sort 𝐺.𝑎𝑐𝑐𝑜𝑢𝑛𝑡𝑠,
+cumulative weight of any path starting from node 𝑡.
+                                                                                      4 by degree // Sort by degree
+    We define the cumulative transaction cost along a path 𝑃 from 𝑡𝑖 to
+                                                                                      5 while |𝐺| > 𝑡𝑎𝑟𝑔𝑒𝑡_𝑠𝑖𝑧𝑒 do
+𝑡𝑗 as:
+                                                                                      6     foreach 𝑎 ∈ 𝐺.𝑎𝑐𝑐𝑜𝑢𝑛𝑡𝑠 do
+                ∑
+                𝑛
+                                                                                      7         𝑏 ← Select neighbor(𝑎), min_degree // Select
+𝛾(𝑡𝑖 , 𝑡𝑗 ) =         𝑤𝑒 (𝑒𝑘 )                                             (1)
+                𝑘=1
+                                                                                                       low-degree neighbor
+                                                                                      8            Merge 𝑎, 𝑏 // Merge nodes
+where 𝑒𝑘 are the edges along the path 𝑃 .                                             9            Update 𝐺, 𝑎, 𝑏 // Update graph
+   The weight 𝜔(𝑡) of a node 𝑡 denotes the sum of all weights of                     10            if degree(𝑎) < threshold then
+incoming and outgoing edges connected to 𝑡, representing the total                   11                Merge low-volume accounts into
+transaction volume related to that account.                                          12                super accounts // Group small nodes
+   Based on this, Depth and Height can be recursively computed as:                   13                Update connect super accounts to
+                        (                               )
+Depth(𝑡) =      max      Depth(𝑡𝑖 ) + 𝜔(𝑡𝑖 ) + 𝛾(𝑡𝑖 , 𝑡)        (2)                  14                neighbors // Reconnect
+                 𝑡𝑖 ∈Predecessors(𝑡)
+                                  (                                    )             15         foreach (𝑢, 𝑣) ∈ 𝐺.𝑒𝑑𝑔𝑒𝑠 do
+Height(𝑡) =              max           Height(𝑡𝑗 ) + 𝜔(𝑡𝑗 ) + 𝛾(𝑡, 𝑡𝑗 )    (3)       16             𝑒𝑑𝑔𝑒_𝑤𝑒𝑖𝑔ℎ𝑡(𝑢, 𝑣) ← calculate_weight(𝑢, 𝑣)
+                  𝑡𝑗 ∈Successors(𝑡)
+                                                                                                       // Reweight edges
+    These metrics allow us to identify high-impact paths in the transac-
+                                                                                     17            if edge_weight(𝑢, 𝑣) > threshold then
+tion graph for partitioning and optimization purposes.
+                                                                                     18                Merge strongly connected accounts ;       // Preserve
+    The formula for the longest path maxPath(𝑒) can be adjusted as
+                                                                                                        strong links
+follows:
+
+maxPath(𝑒) = Depth(source(𝑒)) + 𝑤𝑒 (𝑒) + Height(target(𝑒))                 (4)       19   return 𝐺𝑐𝑜𝑎𝑟𝑠𝑒𝑛𝑒𝑑 ;
+
+This formula calculates the longest path of transaction edge𝑒 within
+the system. Here, source(𝑒) represents the starting node of transaction
+edge𝑒, and target(𝑒) represents the terminal node of transaction edge𝑒.              increasing the cost of cross-shard transactions and communication. To
+If the maxPath(𝑒) value of a path is high, it indicates that the accounts            address this issue, we propose the Depth-Priority Growing Algorithm
+and transactions on the path have a significant influence or frequent                (DPGA). This algorithm prioritizes accounts with high transaction fre-
+transaction records. In the state partitioning process, such high-weight             quency and network importance (i.e., nodes with larger maxDepth) as
+paths should not be fragmented; instead, the transactions on these paths             starting points. Through a region-growing strategy, eligible neighboring
+should be assigned to the same partition to reduce communication                     nodes are merged into the same region until no more neighbors can be
+and synchronization costs associated with cross-partition transactions.              added. This approach reduces the dispersion of high-frequency trans-
+After completing the calculation of maxPath(𝑒), we also need to address              action accounts and lowers the complexity of cross-shard transactions.
+issues such as reducing the likelihood of transaction account isolation.             The pseudocode is presented in Algorithm 2.
+    Therefore, to further improve the partitioning process, the following
+steps are executed, as shown in Algorithm 1.
+                                                                                     5.2. Dynamic sharding adjustment mechanism
+    Step 1: Degree calculation and initial matching
+    Calculate the degrees of all accounts and sort them in ascending
+                                                                                         By adjusting account allocation, the overall load of the sharding
+order. Select the account with the lowest degree from the set of un-
+                                                                                     system can be significantly improved. To further ensure the system’s
+matched accounts and match it with its adjacent accounts. Prioritize the
+                                                                                     performance, we incorporate a dynamic sharding mechanism combined
+reduction of isolated accounts and prevent the spread of low-frequency
+                                                                                     with the CSOCPPA to maintain load stability across the system. The
+trading accounts across shards to minimize cross-shard transactions.
+                                                                                     dynamic sharding adjustment mechanism monitors the GINI coefficient
+    Step 2: Multi-Edge matching phase
+                                                                                     to assess load balance and evaluates transaction patterns to dynamically
+    Match adjacent accounts in descending order of edge weight. If
+                                                                                     split or merge shards. This approach optimizes both load balancing and
+multiple accounts share the same weight, prioritize accounts with fewer
+                                                                                     cross-shard transaction performance. The following sections provide
+merged edges to preserve high-frequency trading paths and minimize
+                                                                                     a detailed explanation of shard load definitions, the GINI coefficient
+cutting critical paths.
+                                                                                     measurement criteria, and the dynamic adjustment mechanism.
+    Step 3: Network update and simplification
+    After each match, update the network’s connections. If the network                   (1) Definition and algorithm of shard load: In a distributed
+reaches the target size, proceed to the next step; otherwise, return to              system, shard load is a key indicator for measuring the workload of
+Step 2 and continue simplifying the network.                                         shards. The calculation of shard load should comprehensively consider
+    Step 4: Load balancing and super account creation                                various factors, including the number of users stored within a shard,
+    After coarsening, merge low-transaction accounts into super ac-                  the transaction volume processed, and the frequency of cross-shard
+counts. Ensure these super accounts remain connected to their original               interactions. To ensure the comprehensiveness of the load indicator,
+neighbors, accumulate all edge weights, and retain full transaction                  the following load calculation methods are defined:
+data.                                                                                    The user count load of shard 𝑆𝑖 is defined as the number of users
+    (2) Initialization phased optimization: In the context of cross-                 managed by the shard, expressed as:
+shard transactions, the initial partitioning is critical for subsequent
+                                                                                     𝑙𝑗𝑢 = 𝑈𝑗                                                                       (5)
+optimizations. The traditional Metis algorithm [17] grows regions by
+randomly selecting starting points, which may lead to high-frequency                 The transaction volume load of shard 𝑆𝑖 is defined as the total trans-
+transaction accounts being distributed across different shards, thereby              action volume processed by the shard within a unit of time, expressed
+
+                                                                                 7
+X. Huang et al.                                                                                                          Computer Standards & Interfaces 97 (2026) 104123
+
+
+Algorithm 2: Depth-Priority Growing Algorithm for Cross-Shard                          • 𝐺 ∈ (0.3, 0.5]: The system load is basically balanced, and load
+Transaction Optimization                                                                 differences are within an acceptable range.
+   Input: 𝐺, target_size, threshold, maxDepth                                          • 𝐺 > 0.5: The system load is imbalanced, requiring redistribution
+   Output: 𝐺𝑐𝑜𝑎𝑟𝑠𝑒𝑛𝑒𝑑                                                                    of high-load shards or merging of low-load shards.
+ 1 foreach 𝑎 ∈ 𝐺.𝑎𝑐𝑐𝑜𝑢𝑛𝑡𝑠 do
+                                                                                       (3) Dynamic sharding adjustment: In a distributed shard system,
+ 2     𝑑𝑒𝑔𝑟𝑒𝑒(𝑎) ← count(neighbor (𝑎)) ;
+                                                                                   load balancing operations, including shard splitting and shard merging,
+           // Initialize degree
+                                                                                   are triggered based on specific conditions to maintain system efficiency
+ 3 𝑠𝑜𝑟𝑡𝑒𝑑_𝑎𝑐𝑐𝑜𝑢𝑛𝑡𝑠 ← Sort 𝐺.𝑎𝑐𝑐𝑜𝑢𝑛𝑡𝑠, by                                           and fairness. When the load of a shard significantly exceeds that of
+ 4 Priority, then by maxDepth ;                                                    others or the GINI coefficient surpasses a predefined threshold, the
+ 5 while |𝐺| > 𝑡𝑎𝑟𝑔𝑒𝑡_𝑠𝑖𝑧𝑒 do
+                                                                                   system triggers shard splitting. The shard with the highest load, 𝑆max ,
+ 6     foreach 𝑎 ∈ 𝐺.𝑎𝑐𝑐𝑜𝑢𝑛𝑡𝑠 do                                                   is identified based on the condition:
+ 7         if maxDepth(𝑎) > threshold then
+                       // Prioritize high-frequency nodes                          𝑙max > 𝜇 + 𝛾 ⋅ 𝜎                                                                  (9)
+  8                    𝑏 ← Select neighbor(𝑎),                                                        1 ∑𝑚
+                                                                                   where 𝜇 = 𝑚 𝑗=1 𝑙𝑗 is the average load of all shards, 𝜎 =
+  9                    max_degree ;                                                √ ∑
+                                                                                     1  𝑚           2
+10                     Merge 𝑎, 𝑏 ;                                                  𝑚  𝑗=1 (𝑙𝑗 − 𝜇) is the standard deviation, and 𝛾 is an adjustment
+11                     Update 𝐺, 𝑎, 𝑏 ;                                            coefficient. For users in the split shard, the system employs consistency
+12                  if degree(𝑎) < threshold then                                  hashing to find corresponding storage shards and updates the
+                       // Identify low-volume nodes                                mapping to ensure query consistency. Conversely, when certain shards
+13                     Merge low-volume accounts into                              experience prolonged low load below a predefined threshold, the
+14                     super accounts ;                                            system triggers shard merging. The set of candidate shards for merging,
+15                     Update connect super accounts                               𝑆low , is identified based on the condition:
+16                     to neighbors ;
+                                                                                   𝑙𝑗 < 𝜃                                                                          (10)
+17         foreach (𝑢, 𝑣) ∈ 𝐺.𝑒𝑑𝑔𝑒𝑠 do
+                                                                                   where 𝜃 is the minimum load threshold, during the merging process, the
+18             𝑒𝑑𝑔𝑒_𝑤𝑒𝑖𝑔ℎ𝑡(𝑢, 𝑣) ←
+                                                                                   system updates the user-to-shard mapping and ensures synchronization
+19             calculate_weight(𝑢, 𝑣) ;
+                                                                                   of query positions for affected users. These mechanisms collectively
+20             if edge_weight(𝑢, 𝑣) > threshold then
+                                                                                   enhance system load distribution and ensure balanced utilization of
+21                 Merge strongly connected
+                                                                                   resources.
+22                 accounts ;
+                       // Merge strong connections
+                                                                                   6. Security analysis
+23    return 𝐺𝑐𝑜𝑎𝑟𝑠𝑒𝑛𝑒𝑑 ;
+                                                                                   6.1. Atomicity of transactions
+
+                                                                                       V-Bridge uses a hypergeometric distribution to calculate the prob-
+as:                                                                                ability of failure in each epoch. In V-Bridge, atomicity depends on
+        ∑                                                                          whether multiple shards can complete state verification and fund con-
+𝑙𝑗𝑡 =          𝑡𝑘                                                       (6)
+                                                                                   firmation within a specified time window while maintaining a secure
+        𝑘∈𝑇𝑗
+                                                                                   state. If any shard’s consensus committee includes more than 1∕3
+where 𝑇𝑖 represents the set of all transactions related to shard 𝑆𝑖 , and          malicious nodes, the shard is considered failed, potentially causing
+𝑡𝑘 represents the transaction volume of transaction 𝑘.                             transaction abortion or triggering exceptional rollbacks.
+    By combining these factors, the comprehensive load is defined as:                  Assume that the current epoch includes 𝑁 valid registered nodes,
+𝑙𝑗 = 𝛼 ⋅ 𝑙𝑗𝑢 + 𝛽 ⋅ 𝑙𝑗𝑡                                                  (7)        with 𝑡 = 𝑓 ⋅ 𝑁 being malicious. Each shard randomly selects 𝑛 nodes to
+                                                                                   ⌊ ⌋ its consensus group. The probability that a shard contains at least
+                                                                                   form
+where 𝛼 and 𝛽 are weighting coefficients used to balance the contribu-               𝑛
+                                                                                     3
+                                                                                        malicious nodes is:
+tion of different load sources to the shard’s pressure, this comprehen-
+sive load indicator can better reflect the actual pressure of the shard                                    ( 𝑡 ) (𝑁−𝑡)
+                                                                                                    ∑
+                                                                                                    𝑛
+                                                                                                            𝑥
+                                                                                                                ⋅ 𝑛−𝑥
+and provide a reference for continuous adjustment.                                 𝑃shard-fail =                (𝑁 )                                               (11)
+                                                                                                     ⌊ ⌋
+    (2) Load Balancing measurement based on GINI coefficient:                                      𝑥= 3𝑛        𝑛
+This study introduces the GINI coefficient as a measure to evaluate the
+load distribution state among shards. The GINI coefficient is a classical              Here, 𝑁 is the total number of registered nodes, 𝑡 = 𝑓 ⋅ 𝑁 is the
+indicator of distribution inequality, with a range of [0, 1]. A value closer       number of malicious nodes, 𝑛 is the number of consensus nodes per
+to 0 indicates a more balanced distribution, while a value closer to 1             shard, and 𝑥 denotes the number of malicious nodes in a single shard.
+indicates a more imbalanced distribution.                                          If a transaction spans 𝑆txn shards, the upper bound on the atomicity
+    In the shard load scenario, the calculation formula for the GINI               failure probability is:
+coefficient is as follows:
+      ∑𝑚 ∑𝑚                                                                        𝑃 𝑟[Atomicity Failure] < 𝑆txn ⋅ 𝑃shard-fail                                     (12)
+        𝑗=1  𝑘=1 |𝑙𝑗 − 𝑙𝑘 |
+𝐺=            ∑                                                          (8)
+          2𝑚 𝑚  𝑗=1 𝑙𝑗
+                                                                                      As long as the proportion of malicious nodes in each shard does
+                                                                                   not exceed 1∕3 (i.e., 𝑓 < 1∕3), V-Bridge can guarantee atomicity. In
+where 𝑚 represents the number of shards, and 𝑙𝑗 represents the load of
+                                                                                   the subsequent analysis, we demonstrate how V-Bridge ensures atomic
+shard 𝑆𝑗 . Based on the calculated GINI coefficient, the current balance
+                                                                                   cross-shard transactions.
+of system shard loads can be directly judged. The specific judgment
+rules are as follows:
+                                                                                   Theorem 1. If a cross-shard transaction completes within the HTLC window
+        • 𝐺 < 0.3: The system load is completely balanced, and all shard           𝑇1 and submits a valid 𝑅, V-Bridge guarantees atomicity without requiring
+          loads are identical.                                                     rollback.
+
+                                                                               8
+X. Huang et al.                                                                                                    Computer Standards & Interfaces 97 (2026) 104123
+
+
+Proof. Assume the transaction involves shards 𝑆1 , 𝑆2 , . . . , 𝑆𝑘 , where         7. Experimental evaluation
+the funds in each shard are locked using HTLC contracts. The unlocking
+condition requires the receipt of a correct 𝑅 that satisfies a predeter-           7.1. Setting
+mined hash value H(𝑅) within the window 𝑇1 , fulfilling the release
+condition.                                                                             We developed a prototype of V-Bridge in Golang and evaluated its
+                                                                                   performance on Ubuntu 20.04.1. The testbed was configured with an
+    Once the receiver Bob submits the correct 𝑅 within 𝑇1 , all con-
+                                                                                   8-core AMD Ryzen 6000 processor, 16 GB LPDDR5-6400 memory, and
+tracts are deemed releasable. Trustor nodes in each shard immediately
+                                                                                   a 1TB PCIe 4.0 SSD. To emulate real-world conditions, we introduced
+execute the fund release and update the balance state as Tablenow .
+                                                                                   random network latency between 50–100 ms and limited the band-
+This updated state is signed by both Trustors and broadcast across all
+                                                                                   width to 500 Mbps. Each block accommodates up to 3000 transactions,
+involved shards for chain-level finalization. Since valid release requires
+                                                                                   with each transaction fixed at 512 bytes. The number of C-Shards was
+the correct 𝑅 and consistent co-signatures, the final state is valid only
+                                                                                   set as 𝑆 ∈ {2, 4, 8, 16, 32, 64, 100} to evaluate system scalability.
+under mutual agreement. Therefore, if the transaction succeeds, all
+                                                                                       The dataset was extracted from the Ethereum blockchain using
+contracts release simultaneously; otherwise, if any contract fails to
+                                                                                   Python scripts from the XBlock-ETH project, including sender/receiver
+trigger, it indicates that 𝑅 was not submitted, and the transaction is
+                                                                                   addresses, amounts, and timestamps. The preprocessed data was used
+entirely aborted without partial execution. In conclusion, as long as 𝑅
+                                                                                   as input for V-Bridge.
+is submitted within 𝑇1 , the HTLC conditions ensure atomicity across all
+                                                                                       In dynamic load regulation, we assigned weight factors 𝛼 = 0.4 and
+shards without the need for rollback logic.
+                                                                                   𝛽 = 0.6 to user count and transaction volume. The GINI coefficient was
+                                                                                   used to evaluate shard imbalance. A split is triggered when GINI > 0.5,
+Theorem 2. If a cross-shard transaction misses the HTLC deadline 𝑇1 , V-
+                                                                                   and merging is triggered when the condition 𝑙𝑗 < 𝜃 = 0.3𝜇 is met,
+Bridge ensures a consistent rollback across all shards, preventing fund loss       where 𝜇 is the average load. To reduce over-sensitivity, the adjustment
+or double spending.                                                                factor was set to 𝛾 = 1.5. In the CSOCPPA module, all transaction edges
+                                                                                   and node weights were set to 1 to simplify computation, representing
+Proof. The HTLC contract in V-Bridge is equipped with a unified                    a uniform transaction cost.
+timeout parameter 𝑇1 . If the correct random secret 𝑅 is not submitted                 For comparison, we implemented two additional load-balancing
+within this time frame, the contract automatically triggers a rollback             schemes: BrokerChain and X-Shard. BrokerChain utilizes a partitioning
+mechanism, returning the locked funds to the original accounts. Since              algorithm based on user account relationships, grouping frequently
+the funds remain unreleased throughout the process, the state in each              interacting accounts within the same shard to reduce cross-shard trans-
+shard remains unchanged, and the system proceeds to restore consis-                actions and enhance throughput. X-Shard employs an optimistic cross-
+tency. There is no partial commitment or state divergence, and the                 shard transaction strategy, processing transactions in parallel on input
+design inherently prevents double spending. Therefore, even in the                 shards and verifying them via gate accounts to minimize delays. All
+event of transaction failure, atomicity and consistency of the system              solutions were tested under identical conditions using the PBFT [33]
+are preserved. This completes the proof.                                           intra-shard consensus protocol.
+
+6.2. Trustor security                                                              7.2. System throughput
+
+    To address potential malicious or faulty behavior among Trustors,                  Fig. 4 illustrates how throughput varies with the number of shards
+the system employs a dynamic reputation mechanism that continuously                (𝑆) using a combination of boxplots, kernel density estimation curves,
+monitors node performance. Actions such as refusing to sign, forward-              and scatter plots. The boxplot shows the quartile throughput range,
+ing delays, or failing to submit HTLC secrets are penalized. Nodes                 with white dots marking the median. The smooth kernel density curve
+whose reputation scores fall below a defined threshold are demoted,                highlights data concentration, while the scatter plot visually represents
+stripped of execution privileges, and forfeit their staked collateral. In          individual data points. As 𝑆 increases from 4 to 100, V-Bridge consis-
+the event of partial trust failure, the system ensures continuity through          tently demonstrates superior throughput, with values stabilizing around
+leader re-election or transaction rollback. All critical operations require        3k to 3.5k TPS across different shard counts. This is evident in the tight
+co-signatures from at least 2∕3 of the shard’s Trustors, providing re-             interquartile ranges and smooth density curves. BrokerChain’s through-
+silience against Byzantine behavior. We assume a majority of Trustors              put remains relatively stable but lower, hovering around 2k to 2.5k
+are honest in each round and that HTLC secrets are submitted within                TPS, while X-Shard’s performance lags behind, stabilizing just above 2k
+a bounded timeframe. Otherwise, contracts automatically roll back to               TPS. Moreover, X-Shard exhibits more variability, with wider boxplots
+preserve state consistency.                                                        and more scattered data points, highlighting its lower reliability in
+    These assumptions are realistic in practice: Trustors must stake               comparison to V-Bridge. These trends highlight V-Bridge’s scalability
+collateral and earn reputation over time through consistent, verifiable            and stability as the number of shards increases.
+actions, making large-scale compromises both economically prohibitive                  Fig. 5 examines throughput under varying transaction arrival rates
+and statistically improbable. The HTLC mechanism incorporates ex-                  (40–180 TX/s). Similar to Fig. 4, the boxplot depicts throughput dis-
+plicit timeouts and fallback logic, including automated rollback, which            tribution, while the kernel density curve and scatter plot highlight
+enables the system to function correctly without relying on perfect                data concentration and individual variations. V-Bridge consistently
+synchrony.                                                                         achieves nearly 3k TPS across all arrival rates with minimal fluctua-
+    Execution rights are assigned dynamically based on reputation:                 tions, confirming its robust performance. BrokerChain maintains steady
+high-reputation nodes are prioritized, while low-reputation nodes are              throughput around 2k TPS, though fluctuations slightly increase at
+sidelined to prevent abuse of authority. Even in cases of collusion,               higher arrival rates. X-sharp again underperforms, with throughput
+the co-signature requirement significantly raises the threshold for suc-           consistently below 2k TPS and significant variability at high arrival
+cessful misconduct. Collectively, these mechanisms mitigate the risks              rates. These results reinforce the performance and stability advantages
+associated with partial Trustor failures and HTLC disruptions. Rather              of V-Bridge under heavy transaction loads, while BrokerChain and X-
+than relying on idealized assumptions, the system maintains robustness             sharp struggle to adapt. The detailed visualizations further validate the
+through built-in safeguards such as incentive alignment, role rotation,            reliability of the data, providing a strong foundation for comparing
+and protocol-level fallback strategies.                                            system performance.
+
+                                                                               9
+X. Huang et al.                                                                                                 Computer Standards & Interfaces 97 (2026) 104123
+
+
+
+
+                                               Fig. 4. The impact of the number of shards on throughput.
+
+
+
+
+                                                Fig. 5. Impact of transaction arrival rate on throughput.
+
+
+7.3. Transaction processing delays                                                  As the number of shards increases, the average transaction delay
+                                                                                 of all systems shows a gradual downward trend. This indicates that
+    Fig. 6 depicts the transaction delay under different protocols as a          more shards can effectively distribute transaction processing workloads
+result of factors such as the number of shards, transaction arrival rate,        and improve system performance. V-Bridge demonstrates significant
+and cross-shard ratio. From the observed trends, these experimental              optimization at higher shard counts (e.g., 32 shards) with delays re-
+results reveal key factors affecting delay and highlight the performance         duced to around 1400 ms, outperforming BrokerChain and X-Shard.
+differences of various protocols in distributed environments.
+                                                                                 This reflects V-Bridge’s superior cross-shard processing efficiency. By
+(1) Impact of the number of shards (Fig. 6(a))                                   contrast, BrokerChain and X-Shard show only slight delay reductions
+
+                                                                            10
+X. Huang et al.                                                                                                Computer Standards & Interfaces 97 (2026) 104123
+
+
+
+
+                                                      Fig. 6. Comparison of transaction delays.
+
+
+
+
+                                                       Fig. 7. Consensus and load comparison.
+
+
+with more shards, and the overall delay remains higher than V-Bridge,          approximately 1400 ms, demonstrating excellent cross-shard processing
+with the gap further widening at 32 shards.                                    capabilities. By contrast, BrokerChain and X-Shard experience signifi-
+                                                                               cant delay increases, particularly when the cross-shard ratio exceeds
+(2) Impact of transaction arrival rate (Fig. 6(b))
+                                                                               80%. Their delays surpass those of V-Bridge, highlighting the ineffi-
+    Increasing the transaction arrival rate leads to an apparent rise
+                                                                               ciency of their cross-shard communication and the more significant
+in transaction delay. All protocols exhibit relatively low delays at
+                                                                               impact of high cross-shard traffic.
+lower arrival rates (40 TX/s). However, as the arrival rate increases,
+delays grow significantly. V-Bridge maintains stable performance even
+                                                                               7.4. Consensus and load comparison
+at higher arrival rates (e.g., 180 TX/s), showing good scalability. In
+contrast, BrokerChain and X-Shard experience rapidly increasing delays
+                                                                                   Consensus success rate is a critical indicator in distributed systems
+at higher arrival rates (e.g., above 120 TX/s), indicating their lim-
+                                                                               that measures the proportion of successfully completed consensus pro-
+ited capacity to handle higher loads and reflecting their performance
+                                                                               cesses within a given time. It reflects the system’s ability to synchronize
+bottlenecks in high-load environments.
+                                                                               and process transactions efficiently while maintaining data consistency.
+(3) Impact of cross-shard ratio (Fig. 6(c))                                    A higher success rate directly correlates with better system performance
+    At a lower cross-shard ratio (20%), delays across all protocols are        and reliability. Conversely, a lower success rate can lead to transaction
+relatively close. However, as the cross-shard ratio increases, inter-          failures, increased delays, or even system reconfiguration. Optimizing
+protocol differences become apparent. V-Bridge maintains stable per-           consensus protocols enhances the system’s stability, performance, and
+formance even at an 80% cross-shard ratio, with an average delay of            cross-shard transaction processing capabilities.
+
+                                                                          11
+X. Huang et al.                                                                                                          Computer Standards & Interfaces 97 (2026) 104123
+
+
+    Fig. 7 illustrates the consensus success rate and transaction effi-           Acknowledgments
+ciency of three protocols (V-Bridge, BrokerChain, and X-Shard) under
+varying conditions, such as the number of shards, transaction arrival                 This work was partially supported by the Key Program of the Joint
+rates, and cross-shard ratios. V-Bridge consistently outperforms the              Funds of the National Natural Science Foundation of China under
+other protocols across all conditions. In Fig. 7(a), as the number of             Grant U2468205, the National Natural Science Foundation of China
+shards increases, the consensus success rate and transaction efficiency           under Grants 62472168, 62072170 and 61976087, the Hunan Provin-
+for all protocols decline due to higher cross-shard communication                 cial Natural Science Foundation of China under Grants 2021JJ30141
+overhead. However, V-Bridge maintains a high success rate (close                  and 2024JJ6066, the Key Research and Development Program of Hu-
+to 95%) even with 32 shards, demonstrating excellent scalability. In              nan Province under Grant 2022GK2015, the Science and Technology
+contrast, BrokerChain and X-Shard experience significant performance              Project of the Department of Communications of Hunan Province under
+degradation as the number of shards increases. Fig. 7(b) examines the             Grant 202101, and the Research Projects of the Hunan Provincial
+impact of transaction arrival rates. V-Bridge maintains stable consensus          Department of Education under Grants 23B0449 and 23B0288.
+success rates and efficiency even at high arrival rates (e.g., 180 TX/s).
+In contrast, BrokerChain and X-Shard exhibit significant declines in              Data availability
+performance under heavy transaction loads, reflecting their processing
+limitations. Fig. 7(c) highlights the impact of cross-shard ratios. V-               Data will be made available on request.
+Bridge demonstrates strong performance, maintaining a stable success
+rate even at an 80% cross-shard ratio. Meanwhile, the success rates of
+BrokerChain and X-Shard drop sharply under high cross-shard ratios,               References
+revealing their weaknesses in handling intensive cross-shard scenarios.
+                                                                                   [1] J. Xu, C. Wang, X. Jia, A survey of blockchain consensus protocols, ACM Comput.
+    Finally, the comparison of load balancing across different shard                   Surv. 55 (13s) (2023) 1–35.
+numbers shows that V-Bridge achieves the most balanced load distribu-              [2] S. Zhang, Z. Yan, W. Liang, K.-C. Li, B. Di Martino, BCAE: A blockchain-based
+tion. Its variance remains minimal as the number of shards increases,                  cross domain authentication scheme for edge computing, IEEE Internet Things
+approaching the ‘‘Optimal Load Balance’’ standard. In contrast, the load               J. 11 (13) (2024) 24035–24048.
+                                                                                   [3] W. Liang, Y. Yang, C. Yang, Y. Hu, S. Xie, K.-C. Li, J. Cao, PDPChain: A
+variance for BrokerChain and X-Shard is significantly higher, indicating
+                                                                                       consortium blockchain-based privacy protection scheme for personal data, IEEE
+poorer load balancing capabilities.                                                    Trans. Reliab. 72 (2) (2022) 586–598.
+                                                                                   [4] W. Liang, S. Xie, K.-C. Li, X. Li, X. Kui, A.Y. Zomaya, MC-DSC: A dynamic
+8. Conclusion and future work                                                          secure resource configuration scheme based on medical consortium blockchain,
+                                                                                       IEEE Trans. Inf. Forensics Secur. 19 (2024) 3525–3538.
+                                                                                   [5] J. Cai, W. Liang, X. Li, K. Li, Z. Gui, M.K. Khan, GTxChain: A secure IoT smart
+    We propose a novel virtual off-chain cross-shard transaction mech-                 blockchain architecture based on graph neural network, IEEE Internet Things J.
+anism that employs logical fund interactions instead of actual currency                10 (24) (2023) 21502–21514.
+transfers. This approach eliminates delays caused by continuous up-                [6] M.M. Islam, M.K. Islam, M. Shahjalal, M.Z. Chowdhury, Y.M. Jang, A low-cost
+                                                                                       cross-border payment system based on auditable cryptocurrency with consortium
+loading and significantly enhances throughput. By integrating an intel-
+                                                                                       blockchain: Joint digital currency, IEEE Trans. Serv. Comput. 16 (3) (2022)
+ligent sharding adjustment mechanism with the CSOCPPA, we address                      1616–1629.
+the limitations of traditional account optimization and mitigate load              [7] Y. Lu, The blockchain: State-of-the-art and research challenges, J. Ind. Inf. Integr.
+imbalance. Experimental results show that, compared to BrokerChain                     15 (2019) 80–90.
+and X-Shard, V-Bridge achieves up to 50% higher average throughput                 [8] H. Jin, J. Xiao, Towards trustworthy blockchain systems in the era of ‘‘internet
+                                                                                       of value’’: development, challenges, and future trends, Sci. China Inf. Sci. 65
+and reduces transaction latency by at least 15%. Additionally, its                     (153101) (2022) 1–11.
+consensus success rate consistently exceeds 90%. Across varying shard              [9] X. Meng, W. Liang, Z. Xu, K. Li, M.K. Khan, X. Kui, An anonymous authenticated
+counts, V-Bridge demonstrates a progressively decreasing load, which                   group key agreement scheme for transfer learning edge services systems, ACM
+remains consistently lower than those of the other two protocols. These                Trans. Sen. Netw. 20 (2024).
+                                                                                  [10] T. Chen, Z. Li, Y. Zhu, J. Chen, X. Luo, J.C.-S. Lui, X. Lin, X. Zhang,
+results underscore V-Bridge’s superior performance, scalability, and
+                                                                                       Understanding ethereum via graph analysis, ACM Trans. Internet Technol. (TOIT)
+reliability as a solution for cross-shard transactions.                                20 (2) (2020) 1–32.
+    In future work, we plan to establish virtual fund channels among              [11] L. Luu, V. Narayanan, C. Zheng, K. Baweja, S. Gilbert, P. Saxena, A secure
+multiple Trustors to achieve interoperability across the entire shard net-             sharding protocol for open blockchains, in: Proceedings of the 2016 ACM SIGSAC
+                                                                                       Conference on Computer and Communications Security, ACM SIGSAC, 2016, pp.
+work. Additionally, we will explore advanced optimization strategies
+                                                                                       17–30.
+for dynamic shard management to reduce communication further over-                [12] Z. Hong, S. Guo, P. Li, Scaling blockchain via layered sharding, IEEE J. Sel.
+head. Meanwhile, we aim to incorporate zero-knowledge proofs and                       Areas Commun. 40 (12) (2022) 3575–3588.
+other cryptographic techniques into security management to enhance                [13] F. Cheng, J. Xiao, C. Liu, S. Zhang, Y. Zhou, B. Li, B. Li, H. Jin, Shardag: Scaling
+system security.                                                                       dag-based blockchains via adaptive sharding, in: 2024 IEEE 40th International
+                                                                                       Conference on Data Engineering, ICDE, IEEE, 2024, pp. 2068–2081.
+                                                                                  [14] P. Zheng, Q. Xu, Z. Zheng, Z. Zhou, Y. Yan, H. Zhang, Meepo: Multiple execution
+CRediT authorship contribution statement                                               environments per organization in sharded consortium blockchain, IEEE J. Sel.
+                                                                                       Areas Commun. 40 (12) (2022) 3562–3574.
+   Xueting Huang: Writing – original draft, Software, Methodology,                [15] J. Wang, H. Wang, Monoxide: Scale out blockchains with asynchronous con-
+                                                                                       sensus zones, in: 16th USENIX Symposium on Networked Systems Design and
+Conceptualization. Xiangwei Meng: Writing – review & editing, Val-                     Implementation, NSDI 19, USENIX Association, 2019, pp. 95–112.
+idation, Supervision. Kai Zhang: Formal analysis, Conceptualization.              [16] H. Huang, X. Peng, J. Zhan, S. Zhang, Y. Lin, Z. Zheng, S. Guo, Brokerchain:
+Ce Yang: Methodology, Formal analysis. Wei Liang: Writing – review                     A cross-shard blockchain protocol for account/balance-based state sharding, in:
+& editing, Funding acquisition, Formal analysis, Conceptualization.                    IEEE INFOCOM 2022-IEEE Conference on Computer Communications, IEEE,
+                                                                                       2022, pp. 1968–1977.
+Kuan-Ching Li: Writing – review & editing, Supervision.
+                                                                                  [17] G. Karypis, V. Kumar, A fast and high quality multilevel scheme for partitioning
+                                                                                       irregular graphs, SIAM J. Sci. Comput. 20 (1) (1998) 359–392.
+Declaration of competing interest                                                 [18] J. Xu, Y. Ming, Z. Wu, C. Wang, X. Jia, X-Shard: Optimistic cross-shard transac-
+                                                                                       tion processing for sharding-based blockchains, IEEE Trans. Parallel Distrib. Syst.
+                                                                                       35 (4) (2024) 548–559.
+    The authors declare that they have no known competing finan-                  [19] Y. Levi, I. Keslassy, Beyond the ring: Quantized heterogeneous consistent hashing,
+cial interests or personal relationships that could have appeared to                   in: 2023 IEEE 31st International Conference on Network Protocols, ICNP, IEEE,
+influence the work reported in this paper.                                             2023, pp. 1–12.
+
+
+                                                                             12
+X. Huang et al.                                                                                                                      Computer Standards & Interfaces 97 (2026) 104123
+
+
+[20] G. Mendelson, S. Vargaftik, K. Barabash, D.H. Lorenz, I. Keslassy, A. Orda,               [29] X. Wang, C. Lin, X. Huang, D. He, Anonymity-enhancing multi-hop locks for
+     Anchorhash: A scalable consistent hash, IEEE/ACM Trans. Netw. 29 (2) (2020)                    monero-enabled payment channel networks, IEEE Trans. Inf. Forensics Secur. 19
+     517–528.                                                                                       (2023) 2438–2453.
+[21] B. Hou, D. Wang, T. Xia, L. Xi, Z. Peng, K.-L. Tsui, Generalized Gini indices:            [30] T. Cai, W. Chen, K.E. Psannis, S.K. Goudos, Y. Yu, Z. Zheng, S. Wan, Scalable
+     Complementary sparsity measures to Box–Cox sparsity measures for machine                       on-chain and off-chain blockchain for sharing economy in large-scale wireless
+     condition monitoring, Mech. Syst. Signal Process. 169 (2022) 108751.                           networks, IEEE Wirel. Commun. 29 (3) (2022) 32–38.
+[22] X. Qi, Y. Li, LightCross: Sharding with lightweight cross-shard execution                 [31] X. Jia, Z. Yu, J. Shao, R. Lu, G. Wei, Z. Liu, Cross-chain virtual payment channels,
+     for smart contracts, in: IEEE INFOCOM 2024-IEEE Conference on Computer                         IEEE Trans. Inf. Forensics Secur. 18 (2023) 3401–3413.
+     Communications, IEEE, 2024, pp. 1681–1690.                                                [32] Z. Li, W. Su, M. Xu, R. Yu, D. Niyato, S. Xie, Compact learning model for dynamic
+[23] S. Jiang, J. Cao, C.L. Tung, Y. Wang, S. Wang, SHARON: Secure and efficient                    off-chain routing in blockchain-based IoT, IEEE J. Sel. Areas Commun. 40 (12)
+     cross-shard transaction processing via shard rotation, in: Proceedings of the IEEE             (2022) 3615–3630.
+     International Conference on Computer Communications, INFOCOM, IEEE, 2024,                 [33] W. Li, C. Feng, L. Zhang, H. Xu, B. Cao, M.A. Imran, A scalable multi-layer
+     pp. 20–23.                                                                                     PBFT consensus for blockchain, IEEE Trans. Parallel Distrib. Syst. 32 (5) (2020)
+[24] Y. Zhang, S. Pan, J. Yu, Txallo: Dynamic transaction allocation in sharded                     1146–1160.
+     blockchain systems, in: 2023 IEEE 39th International Conference on Data                   [34] H. Azimy, A.A. Ghorbani, E. Bagheri, Preventing proof-of-work mining attacks,
+     Engineering, ICDE, IEEE, 2023, pp. 721–733.                                                    Inform. Sci. 608 (2022) 1503–1523.
+[25] E. Kokoris-Kogias, P. Jovanovic, L. Gasser, N. Gailly, E. Syta, B. Ford, Om-              [35] A. Hosoyamada, Y. Sasaki, Quantum collision attacks on reduced SHA-256 and
+     niledger: A secure, scale-out, decentralized ledger via sharding, in: 2018 IEEE                SHA-512, in: Annual International Cryptology Conference, Springer, 2021, pp.
+     Symposium on Security and Privacy, SP, IEEE, 2018, pp. 583–598.                                616–646.
+[26] M. Zamani, M. Movahedi, M. Raykova, Rapidchain: Scaling blockchain via full               [36] C. Boyd, K. Gjøsteen, S. Wu, A blockchain model in tamarin and formal analysis
+     sharding, in: Proceedings of the 2018 ACM SIGSAC Conference on Computer and                    of hash time lock contract, in: 2nd Workshop on Formal Methods for Blockchains,
+     Communications Security, ACM SIGSAC, 2018, pp. 931–948.                                        FMBC 2020, Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2020, p. 13.
+[27] Z. Hong, S. Guo, P. Li, W. Chen, Pyramid: A layered sharding blockchain system,           [37] Y. Liu, W. Liang, K. Xie, S. Xie, K. Li, W. Meng, LightPay: A lightweight and
+     in: IEEE INFOCOM 2021-IEEE Conference on Computer Communications, IEEE,                        secure off-chain multi-path payment scheme based on adapter signatures, IEEE
+     2021, pp. 1–10.                                                                                Trans. Serv. Comput. 17 (4) (2023) 1503–1523.
+[28] A. Liu, Y. Liu, Q. Wu, B. Zhao, D. Li, Y. Lu, R. Lu, W. Susilo, CHERUBIM:                 [38] J. Herrmann, J. Kho, B. Uçar, K. Kaya, Ü.V. Çatalyürek, Acyclic partitioning of
+     A secure and highly parallel cross-shard consensus using quadruple pipelined                   large directed acyclic graphs, in: 2017 17th IEEE/ACM International Symposium
+     two-phase commit for sharding blockchains, IEEE Trans. Inf. Forensics Secur. 19                on Cluster, Cloud and Grid Computing, CCGRID, IEEE, 2017, pp. 371–380.
+     (2024) 3178–3193.
+
+
+
+
+                                                                                          13
+
--- a/papers_txt/basso-isogeny-oprf-sac23.txt
+++ b/papers_txt/basso-isogeny-oprf-sac23.txt
--- a/papers_txt/basso-isogeny-oprf.txt
+++ b/papers_txt/basso-isogeny-oprf.txt
--- a/papers_txt/composable-oprf-thesis.txt
+++ b/papers_txt/composable-oprf-thesis.txt
--- a/papers_txt/cryptrec-guidelines-2024.txt
+++ b/papers_txt/cryptrec-guidelines-2024.txt
--- a/papers_txt/cryptrec-pqc-2024.txt
+++ b/papers_txt/cryptrec-pqc-2024.txt
--- a/papers_txt/faller-garbled-oprf-thesis.txt
+++ b/papers_txt/faller-garbled-oprf-thesis.txt
--- a/papers_txt/isogeny-oprf-lattice-ot.txt
+++ b/papers_txt/isogeny-oprf-lattice-ot.txt
--- a/papers_txt/kyber-verification.txt
+++ b/papers_txt/kyber-verification.txt
--- a/papers_txt/lwe-problem.txt
+++ b/papers_txt/lwe-problem.txt
@@ -0,0 +1,420 @@
+                 CS 294. The Learning with Errors Problem:
+                    Introduction and Basic Cryptography
+   The learning with errors (LWE) problem was introduced in its current form in a seminal work
+of Oded Regev for which he won the Gödel prize in 2018. In its typical form, the LWE problem
+asks to solve a system of noisy linear equations. That is, it asks to find s ∈ Znq given
+                                                                                 m
+                              (ai , hai , si + ei ) : s ← Znq , ai ← Znq , ei ← χ i=1
+                          
+                                                                                                  (1)
+
+where:
+
+    • Zq = Z/qZ denotes the finite ring of integers modulo q, Znq denotes the vector space of
+      dimension n over Zq ;
+
+    • χ is a probability distribution over Z which typically outputs “small” numbers, an example
+      being the uniform distribution over an interval [−B, . . . , B] where B  q/2; and
+
+    • a ← D denotes that a is chosen according to the finite probability distribution D, a ← S
+      denotes that a is chosen uniformly at random from the (finite) set S.
+
+In this first lecture, we will present various perspectives on the LWE (and the closely related “short
+integer solutions” or SIS) problem, basic theorems regarding the different variants of these problems
+and their basic cryptographic applications.
+    We will shortly derive LWE in a different way, “from first principles”, starting from a different
+view, that of finding special solutions to systems of linear equations.
+
+
+1     Solving Systems of Linear Equations
+Consider the problem of solving a system of linear equations
+
+                                                Ae = b mod q                                      (2)
+
+given A ∈ Zn×mq    and b ∈ Znq . This can be accomplished in polynomial time with Gaussian
+elimination. However, slight variations of this problem become hard for Gaussian elimination and
+indeed, we believe, for all polynomial-time algorithms. This course is concerned with two such
+problems, very related to each other, called the SIS problem and the LWE problem.
+
+1.1    The “Total” Regime and SIS
+Assume that we now ask for solutions to equation 2 where e lies in some subset S ⊆ Zm
+                                                                                    q . Typically
+we will think of subsets S that are defined geometrically, for example:
+
+    • S = {0, 1}, which is the classical subset sum problem modulo q. More generally, S =
+      [−B . . . B]m is the set of all solutions where each coordinate can only take a bounded value
+      (absolute value bounded by some number B  q/2). This will be the primary setting of
+      interest.
+
+    • S = Ball2R , the Euclidean ball of (small) radius R.
+
+
+                                                        1
+In all cases, we are asking for short solutions to systems of linear equations and hence this is called
+the SIS (short integer solutions) problem.
+    The SIS problem SIS(n, m, q, B) as we will study is parameterized by the number of variables
+m, the number of equations n, the ambient finite field Zq , and the bound on the absolute value of
+the solutions B. Namely, we require that each coordinate ei ∈ [−B, −B + 1, . . . , B − 1, B].
+    To define an average-case problem, we need to specify the probability distributions for A and
+b. We will, for the most part of this course, take A to be uniformly random in Zn×m   q   . There are
+two distinct ways to define b. The first is in the “total” regime where we simply choose b from the
+uniform distribution over Znq .
+    What does “total” mean? Total problems in NP are ones for which each problem instance has
+a solution that can be verified given a witness, but the solution may be hard to find. An example
+is the factoring problem where you are given a positive integer N and you are asked for its prime
+factorization. A non-example is the 3-coloring problem where you are given a graph G and you
+are asked for a 3-coloring; although this problem is in NP, it is not total as not every graph is
+3-colorable.
+
+Totality of SIS on the Average. Here, using a simple probabilistic argument, one can show
+that (B-bounded) solutions are very likely to exist if (2B + 1)m  q n , or m = Ω( nlog
+                                                                                      log q
+                                                                                        B ). We call
+this regime of parameters the total regime or the SIS regime. Thus, roughly speaking, in the SIS
+regime, m is large enough that we are guaranteed solutions (even exponentially many of them)
+when A and b are chosen to be uniformly random. The problem then is to actually find a solution.
+
+A Variant: homogenous SIS. The homogenous version of SIS asks for a non-zero solution to
+equation 1 with the right hand side being 0, that is, Ae = 0 (mod q). This variant is worst-case
+total as long as (B + 1)m > q n . That is, for every instance A is guaranteed to have a solution. We
+leave the proof to the reader (Hint: Pigeonhole). SIS and hSIS are equivalent on the average-case.
+We again leave the simple proof to the reader.
+
+1.2     The Planted Regime and LWE
+When m  nlog   log q
+                  B , one can show again that there are likely to be no B-bounded solutions for a
+uniformly random b and thus, we have to find a different, sensible, way to state this problem. To
+do this, we first pick a B-bounded vector e and compute b as Ae mod q. In a sense, we plant the
+solution e inside b. The goal now is to recover e (which is very likely to be unique) given A and
+b. We call this the planted regime or the LWE regime.
+    But why is this LWE when it looks so different from Equation 1?
+    This is because the SIS problem in the planted regime is simply LWE in disguise. For, given
+                                                     (m−n)×m
+an LWE instance (A, yT = sT A + eT ), let A⊥ ∈ Zq               be a full-rank set of vectors in the
+right-kernel of A. That is,
+                                        A⊥ · At = 0 mod q
+Then,
+                           b := A⊥ · y = A⊥ · (At s + e) = A⊥ · e mod q
+so (A⊥ , b) is an SIS instance SIS(m − n, m, q, B) whose solution is the LWE error vector. Further-
+more, this is in the planted regime since one can show with an easy probabilistic argument that
+the LWE error vector e is unique given (A, y).
+
+                                                  2
+    The reader should also notice that we can run the reduction in reverse, creating an LWE
+instance from a SIS instance. If the SIS instance is in the planted regime, this (reverse) reduction
+will produce an LWE instance.
+    In summary, the only difference between the SIS and the LWE problems is whether they live
+in the total world or the planted world, respectively. But the world you live in may make a
+big difference. Algorithmically, so far, we don’t see a difference. In cryptography, SIS gives us
+applications in “minicrypt” (such as one-way functions) whereas we need LWE for applications in
+“cryptomania” and beyond (such as public-key encryption and fully homomorphic encryption).
+
+Decision vs. Search for LWE. In the decisional version of LWE, the problem is to distinguish
+between (A, yT := sT A + eT mod q) and a uniformly random distribution. One can show, through
+a reduction that runs in poly(q) time, that the two problems are equivalent. The interesting
+direction is to show that if there is a poly-time algorithm that solves the decision-LWE problem
+for a uniformly random matrix A, then there is a poly-time algorithm that solves the search LWE
+problem for a (possibly different and possibly larger) uniformly random matrix A0 . We will see a
+search to decision reduction later in class.
+
+1.3   Reductions Between SIS and LWE
+SIS is at least as hard as LWE. We wish to show that if you have a solution for SIS w.r.t.
+A, then it is immediate to solve decision-LWE w.r.t. A. Indeed, given a SIS solution e such that
+Ae = 0 (mod q), and a vector bT , compute bT e (mod q). If b is an LWE instance, then
+
+                              bT e = (sT A + xT )e = xT e (mod q)
+
+which is a “small” number (as long as xT is small enough). On the other hand, if b is random,
+then this quantity is uniformly random mod q (in particular, with a non-negligible probability, not
+small). This gives us a distinguisher.
+
+LWE is (quantumly) at least as hard as SIS.           This turns out to be true, as we will see later
+in the course.
+
+1.4   SIS, LWE and Lattice Problems
+SIS and LWE are closely related to lattices and lattice problems. We will have much to say about
+this connection, in later lectures.
+
+
+2     Basic Theorems
+We start with some basic structural theorems on LWE and SIS.
+
+2.1   Normal Form SIS and Short-Secret LWE
+The normal form for SIS is where the matrix A is systematic, that is of the form A = [A0 ||I] where
+      n×(m−n)
+A0 ∈ Zq       .
+
+Lemma 1. Normal-form SIS is as hard as SIS.
+
+                                                 3
+Proof. To reduce from normal-form SIS to SIS, simply multiply the input to normal-form SIS
+(nfSIS), denoted [A0 ||I], on the left by a random matrix B ← Zn×n
+                                                                 q   . We will leave it to the reader
+to verify that the resulting matrix denoted A := B[A0 ||I] is uniformly random. Furthermore, a
+solution to SIS on input (A, Bb0 ) gives us a solution to nfSIS on input (A0 , b0 ).
+    In the other direction, to reduce from SIS to normal-form SIS, write A as [A0 ||B] and gener-
+ate [B−1 A0 ||I] as the normal-form SIS instance. Again, a solution to the normal form instance
+(B−1 A0 , B−1 b) gives us a solution to SIS on input (A, b).
+
+    The corresponding version of LWE is called short-secret LWE where both the entries of s and
+that of e are chosen from the error distribution χ. The proof of the following lemma follows along
+the lines of that for normal form SIS and is left as an exercise. (Indeed, a careful reader will observe
+that short-secret LWE is nothing but normal-form SIS in disguise.)
+Lemma 2. There is a polynomial-time reduction from ssLWE(n, m, q, χ) to LWE(n, m, q, χ) and
+one from LWE(n, m, q, χ) to ssLWE(n, m + n, q, χ).
+
+    We will continue to see more structural theorems about LWE through the course, but this
+suffices for now.
+
+
+3     Basic Cryptographic Applications
+3.1   Collision-Resistant Hashing
+A collision resistant hashing scheme H consists of an ensemble of hash functions {Hn }n∈N where
+each Hn consists of a collection of functions that map n bits to m < n bits. So, each hash function
+compresses its input, and by pigeonhole principle, it has collisions. That is, inputs x 6= y such that
+h(x) = h(y). Collision-resistance requires that every p.p.t. adversary who gets a hash function
+h ← Hn chosen at random fails to find a collision except with negligible probability.
+
+Collision-Resistant Hashing from SIS. Here is a hash family Hn that is secure under SIS(n, m, q, B)
+where n log q > m log(B + 1). Each hash function hA is parameterized by a matrix A ∈ Zn×m
+                                                                                        q   ,
+takes as input e ∈ [0, . . . , B]m and outputs
+
+                                         hA (e) = Ae mod q
+
+A collision gives us e, e0 ∈ [0, . . . , B]m where Ae = Ae0 mod q which in turn says that A(e − e0 ) =
+0 mod q. Since each entry of e − e0 is in [−B, . . . , B], this gives us a solution to SIS(n, m, q, B).
+
+3.2   Private-Key Encryption
+A private-key encryption scheme has three algorithms: a probabilistic key generation Gen which,
+on input a security parameter λ, generates a private key sk; a probabilistic encryption algorithm
+Enc which, on input sk and a message m chosen from a message space M, generates a ciphertext
+c; and a deterministic decryption algorithm Dec which, on input sk and the ciphertext c, outputs
+a message m0 .
+    Correctness requires that for every sk generated by Gen and every m ∈ M,
+
+                                      Dec(sk, Enc(sk, m)) = m
+
+                                                   4
+The notion of security for private-key encryption is semantic security or equivalently, CPA-security,
+as defined in the Pass-Shelat lecture notes (see References at the end of the notes.) In a nutshell,
+this says that no probabilistic polynomial time (p.p.t.) adversary which gets oracle access to either
+the Left oracle or the Right oracle can distinguish between the two. Here, the Left (resp. the Right)
+oracle take as input a pair of messages (mL , mR ) ∈ M2 and outputs an encryption of mL (resp.
+mR ).
+
+Private-Key Encryption from LWE.
+
+
+   • Gen(1λ ): Compute n = n(λ), q = q(λ) and χ = χ(λ) in a way we will describe later in this
+     lecture. Let the private key sk be a uniformly random vector
+
+                                             sk := s ← Znq .
+
+   • Enc(sk, m): We will work with the message space M := {0, 1}. Larger message spaces can
+     be handled by encrypting each bit of the message independently. The ciphertext is
+
+                               c := (a, b) := (a, sT a + e + mbq/2e mod q)
+
+     where a ← Znq and e ← χ is chosen from the LWE error distribution.
+
+   • Dec(sk, c = (a, b)): Output 0 if
+
+                                          b − sT a mod q < q/4
+
+     and 1 otherwise.
+Lemma 3. The scheme above is correct if the support of the error distribution Supp(χ) ⊆ (−q/4, q/4)
+and CPA-secure under the LWE assumption LWE(n, m = poly(n), q, χ).
+   Correctness and security are immediate and left as an exercise to the reader.
+We left the issue of how to pick n, q and χ open, and indeed, they need to be chosen appropriately
+for the scheme to be secure. Correctness and security give us constraints on these parameters
+(see Lemma 3 above), but do not tell us how to completely specify them. To fully specify the
+parameters, we need to ensure security against attackers “running in 2λ time” (this is the meaning
+of the security parameter λ that we will use throughout this course) and to do that, we need to
+evaluate the efficacy of various attacks on LWE which we will do (at least, asymptotically) in the
+next lecture.
+
+
+Open Problem 1.1. Construct a nice private-key encryption scheme from the hardness of SIS.
+
+
+    Note that SIS implies a one-way function directly. Together with generic transformations
+in cryptography from one-way functions to pseudorandom generators (Håstad-Impagliazzo-Levin-
+Luby) and from pseudorandom generators to pseudorandom functions (Goldreich-Goldwasser-Micali)
+and from pseudorandom functions to private-key encryption (easy/folklore), this is possible. The
+problem is to avoid the ugliness that results from using these general transformations.
+
+                                                 5
+3.3   Public-Key Encryption
+A public-key encryption scheme is the same as private-key encryption except for two changes: first,
+the key generation algorithm Gen outputs a public key pk as well as a private key sk; and second,
+the encryption algorithm requires only the public key pk to encrypt. Security requires that a p.p.t.
+adversary which is given pk (and thus can encrypt as many messages as it wants on its own) cannot
+distinguish between an encryption of any two messages m0 , m1 ∈ M of its choice.
+
+Public-Key Encryption from LWE (the LPR Scheme) There are many ways of doing this;
+we will present the cleanest one due to Lyubashevsky-Peikert-Regev.
+
+
+   • Gen(1λ ): Compute n = n(λ), q = q(λ) and χ = χ(λ) in a way we will describe later in this lec-
+     ture. Let the private key sk be a random vector sk := s ← χn is chosen from the error distribution
+     and the public key is
+                                pk := (A, yT := sT A + eT ) ∈ Zqn×n × Znq
+      where A is a uniformly random n-by-n matrix and e ← χn is chosen from the error distribu-
+      tion.
+   • Enc(sk, m): We will work with the message space M := {0, 1} as above. The ciphertext is
+                           c := (a, b) := (Ar + x, yT r + x0 + mbq/2e mod q)
+      where r, x ← χn and x0 ← χ are chosen from the LWE error distribution.
+   • Dec(sk, c = (a, b)): Output 0 if
+                                          b − sT a mod q < q/4
+      and 1 otherwise.
+                                                     p           p
+Lemma 4. The scheme above is correct if Supp(χ) ⊆ (− q/4(2n + 1), q/4(2n + 1)) and CPA-
+secure under the LWE assumption LWE(n, m = 2(n + 1), q, χ).
+Proof. For correctness, note that the decryption algorithm computes
+                                 b − sT a mod q = sT x + eT r + x0
+                                               p              p
+whose absolute value, as long as Supp(χ) ⊆ (− q/4(2n + 1), q/4(2n + 1)) is at most
+                                   q/4(2n + 1) · (2n + 1) = q/4 .
+   For security, we proceed by the following sequence of hybrid experiments.
+
+Hybrid 0.m.     The adversary gets pk and Enc(pk, m) where m ∈ {0, 1}.
+
+Hybrid 1.m.     Feed the adveresary with a “fake” public key pk
+                                                             f computed as
+                                    f = (A, y) ← Zn×n × Zn
+                                    pk            q      q
+
+and Enc(pk,
+         f m). This is indistinguishable from Hybrid 0 by the hardness of ssLWE(n, n, q, χ) and
+therefore, by Lemma 2, LWE(n, 2n, q, χ).
+
+                                                 6
+Hybrid 2.m.     Feed the adversary with pk
+                                        f and Enc(
+                                              g pk,f m) computed as
+
+                               Enc(
+                               g pk,f m) = (a, b0 + mbq/2e mod q)
+
+where a ← Znq is uniformly random. This is indistinguishable from Hybrid 1 by ssLWE(n, n+1, q, χ)
+or by Lemma 2, LWE(n, 2n + 1, q, χ), since the entire ciphertext can easily be rewritten as
+                                                         
+                               A           x             0
+                                   r+           +               mod q
+                              yT           x0        mbq/2e
+
+which, since y is now uniformly random, is n + 1 ssLWE samples and therefore can be indistin-
+guishably replaced by                            
+                                   a            0
+                                       +              mod q
+                                   b0        mbq/2e
+where a ← Znq and b0 ← Zq .
+
+Hybrid 3.m. Feed the adversary with uniformly random numbers from the appropriate domains.
+Follows from the previous expression for the fake ciphertext (random + anything = random).
+For every m ∈ M, Hybrid 0.m is computationally indistinguishable from Hybrid 3.m. Furthermore,
+Hybrid 3 is completely independent of m. Therefore, Hybrids 0.0 and 0.1 are computationally
+indistinguishable from each other, establishing semantic security or CPA-security.
+
+
+    There are many ways to improve the rate of this encryption scheme, that is, lower the ratio of
+(#bits in ciphertext)/(#bits in plaintext) and indeed, even achieve a rate close to 1. We can also
+use these techniques as building blocks to construct several other cryptographic systems such as
+oblivious transfer protocols. This public-key encryption scheme has its origins in earlier works of
+Ajtai and Dwork (1997) and Regev (2004).
+
+Public-Key Encryption from LWE (the Regev Scheme) We present a second public-key
+encryption scheme due to Regev. We will only provide a sketch of the correctness and security
+analysis and leave it as an exercise to the reader. We remark that the security proof relies on a
+beautiful lemma called the “leftover hash lemma” (Impagliazzo, Levin and Luby 1990).
+
+   • Gen(1λ ): Compute n = n(λ), q = q(λ) and χ = χ(λ) in a way we will describe later in this
+     lecture. Let the private key sk be a random vector sk := s ← Znq is chosen uniformly at
+     random from Zq and the public key is
+
+                               pk := (A, yT := sT A + eT ) ∈ Zn×n
+                                                              q   × Zm
+                                                                     q
+
+     where A is a uniformly random n-by-m matrix and e ← χn is chosen from the error distri-
+     bution. Here m = Ω(n log q).
+
+     Note the difference from LPR where the secret key had small entries. Note also that the
+     matrix A is somewhat larger than in LPR.
+
+
+
+                                                7
+   • Enc(sk, m): We will work with the message space M := {0, 1} as above. The ciphertext is
+
+                                 c := (a, b) := (Ar, yT r + mbq/2e mod q)
+
+     where r ← {0, 1}m . x0 ← χ is chosen from the LWE error distribution.
+
+     Note the difference from LPR where the vector r was chosen from the error distribution and
+     the first component of the ciphertext had an additive error as well. Roughly speaking, in Regev,
+     we will argue that the first component is statistically close to random, whereas in LPR, we
+     argued that it is computationally close to random under the decisional LWE assumption.
+   • Dec(sk, c = (a, b)): Output 0 if
+
+                                           b − sT a mod q < q/4
+
+     and 1 otherwise.
+
+    Decryption recovers mbq/2e plus an error eT r + x0 whose norm should be smaller than q/4
+for the correctness of decryption. This is true as long as the support of the error distribution is
+Supp(χ) ⊆ (−q/4(m + 1), q/4(m + 1)).
+    In the security proof, we first replace the public key with a uniformly random vector relying on
+the LWE assumption. Once this is done, use the leftover hash lemma to argue that the ciphertext
+is statistically close to random.
+
+Public-Key Encryption from LWE (the dual Regev Scheme) We present yet another
+public-key encryption scheme due to Gentry, Peikert and Vaikuntanathan called the “dual Regev”
+scheme. The nice feature of this scheme, which will turn out to be important when we get to
+identity-based encryption is that the distribution of the public key is really random. In other words,
+any string could be a possible public key in the scheme.
+
+   • Gen(1λ ): Compute n = n(λ), q = q(λ) and χ = χ(λ) in a way we will describe later in this
+     lecture. Let the private key sk be a random vector sk := r ← {0, 1}m is chosen uniformly at
+     random with 0 or 1 entries and the public key is
+
+                                    pk := (A, a := Ar ∈ Zn×n
+                                                         q   × Zm
+                                                                q )
+
+     where A is a uniformly random n-by-m matrix. Here m = Ω(n log q).
+
+     Note the difference from Regev where the private key here seems to have a component similar
+     to the first component of a Regev ciphertext. No wonder this is called “dual Regev”.
+   • Enc(sk, m): We will work with the message space M := {0, 1} as above. The ciphertext is
+
+                          c := (yT , b) := (sT A + eT , sT a + x0 + mbq/2e mod q)
+
+     where s ← Znq and eT ← χm . x0 ← χ is chosen from the LWE error distribution.
+
+   • Dec(sk, c = (yT , b)): Output 0 if
+
+                                           b − yT r mod q < q/4
+
+     and 1 otherwise.
+
+                                                  8
+Open Problem 1.2. Construct a public-key encryption scheme from the hardness of LWE where
+the support of the error distribution χ is large, namely [−cq, cq] for some constant c.
+
+
+    LWE with such large errors does imply a one-way function, and therefore, a private-key encryp-
+tion scheme. The question therefore asks if there is a gap between the LWE parameters that gives
+us public-key vs private-key encryption.
+
+
+References
+The primary reference for the cryptographic definitions in this lecture is lecture notes by Pass and
+Shelat, available at this url.
+
+
+
+
+                                                 9
+
--- a/papers_txt/nukib-pqc-guide.txt
+++ b/papers_txt/nukib-pqc-guide.txt
--- a/papers_txt/opaque-2018.txt
+++ b/papers_txt/opaque-2018.txt
--- a/papers_txt/opaque-draft-01.txt
+++ b/papers_txt/opaque-draft-01.txt
--- a/papers_txt/opaque-eprint-2018-163.txt
+++ b/papers_txt/opaque-eprint-2018-163.txt
--- a/papers_txt/owl-apake.txt
+++ b/papers_txt/owl-apake.txt
--- a/papers_txt/pake-quantum-annoying.txt
+++ b/papers_txt/pake-quantum-annoying.txt
--- a/papers_txt/regev-lattice-crypto.txt
+++ b/papers_txt/regev-lattice-crypto.txt
--- a/papers_txt/regev-lattice.txt
+++ b/papers_txt/regev-lattice.txt
--- a/papers_txt/rfc9807.txt
+++ b/papers_txt/rfc9807.txt
--- a/papers_txt/vole-constructions.txt
+++ b/papers_txt/vole-constructions.txt
@@ -0,0 +1,418 @@
+(Vector) Oblivious Linear Evaluation:
+Basic Constructions and Applications
+                    Peter Scholl
+       24 January 2022, Bar-Ilan Winter School
+   This talk                                               What is it?
+                                                                               VOLE              variants
+
+
+
+                                                                         OLE
+
+
+
+                                                                                What’s it good for?
+Conclusion                                      (V)OLE
+
+                                                                   How do you build it?                     correlated
+                                                                                                            randomness
+             active security   homomorphic encryption
+
+
+                                                    oblivious transfer
+
+                                                                                      Oblivious PRF
+
+                                                   Peter Scholl                                               3
+Oblivious linear evaluation (OLE)
+
+          Input: 𝑥 ∈ ℤ!                                        Input:
+                                                               𝑎, 𝑏 ∈ ℤ!
+
+                                         ⋮
+
+          Output: 𝑦 = 𝑎𝑥 + 𝑏
+
+
+
+                   𝑥 ∈ ℤ!                          𝑎, 𝑏 ∈ ℤ!
+                               OLE functionality
+             𝑦 = 𝑎𝑥 + 𝑏
+                                                                           5
+OLE is secret-shared multiplication
+        Input: 𝑥 ∈ ℤ!                       Input:
+                                            𝑎 ∈ ℤ!
+                    𝑥                𝑎, 𝑏   𝑏 ← ℤ!
+
+                           OLE
+                    𝑦
+
+
+                        𝑦 − 𝑏 = 𝑎𝑥
+
+
+
+
+                                                     6
+Variants: random-OLE, vector-OLE
+
+                 𝑥 ∈ ℤ!           𝑎, 𝑏 ∈ ℤ!
+                           OLE
+            𝑦 = 𝑎𝑥 + 𝑏
+
+
+               𝑥 ← ℤ!             𝑎, 𝑏 ← ℤ!
+             𝑦 = 𝑎𝑥 + 𝑏   $-OLE
+
+
+                 𝑥 ∈ ℤ!
+                                  ⃗ 𝑏 ∈ ℤ"!
+                                  𝑎,
+                          VOLE
+             𝑦⃗ = 𝑎𝑥
+                  ⃗ +𝑏
+                                              7
+A few basic observations
+                          𝑛 × OLE       ⇒    1× VOLE     (unconditional, passive security)
+                                        ⇐
+v   VOLE is easier to build than 𝑛 × OLE
+
+                            $-OLE       ⇒      OLE       (unconditional, send 3 ℤ! elem.)
+
+v   $-(V)OLE is enough
+                                             Oblivious
+                             OLE        ⇒                (unconditional)
+                                             Transfer
+v   Public-key crypto is necessary [IR 89]
+                                                                                             8
+Motivation: Secure Computation with
+Preprocessing
+[Beaver ’91]
+
+
+
+
+ Correlated randomness   Preprocessing
+
+
+               𝑥                                      𝑦
+                         Online phase
+
+                                           • Information-theoretic
+                            𝑓(𝑥, 𝑦)        • Cheap computation
+
+                            Peter Scholl                             9
+Example: multiplication triples from OLE
+
+
+       𝑥, 𝑥 " , 𝑦, 𝑦′               2x $-OLE                    𝑎, 𝑎" , 𝑏, 𝑏′
+
+
+
+                                     𝑦 − 𝑏 = 𝑎𝑥
+                                   𝑦 " − 𝑏′ = 𝑎" 𝑥 "
+
+
+                        𝑥 + 𝑎′ ⋅ 𝑥 ! + 𝑎 = 𝑥𝑥 ! + 𝑎𝑎! + 𝑎𝑥 + 𝑎! 𝑥′
+
+                               𝑢 ⋅ 𝑣        =     𝑤
+
+
+                                                                                10
+(V)OLE for correlated randomness
+v   Scalar/vector triples, matrix triples
+     ○   Build from VOLE
+
+v   Multi-party correlations:
+     ○   From pairwise instances of (V)OLE
+     ○   Other approaches: depth-1 homomorphic encryption [DPSZ 12]
+
+v   Authenticated secret shares:
+     ○   Use VOLE to generate information-theoretic MACs
+     ○   Key part of SPDZ protocols [DPSZ 12, KOS 16, KPR 18, …]      11
+Application: Oblivious Pseudorandom Functions
+              PRF 𝐹                        Oblivious PRF
+
+
+          𝑥       𝑏 ← 0,1
+                  𝐾 ← 0,1 !
+          𝑦+                     𝐾                                    𝑥
+                                                    ⋮
+Guess 𝑏           𝑦" = 𝐹(𝐾, 𝑥)
+                  𝑦# = $(𝑥)                                       𝐹(𝐾, 𝑥)
+                                          𝐹(𝐾, 𝑦) remains
+                                     pseudorandom for any 𝑦 ≠ 𝑥
+
+
+                                                                            14
+Vector-OLE ⇒ Batch OPRF evaluation [BCGIKS 19]
+
+               𝑠 ← 𝔽1                                               𝑎2 ∈ 𝔽1
+                                          VOLE
+       𝑡2 = 𝑎2 𝑠 + 𝑏2                                               𝑏2 ← 𝔽1
+
+ Keys 𝐾2 : = 𝑠, 𝑡2 2                                                Output 𝐻(𝑏" )
+                              𝐹 𝐾, , 𝑎, ≔ 𝐻(𝑡, − 𝑎, 𝑠)
+
+                        v Relaxed OPRF: related keys, leakage
+                        v Secure if 𝐻 is a random oracle
+                           • Or variant of correlation-robustness
+                                                                                    16
+Random Vector-OLE ⇒ Batch OPRF evaluation
+
+               𝑠 ← 𝔽1                                        𝑟2 ← 𝔽1
+                                     $-VOLE
+       𝑡2 ′ = 𝑟2 𝑠 + 𝑏2                                      𝑏2 ← 𝔽1
+
+                                  𝑑2 = 𝑎2 − 𝑟2
+ 𝑡2 = 𝑡23 + 𝑑2 𝑠
+ Keys 𝐾2 : = 𝑠, 𝑡2 2                                         Output 𝐻(𝑏" )
+
+
+                          v Optimal communication: 1 𝔽1 element
+                             Ø (given $-VOLE)
+
+                                                                             17
+Applications of OPRF
+v   Random 1-out-of-𝑞 OT
+     ○   Correlated randomness, e.g. masked truth tables [DKSSZZ 17]
+
+v   Password-authenticated key exchange, e.g. OPAQUE [JKX 18]
+     ○   Batch OPRF seems less useful
+
+v   Private set intersection
+     ○   Reducing use of public-key crypto [KKRT 16, KMPRT 17, …]
+     ○   With polynomial-based encoding [GPRTY 21, Sec 7.1]
+          ■   Simple protocol, communication: |input|                  18
+Constructing VOLE, “non-silently”
+
+
+
+                                    19
+Taxonomy of VOLE protocols
+    Oblivious Transfer                          Homomorphic Encryption
+
+                                 ”Non-silent”
+       𝑏          𝑠# , 𝑠$                          𝑥                          𝑓(𝑥)
+            OT                                             Enc   Eval   Dec
+       𝑠%
+
+
+
+                                   ”Silent”
+
+
+                  v Mostly based on LPN
+                  v Require “seed” VOLEs               +
+                  to bootstrap                                                       20
+(V)OLE from Oblivious Transfer [Gilboa 99]
+               𝑥 ∈ ℤ1                                       𝑎, 𝑏 ∈ ℤ1
+
+                                 𝑥$          𝑏& , 𝑏& + 𝑎
+Bit-decompose 𝑥 = ∑9   22:8 𝑥                                          Sample 𝑏2 ∈ ℤ1 s.t.
+                   278       2
+                                      OT                               𝑏 = ∑2 22:8𝑏2 mod 𝑞
+                                 𝑦$
+                                       ⋮
+                                 𝑥'         𝑏' , 𝑏' + 𝑎
+                                      OT
+                                 𝑦'
+                                                     Repeat for VOLE
+                                                        [KOS 16]
+ Output 𝑦 = ∑2 22:8𝑦2            𝑦2 = 𝑏2 + 𝑎𝑥2
+                                 ⇒ 𝑦 = 𝑏 + 𝑎𝑥
+                                                                                             21
+(V)OLE from Oblivious Transfer [Gilboa 99]
+v   Perfectly secure
+
+v   Each output: 𝑚 = log 𝑞 calls to OT on 𝑚-bit strings
+     ○   Computational cost: cheap via OT extension [IKNP 03]
+     ○   Communication: ≥ 𝑚< bits
+
+v   Active security?
+
+
+
+
+                                                                22
+(V)OLE from Oblivious Transfer: active security?
+               𝑥 ∈ ℤ1                                    𝑎, 𝑏 ∈ ℤ1
+
+                              𝑥$         𝑏& , 𝑏& + 𝑎
+Bit-decompose 𝑥 = ∑2 22:8𝑥2                                         Sample 𝑏2 ∈ ℤ1 s.t.
+                                   OT                  Bob uses 𝑎" ≠𝑏𝑎:= ∑2 22:8 𝑏2 mod 𝑞
+                              𝑦$
+                                                       Output becomes 𝑦 + 𝑎" − 𝑎 𝑥$
+                                   ⋮
+                              𝑥'        𝑏' , 𝑏' + 𝑎
+                                   OT
+                              𝑦'
+
+
+ Output 𝑦 = ∑2 22:8𝑦2
+
+                                                                                            23
+VOLE: lightweight correctness check
+                   𝑥, 𝑦2                                                𝑎2 , 𝑏2
+
+
+                          Goal: check that 𝑦2 = 𝑎2 𝑥 + 𝑏2 , for all 𝑖
+
+ Random challenges                          𝜒# , … , 𝜒$ ∈ ℤ%
+                                                                             𝑎∗ = - 𝜒$ 𝑎$ , 𝑏 ∗ = - 𝜒$ 𝑏$
+                                             𝑎∗ , 𝑏 ∗                               $             $
+                                                                                    +𝑎"%&         +𝑏"%&
+ 𝑦 ∗ = ∑𝜒" 𝑦" +𝑦"%&
+                                   Intuition:
+ Check 𝑦 ∗ = 𝑎∗ 𝑥 + 𝑏 ∗            • To pass check when 𝑦& is incorrect, Bob must guess 𝜒&
+                                   • Succeed with pr. 1/𝑝
+
+                                                                                                            24
+Problems with selective failure
+v   Recall: corrupt Bob can induce error:
+                            𝑦 / = 𝑦 + 𝑎/ − 𝑎 𝑥0
+     ○   Error depends on secret bit 𝑥8!
+     ○   Even if VOLE is correct, leaks that 𝑥8 = 0
+
+v   Solutions:
+     ○   1) Relaxed VOLE: allow small leakage on 𝑥 [KOS 16], [WYKW 21]
+     ○   2) Privacy amplification via leftover hash lemma [KOS 16]
+
+
+                                                                         25
+(V)OLE from OT: Summary
+v   Simple protocol with lightweight computation
+     ○   Leveraging fast OT extension techniques
+
+v   Expensive communication
+     ○   At least 𝑚< bits, where 𝑚 = log 𝑞
+
+v   Active security almost for free
+     ○   If leakage on 𝑥 is OK
+
+
+
+                                                   26
+VOLE from Homomorphic Encryption
+
+
+
+
+                                   27
+Linearly homomorphic encryption
+vPKE scheme (𝐾𝑒𝑦𝐺𝑒𝑛, 𝐸𝑛𝑐, 𝐷𝑒𝑐), encrypts vectors over ℤ$
+
+                            For 𝑎⃗ ∈ ℤ(! , write 𝑎⃗ ≔ Enc)* (𝑎)
+                                                             ⃗
+
+
+
+
+vLinear homomorphism:
+                                ⃗ for 𝑐⃗ ∈ ℤ$' , s.t.
+   ØCan compute 𝑎⃗ + 𝑏 or 𝑐⃗ ⋅ [𝑎],
+
+                             Dec 𝑎⃗ + 𝑏             = 𝑎⃗ + 𝑏
+                               Dec 𝑐⃗ ⋅ 𝑎⃗          = 𝑐⃗ ⋅ 𝑎⃗
+                                                                Component-wise
+                                                                   product
+                                           Peter Scholl                          28
+Examples of Linearly Homomorphic
+Encryption
+                                             More on Wednesday!
+vPaillier encryption
+   ØEach ciphertext encrypts a ℤG element (𝑁 = 𝑝𝑞)
+
+
+vDDH
+   ØElGamal in the exponent: poly-size plaintexts in ℤ
+   ØClass groups: ℤ! for large prime 𝑝 [CL 15]
+
+vRing Learning With Errors (RLWE) [LPR 10]
+   ØNatively encrypts a vector in ℤ9
+                                   !
+
+
+                                       Peter Scholl               29
+Naïve VOLE from Linearly Homomorphic
+Encryption
+             𝑥 ∈ ℤ!                              ⃗ 𝑏 ∈ ℤ9
+                                                 𝑎,     !
+
+                               𝑝𝑘, [𝑥]
+                    (
+𝑝𝑘, 𝑠𝑘 ← 𝐺𝑒𝑛(1 )
+
+                          𝑦⃗ = 𝑎⃗ ⋅ 𝑥 + [𝑏]
+
+𝑦⃗ = 𝐷𝑒𝑐)* ( 𝑦⃗ )
+
+                              Security:
+                        • Alice: CPA security
+                        • Bob: circuit privacy
+
+                               Peter Scholl                 30
+Circuit privacy in homomorphic encryption
+vIn RLWE, message hidden by “noise”:                               message
+
+                                                                   extra noise ≫ 𝑎 ⋅ 𝑒 + 𝑏
+vAfter computing 𝑎⃗ ⋅ 𝑥 + [𝑏]:
+                                                                   noise 𝑒𝑎 ⋅ 𝑒 + 𝑏
+   ØNoise depends on 𝑎⃗ and 𝑏                                      (removed in decryption)
+
+
+vClassic solution:
+                                                  Optimization: ”Gentle noise flooding” [dCHIV 21]
+   Ø“Noise flooding”                              • Encrypt 𝑡-out-of-𝑛 sharing of message
+   ØRequires much larger ciphertexts              • A few leaked coordinates don’t matter
+
+
+
+                                       Peter Scholl                                                  31
+What about active security?
+vWhat can go wrong?
+  ØAlice/Bob could send garbage ciphertexts…
+
+
+vWhat about correctness check as in OT?
+  ØSelective failure is more subtle
+  ØError may depend on ciphertext noise/secret key
+
+
+vSolution: zero-knowledge proofs
+  ØAlice: proof of plaintext knowledge
+  ØBob: proof of correct multiplication
+
+                                    Peter Scholl     32
+ZK proofs for homomorphic encryption
+vRLWE is more challenging than number-theoretic assumptions
+
+vProof of plaintext knowledge
+   ØNaïve sigma protocol: soundness ½
+   ØVarious optimizations [BCS 19], amortization [BBG 19]
+   ØStill computationally expensive, often need larger parameters
+
+
+vProof of correct multiplication
+   ØEven worse! Tricky to amortize
+   ØCan be avoided, assuming linear-only encryption [BISW 18, KPR 18]
+
+                                      Peter Scholl                      33
+Conclusion: Basic constructions and applications
+v   OLE and VOLE are core building blocks of secure computation
+     ○   Correlated randomness
+     ○   Special-purpose applications like OPRF, private set intersection
+     ○   Next talk: zero knowledge
+
+v   Non-silent protocols: OT, AHE
+     ○   Important, even if silent protocols win J
+     ○   Open question: improving RLWE parameters and efficiency
+          ■   Especially for active security
+                                                                            34
+Thank you!
+
+
+
+
+   Peter Scholl   35
+
--- a/papers_txt/vole-ring-lwe.txt
+++ b/papers_txt/vole-ring-lwe.txt