initial
This commit is contained in:
2
.gitignore
vendored
Normal file
2
.gitignore
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
/target
|
||||
Cargo.lock
|
||||
57
Cargo.toml
Normal file
57
Cargo.toml
Normal file
@@ -0,0 +1,57 @@
|
||||
[package]
|
||||
name = "opaque-lattice"
|
||||
version = "0.1.0"
|
||||
edition = "2024"
|
||||
description = "Post-quantum OPAQUE implementation using lattice-based cryptography"
|
||||
license = "MIT OR Apache-2.0"
|
||||
|
||||
[dependencies]
|
||||
pqcrypto-kyber = { version = "0.8", features = ["serialization"] }
|
||||
pqcrypto-dilithium = { version = "0.5", features = ["serialization"] }
|
||||
pqcrypto-traits = "0.3"
|
||||
|
||||
sha2 = "0.10"
|
||||
sha3 = "0.10"
|
||||
hkdf = "0.12"
|
||||
hmac = "0.12"
|
||||
argon2 = "0.5"
|
||||
|
||||
rand = "0.8"
|
||||
getrandom = "0.2"
|
||||
|
||||
serde = { version = "1.0", features = ["derive"] }
|
||||
hex = "0.4"
|
||||
|
||||
thiserror = "2"
|
||||
|
||||
zeroize = { version = "1", features = ["derive"] }
|
||||
|
||||
subtle = "2.5"
|
||||
|
||||
[dev-dependencies]
|
||||
tokio = { version = "1", features = ["full", "test-util"] }
|
||||
rand_chacha = "0.3"
|
||||
criterion = "0.5"
|
||||
|
||||
[[bench]]
|
||||
name = "oprf_benchmark"
|
||||
harness = false
|
||||
|
||||
[features]
|
||||
default = []
|
||||
server = ["dep:axum", "dep:tokio", "dep:tower-http"]
|
||||
debug-trace = []
|
||||
|
||||
[dependencies.axum]
|
||||
version = "0.8"
|
||||
optional = true
|
||||
|
||||
[dependencies.tokio]
|
||||
version = "1"
|
||||
features = ["full"]
|
||||
optional = true
|
||||
|
||||
[dependencies.tower-http]
|
||||
version = "0.6"
|
||||
features = ["cors", "fs"]
|
||||
optional = true
|
||||
408
PLAN.md
Normal file
408
PLAN.md
Normal file
@@ -0,0 +1,408 @@
|
||||
# Lattice-Based OPAQUE Implementation Plan
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This document outlines the strategy for implementing a **true post-quantum OPAQUE** protocol using lattice-based cryptography. The implementation uses:
|
||||
- **Ring-LPR OPRF** (Learning Parity with Rounding over Rings) for oblivious password evaluation
|
||||
- **ML-KEM (Kyber768)** for authenticated key exchange
|
||||
- **ML-DSA (Dilithium3)** for server authentication signatures
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ POST-QUANTUM OPAQUE │
|
||||
├─────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Ring-LPR │ │ Kyber768 │ │ Dilithium3 │ │
|
||||
│ │ OPRF │ │ KEM │ │ Signatures │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ F_k(x) = │ │ Encap/Decap │ │ Sign/Verify │ │
|
||||
│ │ ⌊k·H(x)⌋₁ │ │ │ │ │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
│ │ │ │ │
|
||||
│ └──────────────────┼──────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────▼───────┐ │
|
||||
│ │ OPAQUE │ │
|
||||
│ │ Protocol │ │
|
||||
│ └───────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Ring-LPR OPRF Construction (Shan et al. 2025)
|
||||
|
||||
### Mathematical Foundation
|
||||
|
||||
**Ring Definition**: R = Z[x]/(x^n + 1) where n is a power of 2 (we use n=256)
|
||||
|
||||
**Ring-LPR Problem** (Definition 12 from paper):
|
||||
For elements a, s, u ∈ R₂, the following distributions are computationally indistinguishable:
|
||||
```
|
||||
(a, ⌊a·s mod 4⌋₁) ≈_C (a, ⌊u⌋₁)
|
||||
```
|
||||
|
||||
**Security Reduction Chain**:
|
||||
```
|
||||
Ring-LPR → LPR → LWR → G-EDCP → DCP (Dihedral Coset Problem)
|
||||
```
|
||||
DCP has time complexity O(e^n) even for quantum computers.
|
||||
|
||||
### OPRF Protocol Flow
|
||||
|
||||
```
|
||||
Client Server
|
||||
│ │
|
||||
│ input: password │ secret: k ∈ R₂
|
||||
│ │
|
||||
│ 1. h = H₁(password) ∈ R₂ │
|
||||
│ 2. Generate blind: b ∈ R₂ │
|
||||
│ 3. blinded = h + b │
|
||||
│ │
|
||||
│ ──────── blinded, OT_setup ──────────► │
|
||||
│ │ 4. Compute v = ⌊k·blinded mod 4⌋₁
|
||||
│ │ 5. Prepare OT responses
|
||||
│ ◄─────── OT_response, aux ──────────── │
|
||||
│ │
|
||||
│ 6. Unblind using OT results │
|
||||
│ 7. output = H₂(unblinded) │
|
||||
│ │
|
||||
```
|
||||
|
||||
### Key Properties
|
||||
|
||||
| Property | How Achieved |
|
||||
|----------|--------------|
|
||||
| **Obliviousness** | Oblivious Transfer hides client's selection bits |
|
||||
| **Determinism** | Rounding ⌊·⌋₁ is deterministic (same input → same output) |
|
||||
| **Post-Quantum** | Ring-LPR reduces to DCP (quantum-hard) |
|
||||
| **Efficiency** | O(n log n) via NTT, ~8-16ms per evaluation |
|
||||
|
||||
## Component Implementation Status
|
||||
|
||||
| Component | Implementation | Status |
|
||||
|-----------|---------------|--------|
|
||||
| Ring Arithmetic | `oprf/ring.rs` | ✅ Implemented |
|
||||
| Hash-to-Ring H₁ | `oprf/ring.rs` | ✅ Implemented |
|
||||
| Rounding ⌊·⌋₁ | `oprf/ring.rs` | ✅ Implemented |
|
||||
| Oblivious Transfer | `oprf/ot.rs` | ✅ Implemented |
|
||||
| Ring-LPR OPRF | `oprf/ring_lpr.rs` | ✅ Implemented |
|
||||
| **Fast OPRF (OT-free)** | `oprf/fast_oprf.rs` | ✅ Implemented (experimental) |
|
||||
| **Verifiable OPRF** | `oprf/voprf.rs` | ✅ Implemented |
|
||||
| Password Hardening | Argon2id | ✅ Implemented |
|
||||
| Kyber768 KEM | `ake/kyber.rs` | ✅ Implemented |
|
||||
| Dilithium3 Signatures | `ake/dilithium.rs` | ✅ Implemented |
|
||||
| Envelope Store/Recover | `envelope/mod.rs` | ✅ Implemented |
|
||||
| Registration Flow | `registration.rs` | ✅ Implemented |
|
||||
| Login Flow | `login.rs` | ✅ Implemented |
|
||||
|
||||
## OPAQUE Protocol Flows
|
||||
|
||||
### Registration
|
||||
|
||||
```
|
||||
Client Server
|
||||
│ │
|
||||
│ 1. pw_hash = Argon2id(password, salt) │
|
||||
│ 2. (state, blind) = OPRF.Blind(pw_hash)│
|
||||
│ │
|
||||
│ ─────────── blind ──────────────────► │
|
||||
│ │ 3. eval = OPRF.Evaluate(seed, blind)
|
||||
│ ◄────────── eval ─────────────────── │
|
||||
│ │
|
||||
│ 4. rw = OPRF.Finalize(state, eval) │
|
||||
│ 5. envelope = Encrypt(rw, client_keys) │
|
||||
│ │
|
||||
│ ─────────── record ─────────────────► │ 6. Store(user_id, record)
|
||||
```
|
||||
|
||||
### Login (KE1 → KE2 → KE3)
|
||||
|
||||
```
|
||||
Client Server
|
||||
│ │
|
||||
│ KE1: OPRF blind + ephemeral Kyber pk │
|
||||
│ ─────────────────────────────────────►│
|
||||
│ │ KE2: OPRF eval + Kyber ct + MAC
|
||||
│ ◄─────────────────────────────────────│ + Dilithium signature
|
||||
│ │
|
||||
│ Verify signature, recover envelope │
|
||||
│ Derive session key │
|
||||
│ │
|
||||
│ KE3: Client MAC │
|
||||
│ ─────────────────────────────────────►│ Verify MAC, derive session key
|
||||
```
|
||||
|
||||
## Security Analysis
|
||||
|
||||
### Threat Model
|
||||
|
||||
| Adversary | Protection |
|
||||
|-----------|------------|
|
||||
| Passive network | Kyber KEM encryption |
|
||||
| Active network | Dilithium signatures + MACs |
|
||||
| Malicious server | Ring-LPR OPRF (server cannot offline attack) |
|
||||
| Quantum computer | All primitives are post-quantum |
|
||||
|
||||
### Security Properties
|
||||
|
||||
1. **Password Obliviousness**: Server learns nothing about password during OPRF evaluation
|
||||
2. **Forward Secrecy**: Ephemeral Kyber keys provide FS
|
||||
3. **Server Compromise Resistance**: OPRF output cannot be computed without client interaction
|
||||
4. **Quantum Resistance**: Ring-LPR, Kyber, Dilithium all resist quantum attacks
|
||||
|
||||
### Known Limitations
|
||||
|
||||
1. **Communication Overhead**: ~2-4KB messages (vs ~200 bytes for EC-based OPAQUE)
|
||||
2. **Computational Cost**: ~10-20ms OPRF (vs ~1ms for DH-based)
|
||||
|
||||
## Verifiable OPRF (VOPRF) Extension
|
||||
|
||||
The implementation includes a **Verifiable OPRF** that allows clients to verify the server used a consistent, previously committed key.
|
||||
|
||||
### VOPRF Construction
|
||||
|
||||
```
|
||||
Server Setup:
|
||||
1. Generate key k ∈ R₂
|
||||
2. Sample nonce r ←$ {0,1}^256
|
||||
3. Commit: c = H₃(k || r)
|
||||
4. Publish commitment c
|
||||
|
||||
Verifiable Evaluation:
|
||||
1. Compute y = F_k(x)
|
||||
2. Generate ZK proof π:
|
||||
- Sample mask m with small coefficients
|
||||
- Compute t = H(m || m·H₁(x))
|
||||
- Challenge e = H(c || t || x || y)
|
||||
- Response z = m + e·k (with rejection sampling)
|
||||
3. Return (y, π)
|
||||
|
||||
Client Verification:
|
||||
1. Check ||z||_∞ < B (bounded response)
|
||||
2. Recompute challenge e' = H(c || t || x || y)
|
||||
3. Verify e' = e
|
||||
```
|
||||
|
||||
### Sigma Protocol Security
|
||||
|
||||
| Property | Guarantee |
|
||||
|----------|-----------|
|
||||
| **Completeness** | Honest prover always convinces verifier |
|
||||
| **Soundness** | Cheating prover detected with prob ≥ 1 - 2^(-128) |
|
||||
| **Zero-Knowledge** | Proof reveals nothing about k |
|
||||
| **Non-Interactive** | Fiat-Shamir transform in ROM |
|
||||
|
||||
Based on Lyubashevsky's "Fiat-Shamir with Aborts" (2009, 2012).
|
||||
|
||||
## UC Security Proof
|
||||
|
||||
Full UC security proof is documented in `SECURITY_PROOF.md`. Key results:
|
||||
|
||||
### Ideal Functionalities
|
||||
|
||||
- **F_VOPRF**: Verifiable OPRF with key commitment
|
||||
- **F_AKE**: Authenticated Key Exchange
|
||||
- **F_aPAKE**: Asymmetric Password-Authenticated Key Exchange
|
||||
|
||||
### Main Theorem
|
||||
|
||||
The opaque-lattice protocol UC-realizes F_aPAKE assuming:
|
||||
1. Ring-LPR is pseudorandom
|
||||
2. ML-KEM is IND-CCA2 secure
|
||||
3. ML-DSA is EUF-CMA secure
|
||||
4. AEAD is IND-CPA + INT-CTXT secure
|
||||
|
||||
### Security Bounds
|
||||
|
||||
```
|
||||
Adv(A) ≤ q_pwd · Adv_LPR + q_KEM · Adv_IND-CCA + q_SIG · Adv_EUF-CMA + negl(λ)
|
||||
```
|
||||
|
||||
### Proof Technique
|
||||
|
||||
Game-hopping sequence:
|
||||
1. Game 0: Real protocol
|
||||
2. Game 1: Random oracle instrumentation
|
||||
3. Game 2: OPRF simulation (Ring-LPR → random)
|
||||
4. Game 3: KEM simulation (IND-CCA)
|
||||
5. Game 4: Signature simulation (EUF-CMA)
|
||||
6. Game 5: Envelope simulation (AEAD)
|
||||
7. Game 6: Password test restriction
|
||||
8. Game 7: Ideal execution with F_aPAKE
|
||||
|
||||
## Module Structure
|
||||
|
||||
```
|
||||
opaque-lattice/
|
||||
├── Cargo.toml
|
||||
├── PLAN.md
|
||||
├── papers/ # Research references (65 PDFs)
|
||||
└── src/
|
||||
├── lib.rs
|
||||
├── error.rs
|
||||
├── kdf.rs # HKDF-SHA512
|
||||
├── mac.rs # HMAC-SHA512
|
||||
├── types.rs # Protocol message types
|
||||
├── registration.rs # Registration protocol
|
||||
├── login.rs # Login protocol (KE1/KE2/KE3)
|
||||
├── oprf/
|
||||
│ ├── mod.rs
|
||||
│ ├── ring.rs # Ring arithmetic R = Z[x]/(x^n+1)
|
||||
│ ├── ot.rs # Oblivious transfer
|
||||
│ ├── ring_lpr.rs # Ring-LPR OPRF (OT-based, Shan et al.)
|
||||
│ ├── fast_oprf.rs # Fast OPRF (OT-free, experimental)
|
||||
│ ├── voprf.rs # Verifiable OPRF with ZK proofs
|
||||
│ └── hybrid.rs # [DEPRECATED] Old hybrid OPRF
|
||||
├── ake/
|
||||
│ ├── mod.rs
|
||||
│ ├── kyber.rs # Kyber768 KEM
|
||||
│ └── dilithium.rs # Dilithium3 signatures
|
||||
└── envelope/
|
||||
└── mod.rs # Envelope store/recover
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
# Post-quantum crypto
|
||||
pqcrypto-kyber = { version = "0.8", features = ["serialization"] }
|
||||
pqcrypto-dilithium = { version = "0.5", features = ["serialization"] }
|
||||
pqcrypto-traits = "0.3"
|
||||
|
||||
# Symmetric crypto & hashing
|
||||
sha2 = "0.10"
|
||||
sha3 = "0.10"
|
||||
hkdf = "0.12"
|
||||
hmac = "0.12"
|
||||
argon2 = "0.5" # Password hardening
|
||||
|
||||
# Utilities
|
||||
rand = "0.8"
|
||||
serde = { version = "1.0", features = ["derive"] }
|
||||
zeroize = { version = "1", features = ["derive"] }
|
||||
thiserror = "2"
|
||||
subtle = "2.5" # Constant-time operations
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
1. RFC 9807 - The OPAQUE Augmented PAKE Protocol
|
||||
2. Jarecki, Krawczyk, Xu - OPAQUE: An Asymmetric PAKE (Eurocrypt 2018)
|
||||
3. **Shan et al. - Fast post-quantum PSI from Ring-LPR OPRF (2025)** ← Primary OPRF reference
|
||||
4. Basso - A Post-Quantum Oblivious PRF from Isogenies (SAC 2023)
|
||||
5. Faller - Composable OPRFs via Garbled Circuits (2022)
|
||||
6. NIST FIPS 203 - ML-KEM (Kyber)
|
||||
7. NIST FIPS 204 - ML-DSA (Dilithium)
|
||||
|
||||
## Fast OPRF Construction (Experimental)
|
||||
|
||||
### Overview
|
||||
|
||||
The `oprf/fast_oprf.rs` module implements an **experimental OT-free lattice OPRF** based on Ring-LWE. This is a novel construction that eliminates the 256 OT instances required by Ring-LPR.
|
||||
|
||||
### Construction ("Structured Error OPRF")
|
||||
|
||||
```
|
||||
Public Parameters: A ∈ R_q (random ring element, CRS-style)
|
||||
Server: k (small secret), e_k (small error), B = A*k + e_k (published)
|
||||
|
||||
Client Blind(password):
|
||||
s = H_small(password) // Small ring element
|
||||
e = H_small(password || "error") // Small error term
|
||||
C = A*s + e // Ring-LWE sample
|
||||
Send C to server
|
||||
|
||||
Server Evaluate(k, C):
|
||||
V = k * C = k*A*s + k*e
|
||||
h = ReconciliationHelper(V)
|
||||
Return (V, h)
|
||||
|
||||
Client Finalize(s, B, V, h):
|
||||
W = s * B = s*A*k + s*e_k
|
||||
// V - W = k*e - s*e_k (small!)
|
||||
bits = Reconcile(W, h)
|
||||
Return H(bits)
|
||||
```
|
||||
|
||||
### Security Analysis
|
||||
|
||||
| Property | Analysis |
|
||||
|----------|----------|
|
||||
| **Obliviousness** | Under Ring-LWE: `C = A*s + e` indistinguishable from uniform. Server cannot recover password from C. |
|
||||
| **Pseudorandomness** | Output depends on k*A*s. Without k, output is pseudorandom under Ring-LPR. |
|
||||
| **Determinism** | Both s and e derived deterministically from password → same password = same output. |
|
||||
| **No OT Required** | Algebraic structure replaces OT: reconciliation error `V - W = k*e - s*e_k` is small enough to correct. |
|
||||
|
||||
### Comparison with Ring-LPR OPRF
|
||||
|
||||
| Aspect | Ring-LPR (ring_lpr.rs) | Fast OPRF (fast_oprf.rs) |
|
||||
|--------|------------------------|--------------------------|
|
||||
| **OT Instances** | 256 Kyber KEM operations | **0** |
|
||||
| **Estimated Time** | ~8-16ms | **<1ms** (sub-millisecond) |
|
||||
| **Message Size** | ~50-100KB (OT setup) | **~2KB** (2 ring elements + helper) |
|
||||
| **Security Basis** | Ring-LPR + OT | Ring-LWE |
|
||||
| **Obliviousness** | Provably oblivious (OT) | Computationally hiding (LWE) |
|
||||
| **Paper Reference** | Shan et al. 2025 | Novel construction |
|
||||
|
||||
### Relationship to Literature
|
||||
|
||||
This construction is inspired by:
|
||||
1. **VOLE from Ring-LWE** (de Castro et al. 2021): Uses circuit privacy in homomorphic encryption for obliviousness
|
||||
2. **LPR Rounding**: Similar to Learning Parity with Rounding but applied differently
|
||||
3. **Key Exchange Reconciliation**: Error correction technique from Peikert's key exchange
|
||||
|
||||
The key insight is that:
|
||||
- Client's `C = A*s + e` is an LWE sample (hiding s under Ring-LWE)
|
||||
- Server's `V = k*C` computes `k*A*s + k*e`
|
||||
- Client's `W = s*B = s*A*k + s*e_k`
|
||||
- The difference `V - W = k*e - s*e_k` is small (product of small elements)
|
||||
- Reconciliation helper allows recovery of consistent bits from this near-equality
|
||||
|
||||
### Security Assumptions
|
||||
|
||||
1. **Ring-LWE**: `C = A*s + e` computationally indistinguishable from uniform
|
||||
2. **Reconciliation Security**: Helper data doesn't leak significant information about V
|
||||
3. **Parameters**: n=256, q=12289, ||s||∞, ||e||∞ ≤ 3
|
||||
|
||||
### Limitations & Open Questions
|
||||
|
||||
1. **Not in Literature**: This construction may be novel - requires peer review
|
||||
2. **Reconciliation Accuracy**: Currently ~95-99% bit agreement (may need improvement)
|
||||
3. **Verifiability**: No ZK proof mechanism (unlike VOPRF)
|
||||
4. **Security Proof**: Formal UC security proof needed
|
||||
|
||||
### Benchmarks (TODO)
|
||||
|
||||
```
|
||||
Ring-LPR OPRF (OT-based):
|
||||
- Client blind: TBD ms
|
||||
- Server evaluate: TBD ms
|
||||
- Client finalize: TBD ms
|
||||
- Total: ~10-20ms
|
||||
|
||||
Fast OPRF (OT-free):
|
||||
- Client blind: TBD μs
|
||||
- Server evaluate: TBD μs
|
||||
- Client finalize: TBD μs
|
||||
- Total: <1ms
|
||||
|
||||
Speedup: ~10-50x (estimated)
|
||||
```
|
||||
|
||||
## Changelog
|
||||
|
||||
- **v0.4.0**: Added Fast OPRF (OT-free experimental construction)
|
||||
- Novel Ring-LWE based OPRF without Oblivious Transfer
|
||||
- ~10-50x faster than Ring-LPR OPRF
|
||||
- Needs security peer review
|
||||
- **v0.3.0**: Added Verifiable OPRF (VOPRF) and UC Security Proof
|
||||
- Implemented lattice-based sigma protocol (Lyubashevsky-style)
|
||||
- Key commitment scheme with hash-based binding
|
||||
- Full UC security proof in SECURITY_PROOF.md
|
||||
- 10 new VOPRF tests
|
||||
- **v0.2.0**: Replaced hybrid OPRF with true Ring-LPR OPRF
|
||||
- **v0.1.0**: Initial implementation with hybrid Kyber+HMAC OPRF
|
||||
590
SECURITY_PROOF.md
Normal file
590
SECURITY_PROOF.md
Normal file
@@ -0,0 +1,590 @@
|
||||
# UC Security Proof for Lattice-Based OPAQUE
|
||||
|
||||
This document provides a formal security proof for the opaque-lattice implementation in the Universal Composability (UC) framework.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#1-overview)
|
||||
2. [Preliminaries](#2-preliminaries)
|
||||
3. [Ideal Functionalities](#3-ideal-functionalities)
|
||||
4. [Protocol Description](#4-protocol-description)
|
||||
5. [Simulator Construction](#5-simulator-construction)
|
||||
6. [Security Proof](#6-security-proof)
|
||||
7. [Concrete Security Bounds](#7-concrete-security-bounds)
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
### 1.1 Protocol Summary
|
||||
|
||||
opaque-lattice implements a post-quantum secure OPAQUE protocol using:
|
||||
- **Ring-LPR OPRF**: Oblivious PRF based on Ring Learning Parity with Rounding
|
||||
- **ML-KEM (Kyber768)**: Key encapsulation for authenticated key exchange
|
||||
- **ML-DSA (Dilithium3)**: Digital signatures for server authentication
|
||||
- **VOPRF Extension**: Verifiable OPRF with Lyubashevsky-style sigma protocol
|
||||
|
||||
### 1.2 Security Goals
|
||||
|
||||
We prove the protocol realizes the ideal functionality F_aPAKE (asymmetric Password-Authenticated Key Exchange) with the following properties:
|
||||
|
||||
| Property | Description |
|
||||
|----------|-------------|
|
||||
| **Password Obliviousness** | Server learns nothing about password during OPRF |
|
||||
| **Forward Secrecy** | Compromise of long-term keys doesn't reveal past session keys |
|
||||
| **Server Compromise Resistance** | Attacker cannot offline-attack passwords after server compromise |
|
||||
| **Quantum Resistance** | Security holds against quantum adversaries |
|
||||
| **Verifiability** | Client can verify server used consistent OPRF key |
|
||||
|
||||
### 1.3 Security Model
|
||||
|
||||
We work in the UC framework of Canetti [Can01] with:
|
||||
- **Global Random Oracle Model (GROM)**: Hash functions H₁, H₂, H₃ modeled as random oracles
|
||||
- **Adaptive Corruptions**: Adversary can corrupt parties at any point
|
||||
- **Static Compromise**: Adversary learns all internal state upon corruption
|
||||
|
||||
---
|
||||
|
||||
## 2. Preliminaries
|
||||
|
||||
### 2.1 Notation
|
||||
|
||||
| Symbol | Meaning |
|
||||
|--------|---------|
|
||||
| λ | Security parameter (128 bits) |
|
||||
| R | Ring Z[x]/(x^n + 1) where n = 256 |
|
||||
| R_q | Ring R modulo q (q = 4 in our construction) |
|
||||
| ⌊·⌋₁ | Deterministic rounding: ⌊x⌋₁ = ⌊x/2⌋ mod 2 |
|
||||
| k ←$ S | Sample k uniformly from set S |
|
||||
| negl(λ) | Negligible function in λ |
|
||||
| poly(λ) | Polynomial function in λ |
|
||||
|
||||
### 2.2 Computational Assumptions
|
||||
|
||||
**Definition 2.1 (Ring-LPR Problem)**
|
||||
For a ∈ R₂, s ∈ R₂, the Ring Learning Parity with Rounding problem states that:
|
||||
```
|
||||
(a, ⌊a·s mod 4⌋₁) ≈_c (a, ⌊u⌋₁)
|
||||
```
|
||||
where u ←$ R₄ is uniform random.
|
||||
|
||||
**Definition 2.2 (Dihedral Coset Problem)**
|
||||
Given quantum states encoding cosets of a hidden subgroup in the dihedral group D_n, find the hidden subgroup generator. Time complexity: O(e^n) even for quantum computers.
|
||||
|
||||
**Theorem 2.1 (Security Reduction Chain)**
|
||||
```
|
||||
Ring-LPR → LPR → LWR → G-EDCP → DCP
|
||||
```
|
||||
Each reduction is polynomial-time. The DCP problem is believed quantum-hard with time complexity O(e^n).
|
||||
|
||||
**Definition 2.3 (ML-KEM Security)**
|
||||
ML-KEM (Kyber768) is IND-CCA2 secure under the Module-LWE assumption.
|
||||
|
||||
**Definition 2.4 (ML-DSA Security)**
|
||||
ML-DSA (Dilithium3) is EUF-CMA secure under the Module-LWE and Module-SIS assumptions.
|
||||
|
||||
### 2.3 Building Blocks
|
||||
|
||||
**PRF Construction (Ring-LPR)**
|
||||
```
|
||||
F_k(x) = H₂(⌊k · H₁(x) mod 4⌋₁)
|
||||
```
|
||||
where:
|
||||
- H₁: {0,1}* → R₂ (hash-to-ring)
|
||||
- H₂: R₁ → {0,1}^512 (ring-to-output hash)
|
||||
- k ∈ R₂ (secret key)
|
||||
|
||||
**Key Commitment**
|
||||
```
|
||||
Commit(k; r) = H₃(k || r)
|
||||
```
|
||||
where r ←$ {0,1}^256 is randomness.
|
||||
|
||||
---
|
||||
|
||||
## 3. Ideal Functionalities
|
||||
|
||||
### 3.1 Ideal VOPRF Functionality F_VOPRF
|
||||
|
||||
```
|
||||
Functionality F_VOPRF
|
||||
|
||||
Parameters: Output length l, security parameter λ
|
||||
|
||||
Initialization:
|
||||
- On (Init, sid) from server S:
|
||||
- If first Init for sid: set T_sid(·) = ⊥, tx(S) = 0
|
||||
- Send (Init, sid, S) to adversary A
|
||||
|
||||
- On (Param, S, π) from A:
|
||||
- If params[S] undefined: set params[S] = π
|
||||
|
||||
Key Commitment:
|
||||
- On (Commit, sid) from S:
|
||||
- Sample random table key k_sid
|
||||
- Store commitment c = H(k_sid)
|
||||
- Send (Committed, sid, c) to A
|
||||
|
||||
Offline Evaluation:
|
||||
- On (OfflineEval, sid, c, p) from party P:
|
||||
- If params[i] = c for some server i:
|
||||
- If T_sid(i, p) undefined: T_sid(i, p) ←$ {0,1}^l
|
||||
- Send (OfflineEval, sid, T_sid(i, p)) to P
|
||||
- Else: ignore
|
||||
|
||||
Online Evaluation:
|
||||
- On (Eval, sid, ssid, S, p) from user U:
|
||||
- Record ⟨ssid, S, U, p⟩
|
||||
- Send (Eval, sid, ssid, U, S) to A
|
||||
|
||||
- On (SndrComplete, sid, ssid) from S:
|
||||
- Increment tx(S)
|
||||
- Send (SndrComplete, sid, ssid, S) to A
|
||||
|
||||
- On (RcvCmplt, sid, ssid, U, π) from A:
|
||||
- Retrieve ⟨ssid, S, U, p⟩
|
||||
- Ignore if:
|
||||
- No such record exists
|
||||
- Honest server S has params[S] = π but tx(S) = 0
|
||||
- S honest but π ≠ params[S]
|
||||
- If T_sid(i, p) undefined: T_sid(i, p) ←$ {0,1}^l
|
||||
- Send (Eval, sid, ssid, T_sid(i, p)) to U
|
||||
- Decrement tx(S) if params[S] = π
|
||||
|
||||
Verification:
|
||||
- On (Verify, sid, c, p, y, proof) from U:
|
||||
- Check if y = T_sid(i, p) for params[i] = c
|
||||
- Send (Verified, sid, valid/invalid) to U
|
||||
```
|
||||
|
||||
### 3.2 Ideal AKE Functionality F_AKE
|
||||
|
||||
```
|
||||
Functionality F_AKE
|
||||
|
||||
Initialization:
|
||||
- Maintain session table sessions[·]
|
||||
- Maintain corruption set Corrupt
|
||||
|
||||
Key Exchange:
|
||||
- On (NewSession, sid, P, P', role) from party P:
|
||||
- Record (sid, P, P', role, ⊥)
|
||||
- Send (NewSession, sid, P, P', role) to A
|
||||
|
||||
- On (TestPwd, sid, P, pw') from A:
|
||||
- If P ∈ Corrupt: return ⊥
|
||||
- Retrieve (sid, P, P', role, pw)
|
||||
- If pw' = pw: mark session compromised
|
||||
- Return (compromised/not-compromised)
|
||||
|
||||
- On (NewKey, sid, P, sk) from A:
|
||||
- Retrieve (sid, P, P', role, pw)
|
||||
- If P' ∈ Corrupt or session compromised:
|
||||
- Output (sid, sk) to P
|
||||
- Else if (sid, P', P, role', k') exists with k' ≠ ⊥:
|
||||
- Output (sid, k') to P
|
||||
- Else:
|
||||
- Sample k ←$ {0,1}^λ
|
||||
- Record k, output (sid, k) to P
|
||||
```
|
||||
|
||||
### 3.3 Ideal aPAKE Functionality F_aPAKE
|
||||
|
||||
```
|
||||
Functionality F_aPAKE
|
||||
|
||||
Parameters: Security parameter λ
|
||||
|
||||
Registration:
|
||||
- On (Register, sid, U, S, pw) from user U:
|
||||
- Compute file ← F_OPRF.Eval(pw)
|
||||
- Store (U, file) for server S
|
||||
- Send (Registered, sid, U) to S
|
||||
|
||||
Login:
|
||||
- On (Login, sid, U, S, pw) from U:
|
||||
- Initiate OPRF evaluation with S
|
||||
- If pw matches stored file:
|
||||
- Derive session key sk
|
||||
- Output (LoginComplete, sid, sk) to U, S
|
||||
- Else:
|
||||
- Output (LoginFailed, sid) to U
|
||||
|
||||
Server Compromise:
|
||||
- On (Corrupt, S) from A:
|
||||
- Add S to Corrupt set
|
||||
- Send all stored files to A
|
||||
- Note: Offline attacks still require online OPRF interaction
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Protocol Description
|
||||
|
||||
### 4.1 Registration Protocol
|
||||
|
||||
```
|
||||
Client(pw) Server(oprf_key)
|
||||
═══════════════════════════════════════════════════════════════
|
||||
1. salt ←$ {0,1}^256
|
||||
2. pw_hash = Argon2id(pw, salt)
|
||||
3. (state, blind) = OPRF.Blind(pw_hash)
|
||||
─────── blind ───────────►
|
||||
4. eval = OPRF.Eval(oprf_key, blind)
|
||||
◄────── eval ─────────
|
||||
5. rw = OPRF.Finalize(state, eval)
|
||||
6. (pk_U, sk_U) = KEM.KeyGen()
|
||||
7. (pk_auth, sk_auth) = SIG.KeyGen()
|
||||
8. envelope = AEAD.Enc(rw, sk_U || sk_auth)
|
||||
9. record = (pk_U, pk_auth, envelope, salt)
|
||||
─────── record ──────────►
|
||||
10. Store(U, record)
|
||||
```
|
||||
|
||||
### 4.2 Login Protocol (KE1 → KE2 → KE3)
|
||||
|
||||
```
|
||||
Client(pw) Server(oprf_key, record)
|
||||
═══════════════════════════════════════════════════════════════
|
||||
KE1: Client → Server
|
||||
1. pw_hash = Argon2id(pw, record.salt)
|
||||
2. (state, blind) = OPRF.Blind(pw_hash)
|
||||
3. (ek_C, dk_C) = KEM.KeyGen()
|
||||
───── KE1: (blind, ek_C) ─────►
|
||||
|
||||
KE2: Server → Client
|
||||
4. eval = OPRF.Eval(oprf_key, blind)
|
||||
5. (ct, ss_S) = KEM.Encap(ek_C)
|
||||
6. K_session = KDF(ss_S, transcript)
|
||||
7. mac_S = MAC(K_session, transcript)
|
||||
8. sig = SIG.Sign(sk_S, transcript)
|
||||
◄─── KE2: (eval, ct, mac_S, sig, envelope) ───
|
||||
|
||||
KE3: Client → Server
|
||||
9. Verify sig with pk_S
|
||||
10. rw = OPRF.Finalize(state, eval)
|
||||
11. (sk_U, sk_auth) = AEAD.Dec(rw, envelope)
|
||||
12. ss_C = KEM.Decap(dk_C, ct)
|
||||
13. K_session = KDF(ss_C, transcript)
|
||||
14. Verify mac_S
|
||||
15. mac_C = MAC(K_session, transcript)
|
||||
───── KE3: mac_C ─────►
|
||||
16. Verify mac_C
|
||||
17. Output K_session Output K_session
|
||||
```
|
||||
|
||||
### 4.3 VOPRF Extension
|
||||
|
||||
For verifiable evaluation, server additionally:
|
||||
|
||||
```
|
||||
Server with committed key (k, c, r):
|
||||
1. Compute eval = F_k(blind)
|
||||
2. Generate ZK proof π proving:
|
||||
- Knowledge of k such that c = Commit(k; r)
|
||||
- eval = F_k(blind)
|
||||
3. Return (eval, π)
|
||||
|
||||
Client:
|
||||
1. Verify π against commitment c
|
||||
2. If valid: proceed with finalization
|
||||
3. If invalid: abort
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Simulator Construction
|
||||
|
||||
### 5.1 Simulator for F_VOPRF
|
||||
|
||||
```
|
||||
Simulator SIM_VOPRF
|
||||
|
||||
On (Init, sid, S) from F_VOPRF:
|
||||
- Sample random key k_sim
|
||||
- Compute c = Commit(k_sim; r_sim)
|
||||
- Send (Param, S, c) to F_VOPRF
|
||||
- Record (S, k_sim, r_sim, c)
|
||||
|
||||
On (Eval, sid, ssid, U, S) from F_VOPRF:
|
||||
If S honest:
|
||||
- Wait for adversary to deliver
|
||||
- On delivery: send (SndrComplete, sid, ssid) to F_VOPRF
|
||||
- Send (RcvCmplt, sid, ssid, U, c_S) to F_VOPRF
|
||||
|
||||
If S corrupted:
|
||||
- Extract adversary's evaluation blind_A
|
||||
- Query F_VOPRF for T_sid(c_A, blind_A)
|
||||
- Program H₂ to output T_sid value
|
||||
- Simulate protocol messages
|
||||
|
||||
On H₂ query (x, y, c) from A:
|
||||
If ∃ honest S with params[S] = c:
|
||||
- Send (OfflineEval, sid, c, H₁⁻¹(x)) to F_VOPRF
|
||||
- Receive ρ from F_VOPRF
|
||||
- Return ρ
|
||||
Else:
|
||||
- Return random value or use table
|
||||
```
|
||||
|
||||
### 5.2 Simulator for F_aPAKE
|
||||
|
||||
```
|
||||
Simulator SIM_aPAKE
|
||||
|
||||
Registration Simulation:
|
||||
On (Register, sid, U, S, pw) from F_aPAKE:
|
||||
- Simulate OPRF blind message
|
||||
- Receive adversarial evaluation
|
||||
- Extract any password guesses
|
||||
- Complete registration
|
||||
|
||||
Login Simulation (Honest Client, Honest Server):
|
||||
- Simulate KE1 with random blind
|
||||
- Simulate KE2 with random evaluation
|
||||
- On (TestPwd, sid, U, pw') from A:
|
||||
- Forward to F_aPAKE
|
||||
- If compromised: program keys accordingly
|
||||
- Generate session key from F_aPAKE
|
||||
- Program MAC/KDF oracles consistently
|
||||
|
||||
Login Simulation (Corrupted Server):
|
||||
- Extract server's OPRF key from state
|
||||
- Use real OPRF evaluation
|
||||
- Monitor password test queries
|
||||
- Enforce "one online guess per session"
|
||||
|
||||
Login Simulation (Corrupted Client):
|
||||
- Extract client's password from state
|
||||
- Use real protocol execution
|
||||
- Provide adversary with session key
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Security Proof
|
||||
|
||||
### 6.1 Main Theorem
|
||||
|
||||
**Theorem 6.1 (UC Security)**
|
||||
The opaque-lattice protocol UC-realizes F_aPAKE in the (F_VOPRF, F_RO)-hybrid model, assuming:
|
||||
1. Ring-LPR is pseudorandom (Definition 2.1)
|
||||
2. ML-KEM is IND-CCA2 secure
|
||||
3. ML-DSA is EUF-CMA secure
|
||||
4. AEAD is IND-CPA and INT-CTXT secure
|
||||
5. HKDF is a secure PRF
|
||||
|
||||
The advantage of any PPT adversary A in distinguishing real from ideal execution is:
|
||||
```
|
||||
Adv(A) ≤ q_pwd · Adv_LPR + q_KEM · Adv_IND-CCA + q_SIG · Adv_EUF-CMA
|
||||
+ q_AEAD · Adv_AEAD + q_sessions · negl(λ)
|
||||
```
|
||||
where q_* denotes the number of respective queries.
|
||||
|
||||
### 6.2 Proof by Game Sequence
|
||||
|
||||
**Game 0 (Real Protocol)**
|
||||
The real execution of opaque-lattice with adversary A.
|
||||
|
||||
**Game 1 (Random Oracle Instrumentation)**
|
||||
Replace hash functions H₁, H₂, H₃ with random oracles maintained by simulator.
|
||||
- Indistinguishable by random oracle assumption
|
||||
|
||||
**Game 2 (OPRF Simulation)**
|
||||
Replace real OPRF evaluations with queries to F_VOPRF.
|
||||
- For honest server: outputs are random (Ring-LPR pseudorandomness)
|
||||
- For corrupted server: extract key, compute real evaluation
|
||||
|
||||
*Lemma 6.1:* |Pr[Game 2] - Pr[Game 1]| ≤ q_oprf · Adv_LPR
|
||||
|
||||
**Game 3 (KEM Simulation)**
|
||||
Replace KEM encapsulation with F_KEM ideal functionality.
|
||||
- Honest parties: shared secret is random
|
||||
- Corrupted parties: extract/inject values
|
||||
|
||||
*Lemma 6.2:* |Pr[Game 3] - Pr[Game 2]| ≤ q_kem · Adv_IND-CCA
|
||||
|
||||
**Game 4 (Signature Simulation)**
|
||||
Replace signatures with F_SIG ideal functionality.
|
||||
- Verify signatures using committed public key
|
||||
- Reject any forgeries
|
||||
|
||||
*Lemma 6.3:* |Pr[Game 4] - Pr[Game 3]| ≤ q_sig · Adv_EUF-CMA
|
||||
|
||||
**Game 5 (Envelope Simulation)**
|
||||
Replace AEAD with ideal encryption.
|
||||
- Envelope contents are hidden until rw is known
|
||||
- Tampering detected by INT-CTXT
|
||||
|
||||
*Lemma 6.4:* |Pr[Game 5] - Pr[Game 4]| ≤ q_aead · Adv_AEAD
|
||||
|
||||
**Game 6 (Password Test Restriction)**
|
||||
Enforce that adversary must make explicit TestPwd query to F_aPAKE.
|
||||
- Each online session allows at most one password test
|
||||
- Offline dictionary attack requires OPRF evaluation
|
||||
|
||||
*Lemma 6.5:* |Pr[Game 6] - Pr[Game 5]| ≤ negl(λ)
|
||||
|
||||
**Game 7 (Ideal Execution)**
|
||||
Execute with F_aPAKE and simulator SIM.
|
||||
- Session keys are random unless compromised
|
||||
- Password never revealed to honest parties
|
||||
|
||||
*Lemma 6.6:* Game 6 ≡ Game 7
|
||||
|
||||
### 6.3 Verifiability Proof
|
||||
|
||||
**Theorem 6.2 (VOPRF Soundness)**
|
||||
For any PPT adversary A, the probability that A produces a valid proof π for an evaluation y = F_k(x) where k differs from the committed key is negligible.
|
||||
|
||||
*Proof Sketch:*
|
||||
1. By binding property of commitment: A cannot open to different k
|
||||
2. By soundness of sigma protocol: A cannot forge proofs
|
||||
3. By Fiat-Shamir security: Non-interactive proofs are sound in ROM
|
||||
|
||||
**Theorem 6.3 (VOPRF Zero-Knowledge)**
|
||||
The sigma protocol proof reveals nothing about k beyond the validity of the statement.
|
||||
|
||||
*Proof Sketch:*
|
||||
1. Construct simulator S that generates accepting proofs without k
|
||||
2. S samples response z uniformly, computes mask m = z - e·k_dummy
|
||||
3. By rejection sampling analysis: real and simulated distributions are statistically close
|
||||
4. Distinguishing advantage bounded by 2^(-λ)
|
||||
|
||||
---
|
||||
|
||||
## 7. Concrete Security Bounds
|
||||
|
||||
### 7.1 Parameter Selection
|
||||
|
||||
| Parameter | Value | Security Level |
|
||||
|-----------|-------|----------------|
|
||||
| Ring dimension n | 256 | 128-bit post-quantum |
|
||||
| Ring modulus q | 4 | Minimal for rounding |
|
||||
| KEM security | Kyber768 | NIST Level 3 |
|
||||
| Signature security | Dilithium3 | NIST Level 3 |
|
||||
| Hash output | 512 bits | Collision resistance |
|
||||
| Commitment nonce | 256 bits | Binding security |
|
||||
|
||||
### 7.2 Concrete Advantages
|
||||
|
||||
Assuming λ = 128 security parameter:
|
||||
|
||||
| Component | Advantage Bound |
|
||||
|-----------|-----------------|
|
||||
| Ring-LPR PRF | 2^(-128) (DCP hardness) |
|
||||
| ML-KEM IND-CCA | 2^(-128) (MLWE hardness) |
|
||||
| ML-DSA EUF-CMA | 2^(-128) (MLWE+SIS hardness) |
|
||||
| AEAD (AES-GCM) | 2^(-128) |
|
||||
| HKDF-SHA512 | 2^(-256) |
|
||||
| Commitment binding | 2^(-128) (collision resistance) |
|
||||
| ZK soundness | 2^(-128) (sigma protocol) |
|
||||
|
||||
### 7.3 Attack Complexity
|
||||
|
||||
| Attack | Complexity | Mitigation |
|
||||
|--------|------------|------------|
|
||||
| Offline dictionary | Requires OPRF oracle | One guess per session |
|
||||
| Online brute force | O(2^128) sessions | Rate limiting |
|
||||
| Quantum OPRF attack | O(e^256) | DCP hardness |
|
||||
| Server compromise | No offline attack | OPRF obliviousness |
|
||||
| Forward secrecy break | O(2^128) per session | Ephemeral KEM keys |
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
[Can01] R. Canetti. "Universally Composable Security: A New Paradigm for Cryptographic Protocols." FOCS 2001.
|
||||
|
||||
[JKX18] S. Jarecki, H. Krawczyk, J. Xu. "OPAQUE: An Asymmetric PAKE Protocol Secure Against Pre-Computation Attacks." Eurocrypt 2018.
|
||||
|
||||
[Lyu09] V. Lyubashevsky. "Fiat-Shamir with Aborts: Applications to Lattice and Factoring-Based Signatures." ASIACRYPT 2009.
|
||||
|
||||
[Lyu12] V. Lyubashevsky. "Lattice Signatures without Trapdoors." EUROCRYPT 2012.
|
||||
|
||||
[Sha25] Z. Shan et al. "Fast Post-Quantum Private Set Intersection from Ring-LPR OPRF." J. Syst. Arch. 2025.
|
||||
|
||||
[Alb21] M. Albrecht et al. "Round-optimal Verifiable OPRFs from Ideal Lattices." PKC 2021.
|
||||
|
||||
[Fal22] J. Faller. "Composable OPRFs via Garbled Circuits." Master's Thesis, 2022.
|
||||
|
||||
[RFC9807] RFC 9807. "The OPAQUE Augmented PAKE Protocol." 2024.
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Proof Details
|
||||
|
||||
### A.1 Ring-LPR Pseudorandomness
|
||||
|
||||
**Lemma A.1** For uniformly random k ∈ R₂ and arbitrary x ∈ R₂:
|
||||
```
|
||||
{(x, F_k(x))} ≈_c {(x, U)}
|
||||
```
|
||||
where U is uniform random output.
|
||||
|
||||
*Proof:*
|
||||
1. F_k(x) = H₂(⌊k·x mod 4⌋₁)
|
||||
2. By Ring-LPR assumption: ⌊k·x mod 4⌋₁ ≈_c ⌊u⌋₁ for random u
|
||||
3. H₂ is a random oracle: output is uniformly distributed
|
||||
4. Combining: F_k(x) is computationally indistinguishable from random
|
||||
|
||||
### A.2 Sigma Protocol Analysis
|
||||
|
||||
**Commitment:**
|
||||
```
|
||||
t = H(m || m·a)
|
||||
```
|
||||
where m ←$ R_q with small coefficients.
|
||||
|
||||
**Challenge:**
|
||||
```
|
||||
e = H(c || t || x || y)[0:16]
|
||||
```
|
||||
(128-bit challenge via Fiat-Shamir)
|
||||
|
||||
**Response:**
|
||||
```
|
||||
z = m + e·k
|
||||
```
|
||||
with rejection if ||z||_∞ > B.
|
||||
|
||||
**Rejection Probability:**
|
||||
By Lemma 4.1 of [Lyu12], if m is sampled from discrete Gaussian with σ > 12·||k||:
|
||||
```
|
||||
Pr[rejection] ≤ 2^(-100)
|
||||
```
|
||||
|
||||
**Soundness:**
|
||||
If adversary produces accepting proofs for (c, x, y₁) and (c, x, y₂) with y₁ ≠ y₂:
|
||||
```
|
||||
z₁ - z₂ = e₁·k - e₂·k = (e₁ - e₂)·k
|
||||
```
|
||||
Since e₁ ≠ e₂ with overwhelming probability, we can extract k.
|
||||
|
||||
**Zero-Knowledge:**
|
||||
Simulator chooses z uniformly, computes t = H(z - e·k_dummy || ...), programs RO.
|
||||
Statistical distance from real: 2^(-λ) by rejection sampling lemma.
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: Implementation Notes
|
||||
|
||||
### B.1 Constant-Time Implementation
|
||||
|
||||
All operations on secret data must be constant-time:
|
||||
- Ring multiplication: coefficient-by-coefficient, no early termination
|
||||
- Rounding: table lookup with constant access pattern
|
||||
- Comparison: bitwise operations only
|
||||
|
||||
### B.2 Side-Channel Mitigations
|
||||
|
||||
- **Timing attacks**: All branches on secret data eliminated
|
||||
- **Cache attacks**: No secret-dependent memory access patterns
|
||||
- **Power analysis**: Balanced operations where possible
|
||||
|
||||
### B.3 Zeroization
|
||||
|
||||
All secret values are zeroized after use:
|
||||
- OPRF keys: `RingLprKey` implements `ZeroizeOnDrop`
|
||||
- Session keys: explicit zeroize before deallocation
|
||||
- Intermediate values: scoped to minimize lifetime
|
||||
188
benches/oprf_benchmark.rs
Normal file
188
benches/oprf_benchmark.rs
Normal file
@@ -0,0 +1,188 @@
|
||||
//! Benchmarks comparing Ring-LPR OPRF (OT-based) vs Fast OPRF (OT-free)
|
||||
//!
|
||||
//! Run with: cargo bench
|
||||
|
||||
use criterion::{BenchmarkId, Criterion, criterion_group, criterion_main};
|
||||
use rand::SeedableRng;
|
||||
use rand_chacha::ChaCha20Rng;
|
||||
|
||||
use opaque_lattice::oprf::fast_oprf::{
|
||||
PublicParams, ServerKey, client_blind as fast_client_blind, client_finalize as fast_finalize,
|
||||
evaluate as fast_evaluate, server_evaluate as fast_server_evaluate,
|
||||
};
|
||||
use opaque_lattice::oprf::ring_lpr::{
|
||||
RingLprKey, client_blind as lpr_client_blind, client_finalize as lpr_finalize,
|
||||
server_evaluate as lpr_server_evaluate,
|
||||
};
|
||||
|
||||
/// Benchmark Fast OPRF (OT-free) - full protocol
|
||||
fn bench_fast_oprf(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("fast_oprf");
|
||||
|
||||
let pp = PublicParams::generate(b"benchmark-params");
|
||||
let key = ServerKey::generate(&pp, b"benchmark-key");
|
||||
let password = b"benchmark-password-12345";
|
||||
|
||||
// Benchmark client blind
|
||||
group.bench_function("client_blind", |b| {
|
||||
b.iter(|| fast_client_blind(&pp, password))
|
||||
});
|
||||
|
||||
// Benchmark server evaluate
|
||||
let (state, blinded) = fast_client_blind(&pp, password);
|
||||
group.bench_function("server_evaluate", |b| {
|
||||
b.iter(|| fast_server_evaluate(&key, &blinded))
|
||||
});
|
||||
|
||||
// Benchmark client finalize
|
||||
let response = fast_server_evaluate(&key, &blinded);
|
||||
group.bench_function("client_finalize", |b| {
|
||||
let state = state.clone();
|
||||
b.iter(|| fast_finalize(&state, key.public_key(), &response))
|
||||
});
|
||||
|
||||
// Benchmark full protocol
|
||||
group.bench_function("full_protocol", |b| {
|
||||
b.iter(|| fast_evaluate(&pp, &key, password))
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
/// Benchmark Ring-LPR OPRF (OT-based) - full protocol
|
||||
fn bench_ring_lpr_oprf(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("ring_lpr_oprf");
|
||||
|
||||
let mut rng = ChaCha20Rng::seed_from_u64(12345);
|
||||
let key = RingLprKey::generate(&mut rng);
|
||||
let password = b"benchmark-password-12345";
|
||||
|
||||
// Benchmark client blind
|
||||
group.bench_function("client_blind", |b| {
|
||||
let mut rng = ChaCha20Rng::seed_from_u64(99999);
|
||||
b.iter(|| lpr_client_blind(&mut rng, password))
|
||||
});
|
||||
|
||||
// Benchmark server evaluate
|
||||
let mut rng2 = ChaCha20Rng::seed_from_u64(88888);
|
||||
let (state, blinded) = lpr_client_blind(&mut rng2, password).unwrap();
|
||||
group.bench_function("server_evaluate", |b| {
|
||||
b.iter(|| lpr_server_evaluate(&key, &blinded))
|
||||
});
|
||||
|
||||
// Benchmark client finalize
|
||||
let evaluated = lpr_server_evaluate(&key, &blinded).unwrap();
|
||||
group.bench_function("client_finalize", |b| {
|
||||
let mut rng = ChaCha20Rng::seed_from_u64(77777);
|
||||
let (state, _) = lpr_client_blind(&mut rng, password).unwrap();
|
||||
b.iter(|| {
|
||||
// Need to re-create state each time since finalize consumes it
|
||||
let mut rng = ChaCha20Rng::seed_from_u64(77777);
|
||||
let (state, _) = lpr_client_blind(&mut rng, password).unwrap();
|
||||
lpr_finalize(state, &evaluated)
|
||||
})
|
||||
});
|
||||
|
||||
// Benchmark full protocol
|
||||
group.bench_function("full_protocol", |b| {
|
||||
let mut rng = ChaCha20Rng::seed_from_u64(66666);
|
||||
b.iter(|| {
|
||||
let (state, blinded) = lpr_client_blind(&mut rng, password).unwrap();
|
||||
let evaluated = lpr_server_evaluate(&key, &blinded).unwrap();
|
||||
lpr_finalize(state, &evaluated)
|
||||
})
|
||||
});
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
/// Compare both protocols side-by-side
|
||||
fn bench_comparison(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("oprf_comparison");
|
||||
|
||||
// Fast OPRF setup
|
||||
let pp = PublicParams::generate(b"benchmark-params");
|
||||
let fast_key = ServerKey::generate(&pp, b"benchmark-key");
|
||||
|
||||
// Ring-LPR setup
|
||||
let mut rng = ChaCha20Rng::seed_from_u64(12345);
|
||||
let lpr_key = RingLprKey::generate(&mut rng);
|
||||
|
||||
let passwords = [
|
||||
b"short".as_slice(),
|
||||
b"medium-password-123".as_slice(),
|
||||
b"this-is-a-very-long-password-that-tests-longer-inputs".as_slice(),
|
||||
];
|
||||
|
||||
for password in &passwords {
|
||||
let len = password.len();
|
||||
|
||||
group.bench_with_input(BenchmarkId::new("fast_oprf", len), password, |b, pwd| {
|
||||
b.iter(|| fast_evaluate(&pp, &fast_key, pwd))
|
||||
});
|
||||
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("ring_lpr_oprf", len),
|
||||
password,
|
||||
|b, pwd| {
|
||||
let mut rng = ChaCha20Rng::seed_from_u64(55555);
|
||||
b.iter(|| {
|
||||
let (state, blinded) = lpr_client_blind(&mut rng, pwd).unwrap();
|
||||
let evaluated = lpr_server_evaluate(&lpr_key, &blinded).unwrap();
|
||||
lpr_finalize(state, &evaluated)
|
||||
})
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
/// Benchmark message sizes
|
||||
fn bench_message_sizes(c: &mut Criterion) {
|
||||
println!("\n=== Message Size Comparison ===\n");
|
||||
|
||||
// Fast OPRF messages
|
||||
let pp = PublicParams::generate(b"benchmark-params");
|
||||
let fast_key = ServerKey::generate(&pp, b"benchmark-key");
|
||||
let (_, blinded) = fast_client_blind(&pp, b"password");
|
||||
let response = fast_server_evaluate(&fast_key, &blinded);
|
||||
|
||||
println!("Fast OPRF:");
|
||||
println!(" Client -> Server (BlindedInput): ~{} bytes", 256 * 4); // RingElement
|
||||
println!(" Server -> Client (Response): ~{} bytes", 256 * 4 + 256); // RingElement + helper
|
||||
|
||||
// Ring-LPR OPRF messages
|
||||
let mut rng = ChaCha20Rng::seed_from_u64(12345);
|
||||
let lpr_key = RingLprKey::generate(&mut rng);
|
||||
let (_, lpr_blinded) = lpr_client_blind(&mut rng, b"password").unwrap();
|
||||
let lpr_evaluated = lpr_server_evaluate(&lpr_key, &lpr_blinded).unwrap();
|
||||
|
||||
let lpr_blind_size = lpr_blinded.to_bytes().len();
|
||||
let lpr_eval_size = lpr_evaluated.to_bytes().len();
|
||||
|
||||
println!("\nRing-LPR OPRF:");
|
||||
println!(
|
||||
" Client -> Server (BlindedInput): {} bytes",
|
||||
lpr_blind_size
|
||||
);
|
||||
println!(
|
||||
" Server -> Client (EvaluatedOutput): {} bytes",
|
||||
lpr_eval_size
|
||||
);
|
||||
|
||||
println!(
|
||||
"\nSpeedup factor (message size): {:.1}x",
|
||||
lpr_blind_size as f64 / (256.0 * 4.0)
|
||||
);
|
||||
println!();
|
||||
}
|
||||
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_fast_oprf,
|
||||
bench_ring_lpr_oprf,
|
||||
bench_comparison,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
8719
papers/1-s2.0-S0920548925001473-main.pdf
Normal file
8719
papers/1-s2.0-S0920548925001473-main.pdf
Normal file
File diff suppressed because one or more lines are too long
14094
papers/1-s2.0-S1383762125000189-main.pdf
Normal file
14094
papers/1-s2.0-S1383762125000189-main.pdf
Normal file
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
Binary file not shown.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
Binary file not shown.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
Binary file not shown.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
BIN
papers/Editorial-Board_2025_Journal-of-Systems-Architecture.pdf
Normal file
BIN
papers/Editorial-Board_2025_Journal-of-Systems-Architecture.pdf
Normal file
Binary file not shown.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
Binary file not shown.
Binary file not shown.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
Binary file not shown.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
Binary file not shown.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
BIN
papers/basso-isogeny-oprf-sac23.pdf
Normal file
BIN
papers/basso-isogeny-oprf-sac23.pdf
Normal file
Binary file not shown.
BIN
papers/basso-isogeny-oprf.pdf
Normal file
BIN
papers/basso-isogeny-oprf.pdf
Normal file
Binary file not shown.
BIN
papers/composable-oprf-thesis.pdf
Normal file
BIN
papers/composable-oprf-thesis.pdf
Normal file
Binary file not shown.
BIN
papers/cryptrec-guidelines-2024.pdf
Normal file
BIN
papers/cryptrec-guidelines-2024.pdf
Normal file
Binary file not shown.
BIN
papers/cryptrec-pqc-2024.pdf
Normal file
BIN
papers/cryptrec-pqc-2024.pdf
Normal file
Binary file not shown.
BIN
papers/faller-garbled-oprf-thesis.pdf
Normal file
BIN
papers/faller-garbled-oprf-thesis.pdf
Normal file
Binary file not shown.
BIN
papers/isogeny-oprf-lattice-ot.pdf
Normal file
BIN
papers/isogeny-oprf-lattice-ot.pdf
Normal file
Binary file not shown.
BIN
papers/kyber-verification.pdf
Normal file
BIN
papers/kyber-verification.pdf
Normal file
Binary file not shown.
BIN
papers/lwe-problem.pdf
Normal file
BIN
papers/lwe-problem.pdf
Normal file
Binary file not shown.
BIN
papers/nukib-pqc-guide.pdf
Normal file
BIN
papers/nukib-pqc-guide.pdf
Normal file
Binary file not shown.
BIN
papers/opaque-2018.pdf
Normal file
BIN
papers/opaque-2018.pdf
Normal file
Binary file not shown.
BIN
papers/opaque-draft-01.pdf
Normal file
BIN
papers/opaque-draft-01.pdf
Normal file
Binary file not shown.
BIN
papers/opaque-eprint-2018-163.pdf
Normal file
BIN
papers/opaque-eprint-2018-163.pdf
Normal file
Binary file not shown.
5897
papers/opaque-rfc9807.html
Normal file
5897
papers/opaque-rfc9807.html
Normal file
File diff suppressed because it is too large
Load Diff
BIN
papers/owl-apake.pdf
Normal file
BIN
papers/owl-apake.pdf
Normal file
Binary file not shown.
BIN
papers/pake-quantum-annoying.pdf
Normal file
BIN
papers/pake-quantum-annoying.pdf
Normal file
Binary file not shown.
BIN
papers/regev-lattice-crypto.pdf
Normal file
BIN
papers/regev-lattice-crypto.pdf
Normal file
Binary file not shown.
BIN
papers/regev-lattice.pdf
Normal file
BIN
papers/regev-lattice.pdf
Normal file
Binary file not shown.
BIN
papers/rfc9807.pdf
Normal file
BIN
papers/rfc9807.pdf
Normal file
Binary file not shown.
BIN
papers/vole-constructions.pdf
Normal file
BIN
papers/vole-constructions.pdf
Normal file
Binary file not shown.
17384
papers/vole-ring-lwe.pdf
Normal file
17384
papers/vole-ring-lwe.pdf
Normal file
File diff suppressed because it is too large
Load Diff
1062
papers_txt/1-s2.0-S0920548925001473-main.txt
Normal file
1062
papers_txt/1-s2.0-S0920548925001473-main.txt
Normal file
File diff suppressed because it is too large
Load Diff
834
papers_txt/1-s2.0-S1383762125000189-main.txt
Normal file
834
papers_txt/1-s2.0-S1383762125000189-main.txt
Normal file
@@ -0,0 +1,834 @@
|
||||
Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Journal of Systems Architecture
|
||||
journal homepage: www.elsevier.com/locate/sysarc
|
||||
|
||||
|
||||
|
||||
|
||||
Fast post-quantum private set intersection from oblivious pseudorandom
|
||||
function for mobile social networks✩
|
||||
Zhuang Shan a , Leyou Zhang a ,∗, Qing Wu b , Qiqi Lai c , Fuchun Guo d
|
||||
a School of Mathematics and Statistics, Xidian University, Xi’an 710126, China
|
||||
b
|
||||
School of Automation, Xi’an University of Posts and Telecommunications, Xi’an 710121, China
|
||||
c
|
||||
School of Computer Science, Shaanxi Normal University, Xi’an 710121, China
|
||||
d
|
||||
Centre for Computer and Information Security Research, University of Wollongong, Wollongong, NSW 2522, Australia
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: Mobile social networks have become integral to our daily lives, transforming communication methods and
|
||||
Mobile social networks facilitating social interactions. With technological advancements, users generate vast amounts of valuable
|
||||
Private set intersection and sensitive personal data, which is stored on servers to enable instant information sharing. To protect the
|
||||
Oblivious pseudorandom function
|
||||
sharing data, each platform has implemented many techniques such as end-to-end encryption mechanisms,
|
||||
Private information retrieval
|
||||
fully homomorphic encryption, etc. However, these approaches face several security and privacy challenges,
|
||||
including potential leaks of user data, vulnerabilities in encryption that expose privacy ciphertexts to
|
||||
probabilistic attacks, and threats posed by future quantum computers.
|
||||
Aimed at the above, we introduce a private set intersection (PSI) protocol based on oblivious pseudorandom
|
||||
functions (OPRF) under ring LPR problem from lattice. The proposed perturbed pseudorandom generator
|
||||
not only enhances the PSI’s resistance to probabilistic attacks, but also leads to generate a more efficient
|
||||
OPRF and a PSI. It boasts a time complexity of 𝑂(𝑛 log 𝑛) and is superior to existing well-known fast post-
|
||||
quantum PSI protocol operating at 𝑂(𝑚𝑛 log(𝑚𝑛)), where 𝑚 is the bit length of the cryptographic modulus and 𝑛
|
||||
represents the dimension of the security parameter. Simulation experiments and security analyses demonstrate
|
||||
that our proposal effectively preserves user privacy, ensures collusion resilience, verifies computation results,
|
||||
and maintains low computational costs. Finally, as an expansion of our OPRF, we also give a fast private
|
||||
information retrieval (PIR) protocol.
|
||||
|
||||
|
||||
|
||||
1. Introduction respective data sets. This way, even if data is stored in distributed
|
||||
systems, it can effectively prevent data breaches and violations of user
|
||||
Mobile social networks have greatly enriched the ways people com- privacy, such as those caused by data leaks or unauthorized access.
|
||||
municate and enhanced the convenience of social interactions. With the The application of PSI in mobile social networks not only enhances
|
||||
development of technology, users generate a large amount of useful data security but also strengthens user trust in the platform, which
|
||||
and sensitive personal data within mobile social networks. This data
|
||||
is crucial for protecting user privacy and improving the platform’s
|
||||
often needs to be stored and processed to provide more personalized
|
||||
competitiveness. In this way, mobile social networks can continue to
|
||||
services and experiences [1,2]. However, due to the limited storage
|
||||
capacity of mobile social network devices, it is impossible to store all provide a rich and vibrant social experience and efficient information
|
||||
the data generated at any given moment, which presents challenges for services while safeguarding personal privacy. Furthermore, as an im-
|
||||
data storage and privacy protection. portant application in the field of privacy computing, PSI has recently
|
||||
To address this issue while ensuring data confidentiality and se- garnered widespread attention due to its efficiency and practicality,
|
||||
curity, many mobile social network platforms have started adopting jointly promoting the rapid implementation of privacy computing tech-
|
||||
advanced privacy-preserving technologies, such as private set inter- nology and ensuring the secure flow and value extraction of data
|
||||
section (PSI). The technology allows two or more parties to securely elements.
|
||||
compute the intersection of their datasets without disclosing their
|
||||
|
||||
|
||||
✩ This document is the results of the research project funded by the National Science Foundation.
|
||||
∗ Corresponding author.
|
||||
E-mail addresses: arcsec30@stu.xidian.edu.cn (Z. Shan), lyzhang@mail.xidian.edu.cn (L. Zhang), xiyouwuq@126.com (Q. Wu), laiqq@snnu.edu.cn (Q. Lai),
|
||||
fuchun@uow.edu.au (F. Guo).
|
||||
|
||||
https://doi.org/10.1016/j.sysarc.2025.103346
|
||||
Received 3 November 2024; Received in revised form 24 December 2024; Accepted 16 January 2025
|
||||
Available online 25 January 2025
|
||||
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
set intersection from oblivious pseudorandom function is proposed in
|
||||
this paper, and it has the following advantages:
|
||||
|
||||
• Symmetric encryption is adopted, which is efficient and reduces the risk of
|
||||
privacy leakage. The PSI in this paper is constructed based on OPRF,
|
||||
which belongs to asymmetric encryption, thus reducing the number
|
||||
of interactions between users and lowering the risk of user privacy
|
||||
leakage. Compared to symmetric encryption, the operational cost of
|
||||
asymmetric encryption is lower, reducing reliance on authoritative
|
||||
institutions.
|
||||
• The structure of OPRF is simple, and it is relatively efficient in post-
|
||||
quantum OPRF. The OPRF used to construct PSI in this paper is based
|
||||
on a new lattice problem, namely the learning parity with rounding
|
||||
Fig. 1. Mobile social networks.
|
||||
over ring problem(Ring-LPR). The Ring-LPR problem not only has a
|
||||
simple structure but also possesses the capability to resist quantum
|
||||
attacks.
|
||||
• A perturbed pseudorandom generator (PPRG) can withstand probabilistic
|
||||
attacks. In addition to OPRF, the PSI in this paper also includes
|
||||
a structure with a perturbed pseudorandom generator, which can
|
||||
overcome the weakness of weak encryption in symmetric encryp-
|
||||
tion, thereby preventing adversaries from guessing the corresponding
|
||||
plaintext using statistical methods on the ciphertext ratios.
|
||||
|
||||
|
||||
Fig. 2. Private set intersection. 1.2. Technical overview
|
||||
|
||||
We adopted oblivious transfer technique and hamming correlation
|
||||
There are many common construction tools for PSI [3], and obliv- robustness, both of which are used in the OPRF construction presented
|
||||
ious transfer (OT) is one of them. An OT [4] is a crucial tool used in this paper. For the incidental pseudorandom function subject, we
|
||||
for secure multiparty computation. In this tool, the sender transmits initially aimed to use learning parity with noise (LPN) over rings.
|
||||
data from a set of messages to the receiver but remains oblivious to However, this approach results in varying encryption outcomes for the
|
||||
which specific message was sent, while the receiver is unaware of the same private data, preventing the recipient from matching the private
|
||||
other messages they did not receive. This protocol is also known as the
|
||||
data. Thus, we sought to make LPN over rings behave consistently
|
||||
oblivious transfer protocol. The essence of an oblivious pseudorandom
|
||||
like learning with rounding (LWR), leading to the introduction of the
|
||||
function is a pseudorandom function (PRF) enhanced with oblivious
|
||||
concept of learning parity with rounding over rings (LPR over rings) in
|
||||
transfer capabilities.
|
||||
this paper.
|
||||
In 1986, Goldreich, Goldwasser, and Micali introduced a new cryp-
|
||||
To prove that LPR over rings is quantum-resistant, we established
|
||||
tographic primitive known as the pseudorandom function, whose out-
|
||||
put appears to be randomly chosen [5]. Two decades later, Naor and a reduction bridge between LPR over rings and LWR. Yes, LPR over
|
||||
Reingold [6] noticed that their number-theoretic PRF allows for an rings is reduced to LWR, not LPN over rings. For (𝑞 = 2𝑛 , 𝑝)-LWR
|
||||
interactive and oblivious evaluation, where a ‘‘client’’ with input 𝑥 instances, we demonstrated the hardness of (𝑞 = 2, 𝑝 = 1)-LWR instances
|
||||
obtains 𝐹𝑘 (𝑥) for a function 𝐹𝑘 (𝑥) that is contributed by a ‘‘server’’. and (𝑞 = 2, 𝑝 = 1)-LWR over rings, where (𝑞 = 2, 𝑝 = 1)-LWR over
|
||||
Neither does the client learn the function (i.e., its key 𝑘), nor does the rings corresponds to LPR over rings. To verify that the computational
|
||||
server learn 𝑥 or 𝐹𝑘 (𝑥). Freedman et al. later called such two-party efficiency of the post-quantum OPRF in this paper is quite fast, we
|
||||
protocol an OPRF and gave first formal definitions and two OPRFs compared the OPRF with the LWE-instantiated OPRF from [14]. The
|
||||
based on the Naor-Reingold PRF [7]. In 2009, Jarecki and Liu presented results showed that, as theoretical analysis suggested, the computation
|
||||
an efficient OPRF for securing intersection data [8]. efficiency improves with the increase of security parameters.
|
||||
Oblivious pseudorandom functions have been utilized in PSI [9]. Based on OPRF, we constructed private set intersection (PSI) based
|
||||
The additional functionalities of oblivious pseudorandom functions on OPRF. Since the paper [15] analyzed that PSI based on symmetric
|
||||
also exhibit diversity, such as verifiable oblivious pseudorandom func- encryption does not resist probabilistic attacks and proposed the con-
|
||||
tions (VOPRF, [10]) and partially oblivious pseudorandom functions cept of perturbed pseudorandom generator, we used LPN over rings
|
||||
(POPRF, [11]). to construct a pseudorandom generator and proved that it satisfies the
|
||||
Currently, OPRFs still faces challenges, as summarized by Casacu- definition of PPRG as given in [15].
|
||||
berta, Hesse, and Lehmann [12]. Efficient OPRF constructions often
|
||||
rely on discrete-log or factoring-type hardness assumptions, which
|
||||
1.3. Organizations
|
||||
are vulnerable to quantum computers. This paper aims to address
|
||||
this by constructing OPRFs based on lattice-hardness assumptions and
|
||||
improving their efficiency (see Figs. 1 and 2). The structure of this paper is as follows. Section 3 provides the
|
||||
necessary definitions and lemmas as a foundation for the readers’
|
||||
1.1. Contributions knowledge. Section 4 presents the construction and efficiency analysis
|
||||
of OPRF, along with the definition and reduction of Ring-LPR. Section 5
|
||||
Regarding the open problem proposed by Casacuberta, there are details the construction of the PSI in this paper, security proofs, and
|
||||
currently quantum-resistant OPRFs, namely Albrecht et al.’s lattice- LWE-based efficiency analysis, as well as the construction of the PPRG
|
||||
based VOPRF [10] and Boneh et al.’s isogeny-based OPRF [13]. Both and the proof of its pseudorandomness. Finally, Section 6 summarizes
|
||||
constructions represent significant feasibility results but require further the advantages and limitations of the PSI presented in this paper, as
|
||||
research to improve their efficiency [12]. So, fast post-quantum private well as the extension of OPRF to PIR
|
||||
|
||||
2
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
2. Preliminary ⎛ 0 0 0 ⋯ 0 −1 ⎞
|
||||
⎜ 1 0 0 ⋯ 0 0 ⎟
|
||||
Each element of a lattice in R𝑛 can be expressed linearly by 𝑛 ⎜ ⎟
|
||||
0 1 0 ⋯ 0 0 ⎟
|
||||
𝑋=⎜ .
|
||||
linearly independent vector integer coefficients. This set of linearly ⎜ 0 0 1 ⋯ 0 0 ⎟
|
||||
independent vectors is called a lattice basis, and we know that the ⎜ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⎟⎟
|
||||
⎜
|
||||
lattice basis is not unique. Given a set of lattice bases (𝑣1 , … , 𝑣𝑛 ) in ⎝ 0 0 0 ⋯ 1 0 ⎠
|
||||
the lattice , then the fundamental parallelelepiped is
|
||||
{ 𝑛 } So there is
|
||||
∑ |
|
||||
(𝑣1 , … , 𝑣𝑛 ) = 𝑘𝑖 𝑣𝑖 ||𝑘𝑖 ∈ [0, 1) . ⎛ 𝑎0 −𝑎𝑛−1 ⋯ −𝑎1 ⎞
|
||||
| ⎜ ⎟
|
||||
𝑖=1 𝑎1 𝑎0 ⋯ −𝑎2 ⎟
|
||||
𝑅𝑜𝑡(𝑓 ) = ⎜ ,
|
||||
If the lattice base (𝑣1 , … , 𝑣𝑛 ) is determined, use the symbol () to ⎜ ⋮ ⋮ ⋱ ⋮ ⎟
|
||||
replace (𝑣1 , … , 𝑣𝑛 ). ∀𝑥 ∈ R𝑛 , project it onto (). According to the ⎜ 𝑎 𝑎𝑛−2 ⋯ ⎟
|
||||
𝑎0 ⎠
|
||||
⎝ 𝑛−1
|
||||
properties of projection, there is a unique 𝑦 ∈ () makes 𝑦 − 𝑥 ∈ .
|
||||
it is easy to prove that this mapping relationship is isomorphic.
|
||||
Use the symbol det () to represent the volume of the fundamental
|
||||
parallelelepiped of the lattice . In other words, the symbol det ()
|
||||
Definition 3 (Learning with Rounding, [16,17]). Let 𝜆 be the security
|
||||
represents the determinant of a matrix composed of a set of lattice bases
|
||||
parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆), 𝑞 = 𝑞(𝜆), 𝑝 = 𝑝(𝜆) be integers. The LWR
|
||||
(𝑣1 , … , 𝑣𝑛 ). For a given 𝑛 dimensional lattice, the det () size of any set
|
||||
problem states that for 𝐴 ∈ Z𝑚×𝑛 𝑛 𝑚
|
||||
𝑞 , 𝑠 ∈ Z𝑞 , 𝑢 ∈ Z𝑞 the following distri-
|
||||
of lattice bases of the lattice is constant.
|
||||
butions are computationally indistinguishable: (𝐴, ⌊𝐴𝑠⌋𝑝 ) ≈𝐶 (𝐴, ⌊𝑢⌋𝑝 ).
|
||||
Given 𝑛 lattice , (𝑣1 , … , 𝑣𝑛 ) and (𝑢1 , … , 𝑢𝑛 ) are two arbitrary groups
|
||||
∑ Here ⌊𝑥⌋𝑝 = ⌊ 𝑞𝑝 𝑥⌋, ⌊𝑥⌋ represents the floor function, which rounds down
|
||||
of lattice respectively lattice bases. Therefore, there is 𝑣𝑖 = 𝑛𝑗=1 𝑚𝑖𝑗 𝑢𝑗
|
||||
∑𝑛 ′ to the nearest integer. For example, ⌊3.14⌋ = 3 and ⌊3⌋ = 3.
|
||||
and 𝑢𝑖 = 𝑗=1 𝑚𝑖𝑗 𝑣𝑗 , 𝑖 ∈ {1, … , 𝑛}, there are two integer matrices 𝑀 and
|
||||
𝑀 ′ such that
|
||||
⎛ 𝑣1 ⎞ ⎛ 𝑢1 ⎞ ⎛ 𝑢1 ⎞ ⎛ 𝑣1 ⎞ Definition 4 (Learning Parity with Noise, [18,19]). Let 𝜆 be the security
|
||||
⎜ ⋮ ⎟ = 𝑀 ⎜ ⋮ ⎟ and ⎜ ⋮ ⎟ = 𝑀 ′ ⎜ ⋮ ⎟ . parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆) be integers. The LPN problem states
|
||||
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
|
||||
⎝ 𝑣𝑛 ⎠ ⎝ 𝑢𝑛 ⎠ ⎝ 𝑢𝑛 ⎠ ⎝ 𝑣𝑛 ⎠ that for 𝐴 ∈ Z𝑚×𝑛
|
||||
2
|
||||
, 𝑠 ∈ Z𝑛2 , 𝑢, 𝑒 ∈ Z𝑚
|
||||
2
|
||||
the following distributions are
|
||||
computationally indistinguishable: (𝐴, 𝐴𝑠 + 𝑒) ≈𝐶 (𝐴, 𝑢).
|
||||
It is easy to prove that 𝑀 and 𝑀 ′ are inverse to each other, and 𝑀
|
||||
and 𝑀 ′ are both integer matrices, there are det (𝑀)⋅ det (𝑀 ′ ) = 1 and
|
||||
det (𝑀) = det (𝑀 ′ ) = ±1, so Definition 5 (Hamming Correlation Robustness, [14]). For a hash func-
|
||||
det (𝑣1 , … , 𝑣𝑛 ) = ± det (𝑢1 , … , 𝑢𝑛 ). tion (⋅) and a pseudorandom function 𝐹𝑘 (⋅) with key 𝑘, (⋅) is Ham-
|
||||
ming correlation robust if (𝑥) ≈𝐶 𝐹𝑘 (𝑥).
|
||||
|
||||
|
||||
Definition 1. An ideal lattice is a subset of rings or domains that Definition 6 (OT1 ). The message sender sends data to the receiver
|
||||
satisfies the following two properties: from a set of pending messages but remains oblivious to which specific
|
||||
message was sent. Meanwhile, the receiver is unaware of the additional
|
||||
1. Additive closure: If any two elements in the ideal are added, the data they want to receive. This protocol is also known as oblivious
|
||||
result is still in the ideal. In other words, for any elements 𝑎 and transfer.
|
||||
𝑏 in the ideal, 𝑎 + 𝑏 also belongs to that ideal.
|
||||
2. Multiplicative absorptivity: If an element in the ideal is multi-
|
||||
plied by any element in the ring (or field), the result is still in Definition 7 (OPRF, [20]). Let the PRF key 𝑘 consist of two bit-
|
||||
the ideal. In other words, for any element 𝑎 in the ideal and any strings 𝑞 , 𝑠 ∈ {0, 1}𝜆 . Let 𝐹 (⋅)be a pseudorandom code that produces a
|
||||
element 𝑟 in the ring (or field), 𝑎𝑟 and 𝑟𝑎 belong to that ideal. pseudorandom string and let be a hash function. The pseudorandom
|
||||
function is computed as
|
||||
For a commutative ring, further require that the ideal be closed for both
|
||||
addition and multiplication. Such an ideal is called a true ideal. OPRF𝑘 (𝑥) = (𝑞 ⊕ [𝐹 (𝑥) ⋅ 𝑠]),
|
||||
|
||||
where ⋅ denotes bitwise-AND and ⊕ denotes bitwise-XOR. For a ran-
|
||||
Definition 2. Referring to the definition of ideal, the ideal lattice is domly generated s, if 𝐹 (𝑥) has enough Hamming weight then the
|
||||
a subset of the lattice that satisfies the following two properties: function OPRF𝑘 (𝑥) is pseudorandom assuming the hash function is
|
||||
correlation robust.
|
||||
1. Additive closure: If any two elements in an ideal lattice are
|
||||
added, the result is still in the ideal lattice. In other words, for
|
||||
any elements 𝑎 and 𝑏 in an ideal lattice, 𝑎+𝑏 also belongs to that Definition 8 (PSI, [14]). PSI enables two parties, each holding a private
|
||||
ideal lattice. set of elements, to compute the intersection of the two sets while
|
||||
2. Multiplicative absorptivity: If an element in an ideal lattice is revealing nothing more than the intersection itself.
|
||||
multiplied by an element in any other ideal lattice, the result
|
||||
remains in the ideal lattice. In other words, for any element 𝑎 in
|
||||
Definition 9 (Dihedral Coset Problem). Given a security parameter 𝜅, for
|
||||
the ideal and any element 𝑟 in another ideal lattice, both 𝑎𝑟 and
|
||||
an instance of the DCP𝓁𝑞 problem, where 𝑁 denotes the modulus and 𝓁
|
||||
𝑟𝑎 belong to that ideal lattice.
|
||||
represents the number of states. Each state is expressed as
|
||||
|0⟩|𝑥𝑖 ⟩ + |1⟩|(𝑥𝑖 + 𝑠) mod 𝑞⟩, 𝑖 ≤ 𝓁,
|
||||
Corollary 1. The ideal lattice is a true idea of the lattice . and it stores 1 + ⌈log2 𝑞⌉ bits, where 𝑥 ∈𝑅 Z𝑛𝑞 and 𝑠 ∈ Z𝑛𝑞 . If 𝑠 can be
|
||||
For 𝑓 (𝑥) = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 is mapped to computed with probability poly(1∕ log 𝑞) in time poly(log 𝑞), then the
|
||||
DCP𝓁𝑞 problem is considered to be broken.
|
||||
𝑅𝑜𝑡(𝑓 ) = 𝑎0 𝐼 + 𝑎1 𝑋 + ⋯ + 𝑎𝑛−1 𝑋 𝑛−1 ∈ .
|
||||
̃
|
||||
|
||||
Among them, ̃ is the mapping of all Z[𝑥]∕<𝑥𝑛 + 1> to the elements in
|
||||
1
|
||||
the ideal lattice collection, and https://blog.csdn.net/m0_61869253/article/details/139362753
|
||||
|
||||
|
||||
3
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
3.2. Security proof of OPRF
|
||||
|
||||
Note 1. The Dihedral Coset Problem is a difficult problem in quantum In this subsection, we will provide the definition of the underly-
|
||||
computing, and solving it has a time complexity of 𝑂(𝑒𝑛 ) or 𝑂(𝑛!). ing lattice problem for OPRF, learning parity with rounding, and its
|
||||
reduction proof.
|
||||
|
||||
Lemma 1. If an efficient algorithm can solve DCP𝓁2 in polynomial
|
||||
Definition 11 (Learning Parity with Rounding). Let 𝜆 be the security
|
||||
time, then there exists an efficient algorithm ′ that can solve DCP𝓁𝑞 in
|
||||
parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆) be integers. The LPR problem states
|
||||
polynomial time.
|
||||
that for 𝐴 ∈ Z𝑚×𝑛
|
||||
2
|
||||
, 𝑠 ∈ Z𝑛2 , 𝑢 ∈ Z𝑚 2
|
||||
the following distributions are
|
||||
computationally indistinguishable: (𝐴, ⌊𝐴𝑠 mod 4⌋1 ) ≈𝐶 (𝐴, ⌊𝑢⌋1 ).
|
||||
Proof. We use a proof by contradiction. Suppose 𝑞 = 2𝑛 and there exists
|
||||
an efficient algorithm that can solve DCP𝓁2 in polynomial time. For Definition 12 (Learning Parity with Rounding Over Ring). The Ring LPR
|
||||
instances of DCP𝓁4 , we have problem states that for 𝑎, 𝑠, 𝑢 ∈ 2 the following distributions are
|
||||
|0⟩|𝑥𝑖 ⟩+|1⟩|(𝑥𝑖 + 𝑠) mod 4⟩ = |0⟩|𝑥′𝑖 ⟩ + |1⟩|(𝑥′𝑖 + 𝑠′ ) mod 2⟩ computationally indistinguishable: (𝑎, ⌊𝑎𝑠 mod 4⌋1 ) ≈𝐶 (𝑎, ⌊𝑢⌋1 ).
|
||||
+ 2(|0⟩|𝑥′′ ′ ′′
|
||||
𝑖 ⟩ + |1⟩|(𝑥𝑖 + 𝑠 ) mod 2), 𝑖 ≤ 𝓁,
|
||||
|
||||
so running the algorithm twice will solve DCP𝓁4=22 . Similarly, run- Lemma 4. For an LWR problem instance ⌊𝐴𝑠⌋𝑝 , if there exists an algorithm
|
||||
ning four times will solve DCP𝓁16=24 , and continuing in this manner, for solving 𝑠 from ⌊𝐴𝑠⌋1 , then there also exists an algorithm ′ for
|
||||
running the algorithm 𝑛 times will solve DCP𝓁𝑞 . Let 𝑂() represent solving the LWR problem.
|
||||
the time complexity of the algorithm . Thus, we have ′ ≤ 𝑛𝑂()
|
||||
and algorithm ′ is an efficient algorithm. □ Proof. Given that there exists an algorithm that can solve ⌊𝐴𝑠⌋1 =
|
||||
⌊ 𝐴𝑠 ⌋, for an LWR problem instance ⌊𝐴𝑠⌋𝑝 , we have:
|
||||
𝑞 ⌊ ⌋
|
||||
Definition 10 (Extrapolated Dihedral Coset Problem with model 2, [21]). 1 1 𝑝𝐴𝑠
|
||||
⌊𝐴𝑠⌋𝑝 =
|
||||
Given a security parameter 𝜅, an instance of EDCP𝓁𝑛,2,𝜌 is provided, 𝑝 𝑝 𝑞
|
||||
( )
|
||||
where 2 denotes the modulus, 𝜌 represents the probability density 1 𝑝𝐴𝑠
|
||||
= +𝑒 (𝑒 ∈ (−1, 0]𝑚 )
|
||||
function, and 𝓁 denotes the number of states. Each state is expressed 𝑝 𝑞
|
||||
( ( ]𝑚 )
|
||||
as 1 1
|
||||
∑ = 𝐴𝑠 + 𝑒′ 𝑒′ ∈ − , 0
|
||||
𝜌(𝑗)|𝑗⟩|(𝑥𝑖 + 𝑗 𝑠) mod 2⟩, 𝑖 ≤ 𝓁, 𝑞 𝑝
|
||||
𝑗∈supp(𝜌) ≈ ⌊𝐴𝑠⌋1 .
|
||||
and stores 2 bits, where 𝑥𝑖 ∈𝑅 Z𝑛2 and 𝑠 ∈ Z𝑛2 . If 𝑠 can be determined
|
||||
Thus, the algorithm can be used to solve the LWR problem. □
|
||||
with probability poly(1∕(𝑛 log 2)) in time poly(𝑛 log 2), then the EDCP𝓁𝑛,2,𝜌
|
||||
problem is considered to be broken. We get next corollary by Lemma 3.
|
||||
√
|
||||
Corollary 3. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)
|
||||
Lemma 2. If there exists an algorithm for solving EDCP𝓁𝑛,4,𝜌 , then this be an instance of 2-LWR. If there exists an algorithm for solving 2-LWR,
|
||||
algorithm can also solve DCP𝓁4 . then there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .
|
||||
𝑟
|
||||
|
||||
|
||||
√
|
||||
Proof. Let Corollary 4. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)
|
||||
1 1 be an instance of LPR. If there exists an algorithm for solving LPR, then
|
||||
|𝑏⟩ = √ |0⟩|𝑥𝑖 ⟩ + √ |1⟩|(𝑥𝑖 + 𝑠) mod 4⟩.
|
||||
2 2 there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .
|
||||
𝑟
|
||||
|
||||
Thus, 𝜌(0)|0⟩ = √1 |0⟩ and 𝜌(1)|1⟩ = √1 |1⟩. Hence, DCP𝓁2 is a special
|
||||
2 2
|
||||
case of EDCP𝓁𝑛,2,𝜌 . Therefore, if there exists an algorithm for solving Lemma 5. If there exists an algorithm for solving the Ring-LPR problem,
|
||||
EDCP𝓁𝑛,2,𝜌 , this algorithm can also solve DCP𝓁2 . □ then there also exists an algorithm ′ for solving the LPR problem.
|
||||
|
||||
|
||||
√ Proof. For an instance of the inner product Ring-LPR
|
||||
Lemma 3 ([21]). Let (𝑛, 𝑞 , 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and
|
||||
(𝑛, 𝑞 , 𝛼) be an instance of LWE. If there exists an algorithm for solving 𝑏 = ⌊𝑎 ⋅ 𝑠⌋1
|
||||
LWE𝑛,𝑞,𝛼 , then there exists an algorithm for solving G-EDCP𝓁𝑛,𝑞,𝜌 . where 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 , we can represent 𝑎 as a circulant
|
||||
𝑟
|
||||
matrix, specifically
|
||||
√ ⎛ 𝑎0 −𝑎𝑛−1 ⋯ −𝑎1 ⎞
|
||||
Corollary 2. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼) ⎜ ⎟
|
||||
𝑎 𝑎0 ⋯ −𝑎2 ⎟
|
||||
be an instance of LPN. If there exists an algorithm for solving LPN𝑛,𝛼 , then 𝐴1 ∶= ⎜ 1
|
||||
.
|
||||
⎜ ⋮ ⋮ ⋱ ⋮ ⎟
|
||||
there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 . ⎜ 𝑎 ⎟
|
||||
𝑟
|
||||
⎝ 𝑛−1 𝑎𝑛−2 ⋯ 𝑎0 ⎠
|
||||
Thus,
|
||||
3. Ring-LPR based OPRF
|
||||
𝑏 = ⌊𝑎 ⋅ 𝑠⌋1 ⇒ 𝑏 = 𝐴1 𝑠.
|
||||
3.1. Constructing OPRF where 𝑎 = (𝑎0 , 𝑎1 , … , 𝑎𝑛−1 ) ← 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 . We use
|
||||
a proof by contradiction. Suppose there exists an efficient algorithm
|
||||
Fig. 3 presents the ring LPR-based oblivious pseudorandom func- that can solve Ring-LPR in polynomial time. We take the first row
|
||||
tion. In the next section, we will prove the security of the oblivious from 𝐴1 , denote it as 𝛼1 , and have ⌊𝛼1 𝑠⌋1 = 𝑏1 , where 𝑏1 is the first
|
||||
pseudorandom function. component of 𝑏. For the LWR problem instance, 𝛽⃗ = ⌊𝛬𝑠⃗⌋1 , assume
|
||||
|
||||
4
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Oblivious Pseudorandom Function (OPRF).
|
||||
|
||||
|
||||
|
||||
𝛬𝑇 = (𝛼1 , 𝛼2 , … , 𝛼𝑚 ).
|
||||
|
||||
Thus, we use the algorithm 𝑚 times to find 𝛽𝑖 such that ⌊𝛾𝑖 ⌋1 = 𝛽𝑖 =
|
||||
⌊𝛼1 𝑠1 ⌋1 , and thus we can solve the equation
|
||||
𝛾 = 𝛬𝑠⃗, 𝛾 𝑇 = (𝛾1 , … , 𝛾𝑚 ).
|
||||
|
||||
|
||||
Assuming that the time complexity of solving 𝑠 from LWR problem
|
||||
instance is 𝑂(𝛬, 𝛽), according to Corollary 3, let 𝑂(𝛾 = 𝛬𝑠⃗) be the
|
||||
computational complexity of solving the equation 𝛾 = 𝛬𝑠⃗, we have
|
||||
𝑚𝑂() + 𝑂(𝛾 = 𝛬𝑠⃗) ≥ 𝑂(𝛬, 𝛽) ≥ 𝑂(𝑛!) or 𝑂(𝑒𝑛 ).
|
||||
|
||||
Let 𝑚 = 𝑛, then
|
||||
𝑂(𝛬, 𝛽) − 𝑂(𝛾 = 𝛬𝑠⃗)
|
||||
𝑂() ≥
|
||||
𝑛
|
||||
𝑂(𝑛!) − 𝑂(𝛾 = 𝛬𝑠⃗) 𝑂(𝑒𝑛 ) − 𝑂(𝛾 = 𝛬𝑠⃗)
|
||||
≥ or .
|
||||
𝑛 𝑛
|
||||
This contradicts the assumption that there is an efficient algorithm
|
||||
that can solve the inner product Ring-LPR in polynomial time, thus the
|
||||
theorem holds. □
|
||||
|
||||
|
||||
3.3. Efficiency analysis
|
||||
|
||||
This section simulates the OPRF computation efficiency of this
|
||||
paper and OPRF in [14] on MAC, Pad and Phone. The PRF of [14]
|
||||
is instantiated based on LWE.
|
||||
|
||||
3.3.1. Efficiency analysis on MAC
|
||||
The tools used in the subsection are Python 3.12, the programs are
|
||||
performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB (see
|
||||
Fig. 4).
|
||||
|
||||
3.3.2. Efficiency analysis on mobile pad
|
||||
The tools used in the subsection are Pydriod 3, the programs are
|
||||
performed on Xiaomi Pad 6 Pro File Explorer 1th Qualcomm(R)AI En-
|
||||
gine(TM) Xiaolong 8+ mobile platform@3.2 GHz, RAM 8.00+3.00 GB
|
||||
(see Fig. 5).
|
||||
Fig. 4. Parallel comparison of OPRF on MAC, where 𝑛 represents the security
|
||||
parameter, unit is microseconds.
|
||||
3.3.3. Summary of data comparison
|
||||
From the simulation results, it can be seen that for 𝑛 ≤ 250, the
|
||||
LWE-based OPRF in [14] is slightly faster, while for 𝑛 > 250, the ring
|
||||
LPR-based OPRF in this paper is faster. Furthermore, as 𝑛 increases, 4. PSI based on OPRF
|
||||
the advantages of ring LPR become more pronounced. Based on the
|
||||
simulation results for Pad, the OPRF in this paper is more stable; In this paper, apart from OPRF, another tool used in the construction
|
||||
although there are fluctuations, they are less significant compared to of PSI is a perturbed pseudorandom generator [15]. The perturbed
|
||||
the LWE-based OPRF in [14]. pseudorandom generator in this paper is constructed from Ring-LPN.
|
||||
|
||||
5
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 6. Pseudorandom generator with perturbation 𝐺𝛾 (⋅).
|
||||
|
||||
|
||||
|
||||
√
|
||||
√𝑛−1
|
||||
√∑
|
||||
‖𝑎‖ = √ |𝑎 |2 . 𝑖
|
||||
𝑖=0
|
||||
|
||||
|
||||
|
||||
|
||||
Definition 15 ([15]). A pseudorandom generator with perturbation,
|
||||
denoted as 𝐺𝛾 (⋅), is defined such that for 𝑥1 , 𝑥2 ∈ , there exists 𝛾
|
||||
satisfying the following conditions:
|
||||
|
||||
1. When 𝑥1 = 𝑥2 , Pr (𝐺𝛾 (𝑥1 ) = 𝐺𝛾 (𝑥2 )) ≤ 𝑂(exp(−𝑛)),
|
||||
2. When 𝑥1 = 𝑥2 , such that ‖𝐺𝛾 (𝑥1 ) − 𝐺𝛾 (𝑥2 )‖ < 𝛾, there exists 𝑁
|
||||
such that ‖𝐺𝛾 (𝑥1 ) − 𝐺𝛾 (𝑥2 )‖ ≥ 𝛾 ⋅ 𝑁, where clearly 𝑁 = 1 is
|
||||
optimal.
|
||||
|
||||
|
||||
|
||||
Theorem 1. The Ring-LPN problem itself can be viewed as a pseudorandom
|
||||
function with perturbations.
|
||||
|
||||
|
||||
Proof. We prove each statement separately. First, when 𝑥1 = 𝑥2 , we
|
||||
Fig. 5. Parallel comparison of OPRF on mobile pads, where 𝑛 represents the security have
|
||||
parameter, unit is microseconds. ( ) 1
|
||||
Pr 𝐺𝛾 (𝑥1 ) = 𝐺𝛾 (𝑥2 ) = Pr (𝑒1 = 𝑒2 ) = 𝑛 .
|
||||
2
|
||||
√
|
||||
Additionally, set 𝛾 = 𝑛 + 1, so
|
||||
Next, we will present the reduction process for Ring-LPN.
|
||||
‖(𝐴𝑥1 + 𝑒1 ) − (𝐴𝑥2 + 𝑒2 )‖ = ‖𝑒1 − 𝑒2 ‖ < 𝛾 .
|
||||
4.1. Reduction of ring-LPN When 𝑥1 ≠ 𝑥2 , set 𝑣1 = 𝐺𝛾 (𝑥1 ), 𝑣2 = 𝐺𝛾 (𝑥2 ), and know that
|
||||
√ ∑𝑛 ( )𝑘 ( )𝑛−𝑘
|
||||
1 1
|
||||
Definition 13 (Learning Parity with Noise Over Ring). The learning parity Pr (‖𝑣1 − 𝑣2 ‖ ≤ 𝑛) = 𝐶𝑛𝑘
|
||||
𝑘=0
|
||||
3 2
|
||||
with noise over ring problem states that for 𝑎, 𝑠, 𝑒, 𝑢 ∈ {0,1} the
|
||||
following distributions are computationally indistinguishable: (𝑎, 𝑎𝑠 + ∑
|
||||
𝑛∕2 ( )𝑘 ( )𝑘 ( )𝑛−2𝑘
|
||||
1 1 1
|
||||
+ 𝐶𝑛𝑘 .
|
||||
𝑒) ≈𝐶 (𝑎, 𝑢). 3 6 2
|
||||
𝑘=0
|
||||
|
||||
Because
|
||||
( )𝑘 ( )𝑛−𝑘 ( ( )2 ( )𝑛 )
|
||||
Corollary 5. If there exists an efficient algorithm that can solve the ∑𝑛
|
||||
1 1 1 2 2 2
|
||||
Ring-LPN problem in polynomial time, then there also exists an algorithm 𝐶𝑛𝑘 = 𝑛 + +⋯+
|
||||
𝑘=0
|
||||
3 2 2 3 3 3
|
||||
′ that can solve the LPN problem. ( ( )𝑛 )
|
||||
3 2
|
||||
= 𝑛 1− ,
|
||||
2 3
|
||||
Proof. The proof method is similar to that of Lemma 5, but this way
|
||||
and
|
||||
the computational complexity of will decrease. If we want the Ring- ( )
|
||||
∑
|
||||
𝑛∕2 ( )𝑘 ( )𝑘 ( )𝑛−2𝑘 ( ) 2𝑛
|
||||
LPN problem to be ‘approximately’ as hard as the LPN problem, then 1 1 1 3⋅6 1 1
|
||||
𝐶𝑛𝑘 ≤ 1− .
|
||||
for the security parameters 𝜅1 of the Ring-LPN problem and 𝜅2 of the 𝑘=0
|
||||
3 6 2 17 2𝑛− 2𝑛 3⋅6
|
||||
LPN problem, we have
|
||||
Therefore
|
||||
𝑒𝜅1 (𝜅 )! ( √ √ )
|
||||
≥ 𝑒𝜅2 , or 1 ≥ (𝜅2 )!. 1
|
||||
Pr ‖𝑣1 − 𝑣2 ‖ ≤ 𝑛 < 𝑛 + 1 ≤ 𝑛 .
|
||||
𝜅12 𝜅12 2
|
||||
√
|
||||
Thus, we can roughly obtain 𝜅1 ≥ 1.5𝜅2 and 𝜅2 ≥ 12. Note that 𝑂(𝑛) Thus, there is a very high probability that ‖𝑣1 −𝑣2 ‖ ≥ 𝑛 + 1, and 𝑁 = 1
|
||||
is an asymptotically large quantity with respect to 𝑛. We use the most (see Fig. 6). □
|
||||
extreme case to determine the relationship between 𝜅1 and 𝜅2 . □
|
||||
|
||||
|
||||
4.2. Perturbed pseudorandom generator 4.3. PSI based on OPRF
|
||||
|
||||
Definition 14. Let 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 ∈ {0,1} . Define the Lemma 6. Assuming 𝑓 (𝑦) ≈𝐶 𝑢1 and 𝑔(𝑢1 ) ≈𝐶 𝑢2 , then (𝑔◦𝑓 )(𝑦) ≈𝐶 𝑢2 .
|
||||
norm of 𝑎 as ‖𝑎‖, and
|
||||
|
||||
6
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 7. PSI based on OPRF.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 9. Parallel comparison of PSI on mobile pads, where 𝑛 represents the security
|
||||
parameter, unit is microseconds.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. Parallel comparison of PSI on MAC, where 𝑛 represents the security parameter, Fig. 10. Comparison of PSI on mobile phones, where 𝑛 represents the security
|
||||
unit is microseconds. parameter, unit is microseconds.
|
||||
|
||||
|
||||
|
||||
7
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 11. PIR based on OPRF.
|
||||
|
||||
|
||||
Proof. On one hand, because the pseudorandom 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →
|
||||
{0,1} , for any 𝑘 ∈ {0,1} , 𝑦 ∈ ⊂ {0, 1}∗ , we have 𝐹̃𝑘 (𝑦) ≈𝐶 𝑢𝜔 ∈
|
||||
{0,1} .
|
||||
On the other hand, due to the pseudorandom function 𝐹𝑘 ∶ {0,1} ×
|
||||
{0,1} → {0,1} , for 𝑢𝓁1 ∈ {0,1} , we have 𝐹𝑘 (𝑢𝓁1 ) ≈𝐶 𝑢𝜔 . According
|
||||
to the property of the hash function, have 1 (𝑦) ≈𝐶 𝑢𝓁1 . Combining
|
||||
with Lemma 6, one can obtain that 𝐹𝑘 (1 (𝑦)) ≈𝐶 𝑢𝜔 . Consequently,
|
||||
𝐹̃𝑘 (𝑦) ≈𝐶 𝐹𝑘 (1 (𝑦)). □
|
||||
|
||||
|
||||
Theorem 2. If 1 is a collision resistant hash function, 2 and 3
|
||||
are hamming correlation robustness, then the protocol in Fig. 7 securely
|
||||
realizes 𝑃 𝑆 𝐼 in the semi-honest model when parameters 𝑚, 𝑤 are chosen
|
||||
as described in [14].
|
||||
|
||||
|
||||
Proof. Perspective from 𝑃1 .
|
||||
Hyb0 𝑃1 ’s view and 𝑃2 ’s output in the real protocol.
|
||||
Hyb1 Same as Hyb0 except that on 𝑃2 ’s side, for each 𝑖 ∈ [𝜔], if 𝑠[𝑖] = 0,
|
||||
then sample 𝐴𝑖 ← {0, 1}𝑚 and compute 𝐵𝑖 = 𝐴𝑖 ⊕ 𝐷𝑖 ; otherwise
|
||||
sample 𝐵𝑖 ← {0, 1}𝑚 and compute 𝐴𝑖 = 𝐵𝑖 ⊕ 𝐷𝑖 . This hybrid is
|
||||
identical to Hyb0 .
|
||||
Hyb2 Initialize an 𝑚 × 𝑤 binary matrix 𝐷 to all 1’s. Denote its column
|
||||
vectors by 𝐷1 , … , 𝐷𝜔 . Then 𝐷1 = ⋯ = 𝐷𝜔 = 1𝑚 . For 𝑦 ∈ ,
|
||||
randomly select 𝑣 ← [𝑚]𝜔 , and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
|
||||
Hyb3 Find a suitable pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →
|
||||
{0,1} . For 𝑦 ∈ , compute 𝑣̃ = 𝐹̃𝑘 (𝑦), randomly select 𝑣 ← [𝑚]𝜔 ,
|
||||
and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
|
||||
Hyb4 Let there be a pseudorandom function 𝐹 ∶ {0,1} ×{0,1} → {0,1}
|
||||
and a hash function 1 ∶ {0, 1}∗ → {0,1} . For 𝑦 ∈ , compute
|
||||
𝑣′ = 𝐹𝑘 (1 (𝑦)), randomly select 𝑣 ← [𝑚]𝜔 , and set 𝐷𝑖 [𝑣[𝑖]] = 0 for
|
||||
all 𝑖 ∈ [𝜔].
|
||||
Hyb5 Let there be a pseudorandom function 𝐹 ∶ {0,1} × {0,1} →
|
||||
{0,1} , Hamming Correlation Robustness 2 ∶ Z𝑚×𝜔 {0,1}
|
||||
→ {0,1}
|
||||
and a hash function 1 ∶ {0, 1}∗ → {0,1} . For 𝑦 ∈ , compute
|
||||
𝑣′ = 𝐹𝑘 (1 (𝑦)), 𝑣 = 2 (𝑣′ ), and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
|
||||
Fig. 12. Parallel comparison of PIR on MAC, where 𝑛 represents the security parameter, Given that Hyb0 ≈𝐶 Hyb1 ≈𝐶 Hyb2 ≈𝐶 Hyb3 , Hyb4 ≈𝐶 Hyb5 and
|
||||
unit is microseconds. according to Lemma 7, it be known that Hyb3 ≈𝐶 Hyb4 . Therefore, we
|
||||
have Hyb0 ≈𝐶 Hyb5 .
|
||||
Perspective from 𝑃2 .
|
||||
Lemma 7. Find a suitable pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ → Hyb0 𝑃2 ’s view in the real protocol.
|
||||
{0,1} . Assuming that the pseudo-random function 𝐹𝑘 ∶ {0,1} × {0,1} →
|
||||
Hyb1 𝜓 ← {0,1} , all other aspects are consistent with the real
|
||||
{0,1} and the hash function 1 ∶ {0, 1}∗ → {0,1} are indistinguishable,
|
||||
protocol.
|
||||
we have
|
||||
Hyb2 Introduce 𝐺𝛾 ∶ {0,1} → {0,1} and Hamming Correlation
|
||||
𝐹̃𝑘 (𝑦) ≈𝐶 𝐹𝑘 (1 (𝑦)).
|
||||
Robustness 3 ∶ Z𝑚×𝜔 {0,1}
|
||||
→ {0,1} , let the initial matrices be
|
||||
𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , randomly select 𝑣 ∈ [𝑚]𝜔 , set 𝐶𝑖 [𝑣[𝑖]] = 0
|
||||
for all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]]).
|
||||
|
||||
8
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
Hyb3 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , find an appropriate • Setup The simulator generates some necessary parameters for the
|
||||
pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ → {0,1} . For 𝑦 ∈ , algorithms and selects an appropriate hash functions 1 ∶ {0, 1}∗ →
|
||||
compute 𝑣̃ = 𝐹̃𝑘 (𝑦), randomly select 𝑣 ← [𝑚]𝜔 , set 𝐶𝑖 [𝑣[𝑖]] = 0 for {0,1} , Hamming Correlation Robustness 2 ∶ {0,1} → [𝑚]𝜔 , Ham-
|
||||
all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]]). ming Correlation Robustness 3 ∶ Z𝑚×𝜔 → {0,1} and a 𝐺𝛾 ∶ {0,1} →
|
||||
{0,1}
|
||||
Hyb4 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , set a pseudo- {0,1} , a pseudorandom function 𝐹 ∶ {0,1} × {0,1} → {0,1} with
|
||||
random function 𝐹 ∶ {0,1} × {0,1} → {0,1} , a hash function key 𝑘 ∈ {0,1} . The adversary 𝑃1 selects 𝑠 and transmits 𝑠 to the
|
||||
1 ∶ {0, 1}∗ → {0,1} and Hamming Correlation Robustness simulator using OT.
|
||||
𝑚×𝜔
|
||||
3 ∶ Z{0,1} → {0,1} . For 𝑦 ∈ , compute 𝑣′ = 𝐹𝑘 (1 (𝑦)), • H-Query, PRF-Query and PRG-Query The adversary 𝑃1 makes
|
||||
randomly select 𝑣 ← [𝑚]𝜔 . Set 𝐶𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔]. Compute queries about the hash function, pseudorandom function, oblivious
|
||||
𝐺𝛾 (3 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]])). transfer values, and pseudorandom generator. The simulator pre-
|
||||
Hyb5 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , set a pseu- establishes lists for handling H-Query, PRF-Query, and PRG-Query
|
||||
dorandom function 𝐹 ∶ {0,1} × {0,1} → {0,1} and a hash respectively.
|
||||
function 1 ∶ {0, 1}∗ → {0,1} , Hamming Correlation Robustness
|
||||
𝑚×𝜔
|
||||
2 ∶ Z{0,1} → {0,1} and 3 ∶ Z𝑚×𝜔 → {0,1} . For 𝑦 ∈ , – 1 -Query For the 𝑖th query 𝑥𝑖 ∈ {0, 1}∗ corresponding to the
|
||||
{0,1}
|
||||
compute 𝑣′ = 𝐹𝑘 (1 (𝑦)), compute 𝑣′ = 𝐹𝑘 (1 (𝑦)). Set 𝐶𝑖 [𝑣[𝑖]] = 0 value of 1 , the simulator selects from the hash value list
|
||||
for all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (3 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]])). if available, otherwise selects a random 𝑋𝑖 ∈ {0,1} . Set 𝑋𝑖 =
|
||||
Similarly, it can be proven that Hyb0 ≈𝐶 Hyb5 . □ 1 (𝑥𝑖 ) and update the list accordingly.
|
||||
– 2 -Query For the 𝑖th query 𝑦𝑖 ∈ {0,1} corresponding to the
|
||||
value of 2 , the simulator selects from the hash value list if
|
||||
Definition 16 (CPA Security Model of the Protocol in Fig. 7). Assume available, otherwise selects a random 𝑌𝑖 ∈ [𝑚]𝜔 . Set 𝑌𝑖 = 2 (𝑦𝑖 )
|
||||
there exists a perturbed pseudorandom oracle machine 𝑃 𝑟𝑀𝛾 (where
|
||||
and update the list accordingly.
|
||||
𝛾 is the upper bound on the norm of the perturbation in 𝑃 𝑟𝑀𝛾 ), such
|
||||
– 3 -Query For the 𝑖th query 𝑧𝑖 ∈ Z𝑚×𝜔 corresponding to the
|
||||
that for an input 𝑥, it outputs two values: one is a random value 𝑦0 , {0,1}
|
||||
value of 3 , the simulator selects from the hash value list
|
||||
and the other is a pseudorandom value 𝑦1 with 𝑥 as its input.
|
||||
if available, otherwise selects a random 𝑍𝑖 ∈ {0,1} . Set 𝑍𝑖 =
|
||||
• Setup The simulator generates the necessary parameters for 3 (𝑧𝑖 ) and update the list accordingly.
|
||||
the algorithms. The adversary chooses 𝑠 and sends it to the – 𝐹 -Query For the 𝑖th query 𝑢𝑖 ∈ {0,1} corresponding to the value
|
||||
simulator using OT. of 𝐹 , the simulator selects from the pseudorandom function
|
||||
• Hash Queries, PRF Queries and PRG Queries The adversary value list if available, otherwise selects a random 𝑈𝑖 ∈ {0,1} .
|
||||
sequentially performs hash function queries, pseudorandom Set 𝑈𝑖 = 𝐹 (𝑢𝑖 , 𝑘) and update the list accordingly.
|
||||
function queries, and pseudorandom synthesizer queries. Here,
|
||||
– 𝐺𝛾 -Query For the 𝑖th query 𝑤𝑖 ∈ {0,1} corresponding to the
|
||||
the adversary cannot know the key in pseudorandom function
|
||||
value of 𝐺𝛾′ , the simulator selects from the pseudorandom
|
||||
queries.
|
||||
generator value list if available, otherwise selects a random
|
||||
• Challenge The adversary selects a private message 𝑚 and sends
|
||||
𝑊𝑖 ∈ {0,1} . Set 𝑊𝑖 = 𝐺𝛾′ (𝑤𝑖 ) and update the list accordingly.
|
||||
it to the simulator . The simulator queries the hash function,
|
||||
pseudorandom function, and oblivious transfer values of the real Note that 𝐺𝛾′ is not 𝐺𝛾black-box .
|
||||
scheme, inputs these results into the pseudorandom oracle ma-
|
||||
chine 𝑃 𝑟𝑀𝛾 , obtains two ciphertexts 𝑐0 and 𝑐1 , and sends them • Challenge 𝑃1 selects 𝑚 ∈ ∕ and sends it to . using the corre-
|
||||
to the adversary . sponding hash function queries and pseudorandom function queries,
|
||||
• Guessing After receiving the two ciphertexts 𝑐0 and 𝑐1 , guesses inputs the queried values into the black-box 𝐺𝛾′ , obtaining 𝜓0 and 𝜓1 ,
|
||||
which ciphertext corresponds to the encryption of 𝑚 and sends the and then sends 𝜓0 , 𝜓1 to 𝑃1 .
|
||||
guess back to the simulator . • Guess Based on the received 𝜓0 and 𝜓1 , 𝑃1 guesses whether 𝜓0 or
|
||||
The advantage of the adversary is defined as the advantage of the 𝜓1 is the ciphertext of the encrypted message 𝑚.
|
||||
simulator in distinguishing the outputs of 𝑃 𝑟𝑀𝛾 . According to the assumption, if the adversary 𝑃1 can break the
|
||||
scheme with a non-negligible advantage, then the simulator can
|
||||
Note 2. The 𝑃 𝑟𝑀 mentioned in this paper differs from [22]. In [22], also break the black-box 𝐺𝛾′ with a non-negligible advantage. This
|
||||
𝑃 𝑟𝑀 refers to a pseudorandom oracle machine that outputs random contradicts the assumption that 𝐺𝛾′ is secure. □
|
||||
values when the adversary does not know the pseudorandom function key,
|
||||
and outputs pseudorandom function values based on the key known to the
|
||||
adversary when the key is known. This is a single-value output. However, the 4.4. Efficiency analysis PSI
|
||||
𝑃 𝑟𝑀 required in this paper outputs both of these values simultaneously,
|
||||
making it a multi-value output. This section simulates the PSI computation efficiency of this pa-
|
||||
per and PSI in [14] on MAC, Pad, and Phone. The PRF of [14] is
|
||||
Theorem 3. If 1 is a collision resistant hash function, 2 and 3 are instantiated based on LWE.
|
||||
hamming correlation robustness, then the protocol in Fig. 7 securely realizes
|
||||
𝑃 𝑆 𝐼 in Definition 16.
|
||||
4.4.1. Efficiency analysis on MAC
|
||||
The tools used in the subsection are Python 3.12, the programs are
|
||||
Proof. Suppose the adversary 𝑃1 can break the scheme with non- performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB (see
|
||||
negligible advantage. Now, the simulator simulates the scheme. Fig. 8).
|
||||
Suppose there exists a black-box 𝐺𝛾𝑏𝑙𝑎𝑐 𝑘−𝑏𝑜𝑥 such that
|
||||
𝑦0 = 𝐺𝛾 (𝑥) ∈ {0,1} ,
|
||||
4.4.2. Efficiency analysis on mobile pad
|
||||
↗ The tools used in the subsection are Pydriod 3, the programs are
|
||||
𝐺𝛾𝑏𝑙𝑎𝑐 𝑘−𝑏𝑜𝑥 (𝑥) → (𝑦0 , 𝑦1 )
|
||||
↘ performed on Xiaomi Pad 6 Pro File Explorer 1th Qualcomm(R)AI En-
|
||||
𝑦1 ∈𝑅 {0,1} . gine(TM) Xiaolong 8+ mobile platform@3.2 GHz, RAM 8.00+3.00 GB
|
||||
(see Fig. 9).
|
||||
|
||||
9
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
4.5. Analysis of efficiency on mobile phones Acknowledgments
|
||||
|
||||
The tools used in the subsection are Pydriod 3, the programs are per- This work was supported in part by the National Nature Science
|
||||
formed on Redmi K30 File Explorer 4th Qualcomm(R)AI Engine(TM) Foundation of China under Grant 61872087 and Grant 51875457; in
|
||||
Qualcomm Xiaolong 730G 8+ mobile platform@2.2 GHz, RAM 6.00 GB part by the Key Foundation of National Natural Science Foundation
|
||||
(see Fig. 10). of China under Grant U19B2021; and in part by the Key Research
|
||||
and Development Program of Shaanxi under Program 2022GY-028 and
|
||||
Program 2022GY-050.
|
||||
4.5.1. Summary of data comparison
|
||||
From the simulation results, it can be seen that for 𝑛 ≤ 400, the Data availability
|
||||
LWE-based OPRF in [14] is slightly faster, while for 𝑛 > 400, the ring
|
||||
LPR-based OPRF in this paper is faster. Furthermore, as 𝑛 increases, No data was used for the research described in the article.
|
||||
the advantages of ring LPR become more pronounced. Based on the
|
||||
simulation results for Pad, the OPRF in this paper is more stable;
|
||||
although there are fluctuations, they are less significant compared to References
|
||||
the LWE-based OPRF in [14].
|
||||
[1] R. Lei, X. Chen, D. Liu, C. Song, Y. Tan, A. Ren, CEIU: Consistent and efficient
|
||||
incremental update mechanism for mobile systems on flash storage, J. Syst. Ar-
|
||||
5. Expansion of this work chit. 152 (2024) 103151, http://dx.doi.org/10.1016/j.sysarc.2024.103151, URL:
|
||||
https://www.sciencedirect.com/science/article/pii/S1383762124000882.
|
||||
[2] J. Sun, L. Yin, M. Zou, Y. Zhang, T. Zhang, J. Zhou, Makespan-minimization
|
||||
Private Information Retrieval (PIR) [23–29] is a technique that workflow scheduling for complex networks with social groups in edge
|
||||
enables a client to securely download a specific element, such as a computing, J. Syst. Archit. 108 (2020) 101799, http://dx.doi.org/10.1016/
|
||||
movie or a friend’s record, from a database managed by an untrusted j.sysarc.2020.101799, URL: https://www.sciencedirect.com/science/article/pii/
|
||||
server, such as a streaming service or a social network, without disclos- S1383762120300928.
|
||||
[3] Y. Gao, Y. Luo, L. Wang, X. Liu, L. Qi, W. Wang, M. Zhou, Efficient scalable
|
||||
ing to the server which particular element has been retrieved. Given
|
||||
multi-party private set intersection(-variants) from bicentric zero-sharing, in:
|
||||
the functional similarities between PIR and PSI, this paper extends its
|
||||
Proceedings of the Conference on Computer and Communications Security, CCS,
|
||||
exploration into the construction of PIR using OPRF (see Fig. 11). Association for Computing Machinery (ACM), New York, NY, USA, 2024.
|
||||
[4] M.O. Rabin, How to exchange secrets with oblivious transfer, 2005, URL: https:
|
||||
5.1. Efficiency analysis PIR //eprint.iacr.org/2005/187.
|
||||
[5] O. Goldreich, S. Goldwasser, S. Micali, How to construct random functions, J.
|
||||
ACM 33 (4) (1986) 792–807, http://dx.doi.org/10.1145/6490.6503.
|
||||
This section simulates the PSI computation efficiency of this paper [6] M. Naor, O. Reingold, Number-theoretic constructions of efficient pseudo-random
|
||||
and machine learning-based PIR in [30](DLMI for short) on MAC. functions, J. ACM 51 (2) (2004) 231–262, http://dx.doi.org/10.1145/972639.
|
||||
The tools used in the subsection are Python 3.12, the programs are 972643.
|
||||
[7] M.J. Freedman, Y. Ishai, B. Pinkas, O. Reingold, Keyword search and oblivious
|
||||
performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB.
|
||||
pseudorandom functions, in: J. Kilian (Ed.), Theory of Cryptography, Springer
|
||||
The OPRF-based PIR proposed in this paper has a runtime that Berlin Heidelberg, Berlin, Heidelberg, 2005, pp. 303–324.
|
||||
differs from the machine learning-based PIR by no more than approx- [8] S. Jarecki, X. Liu, Efficient oblivious pseudorandom function with applications
|
||||
imately 5 × 10−3 seconds. Additionally, the security of our PIR scheme to adaptive OT and secure computation of set intersection, in: O. Reingold (Ed.),
|
||||
is theoretically supported in comparison to [30] (see Fig. 12). Theory of Cryptography, Springer Berlin Heidelberg, Berlin, Heidelberg, 2009,
|
||||
pp. 577–594.
|
||||
[9] V.K. Yadav, N. Andola, S. Verma, S. Venkatesan, A survey of oblivious trans-
|
||||
6. Conclusion fer protocol, ACM Comput. Surv. 54 (10s) (2022) http://dx.doi.org/10.1145/
|
||||
3503045.
|
||||
This paper presents a PSI based on efficient post-quantum OPRF and [10] M.R. Albrecht, A. Davidson, A. Deo, N.P. Smart, Round-optimal verifiable
|
||||
oblivious pseudorandom functions from ideal lattices, in: J.A. Garay (Ed.), Public-
|
||||
proves its security under the semi-honest model, demonstrating security
|
||||
Key Cryptography – PKC 2021, Springer International Publishing, Cham, 2021,
|
||||
even in the CPA model in Definition 16. The addition of PPRG enables pp. 261–289.
|
||||
the PSI to effectively resist probabilistic attacks. In the simulation [11] N. Tyagi, S. Celi, T. Ristenpart, N. Sullivan, S. Tessaro, C.A. Wood, A fast
|
||||
experiments, the proposed PSI shows greater efficiency compared to and simple partially oblivious PRF, with applications, in: O. Dunkelman, S.
|
||||
post-quantum PSIs represented by LWE. Dziembowski (Eds.), Advances in Cryptology – EUROCRYPT 2022, Springer
|
||||
Although the PIR in this study is not as efficient as the machine International Publishing, Cham, 2022, pp. 674–705.
|
||||
[12] S. Casacuberta, J. Hesse, A. Lehmann, Sok: Oblivious pseudorandom functions,
|
||||
learning-based PIR, the gap between the two is already quite small.
|
||||
in: 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P),
|
||||
However, there are also notable shortcomings; the efficiency of the 2022, pp. 625–646, http://dx.doi.org/10.1109/EuroSP53844.2022.00045.
|
||||
proposed PSI still lags behind that of non-post-quantum PSIs, which [13] D. Boneh, D. Kogan, K. Woo, Oblivious pseudorandom functions from isogenies,
|
||||
will be addressed in future work. in: S. Moriai, H. Wang (Eds.), Advances in Cryptology – ASIACRYPT 2020,
|
||||
Springer International Publishing, Cham, 2020, pp. 520–550.
|
||||
[14] M. Chase, P. Miao, Private set intersection in the internet setting from lightweight
|
||||
CRediT authorship contribution statement oblivious PRF, in: D. Micciancio, T. Ristenpart (Eds.), Advances in Cryptology –
|
||||
CRYPTO 2020, Springer International Publishing, Cham, 2020, pp. 34–63.
|
||||
Zhuang Shan: Writing – original draft, Conceptualization. Leyou [15] Z. Shan, L. Zhang, Q. Wu, Q. Lai, Analysis, modify and apply in IIOT form
|
||||
Zhang: Writing – review & editing, Writing – original draft. Qing Wu: light-weight PSI in CM20, 2024, URL: https://eprint.iacr.org/2024/969.
|
||||
[16] J. Alwen, S. Krenn, K. Pietrzak, D. Wichs, Learning with rounding, revisited, in:
|
||||
Conceptualization. Qiqi Lai: Writing – review & editing. Fuchun Guo:
|
||||
R. Canetti, J.A. Garay (Eds.), Advances in Cryptology – CRYPTO 2013, Springer
|
||||
Writing – review & editing. Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 57–74.
|
||||
[17] A. Banerjee, C. Peikert, A. Rosen, Pseudorandom functions and lattices, in: D.
|
||||
Declaration of competing interest Pointcheval, T. Johansson (Eds.), Advances in Cryptology – EUROCRYPT 2012,
|
||||
Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 719–737.
|
||||
[18] D. Bellizia, C. Hoffmann, D. Kamel, H. Liu, P. Méaux, F.-X. Standaert, Y.
|
||||
The authors declare that they have no known competing finan- Yu, Learning parity with physical noise: Imperfections, reductions and FPGA
|
||||
cial interests or personal relationships that could have appeared to prototype, IACR Trans. Cryptogr. Hardw. Embed. Syst. 2021 (2021) 390–417,
|
||||
influence the work reported in this paper. URL: https://api.semanticscholar.org/CorpusID:235814670.
|
||||
|
||||
|
||||
10
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
[19] Y. Yu, J. Zhang, Smoothing out binary linear codes and worst-case sub- Leyou Zhang received the M.S. and Ph.D. degrees from Xid-
|
||||
exponential hardness for LPN, in: T. Malkin, C. Peikert (Eds.), Advances in ian University, Xi’an, China, in 2002 and 2009, respectively.
|
||||
Cryptology – CRYPTO 2021, Springer International Publishing, Cham, 2021, pp. From 2013 to 2014, he served as a visiting scholar at the
|
||||
473–501. University of Wollongong, Australia. He currently worked
|
||||
[20] V. Kolesnikov, R. Kumaresan, M. Rosulek, N. Trieu, Efficient batched oblivious in Xidian University as a professor.
|
||||
PRF with applications to private set intersection, in: Proceedings of the 2016 His current research interests include public key cryp-
|
||||
ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, tography, network security and computer security. He has
|
||||
Association for Computing Machinery, New York, NY, USA, 2016, pp. 818–829, over 120 scientific publications in many highly ranked
|
||||
http://dx.doi.org/10.1145/2976749.2978381. cybersecurity journals and conferences.
|
||||
[21] Z. Brakerski, E. Kirshanova, D. Stehlé, W. Wen, Learning with errors and
|
||||
extrapolated dihedral cosets, in: Public-Key Cryptography – PKC 2018, Springer
|
||||
International Publishing, 2018, pp. 702–727.
|
||||
[22] A. Jain, H. Lin, J. Luo, D. Wichs, The pseudorandom oracle model and ideal
|
||||
obfuscation, in: H. Handschuh, A. Lysyanskaya (Eds.), Advances in Cryptology –
|
||||
CRYPTO 2023, Springer Nature Switzerland, Cham, 2023, pp. 233–262.
|
||||
Qing Wu received the M.S. and Ph.D. degrees from the Xid-
|
||||
[23] S. Angel, H. Chen, K. Laine, S. Setty, PIR with compressed queries and amortized
|
||||
ian University, Xi’an, China, in 2006 and 2009, respectively.
|
||||
query processing, in: 2018 IEEE Symposium on Security and Privacy, SP, 2018,
|
||||
She currently works with Xi’an University of Posts and
|
||||
pp. 962–979, http://dx.doi.org/10.1109/SP.2018.00062. Communications, Xi’an, as a Professor. Her current research
|
||||
[24] A. Burton, S.J. Menon, D.J. Wu, Respire: High-rate PIR for databases with small interests include artificial intelligence security and cloud
|
||||
records, in: Proceedings of the Conference on Computer and Communications security.
|
||||
Security, CCS, Association for Computing Machinery (ACM), New York, NY, USA,
|
||||
2024.
|
||||
[25] J. Dujmovic, M. Hajiabadi, Lower-bounds on public-key operations in PIR, in: M.
|
||||
Joye, G. Leander (Eds.), Advances in Cryptology – EUROCRYPT 2024, Springer
|
||||
Nature Switzerland, Cham, 2024, pp. 65–87.
|
||||
[26] B. Fisch, A. Lazzaretti, Z. Liu, C. Papamanthou, Thorpir: Single server PIR via
|
||||
homomorphic thorp shuffles, in: Proceedings of the Conference on Computer and
|
||||
Communications Security, CCS, Association for Computing Machinery (ACM),
|
||||
New York, NY, USA, 2024.
|
||||
Qiqi Lai received the B.S. from PLA University of Informa-
|
||||
[27] A. Gascon, Y. Ishai, M. Kelkar, B. Li, Y. Ma, M. Raykova, Computationally
|
||||
tion Engineering, henan, China, in 2008. And he received
|
||||
secure private information retrieval and aggregation in the shuffle model, in:
|
||||
the M.S. and Ph.D. degrees from Xidian University, Xi’an,
|
||||
Proceedings of the Conference on Computer and Communications Security, CCS, China, in 2011 and 2015.
|
||||
Association for Computing Machinery (ACM), New York, NY, USA, 2024. His currently works with Shaanxi Normal University,
|
||||
[28] A. Ghoshal, M. Zhou, E. Shi, Efficient pre-processing PIR without public- Xi’an, as a Professor. His current research interests include
|
||||
key cryptography, in: M. Joye, G. Leander (Eds.), Advances in Cryptology – the theory of lattice-based public key cryptography and its
|
||||
EUROCRYPT 2024, Springer Nature Switzerland, Cham, 2024, pp. 210–240. provable security, as well as the construction and analysis
|
||||
[29] M. Luo, F.-H. Liu, H. Wang, Faster FHE-based single-server private information of homomorphic encryption schemes.
|
||||
retrieval, in: Proceedings of the Conference on Computer and Communications
|
||||
Security, CCS, Association for Computing Machinery (ACM), New York, NY, USA,
|
||||
2024.
|
||||
[30] M. Lam, J. Johnson, W. Xiong, K. Maeng, U. Gupta, Y. Li, L. Lai, I. Leontiadis,
|
||||
M. Rhu, H.-H.S. Lee, V.J. Reddi, G.-Y. Wei, D. Brooks, E. Suh, GPU-based
|
||||
Funcun Guo received the B.S. and M.S. degrees from Fujian
|
||||
private information retrieval for on-device machine learning inference, in:
|
||||
Normal University, China, in 2005 and 2008, respectively,
|
||||
Proceedings of the 29th ACM International Conference on Architectural Support and the Ph.D. degree from the University of Wollongong,
|
||||
for Programming Languages and Operating Systems, Volume 1, ASPLOS ’24, Australia, in 2013. He is currently an Associate Research
|
||||
Association for Computing Machinery, New York, NY, USA, 2024, pp. 197–214, Fellow with the School of Computing and Information
|
||||
http://dx.doi.org/10.1145/3617232.3624855. Technology, University of Wollongong.
|
||||
His primary research interests include the public
|
||||
key cryptography, in particular protocols, encryption and
|
||||
Zhuang Shan received the B.S. from Liaoning Institute of signature schemes, and security proof.
|
||||
Science and Technology, benxi, China, in 2019. And he
|
||||
received the M.S. from North Minzu University, yinchuan,
|
||||
China, in 2022.
|
||||
He is currently pursuing the Ph,D. degree in mathemat-
|
||||
ics with Xidian University, Xi’an, China. His current interests
|
||||
include cryptography, reduction of hard problems in lattice,
|
||||
and network security.
|
||||
|
||||
|
||||
|
||||
|
||||
11
|
||||
|
||||
@@ -0,0 +1,846 @@
|
||||
Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Journal of Systems Architecture
|
||||
journal homepage: www.elsevier.com/locate/sysarc
|
||||
|
||||
|
||||
|
||||
|
||||
A CP-ABE-based access control scheme with cryptographic reverse firewall
|
||||
for IoV
|
||||
Xiaodong Yang a , Xilai Luo a ,∗, Zefan Liao a , Wenjia Wang a , Xiaoni Du b , Shudong Li c
|
||||
a College of Computer Science and Engineering, Northwest Normal University, China
|
||||
b
|
||||
College of Mathematics and Statistics, Northwest Normal University, China
|
||||
c
|
||||
Cyberspace Institute of Advanced Technology, Guangzhou University, China
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: The convergence of AI and internet technologies has sparked significant interest in the Internet of Vehicles
|
||||
Attribute-based encryption (IoV) and intelligent transportation systems (ITS). However, the vast data generated within these systems
|
||||
Multi-authority poses challenges for onboard terminals and secure data sharing. To address these issues, we propose a novel
|
||||
Internet of Vehicles
|
||||
solution combining ciphertext policy attribute-based encryption (CP-ABE) and a cryptographic reverse firewall
|
||||
Cryptographic reverse firewall
|
||||
(CRF) mechanism for IoV. This approach offers several advantages, including offline encryption and outsourced
|
||||
Outsource decryption
|
||||
decryption to improve efficiency. The CRF mechanism adds an extra layer of security by re-randomizing
|
||||
vehicle data, protecting sensitive information. While single-attribute authority schemes simplify access control,
|
||||
they are not ideal for IoV environments. Therefore, we introduce a multi-authority scheme to enhance
|
||||
security. Performance analysis demonstrates our scheme’s ability to optimize encryption and decryption while
|
||||
safeguarding vehicle data confidentiality. In summary, our solution improves data management, access control,
|
||||
and security in the IoV, contributing to its safe and efficient development.
|
||||
|
||||
|
||||
|
||||
1. Introduction significant concerns about data security [5]. Therefore, cloud-based
|
||||
solutions alone are insufficient to meet the demands of the IoV. To
|
||||
Advances in 5G technology, coupled with the growing volume of ve- mitigate these issues, edge computing [6], fog computing [7], and
|
||||
hicular traffic, have intensified concerns regarding traffic safety, travel Roadside Units (RSUs) [8] have been proposed. RSUs, with their higher
|
||||
efficiency, and environmental impact. In response, Intelligent Transport computational capabilities, can process data more efficiently and up-
|
||||
Systems (ITS) and the IoV have emerged as critical components of load it to cloud servers in real time, addressing the challenges of latency
|
||||
modern transportation infrastructure. The functionality of the IoV relies and limited onboard processing power.
|
||||
on three key elements: the internal vehicle network, the vehicle-to- However, data security remains a critical issue. One potential so-
|
||||
vehicle communication network, and the in-vehicle mobile internet. lution is encrypting data before transmission, which introduces chal-
|
||||
These elements integrate technologies such as sensors, RFID (Radio Fre- lenges in ciphertext sharing. Traditional symmetric encryption, re-
|
||||
quency Identification), and automated control systems, operating under quiring a one-to-one correspondence between keys and users, proves
|
||||
established communication protocols to enable seamless, dynamic data inefficient for securing large volumes of data in IoV environments. Con-
|
||||
exchange between vehicles and the broader network.
|
||||
ventional asymmetric encryption algorithms also struggle with cipher-
|
||||
While drivers benefit from applications like navigation and traffic
|
||||
text sharing and are ill-suited for the frequent updates characteristic
|
||||
information sharing, the limited computing power of onboard terminals
|
||||
of IoV applications. A more appropriate approach is Attribute-Based
|
||||
is insufficient for computationally intensive tasks such as autonomous
|
||||
Encryption (ABE), which enables fine-grained access control, supports
|
||||
driving and AI-based obstacle avoidance [1]. A potential solution is
|
||||
encryption for multiple recipients, and facilitates the creation of com-
|
||||
offloading data processing to cloud servers, but the large volume of
|
||||
plex access policies [9–11]. ABE allows data owners to control who
|
||||
vehicle-generated data introduces high latency in communication be-
|
||||
can access their data, but the decryption process is computationally
|
||||
tween the onboard terminal and the cloud, compromising real-time
|
||||
decision-making [2–4]. This latency, coupled with the risks associated intensive, requiring numerous pairing and exponential operations. This
|
||||
with data leakage and theft in semi-trusted cloud environments, raises places a significant burden on resource-constrained onboard terminals,
|
||||
|
||||
|
||||
|
||||
∗ Corresponding author.
|
||||
E-mail addresses: yangxd200888@163.com (X. Yang), 2023222208@nwnu.edu.cn (X. Luo), lzf0097@163.com (Z. Liao), neuer1130@163.com (W. Wang),
|
||||
duxiaonwnu@163.com (X. Du), lishudong@gzhu.edu.cn (S. Li).
|
||||
|
||||
https://doi.org/10.1016/j.sysarc.2025.103331
|
||||
Received 11 August 2024; Received in revised form 4 December 2024; Accepted 2 January 2025
|
||||
Available online 17 January 2025
|
||||
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
hindering timely data retrieval and impeding efficient communication. Yang et al. [22] introduced a CP-ABE scheme for dynamic big data
|
||||
As the number of attributes increases, the decryption complexity grows, updates, and Feng et al. [23] developed a CP-ABE scheme for industrial
|
||||
leading to slower decryption times and higher resource consumption. IoT. Other schemes [24,25] have improved security and efficiency,
|
||||
To address these challenges, several outsourced ABE schemes have broadening ABE’s application to the Internet of Medical Things (IoMT).
|
||||
been proposed [12–15], which offload expensive operations to cloud CP-ABE enables fine-grained access control, making it highly appli-
|
||||
servers, alleviating the computational load on onboard terminals. How- cable in sectors such as smart healthcare and intelligent transportation.
|
||||
ever, even secure theoretical implementations of ABE are vulnerable to However, single-attribute authority ABE schemes are vulnerable to col-
|
||||
practical attacks. Sophisticated adversaries may exploit backdoors [16], lusion attacks. To address this, it is desirable to delegate each attribute
|
||||
manipulate pseudo-random number generators [17,18], or intercept to different attribute authorities. Chase [26] was the first to introduce
|
||||
hardware interactions to gain unauthorized access to sensitive data. To the concept of multiple attribute authorities within the ABE framework,
|
||||
counter these threats, the concept of a Cryptographic Reverse Firewall where various authorities oversee different attributes. Lewko and Wa-
|
||||
(CRF) was introduced [19]. The CRF, positioned between the user and ters [27] later introduced the initial decentralized ABE framework with
|
||||
the server, intercepts and alters messages to ensure data security, even multiple authorities. Following this, Chaudhary et al. [28] proposed
|
||||
if the user is compromised. a multi-authority CP-ABE scheme tailored for the Internet of Vehicles
|
||||
Moreover, traditional ABE schemes rely on a single attribute au- (IoV) context.
|
||||
thority, which poses a risk of key leakage if the authority colludes
|
||||
Considering the constrained computing capabilities of user termi-
|
||||
with an adversary. To mitigate this, we propose a multi-authority
|
||||
nals, Green et al. [12] introduced an ABE scheme that delegates de-
|
||||
ABE scheme, integrated with a CRF, to enhance security and prevent
|
||||
cryption computations to the cloud. Lai et al. [13] improved upon this
|
||||
collusion attacks. The key contributions of this paper are as follows:
|
||||
by achieving verifiability of outsourced decryption. Zhong et al. [29]
|
||||
1. We propose a CP-ABE-based scheme that enables more granular further enhanced the efficiency of outsourced decryption ABE schemes
|
||||
access control policies, enhancing the system’s flexibility. This and applied them to smart healthcare scenarios.
|
||||
proves particularly beneficial in IoV scenarios such as IoV com- Mironov and Stephens-Davidowitz [19] were the first to introduce
|
||||
munication, where data access can be dynamically adjusted in the concept of a reverse firewall. They proposed a generic architecture
|
||||
accordance with the context. to prevent user tampering, which could lead to data leakage. However,
|
||||
2. The scheme integrates multiple attribute authorities to prevent the previous approach was found unsuitable for ABE schemes, prompt-
|
||||
collusion attacks and guarantee secure key management. Each ing Ma et al. [30] to introduce a cryptographic reverse firewall utilizing
|
||||
authority is responsible for managing vehicle attribute keys, the CP-ABE scheme. Additionally, Hong et al. [31] proposed a KP-ABE
|
||||
enhancing the security and efficiency of key generation, which scheme with multiple authorities. Due to the limitations of KP-ABE in
|
||||
is ideal for environments like smart cities or autonomous vehicle achieving fine-grained access control, Zhao et al. [32] proposed a CP-
|
||||
fleets. ABE scheme incorporating a CRF and leveraged outsourced decryption
|
||||
3. We enhance the CRF module by incorporating key parameter to alleviate computational burdens. However, these approaches suffer
|
||||
re-randomization within the multi-authority ABE framework, from drawbacks, such as reliance on a single attribute authority or
|
||||
strengthening security in IoV communications, even if certain excessive computational overhead. Moreover, there is a risk of sys-
|
||||
parts of the system are compromised. tem compromise, which could lead to data leakage, especially in the
|
||||
4. The scheme optimizes decryption efficiency through the use of context of IoV, characterized by constrained computational resources
|
||||
online-offline encryption techniques and offloading decryption and stringent data privacy requirements. At the same time, the devel-
|
||||
operations. Decryption time does not increase linearly with the opment of IoV places higher demands on the security and flexibility
|
||||
number of attributes, making it suitable for real-time applica- of access control. Therefore, the proposed scheme combines CP-ABE,
|
||||
tions like hazard detection and traffic optimization. CRF, and multi-authority models to meet the requirements for security,
|
||||
5. The scheme also supports message integrity verification, which flexibility, and low computational overhead.
|
||||
can be easily carried out by onboard terminals using simple hash
|
||||
functions, ensuring the authenticity of IoV messages and pre-
|
||||
3. System model and definitions
|
||||
venting malicious tampering in safety-critical communications.
|
||||
The paper is organized as follows: Section 2 reviews existing 3.1. Preliminaries
|
||||
attribute-based encryption schemes and the application of CRFs. Sec-
|
||||
tion 3 provides an overview of the system and security models. Sec- 1. Bilinear Maps: Involve two multiplicative cyclic groups of prime
|
||||
tion 4 discusses the base scenario and the extended CRF module. order 𝑝, denoted as 𝐺 and 𝐺𝑇 , with 𝑔 representing a generator
|
||||
Section 5 presents security proofs for the base scheme and the CRF- of 𝐺. A bilinear map 𝑒 ∶ 𝐺 × 𝐺 → 𝐺𝑇 must satisfies the following
|
||||
enhanced scheme. Section 6 reports on experiments and results. Finally, three features:
|
||||
Section 7 concludes the paper.
|
||||
(a) Non-degeneracy: 𝑒(𝑔 , 𝑔) ≠ 1.
|
||||
2. Related work (b) Computability: Efficient computation of 𝑒(𝑀 , 𝑁) for any el-
|
||||
ements 𝑀 , 𝑁 ∈ 𝐺 is achievable through a polynomial-time
|
||||
Sahai [10] introduced fuzzy identity-based encryption, which paved algorithm.
|
||||
the way for Attribute-Based Encryption (ABE). ABE later branched (c) Bilinearity: Efficient computation of 𝑎, 𝑏 ∈ 𝑍𝑝 for any ele-
|
||||
into two forms: Key-Policy ABE (KP-ABE) [9] and Ciphertext-Policy ments 𝑀 , 𝑁 ∈ 𝐺 we can acquire 𝑒(𝑀 𝑎 , 𝑁 𝑏 ) = 𝑒(𝑀 , 𝑁)𝑎𝑏 .
|
||||
ABE (CP-ABE) [11]. Initially, both schemes used access trees to define
|
||||
policies. However, the first CP-ABE scheme only provided security 2. Access Structure: Consider a set 𝑃 = {𝑃1 , 𝑃2 , … , 𝑃𝑛 } representing
|
||||
under the random oracle model. Waters [20] introduced an LSSS-based 𝑛 users. A collection 𝑄 is deemed monotone if, for any subsets
|
||||
CP-ABE scheme that encodes policies using matrices. This founda- ∀𝐾 , 𝐿: if 𝐾 ∈ 𝑄 and 𝐾 ⊆ 𝐿, then 𝐿 ∈ 𝑄. Let 𝑄 bbe a nonempty
|
||||
tional model has influenced many subsequent ABE schemes, which subset of 𝑃 that is monotonic, i.e. 𝑄 ⊆ 2{𝑃1 ,𝑃2 ,…,𝑃𝑛 } ∖{∅}, then call
|
||||
have expanded into diverse domains, particularly cloud computing. 𝑄 a monotone access structure. In the context of access control,
|
||||
For example, Yu et al. [21] proposed a KP-ABE scheme enabling data sets included in 𝑄 are identified as authorized, while those that
|
||||
delegation to semi-trusted cloud servers while ensuring confidentiality. are not included are referred to as unauthorized sets.
|
||||
|
||||
2
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
3. Linear Secret Sharing Scheme (LSSS): Let 𝐴̃ = {𝐴̃ 1 , 𝐴̃ 2 , … , 𝐴̃ 𝑁 } be
|
||||
defined as the set that includes all possible attribute names. Cor-
|
||||
responding to each attribute name 𝐴̃ 𝑖 ∈ 𝐴̃ within A, there is an
|
||||
associated set of attribute values, denoted as 𝐴̃𝑖 = {𝐴𝑖,1 , 𝐴𝑖,2 , … ,
|
||||
𝐴𝑖,𝑏𝑖 }, where 𝑏𝑖 is the order of 𝐴̃ 𝑖 . The policy for access is denoted
|
||||
as 𝑇 = (𝑀 , 𝜌, 𝑉 ) Within the context of a linear secret sharing
|
||||
scheme, 𝑀 denotes a matrix structured with 𝑙 row size and 𝑛
|
||||
column size. 𝜌 denotes a function that associates each row of
|
||||
𝑀 with an attribute name in 𝐴̃ 𝑖 . 𝑉 = {𝑣𝜌(𝑖) }𝑖∈[1,𝑙] represents
|
||||
the set of attribute values associated with 𝑇 = (𝑀 , 𝜌). A LSSS
|
||||
encompasses the following pair of algorithms:
|
||||
|
||||
(a) Distribute: Regarding the confidential value 𝑠 ∈ 𝑍𝑝 , arbi-
|
||||
trarily choose a vector 𝑓 = (𝑠, 𝑓2 , … , 𝑓𝑛 ), where 𝑓2 , … , 𝑓𝑛 ∈
|
||||
𝑍𝑝 . Calculate 𝜆𝑖 = 𝑀𝑖 ⋅ 𝑓 , where 𝑀𝑖 is the 𝑖𝑡ℎ row of matrix
|
||||
𝑀. 𝜆𝑖 is a share of 𝑠 that corresponds to 𝜌(𝑖).
|
||||
(b) Reconstruct: Let 𝑆 ∈ 𝐴̃ is permissible for any recognized Fig. 1. Leak game.
|
||||
group and 𝐼 = {𝑖 ∶ 𝜌(𝑖) ∈ 𝑆} ⊆ {1, 2, … , 𝑙}, then, there
|
||||
∑
|
||||
is a collection of constants {𝜔𝑖 ∈ 𝑍𝑝 } satisfy 𝑖∈𝐼 𝜔𝑖 𝑀𝑖 =
|
||||
(1, 0, … , 0). The secret 𝑠 could be reconstructed by us via and a party 𝑃 form a composed party, then we call a
|
||||
∑
|
||||
calculating 𝑖∈𝐼 𝜔𝑖 𝑀𝑖 = 𝑠. cryptographic reverse firewall for 𝑃 . Next we give definitions
|
||||
of three properties of CRFs:
|
||||
Assume S= {𝐼𝑢 , 𝑆} represents the collection of attributes for
|
||||
users. 𝐼𝑢 ⊆ 𝐴̃ represents a collection of user attribute names. (a) Function Maintaining: In the context of any given reverse
|
||||
𝑆 = {𝑠𝑖 }𝑖∈𝐼𝑢 denotes a set that includes all the attribute values firewall identified by and any given party identified by
|
||||
of the user. For ∀𝑖 ∈ 𝐼, where 𝐼 = {𝑖 ∶ 𝜌(𝑖) ∈ 𝑆} ⊆ {1, 2, … , 𝑙}, 𝑃 , let 1 ◦𝑃 = ◦𝑃 . For 𝑘 ≥ 2, let 𝑘 ◦𝑃 = ◦( 𝑘−1 ◦𝑃 ).
|
||||
if 𝑖 satisfies (𝑀 , 𝜌) and 𝑠𝜌(𝑖) = 𝑣𝜌(𝑖) , thereafter, we identify S as For a framework that adheres to the functionality re-
|
||||
matching 𝑇 . quirement , we define the reverse firewall maintains
|
||||
4. q-BDHE problem: Suppose 𝐺 and 𝐺𝑇 represent two cyclic groups functionality if the composed party ◦𝑃 guarantees the
|
||||
with multiplication as their operation, and the order of each is functionality of the party 𝑃 under the scheme in poly-
|
||||
the prime 𝑝, and 𝑔 be a generator of 𝐺. 𝐺𝑇 has a bilinear map nomial time.
|
||||
𝑒 ∶ 𝐺 × 𝐺 → 𝐺𝑇 . Choose 𝑡, 𝑓 ∈ 𝑍𝑝 at random, and calculate (b) Weakly Security-preserving: operates under the premise
|
||||
2 𝑞 𝑞+2 2𝑞
|
||||
𝐽 = (𝑔 , 𝑔 𝑡 , 𝑔 𝑓 , 𝑔 𝑓 , … , 𝑔 𝑓 , 𝑔 𝑓 , … , 𝑔 𝑓 ). In the context of the 𝑞- that it will fulfill the functionality need and the security
|
||||
BDHE problem, it is posited that no algorithm operating within need . When faced with any polynomial-time adversary
|
||||
𝑞+1
|
||||
polynomial time can differentiate between 𝑒(𝑔 , 𝑔)𝑓 𝑡 ∈ 𝐺𝑇 and 𝐵, we say that the scheme satisfies weakly security-
|
||||
𝐾 ∈ 𝐺𝑇 with a significant advantage. preserving if ◦𝑃 satisfies the security requirement .
|
||||
5. Cryptographic Scheme: The cryptographic scheme defines the (c) Weakly Exfiltration-resistant: The game Leak(, 𝑃𝑗 , , 𝜆),
|
||||
interaction between parties (𝑃1 , 𝑃2 , … , 𝑃𝑙 ) with states. The pro- as depicted in the Fig. 1, is the work of designers Mironov
|
||||
cess of scheme establishment is denoted by 𝑠𝑒𝑡𝑢𝑝(1𝜆 ), where 𝜆 and Stephens-Davidowitz [19]. The game is a security
|
||||
refers to the security parameters. Each party enters the public game between a reverse firewall of party 𝑃 and a
|
||||
parameters 𝑃𝑔 and related messages, and then runs the sys- scheme containing a tampering party . The adversary
|
||||
tem initialization algorithm to obtain the corresponding state may control a party by hacking into the party’s algorithm
|
||||
(𝜐𝑃𝑖 )𝑙𝑖=1 for each party. According to the order in which the 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒, 𝑛𝑒𝑥𝑡, 𝑜𝑢𝑡𝑝𝑢𝑡.
|
||||
scheme proceeds, the parties process messages from other parties The purpose of the game is to let the adversary discern
|
||||
in the scheme. Also, each party must have the corresponding whether the party’s actions are honest or tampered with.
|
||||
algorithms 𝑛𝑒𝑥𝑡𝑃𝑖 (𝜐𝑃𝑖 ) and 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒𝑃𝑖 (𝜐𝑃𝑖 ). 𝑛𝑒𝑥𝑡𝑃𝑖 (𝜐𝑃𝑖 ) is used to Thus, a reverse firewall with leak resistance can make it
|
||||
output the updated message, 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒𝑃𝑖 (𝜐𝑃𝑖 ) is used to output the impossible for an adversary to tell if party 𝑃 has been tam-
|
||||
states of the parties after the message update. After the scheme pered with, or if the party is known to have been tampered
|
||||
is completed, each party has algorithm 𝑜𝑢𝑡𝑝𝑢𝑡𝑃𝑖 (𝜐𝑃𝑖 ) return the with but does not know if the operation is honest, hence
|
||||
results of the scheme. We assume that the scheme meets protecting the important privacy of the party.
|
||||
functionality requirement and security requirements . If adversary 𝐵 within the Leak(, 𝑃𝑗 , , 𝜆) game cannot
|
||||
6. Cryptographic Reverse Firewall: , the stateful algorithm, is syn- succeed in polynomial time with a noticeable advantage
|
||||
onymous with the Cryptographic Reverse Firewall. When pro- and while maintaining the party’s functionality , then we
|
||||
vided with a current state and an input message, the algorithm label the reverse firewall as weakly capable of resisting
|
||||
processes them and subsequently outputs an updated state and exfiltration.
|
||||
message. For ease of presentation, the state of is not explicitly
|
||||
written out in the definition. Given that 𝑃 is a party and is a
|
||||
firewall, the expression ◦𝑃 is introduced to indicate the party 3.2. System model
|
||||
that emerges from their composition.
|
||||
Fig. 2 depicts the four components that constitute our scheme:
|
||||
◦𝑃 = 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒◦𝑃 (𝜐, )
|
||||
Attribute authorities (AA), Cloud server (CS), Data user (DU), Data
|
||||
= 𝑟𝑒𝑐 𝑒𝑖𝑣𝑒𝑃 (𝜐, (𝑚)) owner (DO). In addition, the system contains three reverse firewalls.
|
||||
= 𝑛𝑒𝑥𝑡◦𝑃 = (𝑛𝑒𝑥𝑡𝑃 (𝜐)) To implement data re-randomization within the RSU, three firewalls
|
||||
are strategically positioned: 𝐴𝐴 , the reverse wall for AA; 𝐷𝑂 , acting
|
||||
= 𝑜𝑢𝑡𝑝𝑢𝑡◦𝑃 (𝜐) = 𝑜𝑢𝑡𝑝𝑢𝑡𝑃 (𝜐) (1)
|
||||
as the reverse firewall for DO; and 𝐷𝑈 , fulfilling the same role for
|
||||
When the composite party participates in the scheme, the initial DU.
|
||||
state of the firewall is set as the public parameter 𝑃𝑔 . If CS is mainly deployed to store cipher text and conversion key.
|
||||
|
||||
3
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
algorithm 𝐾 𝑒𝑦𝐺𝑒𝑛 and obtains corresponding secret key 𝑆 𝐾𝑖 .
|
||||
Then 𝐹 executes algorithm 𝐴𝐴 .𝐾 𝐺 and gets the re-randomized
|
||||
private key 𝑆 𝐾𝑖 ′ . Subsequently, 𝐹 executes 𝐾 𝑒𝑦𝐺𝑒𝑛.𝑟𝑎𝑛 to get
|
||||
conversion key 𝑇 𝐾𝑖 . Then 𝐹 executes 𝐷𝑈 .𝑇 𝐾 𝑈 𝑝𝑑 𝑎𝑡𝑒 to ob-
|
||||
tain re-randomized conversion key 𝑇 𝐾𝑖 ′ . Eventually, 𝐹 sends
|
||||
(𝑆 𝐾𝑖 ′ , 𝑇 𝐾𝑖 ′ ) to 𝐵.
|
||||
4. Challenge Phase: Two equal-length plaintexts, 𝑚0 , 𝑚1 , are deliv-
|
||||
ered by 𝐵 as part of the protocol. 𝐹 randomly chooses 𝑏 ∈
|
||||
{0, 1} and executes Enc.Offline*, Enc.Online* to obtain challenge
|
||||
ciphertext 𝐶 𝑇𝑏 . Then 𝐹 calls 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒, 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒
|
||||
to get updated cipher text 𝐶 𝑇𝑏 ′ . 𝐹 sends 𝐶 𝑇𝑏 ′ to 𝐵.
|
||||
5. Query Phase 2: Same as Query Phase 1.
|
||||
6. Guess Phase: 𝐵 outputs the guess 𝑏′ ∈ {0, 1} for 𝑏.
|
||||
|
||||
|
||||
Definition 1. The criterion for the basic scheme’s selective CPA-secure
|
||||
is met when the probability of adversary 𝐵’s success in the game during
|
||||
Fig. 2. System model. polynomial time is negligible.
|
||||
|
||||
4. System construction
|
||||
AA is charged with the responsibility of establishing the public
|
||||
parameters and generating the master secret keys. 4.1. Basic scheme
|
||||
DU includes setting the access policy that guides the encryption
|
||||
process and producing a verification credential. After these steps are The scheme contains 𝑁 attribute authorities, each attribute author-
|
||||
accomplished, the DU uploads both the encrypted data and the verifi- ity managing one class of attributes 𝐴̃𝑖 = {𝐴𝑖,1 , 𝐴𝑖,2 , … , 𝐴𝑖,𝑏𝑖 }, 𝐴𝑖,1 ∈ 𝑍𝑝 ,
|
||||
cation credential to the cloud server. 𝑖 = 1, 2, … , 𝑁, 𝑗 = 1, 2, … , 𝑏𝑖 .
|
||||
DO initiates the process by generating a conversion key, which is
|
||||
1. Global Setup: Attribute authority 𝐴𝐴1 sets commonly known
|
||||
then uploaded to the cloud server. Following this, the DO retrieves the
|
||||
parameters 𝑃 𝑎𝑟𝑎𝑚𝑠 = {𝑔 , 𝑢, 𝑣, 𝑤, ℎ, 𝐺, 𝐺𝑇 , 𝐻0 ()} and publishes
|
||||
ciphertext and the verification credential from the cloud server to carry
|
||||
them, 𝐻0 is the designated collision-resistant hash function for
|
||||
out the concluding stages of decryption and integrity verification.
|
||||
generating robust verification credentials within the system.
|
||||
𝐴𝐴 includes the re-randomization of public parameters and the
|
||||
𝐻0 () ∶ {0, 1}∗ → {0, 1} 𝐻0 .
|
||||
secret keys that belong to users.
|
||||
2. AASetup:
|
||||
𝐷𝑂 is responsible to rerandomize cipher texts.
|
||||
𝐷𝑈 is responsible to rerandomize conversion keys and conversion (a) For each Attribute Authority, the process involves ran-
|
||||
ciphertexts. domly choosing 𝛼𝑖 ∈ 𝑍𝑝 , determining 𝑌𝑖 = 𝑒(𝑔 , 𝑔)𝛼𝑖 , and
|
||||
then distributing 𝑌𝑖 to other attribute authorities. As the
|
||||
3.3. Security model process concludes, each attribute authority carries out the
|
||||
∏𝑁 ∑𝑁
|
||||
calculation for 𝑌 = 𝑖=1 𝛼𝑖 = 𝑒(𝑔 , 𝑔)𝛼 ,
|
||||
The DO and the DU in our system are considered completely trust- ∑𝑁 𝑖=1 𝑌𝑖 = 𝑒(𝑔 , 𝑔)
|
||||
where 𝛼 = 𝑖=1 𝛼𝑖 .
|
||||
worthy. However, the reverse firewalls and cloud server are deemed
|
||||
‘‘honest and curious’’, meaning they will comply with the algorithm’s (b) Each attribute authority 𝐴̂ 𝑖 operates as follows:
|
||||
steps but will also endeavor to discover any private information within • Randomly select 𝑁 − 1 elements 𝑠𝑖𝑘 ∈ 𝑍𝑝 (𝑘 ∈
|
||||
the data. Furthermore, there is a risk of the Attribute Authority collud- {1, 2, … , 𝑁}∖{𝑖}), calculate 𝑔 𝑠𝑖𝑘 and send it to other
|
||||
ing with an adversary. In response to this challenge, we have put in attribute authorities.
|
||||
place a selective CPA security game, and the sequence of events within • After receiving 𝑁 − 1 components 𝑔 𝑠𝑘𝑖 from other
|
||||
this game is as follows: ascribe powers 𝐴̂ 𝑘 (𝑘 ∈ {1, 2, … , 𝑁}∖{𝑖}), the master
|
||||
key 𝑀 𝐾 𝑖 is calculated by the following formula:
|
||||
1. Init Phase: The rival 𝐵 declares a set of malicious attribute ∏
|
||||
authorities 𝑅 = (𝐴̂ 𝑖 )𝑖∈𝐼 and access policies (𝑀𝑖 ∗ , 𝜌𝑖 ∗ )𝑖∈𝐼 ∗ to be 𝑀𝐾𝑖 = (𝑔 𝑠𝑖𝑘 ∕𝑔 𝑠𝑘𝑖 )
|
||||
challenged, where 𝐼 ⊆ {1, 2, … , 𝑁}, 𝐼 ∗ ⊆ {1, 2, … , 𝑁}. Then 𝑘∈{1,2,…,𝑁}∖{𝑖}
|
||||
∑ ∑
|
||||
𝐵 sends algorithms 𝐺𝑙𝑜𝑏𝑎𝑙𝑠𝑒𝑡𝑢𝑝∗ , 𝐴𝐴𝑆 𝑒𝑡𝑢𝑝∗ , 𝐾 𝑒𝑦𝐺𝑒𝑛∗ , 𝐾 𝑒𝑦.𝑟𝑎𝑛∗ , ( 𝑠𝑖𝑘 − 𝑠𝑘𝑖 )
|
||||
𝑒𝑛𝑐 .𝑜𝑓 𝑓 𝑙𝑖𝑛𝑒∗ , 𝑒𝑛𝑐 .𝑜𝑛𝑙𝑖𝑛𝑒∗ to challenger 𝐹 . = 𝑔 𝑘∈{1,2,…,𝑁}∖{𝑖} 𝑘∈{1,2,…,𝑁}∖{𝑖}
|
||||
, (2)
|
||||
2. Setup Phase: 𝐹 executes algorithms 𝐺𝑙𝑜𝑏𝑎𝑙𝑠𝑒𝑡𝑢𝑝∗ and 𝐴𝐴𝑆 𝑒𝑡𝑢𝑝∗ to ∏𝑁
|
||||
obtain the public parameter 𝑃 𝑎𝑟𝑎𝑚𝑠, attribute authorities public where 𝑖=1 𝑀 𝐾𝑖 = 1.
|
||||
key 𝑃 𝐾 and private key pairs (𝑃 𝐾𝑖 , 𝐴𝑆 𝐾 𝑖 )𝑖∈𝐼 . Subsequently, the • For each attribute 𝐴𝑖,𝑗 ∈ 𝐴̃𝑖 , calculate 𝑢𝐴𝑖,𝑗 ℎ.
|
||||
reverse firewall puts the 𝑊𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 algorithm into action to
|
||||
Attribution authority publishes public key 𝑃 𝐾 = (𝑔 , 𝑢, ℎ,
|
||||
generate and announce the new public key 𝑃 𝐾 ′ , and in doing
|
||||
𝑤, 𝑣, 𝑒(𝑔 , 𝑔)𝛼 , 𝐺, 𝐺𝑇 ) and keeps its own private key 𝐴𝑆 𝐾 𝑖 =
|
||||
so, also retains the corresponding random number 𝑓 . 𝐵 can
|
||||
{𝛼𝑖 , (𝑢𝐴𝑗 ℎ)𝐴 ∈𝐴̂ , 𝑀 𝐾𝑖 }.
|
||||
receive 𝑃 𝐾𝑖 ′ from all non-malicious attribute authorities and 𝑗 𝑖
|
||||
|
||||
(𝑃 𝐾𝑖 , 𝐴𝑆 𝐾 𝑖 )𝑖∈𝐼 from all malicious attribute authorities.
|
||||
3. KeyGen: Each attribute authority 𝐴̂ 𝑖 execute algorithm as fol-
|
||||
3. Query Phase 1: Adaptive requests for secret keys regarding at-
|
||||
lows:
|
||||
tribute sets 𝑆1 , 𝑆2 , … , 𝑆𝑞 can be made by 𝐵. Each time 𝐵 per-
|
||||
forms a key query, when submitting a set of attributes, it is (a) Select 𝜃𝑖 ∈ 𝑍𝑝 at random, thereafter derive the elements
|
||||
imperative that they do not comply with the access structure of the secret key, denoted as 𝑀 𝐾𝑖 ⋅ 𝑔 𝜃𝑖 , 𝑀 𝐾𝑖 ⋅ 𝑣−𝜃𝑖 , 𝑀 𝐾𝑖 ⋅
|
||||
rules outlined by (𝑀𝑖 ∗ , 𝜌𝑖 ∗ )𝑖∈𝐼 ∗ , nor come from a malicious at- 𝑔 𝛼𝑖 ⋅ 𝑤𝜃𝑖 and subsequently convey these elements to the
|
||||
tribute authority 𝑅 = (𝐴̂ 𝑖 )𝑖∈𝐼 . For every query 𝑆𝑖 , 𝐹 executes pertinent attribute authorities.
|
||||
|
||||
4
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
(b) Upon obtaining the components from various attribute 4.2. CRF scheme
|
||||
authorities, proceed to compute the secret key utilizing
|
||||
the following steps: 1. Initialization: The attribute authorities runs 𝐺𝑙𝑜𝑏𝑎𝑙𝑆 𝑒𝑡𝑢𝑝 and
|
||||
∏𝑁 ∑𝑁
|
||||
𝐴𝐴𝑆 𝑒𝑡𝑢𝑝, each attribute authority sends 𝛼𝑖 to 𝐴𝐴 , then 𝐴𝐴
|
||||
𝐾0 = 𝑀 𝐾𝑖 ⋅ 𝑔 𝛼𝑖 ⋅ 𝑤𝜃𝑖 = 𝑔 𝑖=1 𝛼𝑖 𝑤𝑟 (3) executes algorithms as follows:
|
||||
𝑖=1 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 ∶ Upon receiving the parameters from 𝐴𝐴, the CRF
|
||||
∑
|
||||
∏
|
||||
𝑁 ∑𝑁 𝐴𝐴 calculates 𝛼 = 𝑁 𝑖=1 𝛼𝑖 , then randomly chooses 𝑎, 𝑏, 𝑐 , 𝑑 , 𝑒, 𝑓 ∈
|
||||
𝐾1 = 𝑀 𝐾𝑖 ⋅ 𝑔 𝜃𝑖 = 𝑔 𝑖=1 𝜃𝑖 = 𝑔𝑟 (4) 𝑍𝑝 and calculates 𝑔 ′ = 𝑔 𝑎 , 𝑢′ = 𝑢𝑏 , ℎ′ = ℎ𝑐 , 𝑤′ = 𝑤𝑑 , 𝑣′ =
|
||||
𝑖=1 ′ 2
|
||||
𝑣𝑒 , 𝛼 ′ = 𝛼 + 𝑓 , 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 = 𝑒(𝑔 , 𝑔)𝑎 (𝛼+𝑓 ) . 𝐴𝐴 stores 𝑓 and
|
||||
∏𝑁 ′
|
||||
𝐾𝑣 = 𝑀 𝐾𝑖 ⋅ 𝑣−𝜃𝑖 = 𝑣−𝑟 (5) publishes the updated 𝑃 𝐾 ′ = (𝑔 ′ , 𝑢′ , ℎ′ , 𝑤′ , 𝑣′ , 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 , 𝐺, 𝐺𝑇 ).
|
||||
′
|
||||
After receiving 𝑃 𝐾 , 𝐴𝐴 executes 𝐾 𝑒𝑦𝐺𝑒𝑛 to generate secret key
|
||||
𝑖=1
|
||||
𝑆 𝐾 = {𝐾0 , 𝐾1 , {𝐾𝑖,2 , 𝐾𝑖,3 }𝑖∈[1,𝜎] , 𝑆𝐼 𝐷 } and sends 𝑆 𝐾 to CRF 𝐴𝐴 .
|
||||
(c) For each attribute 𝜎 ∈ [𝑆𝐼 𝐷 ∩ 𝐴̂ 𝑖 ], randomly choose 𝑟𝜎 ∈ 𝐴𝐴 runs the following algorithm for re-randomization.
|
||||
𝑍𝑝 , where 𝜎 ≤ 𝑁 and 𝑆𝐼 𝐷 denotes the set of users. 𝐴𝐴 .𝐾 𝐺 ∶ Provide 𝑃 𝐾 ′ , 𝑓 and 𝑁 as input, where 𝑁 rep-
|
||||
𝑟 𝑟 resents the total number of attributes. 𝐴𝐴 randomly selects
|
||||
Calculate 𝐾𝑖,2 = 𝑔 𝑟𝑖 , 𝐾𝑖,3 = (𝑢𝐴𝑖 ℎ) 𝑖 ⋅ 𝐾𝑣 = (𝑢𝐴𝑖 ℎ) 𝑖 𝑣−𝑟 .
|
||||
𝑟′ , 𝑟1 ′ , 𝑟′2 , … , 𝑟′𝑁 ∈ 𝑍𝑝 , calculates 𝐾 ̃′ = 𝑔 ′ 𝑓 𝑤′ 𝑟′ , 𝐾
|
||||
̃′ = 𝑔 ′ 𝑟′ . For
|
||||
Then user gets the secret key 𝑆 𝐾 = {𝐾0 , 𝐾1 , 0 1
|
||||
𝑟′𝑖 ′
|
||||
{𝐾𝑖,2 , 𝐾𝑖,3 }𝑖∈[1,𝜎] , 𝑆𝐼 𝐷 }. 𝑖 = 1, 2, … , 𝑁, 𝑊 computes 𝐾 = 𝑔 , 𝐾 = 𝑣′ −𝑟 , 𝐾
|
||||
𝐴𝐴
|
||||
̃ ′ ′ ′
|
||||
𝑖,2
|
||||
̃ ′ =
|
||||
𝑣 𝑖,3
|
||||
𝑟′ 𝑟′ ′
|
||||
(𝑢′ 𝐴𝑖 ℎ′ ) 𝑖 ⋅ 𝐾𝑣′ = (𝑢′ 𝐴𝑖 ℎ′ ) 𝑖 𝑣′ −𝑟 . The intermediate key 𝑍 𝑆 𝐾 =
|
||||
4. KeyGen.ran: Upon inputting 𝑆 𝐾, the data user independently ̃′ , 𝐾
|
||||
(𝐾 ̃′ , {𝑟′ , 𝐾
|
||||
̃ ̃
|
||||
′ ,𝐾 ′ } ).
|
||||
0 1 𝑖 𝑖,2 𝑖,3 𝑖∈[1,𝑁]
|
||||
selects a random element from the finite field 𝜏 ∈ 𝑍𝑝 , and
|
||||
Eventually, 𝐴𝐴 computes 𝐾0′ = 𝐾0 ⋅ 𝐾 ̃′ = 𝑔 ′ 𝛼+𝑓 𝑤′ 𝑟+𝑟′ =
|
||||
proceeds to calculate 𝐾0′ = 𝐾0 1∕𝜏 = 𝑔 𝛼∕𝜏 𝑤𝑟∕𝜏 , 𝐾1′ = 𝐾1 1∕𝜏 = 𝑔 𝑟∕𝜏 . ′ ′ ′
|
||||
0
|
||||
′ = 𝐾 1∕𝜏 = 𝑔 𝑟𝑖 ∕𝜏 , ̃′ = 𝑔 ′ 𝑟+𝑟 . For 𝑖 = 1, 2, … , 𝜎, where
|
||||
𝑔 ′ 𝛼 𝑤′ 𝑟+𝑟 , 𝐾 ′ = 𝐾 ⋅ 𝐾
|
||||
For 𝑖 = 1, 2, … , 𝜎, the data user calculates 𝐾𝑖,2 𝑖,2 1 1 1 ′
|
||||
𝐾𝑖,3
|
||||
𝑟 ∕𝜏
|
||||
′ = 𝐾 1∕𝜏 = (𝑢𝐴𝑖 ℎ) 𝑖 𝑣−𝑟∕𝜏 . The transformation key, desig-
|
||||
′
|
||||
𝜎 ≤ 𝑁, 𝐴𝐴 calculates 𝐾𝑖,2 ̃
|
||||
= 𝐾𝑖,2 ⋅ 𝐾 ′
|
||||
𝑖,2
|
||||
= 𝑔 ′ 𝑟𝑖 +𝑟𝑖 , 𝐾𝑖,3
|
||||
′ =
|
||||
𝑖,3 ′
|
||||
′ = (𝑢′ 𝐴𝑖 ℎ′ )𝑟𝑖 +𝑟𝑖 𝑣′ −𝑟−𝑟 .
|
||||
′
|
||||
nated as 𝑇 𝐾 = (𝑆𝐼 𝐷 , 𝐾0′ , 𝐾1′ , {𝐾𝑖,2 ′ , 𝐾′ } ) and the recovery ̃ ′
|
||||
𝑖,3 𝑖∈[1,𝜎] 𝐾𝑖,3 ⋅ 𝐾 𝑖,3 𝐴𝐴 sends the updated 𝑆 𝐾 =
|
||||
′ ′ ′ ′
|
||||
(𝐾0 , 𝐾1 , {𝐾𝑖,2 , 𝐾𝑖,3 } , 𝑆𝐼 𝐷 ) to data user.
|
||||
key, denoted as 𝑅𝐾 = 𝜏, serve distinct functions within the
|
||||
𝑖∈[1,𝜎]
|
||||
cryptographic framework. 2. Data Upload: The data owner invokes the 𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒
|
||||
5. Enc.Offline: Enter the 𝑃 𝐾, and let 𝑁 ′ denote the upper limit on and 𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 to obtain ciphertext 𝐶 𝑇 = ((𝑀 , 𝜌), 𝐶 , 𝐶0 ,
|
||||
the count of rows within the secret sharing matrix. The data {𝐶𝑗 ,1 , 𝐶𝑗 ,2 , 𝐶𝑗 ,3 }𝑗∈[1,𝑙] ) and verification credential 𝑇 𝑜𝑘𝑒𝑛, then
|
||||
owner randomly chooses 𝑠 ∈ 𝑍𝑝 , calculates 𝐶̂ = 𝑒(𝑔 , 𝑔)𝛼𝑠 , 𝐶̂0 = 𝑔 𝑠 . sends 𝐶 𝑇 and 𝑇 𝑜𝑘𝑒𝑛 to CRF 𝐷𝑂 , 𝐷𝑂 executes algorithm as
|
||||
For 𝑗 = 1, 2, … , 𝑁 ′ , the data owner randomly chooses 𝑑𝑗 ∈ 𝑍𝑝 follows:
|
||||
and calculates 𝐶̂𝑗 ,1 = 𝑣𝑑𝑗 , 𝐶̂𝑗 ,2 = ℎ−𝑑𝑗 , 𝐶̂𝑗 ,3 = 𝑔 𝑑𝑗 . The intermediate 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒 ∶ Input 𝑃 𝐾 ′ and 𝑁 ′ , the notation 𝑁 ′ is
|
||||
ciphertext 𝑀 𝑇 = (𝑠, 𝐶̂ , 𝐶̂0 , {𝑑𝑗 , 𝐶̂𝑗 ,1 , 𝐶̂𝑗 ,2 , 𝐶̂𝑗 ,3 }𝑗∈[1,𝑁 ′ ] ). used to represent the highest possible number of rows that are
|
||||
6. Enc.Online: Input 𝑀 𝑇 , plaintext 𝑚, access structure (𝑀 , 𝜌), where allowed in the access structure. 𝐷𝑂 randomly chooses 𝑠′ ∈ 𝑍𝑝
|
||||
′ ′ ′
|
||||
𝑀 is a matrix of 𝑙 rows and 𝑛 columns (𝑙 ≤ 𝑁 ′ ). The data as secret value and calculates 𝐶̂ ′ = 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 𝑠 , 𝐶̂0′ = 𝑔 ′ 𝑠 . For
|
||||
′ ′
|
||||
𝑗 = 1, 2, … , 𝑁 , 𝐷𝑂 randomly chooses 𝑑𝑗 ∈ 𝑍𝑝 and calculates
|
||||
owner randomly chooses vector 𝑦⃖⃗ = (𝑠, 𝑦2 , … , 𝑦𝑛 ) ∈ 𝑍𝑝𝑛×1 . The
|
||||
𝑑′ −𝑑 ′ 𝑑′
|
||||
secret share is 𝜆⃖⃗ = (𝜆1 , 𝜆2 , … , 𝜆𝑙 )𝑇 = 𝑀 𝑦⃖⃗. Then the data owner 𝐶̂𝑗′,1 = 𝑣′ 𝑗 , 𝐶̂𝑗′,2 = ℎ′ 𝑗 , 𝐶̂𝑗′,3 = 𝑔 ′ 𝑗 . Enter the transitional
|
||||
calculates 𝑇 𝑜𝑘𝑒𝑛 = 𝐻0 (𝑚), 𝐶 = 𝑚 ⋅ 𝐶̂ = 𝑚 ⋅ 𝑒(𝑔 , 𝑔)𝛼𝑠 , 𝐶0 = 𝐶̂0 = 𝑔 𝑠 . encryption, denoted as 𝑀 𝑇 ′ = (𝑠′ , 𝐶̂ ′ , 𝐶̂ ′ , {𝐶̂ ′ , 𝐶̂ ′ , 𝐶̂ ′ } ). 0 𝑗 ,1 𝑗 ,2 𝑗 ,3 𝑗∈[1,𝑁 ′ ]
|
||||
For 𝑗 = 1, 2, … , 𝑙, data owner computes 𝐶𝑗 ,1 = 𝐶̂𝑗 ,1 ⋅ 𝑤𝜆𝑗 = 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 ∶ Input 𝑃 𝐾 ′ , 𝑀 𝑇 ′ and 𝐶 𝑇 . The CRF 𝐷𝑂
|
||||
−𝑑
|
||||
𝑤𝜆𝑗 𝑣𝑑𝑗 , 𝐶𝑗 ,2 = 𝐶̂𝑗 ,2 ⋅ 𝑢−𝜌(𝑗)𝑑𝑗 = (𝑢−𝜌(𝑗) ℎ) 𝑗 , 𝐶𝑗 ,3 = 𝐶̂𝑗 ,3 = 𝑔 𝑑𝑗 . randomly selects vector 𝑦⃖⃖⃗′ = (𝑠′ , 𝑦′2 , ..., 𝑦′𝑛 )𝑇 ∈ 𝑍𝑝𝑛×1 , then secret
|
||||
The ciphertext 𝐶 𝑇 = ((𝑀 , 𝜌), 𝐶 , 𝐶0 , {𝐶𝑗 ,1 , 𝐶𝑗 ,2 , 𝐶𝑗 ,3 }𝑗∈[1,𝑙] ) and the shared vectors 𝜆⃖⃖⃗′ = (𝜆′ , … , 𝜆′ )𝑇 = 𝑀 𝑦⃖⃖⃗′ . Then
|
||||
1 𝑛 computes 𝐷𝑂
|
||||
′ ′ ′
|
||||
verification credential is 𝑇 𝑜𝑘𝑒𝑛. 𝐶 ′ = 𝐶 ⋅ 𝐶̂ ′ = 𝑚 ⋅ 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 ) , 𝐶0′ = 𝐶0 ⋅ 𝐶̂0′ = 𝑔 ′ 𝑠+𝑠 . For
|
||||
7. Dec.Out: If the user’s attributes set, identified by 𝑆𝐼 𝐷 , does not 𝑗 = 1, 2, … , 𝑙, where 𝑙 ≤ 𝑁 ′ , 𝐷𝑂 calculates
|
||||
conform to the access structure, the cloud server will return 𝜆′ 𝜆 +𝜆′𝑗 ′ 𝑑𝑗 +𝑑𝑗′
|
||||
𝐶𝑗′,1 = 𝐶𝑗 ,1 ⋅ 𝐶̂𝑗′,1 ⋅ 𝑤′ 𝑗 = 𝑤′ 𝑗 𝑣 , (8)
|
||||
a null value ⊥ and terminate the algorithm. Otherwise, cloud
|
||||
′
|
||||
server collects 𝐼 = {𝑖, 𝜌(𝑖) ∈ 𝑆𝐼 𝐷 } and calculates {𝜔𝑖 ∈ 𝑍𝑝 }𝑖∈𝐼 , −𝜌(𝑗)𝑑𝑗′ 𝜌(𝑗) ′ −(𝑑𝑗 +𝑑𝑗 )
|
||||
∑ 𝐶𝑗′,2 = 𝐶𝑗 ,2 ⋅ 𝐶̂𝑗′,2 ⋅ 𝑢′ = (𝑢′ ℎ) , (9)
|
||||
where 𝑖∈𝐼 𝜔𝑖 ⋅ 𝑀𝑖 = (1, 0, … , 0) and 𝑀𝑖 is the 𝑖th row of matrix
|
||||
𝑑 +𝑑𝑗′
|
||||
𝑀. Then the cloud server calculates 𝐶𝑗′,3 = 𝐶𝑗 ,3 ⋅ 𝐶̂𝑗′,3 = 𝑔 ′ 𝑗 . (10)
|
||||
𝑒(𝐶0 , 𝐾0′ )
|
||||
𝐴= ∏ ′ ′ ′ 𝜔𝑖 The 𝐷𝑂 transmits the ciphertext 𝐶 𝑇 ′ = (𝐶 ′ , 𝐶0′ , {𝐶𝑗′,1 , 𝐶𝑗′,2 ,
|
||||
𝑖∈𝐼 (𝑒(𝐶𝑖,1 , 𝐾1 ) ⋅ 𝑒(𝐶𝑖,2 , 𝐾𝑗 ,2 ) ⋅ 𝑒(𝐶𝑖,3 , 𝐾𝑗 ,3 ))
|
||||
𝐶𝑗′,3 }𝑗∈[1,𝑙] , (𝑀 , 𝜌)), which has been re-randomized, along with
|
||||
= 𝑒(𝑔 , 𝑔)𝛼 𝑠∕𝜏 , (6) the 𝑇 𝑜𝑘𝑒𝑛, to the cloud server.
|
||||
3. Data Download: The data user runs 𝐾 𝑒𝑛𝐺𝑒𝑛.𝑟𝑎𝑛(𝑆 𝐾 ′ ) and sends
|
||||
in the given context, 𝑗 represents the position or identifier for 𝑇 𝐾 = (𝑆𝐼 𝐷 , 𝐾0′′ , 𝐾1′′ , {𝐾𝑖,2
|
||||
′′ , 𝐾 ′′ } ) to CRF 𝐷𝑈 . Then 𝐷𝑈
|
||||
𝑖,3 𝑖∈[1,𝜎]
|
||||
the attribute value 𝜌(𝑖) in 𝑆𝐼 𝐷 (). executes algorithm as follows:
|
||||
8. Dec.User: The data user uses the conversion key 𝑅𝐾 to decrypt 𝐷𝑈 .𝑇 𝐾 𝑈 𝑝𝑑 𝑎𝑡𝑒 ∶ 𝐷𝑈 randomly chooses 𝜑 ∈ 𝑍𝑝 and calculates
|
||||
as follows: 1∕𝜑 𝛼 ′ ∕𝜏 𝜑 (𝑟+𝑟′ )∕𝜏 𝜑
|
||||
𝐶 𝑒(𝑔 , 𝑔)𝛼𝑠 𝑚 𝐾0′′′ = 𝐾 ′′
|
||||
0
|
||||
= 𝑔′ 𝑤′ , (11)
|
||||
= 𝜏 = 𝑚, (7)
|
||||
𝐴𝜏 (𝑒(𝑔 , 𝑔)𝛼𝑠∕𝜏 ) 1∕𝜑 (𝑟+𝑟′ )∕𝜏 𝜑
|
||||
𝐾1′′′ = 𝐾 ′′
|
||||
1
|
||||
= 𝑔′ , (12)
|
||||
then data user uses the verification credential 𝑇 𝑜𝑘𝑒𝑛 to com- 1∕𝜑 (𝑟 +𝑟′ )∕𝜏 𝜑
|
||||
′′′
|
||||
plete the ciphertext verification, if 𝐻0 (𝑚) = 𝑇 𝑜𝑘𝑒𝑛 holds, the 𝐾𝑖,2 = 𝐾 ′′
|
||||
𝑖,2
|
||||
= 𝑔′ 𝑖 𝑖 , (13)
|
||||
ciphertext is correct. Otherwise, the ciphertext may have been ′′′ 1∕𝜑 𝐴 (𝑟𝑖 +𝑟′𝑖 )∕𝜏 𝜑 ′ −(𝑟+𝑟′ )∕𝜏 𝜑
|
||||
𝐾𝑖,3 = 𝐾 ′′
|
||||
𝑖,3
|
||||
= (𝑢′ 𝑖 ℎ′ ) 𝑣 . (14)
|
||||
tampered with.
|
||||
|
||||
|
||||
|
||||
5
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
𝐷𝑈 stores 𝜑 ∈ 𝑍𝑝 and sends re-randomize conversion key 𝑒(𝐶0′ , 𝐾0′′′ )
|
||||
𝑇 𝐾 ′ = (𝑆𝐼 𝐷 , 𝐾0′′′ , 𝐾1′′′ , {𝐾𝑖,2′′′ , 𝐾 ′′′ } ) to the cloud server. 𝐴′ = ∏ ′ ′′′ ′ ′′′ ′ ′′′ 𝜔𝑖
|
||||
𝑖,3 𝑖∈[1,𝜎] 𝑖∈𝐼 (𝑒(𝐶𝑖,1 , 𝐾1 ) ⋅ 𝑒(𝐶𝑖,2 , 𝐾𝑗 ,2 ) ⋅ 𝑒(𝐶𝑖,3 , 𝐾𝑗 ,3 ))
|
||||
When receiving a decryption request from a data user, the cloud ′ ′ ′ ′
|
||||
server performs 𝐷𝑒𝑐 .𝑂𝑢𝑡(𝑇 𝐾 ′ , 𝐶 𝑇 ′ ) to acquire a partially de- 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏 𝜑 𝑒(𝑔 ′ , 𝑤′ )(𝑟+𝑟 )(𝑠+𝑠 )∕𝜏 𝜑
|
||||
= ∏ ′
|
||||
⋅∏ ′
|
||||
crypted ciphertext 𝑇 𝐶 𝑇 . The cloud server sends 𝑇 𝐶 𝑇 = (𝐶 ′ , 𝐴 = ′ ′ (𝑟+𝑟′ )(𝜆𝑖 +𝜆𝑖 )𝜔𝑖 ∕𝜏 𝜑 ′ ′ (𝑟+𝑟′ )(𝑑𝑖 +𝑑𝑖 )𝜔𝑖 ∕𝜏 𝜑
|
||||
′ ′ 𝑖∈𝐼 𝑒(𝑔 , 𝑤 ) 𝑖∈𝐼 𝑒(𝑔 , 𝑣 )
|
||||
𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏 𝜑 ) and 𝑇 𝑜𝑘𝑒𝑛 to 𝐷𝑈 , 𝐷𝑈 runs algorithms as 1
|
||||
⋅∏
|
||||
follows. ′
|
||||
′ ′ −𝜌(𝑖)(𝑑𝑖 +𝑑𝑖 )(𝑟𝑖 +𝑟𝑖 ′ )𝜔𝑖 ∕𝜏 𝜑
|
||||
′ ′ 𝑖∈𝐼 𝑒(𝑔 , 𝑢 )
|
||||
𝐷𝑈 .𝐷𝑒𝑐 ∶ The CRF 𝐷𝑈 computes 𝐴′ = 𝐴𝜑 = 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏
|
||||
1
|
||||
′ ′ ′
|
||||
and sends 𝑇 𝐶 𝑇 = (𝐶 , 𝐴 ) and 𝑇 𝑜𝑘𝑒𝑛 to the data user. ⋅∏ ′ ′
|
||||
(15)
|
||||
𝑖∈𝐼 𝑒(𝑔 ′ , ℎ′ )−(𝑑𝑖 +𝑑𝑖 )(𝑟𝑖 +𝑟𝑖 )𝜔𝑖 ∕𝜏 𝜑
|
||||
After receiving re-randomize partially decrypted ciphertext, data
|
||||
user runs 𝐷𝑒𝑐 .𝑈 𝑠𝑒𝑟 to recover plaintext 𝑚. Then the data user 1
|
||||
⋅∏ ′ ′
|
||||
uses the verification credential 𝑇 𝑜𝑘𝑒𝑛 to finish the ciphertext 𝑖∈𝐼 𝑒(𝑔 ′ , 𝑢′ )𝐴𝑖 (𝑑𝑖 +𝑑𝑖 )(𝑟𝑖 +𝑟𝑖 )𝜔𝑖 ∕𝜏 𝜑
|
||||
verification, if 𝐻0 (𝑚) = 𝑇 𝑜𝑘𝑒𝑛 holds, the ciphertext is correct. 1 1
|
||||
⋅∏ ′ ′
|
||||
⋅∏ ′ ′
|
||||
′ ′ (𝑑𝑖 +𝑑𝑖 )(𝑟𝑖 +𝑟𝑖 )𝜔𝑖 ∕𝜏 𝜑 ′ ′ −(𝑟+𝑟 )(𝑑𝑖 +𝑑𝑖 )𝜔𝑖 ∕𝜏 𝜑
|
||||
𝑖∈𝐼 𝑒(𝑔 , ℎ ) 𝑖∈𝐼 𝑒(𝑔 , 𝑣 )
|
||||
′ ′ ′ ′
|
||||
5. Security analysis 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏 𝜑 𝑒(𝑔 ′ , 𝑤′ )(𝑟+𝑟 )(𝑠+𝑠 )∕𝜏 𝜑 ′ ′
|
||||
= ∑ ′
|
||||
= 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏 𝜑 .
|
||||
(𝑟+𝑟′ ) 𝑖∈𝐼 (𝜆𝑖 +𝜆𝑖 )𝜔𝑖 ∕𝜏 𝜑
|
||||
𝑒(𝑔 ′ , 𝑤′ )
|
||||
5.1. Security proof (16)
|
||||
𝛼 ′ (𝑠+𝑠′ )∕𝜏
|
||||
𝐶′ 𝐶′ 𝑚 ⋅ 𝑒(𝑔 ′ , 𝑔 ′ )
|
||||
Theorem 1. Given that the 𝑞-BDHE assumption holds true, the proposed ′𝜏
|
||||
= 𝜑𝜏 = ′ ′
|
||||
=𝑚 (17)
|
||||
𝐴 𝐴 𝑒(𝑔 ′ , 𝑔 ′ )𝛼 (𝑠+𝑠 )∕𝜏
|
||||
scheme is deemed secure against selective CPA.
|
||||
It is evident from the aforementioned equations that the message
|
||||
‘m’ remains decryptable under normal circumstances even after
|
||||
Proof. If a polynomial-time adversary 𝐵 can effectively compromise the the implementation of a cryptographic reverse firewall. Conse-
|
||||
proposed scheme with a significant advantage, then we can develop a quently, the functionality of the cryptographic reverse firewalls
|
||||
challenger 𝐹 to solve the 𝑞-BDHE problem with a significant advantage. is preserved.
|
||||
The process is as follows: 2. Weakly Security-preserving and Weakly Exfiltration-resistant
|
||||
Init Phase: The adversary 𝐵 submits access policies (𝑀𝑖 ∗ , 𝜌𝑖 ∗ )𝑖∈𝐼 ∗ and We assume the following security game process.
|
||||
a set of malicious attribute authorities 𝑅 = (𝐴̂ 𝑖 )𝑖∈𝐼 , where 𝑀𝑖 ∗ is a 𝑙 ∗ 𝑛 Game 0: Same as chapter 3 security games.
|
||||
matrix. Furthermore, the attributes within the access structure must Game 1: In the init phase, attribute authorities’ 𝑃 𝐾 , 𝐴𝑆 𝐾 𝑖 are
|
||||
originate from trusted attribute authorities and cannot be maliciously generated by algorithms GlobalSetup and AASetup of basic
|
||||
manipulated. scheme, not GlobalSetup*, AASetup* and 𝐴𝐴 .SetUp. The sub-
|
||||
Setup Phase: The challenger 𝐹 executes algorithms AASetup and sequent algorithms are carried over unchanged from Game
|
||||
GlobalSetup to generate public parameter 𝑃 𝑎𝑟𝑎𝑚𝑠 = {𝑔 , 𝑢, 𝑣, 𝑤, ℎ, 𝐺, 𝐺𝑇 , 0.
|
||||
𝐻0 ()} and private keys (𝑃 𝐾𝑖 , 𝐴𝑆 𝐾 𝑖 )𝑖∈𝐼 . The reverse firewall 𝐴𝐴 ex- Game 2: During both phase 1 and phase 2, the secret key 𝑆 𝐾 is
|
||||
ecutes the algorithm 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 to re-random public key, then 𝐴𝐴 derived from the KeyGen algorithm of the foundational scheme,
|
||||
publishes updated public key 𝑃 𝐾 ′ . rather than being produced by KeyGen* or the 𝐴𝐴 .𝐾 𝐺. The
|
||||
Query Phase 1: During this phase, 𝐵 can dynamically request secret 𝑇 𝐾 is produced using the KeyGen.ran function of the underlying
|
||||
keys for attribute sets 𝑆1 , 𝑆2 , … , 𝑆𝑞 . For every query 𝑆𝑖 , 𝐹 executes scheme, and not through KeyGen.ran* or the 𝐷𝑈 .TKUpdate.
|
||||
algorithm KeyGen to obtain corresponding secret key 𝑆 𝐾𝑖 . Then 𝐹 The subsequent algorithms mirror those utilized in Game 1.
|
||||
executes algorithm 𝐴𝐴 .𝐾 𝐺 to get re-randomized secret key 𝑆 𝐾𝑖′ . Game 3: During the challenge phase, the ciphertext labeled
|
||||
Subsequently, 𝐹 executes KeyGen.ran to get conversion key 𝑇 𝐾𝑖 . Then as 𝐶 𝑇𝑏 is constructed through the process of encryption de-
|
||||
𝐹 runs 𝐷𝑈 .𝑇 𝐾 𝑈 𝑝𝑑 𝑎𝑡𝑒 to get re-randomized conversion key 𝑇 𝐾𝑖′ . 𝐶 noted by Enc.offline, Enc.online, not Enc.offline*, Enc.online*,
|
||||
returns (𝑆 𝐾𝑖′ , 𝑇 𝐾𝑖′ ) to 𝐵. 𝐷𝑂 .Enc.offline and 𝐷𝑂 .Enc.online. Actually, Game 3 is the
|
||||
Challenge Phase: 𝐵 provides two messages, 𝑚0 and 𝑚1 , of equal security game of basic scheme.
|
||||
length. 𝐹 randomly selects 𝑏 ∈ {0, 1} and runs Enc.Offline* and We then proceed to demonstrate the indistinguishability be-
|
||||
tween Game 0 and Game 1, followed by Game 1 and Game
|
||||
Enc.Online* to get challenge ciphertext 𝐶 𝑇𝑏 = ((𝑀 , 𝜌), 𝐶 , 𝐶0 , {𝐶𝑗 ,1 , 𝐶𝑗 ,2 ,
|
||||
2, and finally between Game 2 and Game 3, each in isolation.
|
||||
𝐶𝑗 ,3 }𝑗∈[1,𝑙] ).
|
||||
Between Game 0 and Game 1, it is observed that no matter
|
||||
Then 𝐹 executes 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒 and 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 Obtain a
|
||||
the modifications introduced by the tampered GlobalSetup* and,
|
||||
ciphertext 𝐶 𝑇𝑏′ . 𝐹 that has been re-randomized sends 𝐶 𝑇𝑏′ to 𝐵.
|
||||
AASetup* algorithms, after the application of re-randomization
|
||||
Query Phase 2: The challenger 𝐹 proceeds as in Query Phase 1.
|
||||
via the 𝑊𝐴𝐴 reverse firewall, the public parameter 𝑃 𝐾 ′ always
|
||||
Guess Phase: 𝐵 outputs a bit 𝑏′ ∈ {0, 1}. If 𝑏′ = 𝑏, then 𝐹 outputs 0
|
||||
corresponds to the structure of the 𝑃 𝐾 that is generated by the
|
||||
(meaning that 𝐵 obtains the normally generated ciphertext). If 𝑏′ ≠
|
||||
standard algorithm. This uniformity is due to the malleability
|
||||
𝑏, then 𝐹 outputs 1(meaning that 𝐵 obtains the randomly selected
|
||||
of the key in question. Consequently, there is no distinguishable
|
||||
element). Hence, the adversary 𝐵 has advantage of 𝜖 security game
|
||||
difference between Game 0 and Game 1.
|
||||
directly correlates to the ability of function 𝐹 to resolve the 𝑞-BDHE
|
||||
Given that the secret key 𝑆 𝐾 and the conversion key 𝑇 𝐾,
|
||||
problem with the same level of probability.
|
||||
which are produced for the user by the attribute authority, also
|
||||
possess malleability, it follows that Game 1 and Game 2 are
|
||||
5.2. Security analysis indistinguishable. When it comes to Game 2 and Game 3, the 𝐶 𝑇
|
||||
will undergo rerandomization by the reverse firewall, resulting
|
||||
The features of the proposed scheme include: in a new ciphertext 𝐶 𝑇 ′ , a process that is a consequence of
|
||||
the ciphertext’s malleable nature. Thus, regardless of how the
|
||||
1. Function Maintaining Enc.offline* and Enc.online* algorithms operate, the ultimate
|
||||
If the collection of attributes associated with the secret key configuration of the ciphertext aligns with that of the basic
|
||||
∑
|
||||
constitutes an authorized set, then the equation 𝑖∈𝐼 𝜔𝑖 ⋅ (𝜆𝑖 + scheme’s ciphertext structure. Consequently, there is no distin-
|
||||
𝜆𝑖 ′ ) = 𝑠 + 𝑠′ holds. Thus, guishable difference between Game 2 and Game 3. In summary,
|
||||
|
||||
6
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
Table 1
|
||||
Function comparison.
|
||||
Scheme With CRFs Outsource Offline encryption Multi-authority Ciphertext verification Access structure
|
||||
Guo et al. [25] ✕ ✓ ✓ ✕ ✕ Tree
|
||||
Chaudhary et al. [28] ✕ ✓ ✕ ✓ ✕ LSSS
|
||||
Hong et al. [31] ✓ ✕ ✕ ✓ ✕ LSSS
|
||||
Zhong et al. [29] ✕ ✓ ✕ ✕ ✕ Tree
|
||||
Zhao et al. [32] ✓ ✓ ✓ ✕ ✕ Tree
|
||||
Jin et al. [33] ✓ ✕ ✕ ✕ ✕ LSSS
|
||||
Elhabob et al. [34] ✓ ✕ ✕ ✕ ✓ Tree
|
||||
Ours ✓ ✓ ✓ ✓ ✓ TREE
|
||||
|
||||
|
||||
we deduce that Game 0 and Game 3 are equivalent in terms of By combining the above technologies, this method not only pro-
|
||||
their indistinguishability. Given that the foundational scheme is tects the communication channel, but also improves the security
|
||||
secure, it follows that the proposed scheme is also secure. of information.
|
||||
3. Message Verification
|
||||
The data user(vehicle/RSU) use parameters 𝑇 𝑜𝑘𝑒𝑛, 𝑚 and hash 6. Performance evaluation
|
||||
function 𝐻0 () to check whether equation 𝐻0 (𝑚) = 𝑇 𝑜𝑘𝑒𝑛 holds
|
||||
true. With the help of the verification procedure described, the 6.1. Experimental setup
|
||||
data user can identify any tampering that may have occurred
|
||||
with the message. Additionally, it provides assurance regarding The following outlines the hardware and software contexts utilized
|
||||
the completeness and dependability of the received message. If for conducting the experiment:
|
||||
the message changes, the equation will not holds. Therefore, the
|
||||
proposed scheme supports the message verification. • The experimental apparatus consists of a desktop computer
|
||||
4. Collusion Resistance equipped with a 3.2 GHz AMD Ryzen 5 5600x CPU, 16 GB of
|
||||
RAM, and runs the Windows 11 Professional (x64) OS.
|
||||
Theorem 2. Should the difficulty of the discrete logarithm problem remain • The experimental schemes are realized using Java 8 and the
|
||||
uncompromised, the proposed scheme can defend against collusion attacks JPBC 2.0.0 library [32]. The prime-order bilinear pairings are
|
||||
initiated by up to 𝑁 − 1 attribute authorities. constructed upon a 160-bit elliptic curve group, which is founded
|
||||
on the equation 𝑦2 = 𝑥3 + 𝑥.
|
||||
According to the encryption process, each attribute authority
|
||||
randomly chooses 𝑠𝑖𝑘 ∈ 𝑍𝑝 and attribute authority extends 6.2. Theoretical analysis
|
||||
the value 𝑔 𝑠𝑖𝑘 to all the other attribute authorities involved.
|
||||
Given the difficulty inherent in the discrete logarithm problem, it Table 1 provides a side-by-side comparison to examine the function-
|
||||
would be problematic for an adversary 𝐵 to deduce 𝑠𝑖𝑘 from 𝑔 𝑠𝑖𝑘 ality of our proposed scheme in relation to other schemes. Scheme [25]
|
||||
alone. Hence, even with the combined efforts of 𝑁 − 2 attribute supports outsourced decryption and online encryption, but the rest
|
||||
authorities working in tandem with the adversary, guessing a of the functionality is not realized. Scheme [28] introduced multiple
|
||||
valid 𝑀 𝐾𝑖 remains an unattainable task for the adversary. Con- authorities to protect against collusion attacks. Scheme [29] only pro-
|
||||
sequently, the adversary cannot devise a valid secret key 𝑆 𝐾. vides outsource decryption, thus the efficiency of encryption phase is
|
||||
This renders the proposed scheme resistant to collusion attacks not good enough. Scheme [31–34], add CRF modules between entities
|
||||
carried out by 𝑁 − 1 attribute authorities. based on the above schemes. However, these schemes either do not
|
||||
have outsourced decryption or do not have multiple attribute authori-
|
||||
5.3. Informal security analysis ties, which has some disadvantages. Our scheme provides both of these
|
||||
features, taking into account both efficiency and security. Through
|
||||
1. Side channel attack defenses comparison, we can find that the proposed scheme adds cryptographic
|
||||
The proposed scheme utilizes CRF technology, which signif- reverse firewalls between entities. By employing these firewalls, the
|
||||
icantly reduces the computational overhead while enhancing system is fortified with a layer of defense that maintains its func-
|
||||
security. By leveraging CRF, it reduces the risk of messages tional integrity against potential subversion attacks and any attempts
|
||||
being attacked and complicates potential threats. In addition, to tamper with its algorithms.
|
||||
multi-authorization technology maximizes the security of the The introduction of multi-attribute authorities ensures that the sys-
|
||||
entire system, effectively preventing single-point leakage, while tem is resistant to collusion attacks. The proposed scheme also provides
|
||||
balancing power consumption and execution time. These two outsourcing decryption as well as offline encryption, which requires
|
||||
methods not only improve the efficiency, but also provide strong low computation for the users to obtain the ciphertext. Addition-
|
||||
protection against side channel attacks. ally, verification credentials empower users to check and ensure the
|
||||
In short, the scheme effectively combines efficiency and en- ciphertext’s integrity.
|
||||
hanced security, making it suitable for secure communication in The following notations are applied within Tables 2 and 3 are as
|
||||
vehicular networks that are susceptible to side channels. follows: 𝐸 signifies an exponential operation, and 𝑃 denotes a bilinear
|
||||
2. Man-in-the-Middle attack defense0 pairing operation. In the given context, 𝑀 signifies the number of rows
|
||||
The proposed scheme uses CP-ABE technology. This technique in a matrix as well as the number of leaf nodes in an access tree. The
|
||||
uses a ciphertext policy, which embeds the access policy into the symbol 𝑙 is used to denote the total number of attributes possessed by
|
||||
ciphertext. This improves the security and flexibility of access users, while 𝑘 signifies the minimum number of attributes from the
|
||||
control and reduces the risk of man-in-the-middle attack (MITI) access structure required to fulfill the decryption criteria.
|
||||
due to identity forgery. As shown in Table 2, our scheme is in the middle of the 𝐾 𝑒𝑦𝐺𝑒𝑛
|
||||
In addition, we enhance the CRF module by integrating key pa- phase. However, our scheme achieves the lowest computational over-
|
||||
rameter re-randomization within the multi-authority ABE frame- head in the 𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 phase. In the 𝐷𝑒𝑐 .𝑂𝑢𝑡 phase, our scheme does
|
||||
work. In addition, the proposed scheme also supports message not achieve significant advantages. But in 𝐷𝑒𝑐 .𝑈 𝑠𝑒𝑟 phase, our scheme
|
||||
integrity verification, easily executable by onboard terminals requires only a single exponential operation, reaches a constant level
|
||||
using simple hash functions. of computational overhead.
|
||||
|
||||
7
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Time consumption of basic scheme.
|
||||
|
||||
Table 2
|
||||
Computation comparison.
|
||||
Scheme KeyGen Encryption Outsource decryption User decryption
|
||||
Offline Online
|
||||
Guo et al. [25] (𝑙 + 4)𝐸 (3𝑀 + 1)𝐸 3𝐸 2𝑙𝐸 + 2𝑙𝑃 𝐸
|
||||
Chaudhary et al. [28] (2𝑙 + 2)𝐸 ✕ (3𝑀 + 1)𝐸 (4𝑙 + 2)𝐸 𝐸
|
||||
Zhong et al. [29] (3𝑙 + 6)𝐸 ✕ (2𝑀 + 2)𝐸 ✕ 2𝑙𝐸 + (𝑙 + 1)𝑃
|
||||
Hong et al. [31] (4𝑙 + 2)𝐸 + 𝑃 ✕ (5𝑀 + 2)𝐸 ✕ 𝐸 + (3𝑘 + 1)𝑃
|
||||
Zhao et al. [32] (2𝑙 + 4)𝐸 3𝑀 𝐸 + 𝑃 3𝐸 (3𝑙 + 1)𝐸 + (2𝑙 + 1)𝑃 2𝐸
|
||||
Jin et al. [33] 𝑙𝐸 + 𝑃 ✕ 6𝑀 𝐸 + 3𝑃 ✕ 𝑙𝐸 + 2𝑃
|
||||
Elhabob et al. [34] (2𝑙 + 2)𝐸 ✕ 4𝐸 ✕ 3𝐸
|
||||
Ours (2𝑙 + 3)𝐸 (2𝑀 + 2)𝐸 3𝐸 𝑙𝐸 + 3𝑙𝑃 𝐸
|
||||
|
||||
|
||||
Table 3 Fig. 3(a) demonstrates that our scheme has a low computational
|
||||
Time consumption of CRFs.
|
||||
overhead., is observed to be low. As shown in Fig. 3(b), when compar-
|
||||
Scheme 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 𝐴𝐴 .𝐾 𝐺 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 ing the computational overhead of the 𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 phase, our scheme,
|
||||
Hong et al. [31] 2𝑙𝐸 + 2𝑙𝑃 (5𝑙 + 2)𝐸 2𝑙𝐸 + 𝑃 which benefits from the preprocessing performed in the 𝐸 𝑛𝑐 .𝑂𝑓 𝑓 𝑙𝑖𝑛𝑒
|
||||
Zhao et al. [32] 2𝐸 (2𝑙 + 3)𝐸 4𝐸
|
||||
phase, has the lowest computational overhead of all the schemes eval-
|
||||
Jin et al. [33] (𝑙 + 2)𝐸 (2𝑙 + 2)𝐸 𝑃
|
||||
Elhabob et al. [34] 2𝐸 (2𝑙 + 3)𝐸 4𝐸 uated. In terms of Fig. 3(c), the efficiency of our scheme is in the
|
||||
Ours 5𝐸 (2𝑙 + 3)𝐸 2𝐸 middle of the 𝐷𝑒𝑐 .𝑂𝑢𝑡 phase. While in the 𝐷𝑒𝑐 .𝑈 𝑠𝑒𝑟 phase, our scheme
|
||||
maintains the lowest computational overhead, It is also significant to
|
||||
observe that the overhead does not fluctuate with varying counts of
|
||||
attributes in the system.
|
||||
In terms of CRFs’ time consumption, our scheme achieves time con-
|
||||
As depicted in Fig. 4, there is a performance comparison for the re-
|
||||
sumption of constant level in 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 phase as illustrated in 3, the
|
||||
randomization of secret keys by CRF 𝐴𝐴 . Our scheme’s computational
|
||||
time overhead does not fluctuate based on the count of attributes within
|
||||
overhead is similar to that of scheme [32], which is at the lower
|
||||
the system. Moreover, our scheme achieves the highest efficiency in
|
||||
level. Moreover, as shown in Fig. 5, the computational overhead of
|
||||
terms of the 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 phase, and requires only two exponential
|
||||
our scheme in the 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒 phase is the most efficient and does
|
||||
operations.
|
||||
not escalate linearly with an increase in vehicle attributes, which is a
|
||||
distinct advantage over other scheme [31]. And compared with [33,
|
||||
6.3. Practical analysis 34], the proposed scheme still has an advantage in the computational
|
||||
overhead of 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝 phase.
|
||||
In light of the hardware and software environment described within In summary, our scheme reduces resource consumption on the user
|
||||
the xperimental Setup section, Fig. 3 presents a performance comparison side and improves the efficiency of data flow in vehicles with limited
|
||||
of the multiple phases of our scheme. computing power.
|
||||
|
||||
8
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
Acknowledgments
|
||||
|
||||
This work was supported in part by Key project of Gansu Science
|
||||
and Technology Plan (23YFGA0081), Gansu Province College Industry
|
||||
Ssupport Plan (2023CYZC-09), National Natural Science Foundation of
|
||||
China (No. 62362059).
|
||||
|
||||
Data availability
|
||||
|
||||
The authors do not have permission to share data.
|
||||
|
||||
|
||||
References
|
||||
Fig. 4. Time consumption of 𝐴𝐴 .𝑆 𝑒𝑡𝑈 𝑝.
|
||||
[1] Siyi Liao, Jun Wu, Jianhua Li, Ali Kashif Bashir, Shahid Mumtaz, Alireza Jolfaei,
|
||||
Nida Kvedaraite, Cognitive popularity based AI service sharing for software-
|
||||
defined information-centric networks, IEEE Trans. Netw. Sci. Eng. 7 (4) (2020)
|
||||
2126–2136.
|
||||
[2] Rich Miller, Rolling zettabytes: Quantifying the data impact of connected cars,
|
||||
Data Cent. Front. (2020).
|
||||
[3] Kayhan Zrar Ghafoor, Linghe Kong, Sherali Zeadally, Ali Safaa Sadiq, Gre-
|
||||
gory Epiphaniou, Mohammad Hammoudeh, Ali Kashif Bashir, Shahid Mumtaz,
|
||||
Millimeter-wave communication for internet of vehicles: status, challenges, and
|
||||
perspectives, IEEE Internet Things J. 7 (9) (2020) 8525–8546.
|
||||
[4] Soheila Ghane, Alireza Jolfaei, Lars Kulik, Kotagiri Ramamohanarao, Deepak
|
||||
Puthal, Preserving privacy in the internet of connected vehicles, IEEE Trans.
|
||||
Intell. Transp. Syst. 22 (8) (2020) 5018–5027.
|
||||
[5] Liang Zhao, Hongmei Chai, Yuan Han, Keping Yu, Shahid Mumtaz, A collabo-
|
||||
rative V2X data correction method for road safety, IEEE Trans. Reliab. 71 (2)
|
||||
(2022) 951–962.
|
||||
[6] Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, Lanyu Xu, Edge computing:
|
||||
Vision and challenges, IEEE Internet Things J. 3 (5) (2016) 637–646.
|
||||
Fig. 5. Time consumption of 𝐷𝑂 .𝐸 𝑛𝑐 .𝑂𝑛𝑙𝑖𝑛𝑒. [7] Zhenyu Zhou, Haijun Liao, Bo Gu, Shahid Mumtaz, Jonathan Rodriguez, Resource
|
||||
sharing and task offloading in IoT fog computing: A contract-learning approach,
|
||||
IEEE Trans. Emerg. Top. Comput. Intell. 4 (3) (2019) 227–240.
|
||||
[8] Xingwang Li, Zhen Xie, Zheng Chu, Varun G Menon, Shahid Mumtaz, Jianhua
|
||||
7. Conclusion Zhang, Exploiting benefits of IRS in wireless powered NOMA networks, IEEE
|
||||
Trans. Green Commun. Netw. 6 (1) (2022) 175–186.
|
||||
[9] Vipul Goyal, Omkant Pandey, Amit Sahai, Brent Waters, Attribute-based encryp-
|
||||
In the IoV environment, securing the encryption and sharing of the tion for fine-grained access control of encrypted data, in: Proceedings of the 13th
|
||||
vast amounts of data generated by vehicles, while preventing data leak- ACM Conference on Computer and Communications Security, 2006, pp. 89–98.
|
||||
age due to device tampering, presents significant challenges. To address [10] Amit Sahai, Brent Waters, Fuzzy identity-based encryption, in: Advances in
|
||||
these challenges, we propose an advanced attribute-based encryption Cryptology–EUROCRYPT 2005: 24th Annual International Conference on the
|
||||
Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, May
|
||||
scheme, enhanced with a cryptographic reverse firewall, specifically
|
||||
22-26, 2005. Proceedings 24, Springer, 2005, pp. 457–473.
|
||||
designed for the IoV ecosystem. This scheme is supported by multiple [11] John Bethencourt, Amit Sahai, Brent Waters, Ciphertext-policy attribute-based
|
||||
attribute authorities, which not only defend against collusion attacks encryption, in: 2007 IEEE Symposium on Security and Privacy, SP’07, IEEE,
|
||||
but also enable offline encryption and outsourced decryption. These 2007, pp. 321–334.
|
||||
[12] Matthew Green, Susan Hohenberger, Brent Waters, Outsourcing the decryption
|
||||
integrated features greatly improve the computational efficiency of
|
||||
of {abe} ciphertexts, in: 20th USENIX Security Symposium, USENIX Security 11,
|
||||
vehicular onboard units. Additionally, we deploy RSUs with CRFs 2011.
|
||||
between the entities, ensuring that data remains secure even in the [13] Junzuo Lai, Robert H. Deng, Chaowen Guan, Jian Weng, Attribute-based encryp-
|
||||
event of device tampering. The proposed attribute-based encryption tion with verifiable outsourced decryption, IEEE Trans. Inf. Forensics Secur. 8
|
||||
scheme, combined with the reverse firewall mechanism, shows great (8) (2013) 1343–1354.
|
||||
[14] Suqing Lin, Rui Zhang, Hui Ma, Mingsheng Wang, Revisiting attribute-based
|
||||
promise in securing data transmission and storage within the IoV, while
|
||||
encryption with verifiable outsourced decryption, IEEE Trans. Inf. Forensics
|
||||
protecting against unauthorized access and data leakage. Secur. 10 (10) (2015) 2119–2130.
|
||||
[15] Cong Zuo, Jun Shao, Guiyi Wei, Mande Xie, Min Ji, CCA-secure ABE with
|
||||
outsourced decryption for fog computing, Future Gener. Comput. Syst. 78 (2018)
|
||||
CRediT authorship contribution statement
|
||||
730–738.
|
||||
[16] James Ball, Julian Borger, Glenn Greenwald, et al., Revealed: how US and UK
|
||||
Xiaodong Yang: Writing – review & editing, Writing – original spy agencies defeat internet privacy and security, Know Your Neighb. (2013).
|
||||
draft. Xilai Luo: Writing – review & editing, Writing – original draft. [17] Stephen Checkoway, Ruben Niederhagen, Adam Everspaugh, Matthew Green,
|
||||
Tanja Lange, Thomas Ristenpart, Daniel J Bernstein, Jake Maskiewicz, Hovav
|
||||
Zefan Liao: Writing – review & editing, Writing – original draft. Wenjia Shacham, Matthew Fredrikson, On the practical exploitability of dual {ec} in
|
||||
Wang: Writing – review & editing, Writing – original draft. Xiaoni {tls} implementations, in: 23rd USENIX Security Symposium, USENIX Security
|
||||
Du: Writing – review & editing, Writing – original draft. Shudong Li: 14, 2014, pp. 319–335.
|
||||
Writing – review & editing, Writing – original draft. [18] Yevgeniy Dodis, Chaya Ganesh, Alexander Golovnev, Ari Juels, Thomas Risten-
|
||||
part, A formal treatment of backdoored pseudorandom generators, in: Advances
|
||||
in Cryptology–EUROCRYPT 2015: 34th Annual International Conference on the
|
||||
Declaration of competing interest Theory and Applications of Cryptographic Techniques, Sofia, Bulgaria, April
|
||||
26-30, 2015, Proceedings, Part I 34, Springer, 2015, pp. 101–126.
|
||||
[19] Ilya Mironov, Noah Stephens-Davidowitz, Cryptographic reverse firewalls, in: Ad-
|
||||
The authors declare that they have no known competing finan- vances in Cryptology-EUROCRYPT 2015: 34th Annual International Conference
|
||||
cial interests or personal relationships that could have appeared to on the Theory and Applications of Cryptographic Techniques, Sofia, Bulgaria,
|
||||
influence the work reported in this paper. April 26-30, 2015, Proceedings, Part II 34, Springer, 2015, pp. 657–686.
|
||||
|
||||
|
||||
9
|
||||
X. Yang et al. Journal of Systems Architecture 160 (2025) 103331
|
||||
|
||||
|
||||
[20] Brent Waters, Ciphertext-policy attribute-based encryption: An expressive, effi- Xilai Luo is presently a master’s degree candidate at the
|
||||
cient, and provably secure realization, in: International Workshop on Public Key College of Computer Science and Engineering, Northwest
|
||||
Cryptography, Springer, 2011, pp. 53–70. Normal University, located in China. His academic pur-
|
||||
[21] Shucheng Yu, Cong Wang, Kui Ren, Wenjing Lou, Achieving secure, scalable, suits are focused on the areas of artificial intelligence,
|
||||
and fine-grained data access control in cloud computing, in: 2010 Proceedings information security, and cryptography.
|
||||
IEEE INFOCOM, IEEE, 2010, pp. 1–9.
|
||||
[22] Kan Yang, Xiaohua Jia, Kui Ren, Ruitao Xie, Liusheng Huang, Enabling efficient
|
||||
access control with dynamic policy updating for big data in the cloud, in: IEEE
|
||||
INFOCOM 2014-IEEE Conference on Computer Communications, IEEE, 2014, pp.
|
||||
2013–2021.
|
||||
[23] Jun Feng, Hu Xiong, Jinhao Chen, Yang Xiang, Kuo-Hui Yeh, Scalable and
|
||||
revocable attribute-based data sharing with short revocation list for IIoT, IEEE
|
||||
Internet Things J. 10 (6) (2022) 4815–4829. Zefan Liao is actively working towards his master’s degree
|
||||
[24] Qian Mei, Hu Xiong, Yeh-Cheng Chen, Chien-Ming Chen, Blockchain-enabled in the College of Computer Science and Engineering at
|
||||
privacy-preserving authentication mechanism for transportation cps with Northwest Normal University, China. His areas of research
|
||||
cloud-edge computing, IEEE Trans. Eng. Manage. (2022). interest include the fields of edge computing, information
|
||||
[25] Rui Guo, Geng Yang, Huixian Shi, Yinghui Zhang, Dong Zheng, O 3-R-CP-ABE: An security, and cryptography.
|
||||
efficient and revocable attribute-based encryption scheme in the cloud-assisted
|
||||
IoMT system, IEEE Internet Things J. 8 (11) (2021) 8949–8963.
|
||||
[26] Melissa Chase, Multi-authority attribute based encryption, in: Theory of Cryp-
|
||||
tography: 4th Theory of Cryptography Conference, TCC 2007, Amsterdam, the
|
||||
Netherlands, February 21-24, 2007. Proceedings 4, Springer, 2007, pp. 515–534.
|
||||
[27] Allison Lewko, Brent Waters, Decentralizing attribute-based encryption, in: An-
|
||||
nual International Conference on the Theory and Applications of Cryptographic
|
||||
Techniques, Springer, 2011, pp. 568–588. Wenjia Wang is pursuing her master’s degree within the
|
||||
[28] Chandan Kumar Chaudhary, Richa Sarma, Ferdous Ahmed Barbhuiya, RMA- College of Computer Science and Engineering at Northwest
|
||||
CPABE: A multi-authority CPABE scheme with reduced ciphertext size for IoT Normal University, China. Her research interests are cen-
|
||||
devices, Future Gener. Comput. Syst. 138 (2023) 226–242. tered on the topics of data security and network security.
|
||||
[29] Hong Zhong, Yiyuan Zhou, Qingyang Zhang, Yan Xu, Jie Cui, An efficient and
|
||||
outsourcing-supported attribute-based access control scheme for edge-enabled
|
||||
smart healthcare, Future Gener. Comput. Syst. 115 (2021) 486–496.
|
||||
[30] Hui Ma, Rui Zhang, Guomin Yang, Zishuai Song, Shuzhou Sun, Yuting Xiao,
|
||||
Concessive online/offline attribute based encryption with cryptographic reverse
|
||||
firewalls—Secure and efficient fine-grained access control on corrupted machines,
|
||||
in: Computer Security: 23rd European Symposium on Research in Computer
|
||||
Security, ESORICS 2018, Barcelona, Spain, September 3-7, 2018, Proceedings, Xiaoni Du received the Ph.D. degree in cryptography from
|
||||
Part II 23, Springer, 2018, pp. 507–526. Xidian University, Xi’an, China, in 2008.
|
||||
[31] Bo Hong, Jie Chen, Kai Zhang, Haifeng Qian, Multi-authority non- She worked as a Visiting Scholar with the University of
|
||||
monotonic KP-ABE with cryptographic reverse firewall, IEEE Access 7 (2019) Kentucky, Lexington, KY, USA, and Hong Kong University
|
||||
159002–159012. of Science and Technology, Hong Kong, in 2011 and 2014,
|
||||
[32] Yang Zhao, Yuwei Pang, Xingyu Ke, Bintao Wang, Guobin Zhu, Mingsheng Cao, respectively. She is currently a Professor with the College
|
||||
A metaverse-oriented CP-ABE scheme with cryptographic reverse firewall, Future of Mathematics and Statistics, Northwest Normal Univer-
|
||||
Gener. Comput. Syst. 147 (2023) 195–206. sity, Lanzhou, China. Her main research interests include
|
||||
[33] Jin C., Chen Z., Qin W., et al., Blockchain-based proxy re-encryption scheme information security, cryptography, and coding.
|
||||
with cryptographic reverse firewall for IoV, Int. J. Netw. Manage. (2024) e2305.
|
||||
[34] Elhabob R., Eltayieb N., Xiong H., et al., Equality test public key encryption
|
||||
with cryptographic reverse firewalls for cloud-based E-commerce, IEEE Trans.
|
||||
Consum. Electron. (2024). Shudong Li received the M.S. degree in applied mathe-
|
||||
matics from Tongji University, Shanghai, China, in 2005,
|
||||
and the Ph.D. degree in Posts and Telecommunications from
|
||||
Xiaodong Yang (Member, IEEE) received the M.S. degree Beijing University, Beijing, China, in 2012.
|
||||
in cryptography from Tongji University, Shanghai, China, in From 2013 to 2018, he held the position of a post-
|
||||
2005, and the Ph.D. degree in cryptography from Northwest doctoral researcher at the National University of Defense
|
||||
Normal University, Lanzhou, China, in 2010. Technology in Changsha, China. He now serves as a Pro-
|
||||
In his role as a Postdoctoral Researcher at China’s State fessor at the Cyberspace Institute of Advanced Technology
|
||||
Key Laboratory of Cryptology in Beijing during 2016, he at Guangzhou University. His primary research interests
|
||||
played a significant part in advancing the field. Today, he are in the realms of Big Data and its security, malware
|
||||
holds the position of Professor at the College of Computer identification, and cloud computing.
|
||||
Science and Engineering, Northwest Normal University. The
|
||||
core of his research is anchored in public-key cryptogra-
|
||||
phy, information security protocols, and the application of
|
||||
wireless sensor networks.
|
||||
|
||||
|
||||
|
||||
|
||||
10
|
||||
|
||||
@@ -0,0 +1,965 @@
|
||||
Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Journal of Systems Architecture
|
||||
journal homepage: www.elsevier.com/locate/sysarc
|
||||
|
||||
|
||||
|
||||
|
||||
A hash-based post-quantum ring signature scheme for the Internet of Vehicles
|
||||
Shuanggen Liu a ,∗, Xiayi Zhou a , Xu An Wang b , Zixuan Yan a , He Yan a , Yurui Cao a
|
||||
a
|
||||
School of Cyberspace Security, Xi’an University of Posts and Telecommunications, Xi’an, Shaanxi, China
|
||||
b
|
||||
Key Laboratory of Network and Information Security, Engineering University of People’s Armed Police, Shaanxi, China
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: With the rapid development of the Internet of Vehicles, securing data transmission has become crucial,
|
||||
Ring signature especially given the threat posed by quantum computing to traditional digital signatures. This paper presents
|
||||
Internet of Vehicles a hash-based post-quantum ring signature scheme built upon the XMSS hash-based signature framework,
|
||||
Merkle tree
|
||||
leveraging Merkle trees for efficient data organization and verification. In addition, the scheme is applied to
|
||||
Post-quantum digital signature
|
||||
the Internet of Vehicles, ensuring both anonymity and traceability while providing robust quantum-resistant
|
||||
Hash-based signature scheme
|
||||
security. Evaluation results indicate that, compared to other schemes, the proposed method achieves superior
|
||||
verification speed while ensuring data security and privacy.
|
||||
|
||||
|
||||
|
||||
1. Introduction area of study, with the aim of establishing a resilient foundation
|
||||
for the industry. The National Institute of Standards and Technology
|
||||
As a fundamental necessity in modern life, the number of vehicles (NIST) has been conducting a multi-stage standardization process for
|
||||
produced worldwide continues to grow. According to relevant statistics, post-quantum cryptography. The third round of candidate evaluations
|
||||
global vehicle production reached 94 million units in 2023 [1]. Ad- has been completed, and algorithms such as SPHINCS+, CRYSTALS-
|
||||
ditionally, data from the International Organization of Motor Vehicle DILITHIUM, and CRYSTALS-KYBER have been standardized. These
|
||||
Manufacturers indicates that there are now 1.3 billion vehicles in algorithms achieve varying levels of bit-level security depending on
|
||||
use [2]. However, this growth brings various challenges, including key size and parameter settings, which align with NIST security levels
|
||||
network attacks, unauthorized access, and concerns around road safety from 1 to 5, representing 128/160/192/224/256-bit security strengths,
|
||||
and privacy. To address these issues, new research fields, such as respectively [5]. A post-quantum digital signature scheme is a dig-
|
||||
intelligent transportation systems (ITS) and the Internet of Vehicles ital signature scheme capable of resisting quantum attacks. Among
|
||||
(IoV), have emerged. These fields aim to provide safer, more efficient, post-quantum digital signature schemes, hash-based schemes are partic-
|
||||
and more harmonious vehicular environments. Vehicle-to-Everything ularly effective and provably secure. Hash-based post-quantum digital
|
||||
(V2X) technology enables the effective use of dynamic information signature schemes offer significant advantages over other types of
|
||||
from all networked vehicles via on-board devices, facilitating secure,
|
||||
post-quantum schemes due to their high computational efficiency, scal-
|
||||
efficient, intelligent, and comfortable services, thereby contributing
|
||||
ability, maturity, and reliance solely on the preimage resistance of the
|
||||
to the intelligence of social traffic systems [3]. The typical VANET
|
||||
underlying hash function [6].
|
||||
structure is shown in Fig. 1.
|
||||
In IoV networks, where both privacy and traffic safety are essential,
|
||||
With the increasing number of vehicles and the development of
|
||||
ring signatures are especially suitable. Ring signature schemes offer
|
||||
the IoV, it is a very important job to ensure the security of the
|
||||
anonymity by concealing the identity of signer among a group of par-
|
||||
IoV systems. Currently, the security of vehicular networks, whether
|
||||
ticipants. Using hash-based post-quantum ring signatures, vehicles can
|
||||
internal or external, primarily relies on digital signatures or public-
|
||||
sign messages anonymously within a group, ensuring their identities
|
||||
key encryption. However, as quantum computing advances, traditional
|
||||
digital signature algorithms are increasingly vulnerable to quantum cannot be traced. These signatures also provide unforgeability, collision
|
||||
attacks, making it essential to incorporate post-quantum digital sig- resistance, resilience against quantum attacks, and low communication
|
||||
nature algorithms into IoV research. Unlike traditional computers, overhead. In densely populated cities, managing keys for secure vehic-
|
||||
quantum computers can accelerate the cracking of probabilistic al- ular communications can be challenging, especially given the limited
|
||||
gorithms through parallel computation capabilities [4]. In light of IoV coverage [7]. The Merkle tree structure effectively compresses
|
||||
these challenges, post-quantum cryptography has become a critical keys, reducing key management costs [8]. In this study, we propose a
|
||||
|
||||
|
||||
∗ Corresponding author.
|
||||
E-mail address: liushuanggen201@xupt.edu.cn (S. Liu).
|
||||
|
||||
https://doi.org/10.1016/j.sysarc.2025.103345
|
||||
Received 11 November 2024; Received in revised form 23 December 2024; Accepted 16 January 2025
|
||||
Available online 23 January 2025
|
||||
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
of classical signature and ring signature in the quantum environment,
|
||||
and proposed two short signature schemes, which were implemented
|
||||
in the quantum random prediction model and the ordinary model
|
||||
respectively [20]. Recent literature has introduced novel architectures,
|
||||
such as linkable ring signatures, threshold ring signatures, and identity-
|
||||
based post-quantum ring signatures, discussing their post-quantum se-
|
||||
curity features [21–23], Similarly, literature [24]systematically reviews
|
||||
the theory and application of linkable ring signatures, providing an in-
|
||||
depth comparison of anonymization and linkability schemes, but these
|
||||
studies lack analysis of specific application scenarios (such as the IoV),
|
||||
and do not fully consider resource-constrained environments and the
|
||||
potential of anti-quantum computing.
|
||||
In response to the research of NIST on post-quantum algorithms
|
||||
and verification ring signatures, a blockchain-based, post-quantum
|
||||
anonymous, traceable, and verifiable authentication scheme was pro-
|
||||
posed to mitigate quantum attacks while addressing security and pri-
|
||||
vacy concerns, with an evaluation of its feasibility in IoV environ-
|
||||
ments [25]. The IoV faces significant security and privacy challenges,
|
||||
Fig. 1. VANET structure.
|
||||
and blockchain technology offers an effective platform to ensure both
|
||||
user privacy and security [26–28]. Literature [29] proposes an identity
|
||||
authentication and signature scheme for UAV-assisted Vehicular Ad
|
||||
Hoc Networks (VANET), focusing on enhancing network anonymity
|
||||
hash-based post-quantum ring signature scheme for IoV applications.
|
||||
and user privacy through an efficient authentication mechanism. Lit-
|
||||
The ring signature algorithm of Our scheme is based on the XMSS
|
||||
erature [30] introduces a distributed message authentication scheme
|
||||
algorithm, aiming to enhance data sharing security and efficiency.
|
||||
combined with a reputation mechanism to improve the security and
|
||||
Merkle trees are used to organize and verify data efficiently, while ring
|
||||
trust of the IoV. The scheme uses node credit values to authenticate
|
||||
signatures ensure the authenticity and integrity of data within the IoV
|
||||
message validity, effectively preventing malicious attacks and forgery.
|
||||
network without compromising user anonymity.
|
||||
Literature [31] presents an authentication key negotiation protocol for
|
||||
intelligent transportation systems in vehicle networks, strengthening
|
||||
1.1. Related works identity authentication and key exchange mechanisms to prevent secu-
|
||||
rity threats such as eavesdropping, tampering, and man-in-the-middle
|
||||
In recent years, hash-based post-quantum digital signature schemes attacks. While these studies address key security challenges in vehicular
|
||||
have garnered significant attention within the cryptography commu- networks, they often focus on specific aspects, lacking comprehensive
|
||||
nity. Following the fourth round of the NIST post-quantum digital and scalable frameworks for real-world scenarios. Furthermore, the
|
||||
signature standardization process, the SPHINCS+ algorithm was in- integration of post-quantum cryptography and scalability in dynamic,
|
||||
troduced as a supplementary standard, featuring a flexible, tunable large-scale networks remains underexplored, highlighting opportunities
|
||||
hash function structure [9]. As the standardization process progresses, for future research into robust and future-proof solutions. Given the
|
||||
researchers have proposed various adaptations, including SPHINCS-a inherent advantages of ring signatures, they are particularly well-
|
||||
and SPHINCS+-c, which further compress signature sizes and enhance suited for applications such as the Internet of Vehicles, making further
|
||||
execution speeds [10,11]. Additionally, Sun, Liu, and colleagues de- investigation essential.
|
||||
veloped a domestic signature algorithm based on the post-quantum In order to ensure the post-quantum security of data transmission
|
||||
hash function SM3 [12]. Hülsing and Kudinov provided a rigorous in the IoV environment, researchers have proposed various solutions.
|
||||
security proof for the SPHINCS+ algorithm, confirming its robustness The literature [32] recommends the use of lattice-based post-quantum
|
||||
in a post-quantum environment [13]. The XMSS algorithm forms the digital signature, but the signature algorithm has not been combined
|
||||
foundation of SPHINCS+, with its architectural design and security with specific scenarios. Another study [33] proposed a ring-signature
|
||||
proof presented by Hülsing, Butin, and others [14]. Research on hard- scheme based on lattice-based difficult problems and combined it with
|
||||
ware implementations of the XMSS algorithm has also advanced, with the vehicle-connected environment, but the quantum anti-attack char-
|
||||
significant contributions from Thoma and Güneysu [15]. Meanwhile, acteristics of the scheme were not explained in detail. In addition,
|
||||
Sun and Liu investigated the feasibility of replacing the hash function reducing energy consumption in blockchain has also become a research
|
||||
in XMSS with the domestic SM3 hash function [16]. An essential com- focus [34]. An energy saving method is adopted to calculate the root of
|
||||
ponent of XMSS is WOTS+, a one-time signature algorithm; Hülsing Merkle tree, and a Merkle tree design scheme conforming to the specifi-
|
||||
provided its security proof [17], while Zhang, Cui, and colleagues cation is proposed. The effectiveness of this method is verified through
|
||||
evaluated the efficiency of WOTS+ in tree-based one-time signature experiments. At the same time, the Merkle tree accumulator algorithm
|
||||
algorithms [18]. Currently, research on post-quantum digital signatures proposed by Derler and Ramacher in [35] builds an accumulator that
|
||||
primarily concentrates on enhancing signature efficiency and replacing can resist quantum attacks by using only hash function and symmetric
|
||||
the underlying hash functions. However, there is a scarcity of studies meta language, and gives specific operations and definitions. However,
|
||||
that integrate post-quantum digital signatures with specific application the specific algorithm implementation and its combination in practical
|
||||
scenarios or explore their variants. application scenarios need to be further studied.
|
||||
The exploration of post-quantum ring signatures is also accelerating
|
||||
in post-quantum digital signature research. Xie, Wang, and colleagues 1.2. Contributions
|
||||
highlighted that traditional signature algorithms are highly susceptible
|
||||
to quantum computing attacks, and noted that ring signatures offer Firstly, building on the Merkle tree accumulator algorithm described
|
||||
considerable advantages in blockchain applications, including medical in Ref. [35], we propose a hash-based ring signature algorithm specif-
|
||||
data sharing and vehicular networking, due to their unique proper- ically designed for IOV, we improve the Merkle tree accumulator
|
||||
ties [19]. Chatterjee and Chung et al. conducted an in-depth analysis on algorithm to XMSS accumulator algorithm. This algorithm integrates
|
||||
the security of post-quantum ring signature, re-examined the security the principles of ring signatures with Merkle tree structures. Unlike
|
||||
|
||||
2
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
Table 1
|
||||
Notation for ring signature scheme. Let the security parameter 𝜆, ring signature 𝑅𝑆 = (𝐺𝑒𝑛, 𝑠𝑖𝑔 , 𝑉 𝑒𝑟),
|
||||
𝜆 Security parameter algorithm A is polynomial-time algorithm (any PPT adversary A), for
|
||||
any integer 𝑠, define the following experiment:
|
||||
𝑁 The size of the ring
|
||||
(𝑝𝑘, 𝑠𝑘) Key pair Step 1, the challenger generates 𝑠 key pairs (𝑝𝑘, 𝑠𝑘) in which
|
||||
𝑅 A ring consisting of (𝑝𝑘1 , 𝑝𝑘2 , … … , 𝑝𝑘𝑙 ) 𝑖 ∈ [1, 𝑠], and sends all the public keys 𝑃 𝐾𝑖 in a set 𝑃 𝐾 = (𝑃 𝐾1 ,
|
||||
𝑚 The message digest 𝑃 𝐾2 , … , 𝑃 𝐾𝑠 ) to 𝐴.
|
||||
𝜎 The signature of message Step 2, the challenger chooses one 𝑃 𝐾𝑖 and checks whether 𝑃 𝐾𝑖
|
||||
belongs to 𝑅, if 𝑆 𝑖𝑔(𝑠𝑘𝑖 , 𝑅, 𝑚) → 𝜎 is calculated by the challenger, then
|
||||
the challenger will send 𝜎 to A.
|
||||
Step 3, the attacker outputs the tuple 𝑅∗ , 𝑚∗ , 𝜎 ∗ , and the challenger
|
||||
traditional ring signature algorithms, this proposed scheme can resist
|
||||
checks it.
|
||||
quantum attacks, thus offering post-quantum security.
|
||||
If: 𝑅∗ ∈ 𝑃 𝐾 Attacker A never performs signature query access to
|
||||
Secondly, we construct a new hash-based post-quantum ring sig-
|
||||
(𝑠𝑖𝑔 𝑛, 𝑅∗ , 𝑚∗ ),
|
||||
nature scheme for application of vehicular network. This scheme en- 𝑉 𝑒𝑟(𝑅∗ , 𝑚∗ , 𝜎 ∗ )
|
||||
hances the security of data transmission within the vehicular network, And returns a 1 for the experiment, or a 0 otherwise.
|
||||
providing robust post-quantum security to effectively protect shared
|
||||
data. 𝐴𝑑 𝑣𝜆,𝑠
|
||||
𝑈𝑁𝐹
|
||||
(𝐴) = 𝑃 𝑟[𝐸 𝑥𝑝𝜆,𝑠
|
||||
𝑈𝑁𝐹
|
||||
(𝐴) = 1] ≤ 𝑛𝑒𝑙𝑔(𝜆)
|
||||
|
||||
|
||||
1.3. Structure Definition 3 (Anonymity). Anonymity in a ring signature scheme en-
|
||||
sures that the identity of signer remains concealed among a group of
|
||||
The remainder of this paper is organized as follows: Chapter 2 potential signers, making it impossible to determine who specifically
|
||||
provides the necessary foundational knowledge, along with a review generated the signature. This anonymity is achieved through a ring
|
||||
of the background and related work relevant to this study. In Chapter signature generation process that relies on the public keys of all group
|
||||
3, we present a post-quantum ring signature algorithm based on Merkle members, without revealing the identity of the actual signer.
|
||||
trees and discuss its application within the IoV environment. Chapter In the anonymization experiment, the adversary is given a ring
|
||||
4 offers a security analysis and proof of the robustness of proposed. In signature generated from any two pairs of public and private key pairs,
|
||||
Chapter 5, we evaluate the performance of the scheme and compare it as well as from either of these two private keys, which contains both
|
||||
public keys owned by the adversary, and the goal of adversary is to
|
||||
with existing alternatives. Finally, Chapter 6 concludes the paper and
|
||||
distinguish which private key was used to generate the ring signature
|
||||
outlines directions for future research.
|
||||
with negligible probability.
|
||||
Let the security parameter 𝜆, the ring signature 𝑅𝑆 = (𝐺𝑒𝑛, 𝑠𝑖𝑔 , 𝑉 𝑒𝑟),
|
||||
2. Preliminaries algorithm A be a polynomial time algorithm, for any integer 𝑠 and any
|
||||
bit 𝑏, define the experiment as follows:
|
||||
2.1. Ring signature Step 1, the challenger generates 𝑠 key pairs (𝑃 𝐾𝑖 , 𝑆 𝐾𝑖 ), of which
|
||||
𝑖 ∈ [1, 𝑠], and sends all the public keys 𝑃 𝐾𝑖 to A.
|
||||
Ring signature is a digital signature scheme introduced by Rivest, Step 2, A sends (𝑅, 𝑚, 𝑖0 , 𝑖1 ) to the challenger, the challenger checks
|
||||
Shamir, and Tauman in 2001. A ring is composed of a group of if 𝑝𝑘𝑖0 ∈ 𝑅2 , 𝑝𝑘𝑖1 ∈ 𝑅2 , then the challenger calculates 𝑅2 𝜎 ←
|
||||
members, allowing any member within the group to sign on behalf 𝑆 𝑖𝑔(𝑠𝑘𝑖𝑏 , 𝑅, 𝑚) and send 𝜎 to A.
|
||||
of the entire group without revealing the identity of the signing mem- Step 3, A returns a guess bit 𝑏∗ where the experiment 𝑏∗ = 𝑏 outputs
|
||||
1 if and 0 otherwise, and RS is considered anonymous if for all 𝑠 and
|
||||
ber [36],The main parameters of ring signature are given in Table 1.
|
||||
all polynomial-time algorithms A, the probability of A returning 1 in
|
||||
the (𝑠, 0)-anonymous experiment (in the 𝜆) is ignorably close to the
|
||||
Definition 1 (Ring Signature). A ring signature scheme consists of three
|
||||
probability of A returning 1 in the (𝑠, 1)anonymous experiment.
|
||||
core algorithms: key generation, signature generation, and signature
|
||||
1
|
||||
verification. These algorithms are defined as follows: 𝐴𝑑 𝑣𝜆,𝑠
|
||||
𝐴𝑁 𝑂𝑁
|
||||
(𝐴) = |𝑃 𝑟[𝐸 𝑥𝑝𝜆,𝑠
|
||||
𝐴𝑁 𝑂𝑁
|
||||
(𝐴)] − | ≤ 𝑛𝑒𝑙𝑔(𝜆)
|
||||
2
|
||||
Step1: Key generation
|
||||
(𝑝𝑘, 𝑠𝑘) ← 𝐺𝑒𝑛(𝜆, 𝑁):The size of the ring is 𝑁, set the security param- 2.2. WOTS+
|
||||
eters 𝜆 the maximum number of members in the ring 𝑁, 𝜆 and 𝑁 as
|
||||
input, the output is the public and private key pair. Ralph Merkle pioneered hash-based signature algorithms, as noted
|
||||
Step2: Signature generation in Ref. [37]. Currently, hash-based signature schemes are categorized
|
||||
𝜎 ← 𝑆 𝑖𝑔 𝑛(𝑠𝑘, 𝑅, 𝑚): Input private key 𝑠𝑘, set of all public keys 𝑅 = into three main types: one-time signature schemes (OTS), few-time
|
||||
(𝑃 𝐾1 , 𝑃 𝐾2 , … , 𝑃 𝐾𝐿 ), message 𝑚 ∈ 𝑀𝜆 , output signature 𝜎. signature schemes (FTS), and many-time signature schemes (MTS).
|
||||
The Table 2 below summarizes some of the most widely used hash-
|
||||
Step3: Signature verification
|
||||
based signature schemes. Research on OTS schemes began with the
|
||||
𝑇 𝑟𝑢𝑒∕𝑓 𝑎𝑙𝑠𝑒 ← 𝑉 𝑒𝑟(𝑅, 𝑚, 𝜎): Input a collection composed of all public
|
||||
Lamport-Diffie algorithm. This paper adopts the WOTS+ (Winternitz
|
||||
keys 𝑅, message 𝑚 ∈ 𝑀𝜆 , signature 𝜎, and output 𝑇 𝑟𝑢𝑒∕𝑓 𝑎𝑙𝑠𝑒.
|
||||
One-Time Signature Plus) scheme, which comprises three main compo-
|
||||
A ring signature must satisfy two critical security properties: nents: key generation (GEN), signature generation (SIG), and signature
|
||||
anonymity and Unforgeability. Anonymity ensures that while the sig- verification (VER).
|
||||
nature indicates it was generated by a member of the ring, it does The first step is parameter selection, where parameter 𝜔, an integer
|
||||
not reveal the specific identity of the signer. Unforgeability guarantees 𝜔 ∈ 𝑁 with 𝜔 ≥ 2, is determined to set the number of hash iterations
|
||||
that only members of the ring can generate valid signatures; outsiders required to construct the 𝑛 ∈ 𝑁 public key. Additionally, the hash
|
||||
cannot create valid signatures for the ring. output length m and security parameter n, where, need to be defined.
|
||||
Next, parameters 𝑙1 and 𝑙2 are computed, which are then summed to
|
||||
Definition 2 (Unforgeability). Unforgeability ensures that only members obtain l. The calculation method is as follows:
|
||||
of the ring can generate a valid signature. In the unforgeability model, ⌈ ⌉ ⌊ ⌋
|
||||
𝑚 log2 (𝑙1 (𝜔 − 1)) + log2 𝜔
|
||||
we assume that the attacker has access to a public key and aims to 𝑙1 = , 𝑙2 = , 𝑙 = 𝑙1 + 𝑙2
|
||||
log2 𝜔 log2 𝜔
|
||||
produce a valid ring signature without authorization.
|
||||
|
||||
|
||||
3
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
Table 2
|
||||
Classification table for hash-based signature schemes.
|
||||
Scheme Type Scheme Name
|
||||
OTS Lamport-Diffe, WOTS, 𝑊 𝑂𝑇 𝑆 +
|
||||
FTS HORS, HORST-T, PORS, PORS-T
|
||||
MTS XMSS, SPHINCS, SPHINCS+
|
||||
|
||||
|
||||
Table 3
|
||||
Parameter descriptions for the WOTS+ algorithm.
|
||||
𝑛∈𝑁 Security parameter
|
||||
𝑤∈𝑁 Winternitz parameter (𝑤 ≥ 2)
|
||||
𝑚∈𝑁 Bit length of the message digest
|
||||
{ }
|
||||
𝐹𝑛 A set of functions, 𝐹𝑛 = 𝑓𝑘 ∣ 𝑘 ∈ {0, 1}𝑛 ,
|
||||
𝑓𝑘 ∶ {0, 1}𝑛 → {0, 1}𝑛
|
||||
ℎ∈𝑁 Height of the tree
|
||||
H Hash function, 𝐻 ∶ {0, 1}∗ → {0, 1}𝑚
|
||||
𝑥 ∈ {0, 1}𝑛 Randomly chosen string 𝑥,
|
||||
used to construct a one-time verification key
|
||||
|
||||
|
||||
Fig. 2. Key generation process for WOTS+.
|
||||
|
||||
|
||||
|
||||
The Table 3 gives the meaning of the parameters in the formula.
|
||||
Next define the operation, WOTS+ uses the function 𝐹𝑛 family:
|
||||
𝐹𝑛 ∶ {0, 1}𝑛 → {0, 1}𝑛
|
||||
Fig. 3. Message digest generation graph.
|
||||
Define the function operation:
|
||||
{ 𝑖
|
||||
𝑐 (𝑥, 𝑟) = 𝐹 (𝑐𝑘𝑖−1 (𝑥, 𝑟) ⊕ 𝑟𝑖 ) 𝑖 > 0
|
||||
𝑐 𝑖 (𝑥, 𝑟) = 𝑥, 𝑖 𝑖=0
|
||||
|
||||
⎧ 𝑥 ∈ {0, 1}𝑛
|
||||
⎪ 𝑛 𝑛
|
||||
⎨𝐹 = 𝐹 𝑛 ∶ {0, 1} → {0, 1}
|
||||
⎪ 𝑟 = (𝑟 , 𝑟 , … … , 𝑟 𝑤 ) 𝑟 ∈ {0, 1}𝑛×(2
|
||||
𝜔−1 )
|
||||
⎩ 1 2 2 −1
|
||||
Step1: Key Generation(GEN)
|
||||
The process of key generation mainly includes two steps: private
|
||||
key generation and public key generation. The key generation process
|
||||
is shown in Fig. 2.
|
||||
(1) Private key generation: Using PRG to generate 𝑙 + 2𝜔 − 1 n
|
||||
bits of random number, the first random number is the private key
|
||||
𝑠𝑘 = (𝑠𝑘0 , 𝑠𝑘1 , … … , 𝑠𝑘𝑙−1 ), and the last 2𝜔 − 1 are the mask, 𝑟 =
|
||||
(𝑟1 , 𝑟2 , … … , 𝑟2𝜔 −1 ).
|
||||
(2) Public key generation: The public key consists of 𝑙 + 1 blocks,
|
||||
the first block is the mask r, the last L blocks are converted by sk, and
|
||||
The public key is composed as follows:
|
||||
𝜔
|
||||
𝑝𝑘𝑖 = 𝑐 2 −1 (𝑠𝑘𝑖−1 , 𝑟), 𝑖 ∈ [1, 𝑙] Fig. 4. WOTS+ signature generation diagram.
|
||||
𝑝𝑘 = (𝑝𝑘0 , 𝑝𝑘1 , … , 𝑝𝑘𝑙 )
|
||||
( 𝜔−1 𝜔−1
|
||||
)
|
||||
= 𝑟, 𝑐 2 (𝑠𝑘0 , 𝑟), … , 𝑐 2 (𝑠𝑘𝑙−1 , 𝑟)
|
||||
The message M is converted to 𝑏 = (𝑏0 , 𝑏1 , … … , 𝑏𝑙−1 ). Then, the
|
||||
Step2: Message Signature(SIG) transmitted signature 𝜎 = (𝜎0 , 𝜎1 , … … , 𝜎𝑙−1 ) is processed as follows to
|
||||
(1) Generate message digest: Generate message digest M that needs obtain 𝑝𝑘′ . If the signature is the same as pk, the signature verification
|
||||
to be signed message m through the hash function, and then divide the succeeds.
|
||||
message digest into 𝑙1 parts, each 𝜔 bit, where each 𝜔 bit represents the 𝑝𝑘′ =(𝑟, 𝑝𝑘′1 , 𝑝𝑘′2 , … , 𝑝𝑘′𝑙 )
|
||||
𝑚𝑖 , 𝑖 ∈ [0, 𝑙1 − 1] equivalent of an integer. The message digest generation ( 𝜔 𝜔 𝜔
|
||||
)
|
||||
process is shown in Fig. 3, and the overall signature generation process = 𝑟, 𝐹 2 −1−𝑏0 (𝜎0 ), 𝐹 2 −1−𝑏1 (𝜎1 ), … , 𝐹 2 −1−𝑏𝑙−1 (𝜎𝑙−1 )
|
||||
is shown in Fig. 4.
|
||||
(2) Calculate the checksum:
|
||||
𝑙1
|
||||
∑ 2.3. XMSS
|
||||
𝐶= (2𝜔 − 1 − 𝑚𝑖 ) ≤ 𝑙1 (2𝜔 − 1)
|
||||
𝑖=1 2.3.1. Merkle tree
|
||||
Divide C into 𝜔 bits, and 𝑐 = (𝑐0 , 𝑐1 , … … , 𝑐𝑙2 −1 ). The Merkle Signature Scheme (MSS), proposed by Ralph Merkle in
|
||||
Let 𝑏 = (𝑏0 , 𝑏1 , … … , 𝑏𝑙−1 ), that is b be the concatenation of 𝑚 and 𝑐. 1979, integrates the Merkle Tree with an OTS algorithm. A Merkle tree
|
||||
Signature generation is represented by the following formula: is a hierarchical structure where leaf nodes contain hash values of data,
|
||||
and non-leaf nodes store the combined hash values of their child nodes.
|
||||
𝜎 = (𝜎0 , 𝜎1 , … , 𝜎𝑙−1 ) This structure enables efficient data integrity verification, especially for
|
||||
( )
|
||||
= 𝐹 𝑏0 (𝑠𝑘0 , 𝑟), 𝐹 𝑏1 (𝑠𝑘1 , 𝑟), … , 𝐹 𝑏𝑙−1 (𝑠𝑘𝑙−1 , 𝑟) large-scale datasets. The structure of the Merkle tree is shown in Fig. 5.
|
||||
According to the Fig. 5, the tree has 3 layers and 23 = 8 leaf nodes,
|
||||
Step3: Message verification(VER) each storing the hash of a one-time signature public key. The leaf nodes,
|
||||
|
||||
4
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 5. Merkle tree structure diagram.
|
||||
|
||||
|
||||
labeled node0 to node7, are hashed pairwise to generate the middle 2.3.4. Signature verification
|
||||
nodes. The final root node stores the public key. The signature verification process ensures the correctness of the
|
||||
The Merkle tree serves two primary functions: OTS signature and validates that the corresponding OTS public key
|
||||
(1) Data Integrity Verification, where users can check if data has is consistent with the root of the Merkle tree. The main steps are as
|
||||
been tampered with by recalculating the root hash. follows:
|
||||
(2) Public Key Size Compression, reducing the storage requirements Step1: Extract Information
|
||||
for numerous public keys by consolidating them into a single root key. Extract OTS serial number 𝑖, OTS signature 𝑆 𝑖𝑔𝑂𝑇 𝑆 , and path proof
|
||||
AuthPath for the Merkle tree from XMSS signature 𝑆 𝑖𝑔𝑋 𝑀 𝑆 𝑆 .
|
||||
2.3.2. Key generation
|
||||
Step2: Verify OTS signature
|
||||
The XMSS algorithm deploys 2ℎ WOTS+ instances as the 2ℎ leaf
|
||||
Using the extracted OTS public key, verify the validity of 𝑆 𝑖𝑔𝑂𝑇 𝑆
|
||||
nodes of a Merkle tree with height ℎ, with the root node authenticating
|
||||
for the message M. If verification fails, the signature is deemed invalid.
|
||||
these instances [38]. The XMSS key consists of multiple OTS keys and
|
||||
Step3: Compute Merkle Tree Path
|
||||
the root of the Merkle tree as the public key.
|
||||
Step1: Select the parameters Calculate the Merkle tree node of the OTS public key Using OTS
|
||||
Step2: Generate a one-time signature key pair (𝑝𝑘, 𝑠𝑘) public key 𝑝𝑘𝑖 and path proof AuthPath, calculate the hash value of
|
||||
Step3: Build the Merkle tree the parent node step by step from the leaf node 𝑝𝑘𝑖 until the root node
|
||||
Use each OTS public key 𝑝𝑘𝑖 as a leaf node of the Merkle tree. 𝑁 𝑜𝑑 𝑒(𝑖) = 𝐻(𝑐 ℎ𝑖𝑙𝑑(𝑖) ∥ 𝑐 ℎ𝑖𝑙𝑑(𝑖)) is calculated.
|
||||
Each leaf node generates non-leaf nodes through a hash function, which Step4: Compare Root Nodes
|
||||
eventually generates the Root node. The parent node in the Merkle tree Compare the reconstructed root node with the root node Root
|
||||
is generated from the hash of the two child nodes, that is, 𝑁 𝑜𝑑 𝑒(𝑖) = from the XMSS public key. If the values match, the signature is valid;
|
||||
𝐻(𝑐 ℎ𝑖𝑙𝑑(1) ∥ 𝑐 ℎ𝑖𝑙𝑑(𝑖)), the root node 𝑅𝑜𝑜𝑡 serves as the XMSS public otherwise, it is invalid.
|
||||
key.
|
||||
Step4: Output the key pair 3. Hash-based post-quantum ring signature scheme
|
||||
Public key: 𝑝𝑘 = (𝑟𝑜𝑜𝑡, 𝑠𝑒𝑒𝑑), the private key consists of the OTS key
|
||||
pairs. In addition to its high computational efficiency and excellent scal-
|
||||
ability, the hash function-based signature scheme exhibits greater al-
|
||||
2.3.3. Message signature gorithmic maturity compared to other post-quantum digital signature
|
||||
To sign a message, an unused WOTS+ private key is selected, and schemes, such as XMSS and SPHINCS+. Furthermore, post-quantum
|
||||
the Merkle tree path proof is generated to output the signature SIG.
|
||||
ring signatures ensure both the anonymity and unforgeability of signa-
|
||||
Step1: Select WOTS+ key
|
||||
tures. Consequently, in light of the security threats posed by the rapid
|
||||
Choose an unused WOTS+ private key 𝑠𝑘𝑖 , ensuring it is used only
|
||||
advancement of quantum computing, it is highly significant to integrate
|
||||
once.
|
||||
the post-quantum ring signature scheme with vehicle networking.
|
||||
Step2: Generate WOTS+ one-time signature
|
||||
Use the WOTS+ private key to sign message M, producing the OTS
|
||||
signature 𝑆 𝑖𝑔𝑂𝑇 𝑆 . 3.1. Design principles
|
||||
Step3: Merkle tree path proof
|
||||
Hash path from leaf node 𝑝𝑘𝑖 to Root node, this path proves that The Merkle tree is an efficient data structure, a binary hash tree
|
||||
OTS public key is valid. where each node represents the hash value of a data block. The root
|
||||
Step4: Generate XMSS signature node represents the hash of the entire data set. The characteristics
|
||||
The signature includes: serial number 𝑖 (using the 𝑖 th OTS key), of the Merkle tree make it a highly efficient method for storing and
|
||||
OTS signature 𝑆 𝑖𝑔𝑂𝑇 𝑆 , and AuthPath for authentication of the Merkle verifying large amounts of data. In blockchain, Merkle trees are widely
|
||||
tree 𝑆 𝑖𝑔𝑋 𝑀 𝑆 𝑆 = (𝑖, 𝑆 𝑖𝑔𝑂𝑇 𝑆 , 𝐴𝑢𝑡ℎ𝑃 𝑎𝑡ℎ). used to store transaction data and block hashes. Ring signatures enable
|
||||
|
||||
5
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
Table 4
|
||||
Meaning of parameters in the proposed scheme.
|
||||
⎡ 𝐸 𝑣𝑎𝑙𝑟 ((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝑋 ∗ ) → 𝛺∗ ⎤
|
||||
Parameter Description ⎢ 𝑖
|
||||
⎥
|
||||
𝑃 𝑟 ⎢ (Gen(1𝑘 , 𝑡) → (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ))(𝐴(𝑝𝑘𝛺 ) → (𝑤𝑖𝑡∗𝑥𝑖 , 𝑥∗𝑖 , 𝑋 ∗ )) ⎥ ≤ 𝜀(𝑘)
|
||||
𝑘 Security parameter
|
||||
⎢ 𝑉 𝑒𝑟𝑖𝑓 𝑦(𝑝𝑘𝛺 , 𝛺∗ , 𝑤𝑖𝑡∗ , 𝑥∗ ) = 1 ∧ 𝑥𝑖 ∈ 𝑋 ∗ ⎥
|
||||
𝑡 Maximum number of elements to accumulate ⎣ 𝑥𝑖 𝑖 ⎦
|
||||
𝑖 𝑖 ∈ [0, 2ℎ − 1]
|
||||
ℎ∈𝑁 Height of the tree The implementation of the Merkle tree ring signature is described
|
||||
𝐻 Hash function, 𝐻 ∶ {0, 1}∗ → {0, 1}𝑚
|
||||
next, and the whole process is covered in Algorithm 1.
|
||||
(𝑠𝑘𝛺 , 𝑝𝑘𝛺 ) A key pair
|
||||
{ } Step1: Key Generation: 𝐺𝑒𝑛(1𝑘 , 𝑡)
|
||||
𝑋 The set of 𝑥𝑖 ∣ 𝑖 ∈ [0, 2ℎ − 1] { }
|
||||
𝛺 The accumulator First, determine the hash functions 𝐻𝑘 𝑘∈𝐾 𝐾 , where for any 𝑘 ∈
|
||||
𝑎𝑢𝑥 The auxiliary information 𝐾 𝐾 , the hash function 𝐻𝑘 ∶ {0, 1}∗ → {0, 1}𝐾 . The hash function can be
|
||||
𝑤𝑖𝑡𝑥𝑖 The certificate for 𝑥𝑖 chosen as SHA functions, SM2, SM3, etc. Determine the parameter N,
|
||||
which represents the number of ring members, and 𝑡, the upper bound
|
||||
for accumulating elements. Then, generate the key pairs and return
|
||||
(𝑠𝑘𝛺 , 𝑝𝑘𝛺 ).
|
||||
a message sender to demonstrate possession of at least one public
|
||||
Step2: Public Key Evaluation Eval: 𝐸 𝑣𝑎𝑙((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝑋)
|
||||
key within a set while concealing the specific public key used, thus
|
||||
Parse the number of ring members N. The parsing rule is that if N
|
||||
providing anonymity and unlinkability. This feature makes ring sig-
|
||||
natures particularly valuable in applications centered on privacy and is not a power of 2, the function returns false, as it must be a perfect
|
||||
secure communication. Within ring signatures, Merkle trees can be binary tree. If N is a power of 2, begin computation from layer 0 (the
|
||||
employed to organize the hashes of messages or data blocks into a leaf nodes at the lowest level) and continue until the root (the single
|
||||
tree structure, facilitating efficient verification of data integrity and node at the top) is obtained. Let 𝐿𝑢,𝑣 represent the node at layer v and
|
||||
authenticity. Furthermore, ring signatures can leverage Merkle trees the u-th leaf index. The auxiliary variable aux stores the hash values
|
||||
to obscure the identity of sender by integrating the public key of corresponding to each layer.
|
||||
signer with those of other members in a ring. Consequently, the signer Step3: Certificate Creation: 𝑊 𝑖𝑡((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝛺𝑋 , 𝑎𝑢𝑥𝑥𝑖 , 𝑥𝑖 )
|
||||
can validate ownership of at least one public key in the set without First, parse aux into nodes at each level of the Merkle tree. Then, re-
|
||||
disclosing the specific key used. Even if an attacker intercepts the construct the Merkle tree from bottom to top. The 𝑊 𝑖𝑡𝐶 𝑟𝑒𝑎𝑡 algorithm
|
||||
signed message, they would be unable to ascertain the true identity involves using intermediate nodes to build up to the root hash value.
|
||||
of the signer. Step4: Certificate Verification: 𝑉 𝑒𝑟𝑖𝑓 𝑦(𝑝𝑘𝛺 , 𝛺𝑋 , 𝑤𝑖𝑡𝑥𝑖 , 𝑥𝑖 )
|
||||
The final step is verification. Start by setting the leaves to the hash
|
||||
3.2. Scheme description values of each party and proceed to compute hashes from the bottom
|
||||
up. Check if the final result matches the root node value. If it matches,
|
||||
This scheme is based on the definition of Merkle tree accumulators it verifies that the member is part of the ring. For example, node 𝑙0,2 is
|
||||
as described in [35], with slight modifications to accommodate the visualized in Fig. 6, showing how node 𝑙0,2 reconstructs the root node
|
||||
proposed post-quantum ring signature scheme utilizing hash functions, in a Merkle tree with height ℎ = 3 and 𝑁 = 8 leaf nodes.
|
||||
specifically designed for vehicular networks. This formalism facilitates
|
||||
the restatement of the Merkle tree accumulator algorithm within the
|
||||
current framework. The main parameters of this scheme are given in Algorithm 1 Extend Merkle tree accumulator
|
||||
Table 4. input: 𝑘, 𝑡, {𝐻𝑘 }𝑘∈𝐾 𝜅 , 𝐻𝑘 ∶ {0, 1}∗ → {0, 1}𝜅
|
||||
output: (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝐿𝑢,𝑣 , 𝑤𝑖𝑡𝑥𝑖 , 0 or 1
|
||||
Definition 4 (Extend Merkle Tree Accumulator). The Merkle tree accu-
|
||||
mulator algorithm (Algorithm 1) comprises the following subroutines 1. 𝑘 ∈ 𝐾𝜅 # Key generation 𝐺𝑒𝑛(1𝑘 , 𝑡)
|
||||
(Gen, Eval, WitCreate, Verify), defined as follows: 2. (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ) ← {𝐻𝑘 }𝑘∈𝐾 𝜅
|
||||
𝐺𝑒𝑛(1𝑘 , 𝑡): The key generation algorithm takes a security parameter 3. 𝐻𝑘 ← 𝑝𝑘𝛺 # Public Key Resolution
|
||||
𝑘 and a parameter 𝑡, where 𝑡 is the upper bound on the number of 4. (𝑥0 , 𝑥1 , … , 𝑥𝑛−1 ) ← 𝑋
|
||||
elements to be accumulated, and returns a key pair (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ). 5. If 𝑛 = 2𝑘 ∣ 𝑘 ∈ N, 𝑣 ≤ 𝑘:
|
||||
𝐸 𝑣𝑎𝑙((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝑋): This algorithm takes the key pair (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ) and
|
||||
6. 𝐻𝑘 (𝐿2𝑢,𝑣+1 ∥𝐿2𝑢+1,𝑣+1 ) if 𝑣 < 𝑘 else 𝐻𝑘 (𝑥𝑖 )
|
||||
the set of elements X to be accumulated, returning the accumulator 𝛺𝑋
|
||||
and some auxiliary information aux. 7. Else False
|
||||
( )
|
||||
𝑊 𝑖𝑡𝐶 𝑟𝑒𝑎𝑡((𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), 𝛺𝑋 , 𝑎𝑢𝑥, 𝑥𝑖 ): This algorithm takes the key 8. 𝑙𝑢,𝑣 (𝑢∈[𝑛∕2𝑘−𝑣 ]) ← 𝑎𝑢𝑥 # Creates a certificate
|
||||
𝑣∈[𝑘]
|
||||
pair(𝑠𝑘𝛺 , 𝑝𝑘𝛺 ), accumulator 𝛺𝑋 , auxiliary information aux, and an
|
||||
𝑊 𝑖𝑡𝐶 𝑟𝑒𝑎𝑡𝑒((𝑝𝑘𝛺 , 𝑠𝑘𝛺 ), 𝛺𝑋 , 𝑎𝑢𝑥𝑋 , 𝑥𝑖 )
|
||||
element 𝑥𝑖 . If 𝑥𝑖 is not in the set X, it returns false; otherwise, it returns
|
||||
a certificate𝑤𝑖𝑡𝑥𝑖 for 𝑥𝑖 . 9. 𝑤𝑖𝑡𝑥𝑖 ← (𝑙⌊𝑖∕2𝑣 ⌋ + 𝜂 , 𝑘 − 𝑣), 0 ≤ 𝑣 ≤ 𝑘
|
||||
𝑉 𝑒𝑟𝑖𝑓 𝑦(𝑝𝑘𝛺 , 𝛺𝑋 , 𝑤𝑖𝑡𝑥𝑖 , 𝑥𝑖 ): This algorithm takes the public key 𝑝𝑘𝛺 , 10. 1 if ⌊𝑖∕2𝑣 ⌋ (mod 2) = 0 else −1
|
||||
accumulator 𝛺𝑋 certificate 𝑤𝑖𝑡𝑥𝑖 , and element 𝑥𝑖 . If 𝑤𝑖𝑡𝑥𝑖 is a valid 11. 𝐻𝑘 ← 𝑝𝑘𝛺 , 𝐿0,0 ← 𝛺𝑋 # Certificate authentication
|
||||
certificate for 𝑥𝑖 it returns 1; otherwise, it returns 0.
|
||||
𝑉 𝑒𝑟𝑖𝑓 𝑦(𝑝𝑘𝛺 , 𝛺𝑋 , 𝑤𝑖𝑡𝑥𝑖 , 𝑥𝑖 )
|
||||
The Merkle tree accumulator ensures both correctness and collision
|
||||
resistance. Collision resistance indicates the difficulty of finding an 12. 𝐿𝑖,𝑘 ← 𝐻𝑘 (𝐿⌊𝑖∕2𝑣 ⌋,𝑘−𝑣 ∥𝐿⌊𝑖∕2𝑣 ⌋+1,𝑘−𝑣 ) If ⌊𝑖∕2𝑣 ⌋ (mod 2) = 0
|
||||
element 𝑥𝑖,𝑗 that does not belong to X yet possesses a valid certificate else 𝐿𝑖,𝑘 ← 𝐻𝑘 (𝐿⌊𝑖∕2𝑣 ⌋,𝑘−𝑣 ∥𝐿⌊𝑖∕2𝑣 ⌋,𝑘−𝑣 )
|
||||
𝑥𝑖,𝑗 . 13. 1 if 𝑤𝑖𝑡𝑥𝑖 is a valid witness for 𝑥𝑖 ∈ 𝑋 else 0
|
||||
|
||||
Definition 5 (Collision Resistance). Collision resistance implies that for
|
||||
an adversary 𝐴 possessing a valid key pair (𝑠𝑘𝛺 , 𝑝𝑘𝛺 ) generated by 3.3. Signature algorithm description
|
||||
the Gen algorithm, and under the assumption that intermediate values
|
||||
are correct, the probability of finding an element 𝑥∗𝑖 that is not in the The hash-based post-quantum ring signature scheme explored in
|
||||
accumulator 𝑋 ∗ but still produces a verification result of 1 is negligible. this work is based on the XMSS algorithm, which incorporates two
|
||||
Assuming the existence of a negligible function 𝜀(𝑘), collision resistance primary frameworks: the WOTS+ algorithm and the Merkle tree algo-
|
||||
is formally defined as follows: rithm. Below is an overview of these frameworks.
|
||||
|
||||
6
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
The formal signing process begins by selecting the corresponding one-
|
||||
time signature (OTS) key pair (𝑥𝑖 , 𝑦𝑖 ), specifically the 𝑖th OTS key pair.
|
||||
The signer then uses the private OTS key 𝑥𝑖 to sign the message,
|
||||
creating a one-time signature 𝜎𝑂𝑇 𝑆 and calculating the authentication
|
||||
path. The final signature comprises: the index 𝑖, the one-time signature
|
||||
𝜎𝑂𝑇 𝑆 , the public key 𝑦𝑖 , and the authentication path for 𝑦𝑖 , denoted
|
||||
𝑎𝑢𝑡ℎ𝑖 . The signature is formally represented as 𝜎 = (𝑖, 𝜎𝑂𝑇 𝑆 , 𝑌𝑖 , 𝑎𝑢𝑡ℎ𝑖 ).
|
||||
The Fig. 7 illustrates the signing process using leaf node𝑥2 as the signing
|
||||
node, where the shaded areas represent the authentication path of the
|
||||
Fig. 6. A Merkle tree with a height of h = 3 and a number of leaf nodes N = 8 signature.
|
||||
visualizes the reconstruction of the root node by 𝑙0.2 nodes.
|
||||
Step 4: Signature Verification
|
||||
As shown in Algorithm 4, signature verification begins by first
|
||||
verifying the one-time signature 𝜎𝑂𝑇 𝑆 . If this check is successful, the
|
||||
Definition 6 (Merkle Tree Ring Signature Algorithm). The Merkle tree- next step involves reconstructing the Merkle tree root based on the
|
||||
based ring signature algorithm comprises four main steps: parameter chosen index 𝑖 and the public key 𝑦𝑖 . The reconstructed root is then
|
||||
definition, public key generation, signature generation, and signature compared with the stored public key. If the two match, verification is
|
||||
verification. These steps are outlined as follows: deemed successful.
|
||||
Step 1: Parameter Definition
|
||||
Algorithm 4 Signature verification
|
||||
The height h of the tree represents its number of layers, meaning a
|
||||
Merkle tree with height ℎ has 2ℎ leaf nodes, indicating 2ℎ ring members input: 𝜎
|
||||
and corresponding key pairs (𝑥𝑖 , 𝑦𝑖 ), 𝑖 ∈ [0, 2ℎ − 1]. output: true or false
|
||||
1 If
|
||||
In practical application scenarios, if the number of vehicles does
|
||||
2 𝑉𝐸𝑅(𝑀 , 𝑠𝑖𝑔(𝑂𝑇 𝑆), 𝑌𝑖 ) = 𝑡𝑟𝑢𝑒
|
||||
not satisfy this condition, it is recommended to either introduce virtual
|
||||
3 Reconstruct the 𝑟𝑜𝑜𝑡∗ node of the merkle tree
|
||||
members into the ring or divide the vehicles into multiple rings.
|
||||
according to i and Yi
|
||||
Step 2: Public Key Generation/Merkle Tree Construction
|
||||
4 If
|
||||
As shown in algorithm 2, in the Merkle tree, all leaf nodes together 5 𝑅𝑜𝑜𝑡′ = 𝑃 𝐾
|
||||
constitute the ring. Each member in the ring is represented by a public– 6 true
|
||||
private key pair corresponding to a leaf node. Each leaf node holds the 7 Else
|
||||
hash of the public key derived from a one-time signature (OTS) scheme, 8 False
|
||||
while each parent node stores the hash of the concatenation of its two 9 Else
|
||||
child nodes. This process repeats according to the same generation rule 10 False
|
||||
until the final root node is formed. The value of the root node is the
|
||||
final public key, while the private key consists of the 2ℎ OTS private
|
||||
To illustrate the reconstruction process, consider node𝑥2 as an
|
||||
keys 𝑥𝑖 . The number of ring members equals the number of leaf nodes in
|
||||
example, assuming 𝑖 = 2 and 𝑌2 known, along with the signature 𝜎 =
|
||||
the Merkle tree. It is essential to ensure that the number of participating
|
||||
(2, 𝜎𝑂𝑇 𝑆 , 𝑌2 , 𝑎𝑢𝑡ℎ2 ). Here, 𝑎𝑢𝑡ℎ2 contains values stored in nodes 3, 8, and
|
||||
members in the ring is a power of 2. The public key of each ring
|
||||
13. The root node can be reconstructed as follows: node14=hash(node
|
||||
member corresponds to the public key from the one-time signature.
|
||||
12∥node13), node12=hash(node8∥node9), node9= hash(node2∥node3)
|
||||
wh-ere node2 stores the value of 𝑌2 . The computed value of node14 is
|
||||
Algorithm 2 Public Key Generation the value of the reconstructed root 𝑟𝑜𝑜𝑡∗ . This is shown in Fig. 8. By
|
||||
input: h, SK hashing upwards from the leaf nodes, if a match with the stored root
|
||||
output: PK node is found, the membership of signer in the ring is verified.
|
||||
( )
|
||||
1. 𝑛𝑜𝑑 𝑒𝑖 = 𝐻 𝑎𝑠ℎ 𝑛𝑜𝑑 𝑒2𝑖+1 ||𝑛𝑜𝑑 𝑒2𝑖 , 𝑖 ∈ [0, 2ℎ − 1]
|
||||
2. Root=Hash(node1|| node2) 3.4. Application of the scheme in vehicular networks
|
||||
3. PK=Root
|
||||
The proposed hash-based signature scheme offers post-quantum
|
||||
security, protecting against quantum threats, and is highly efficient
|
||||
Step 3: Signature Generation Before executing the ring signature opera- with compact signatures, ideal for resource-constrained on-board de-
|
||||
tion, the signer hashes the binary message to generate a message digest vices in IoV. It supports fast information exchange and verification in
|
||||
𝑚 = 𝐻(𝑀), where H is the chosen hash function, and M represents the dynamic traffic environments, enhancing security and privacy, such as
|
||||
original binary message. This digest 𝑚 will be used in the subsequent in accident reporting systems, while maintaining reporter anonymity.
|
||||
steps of the signature generation process. This process is shown in Overall, it addresses key security, efficiency, and scalability challenges
|
||||
algorithm 3. in connected vehicle networks.
|
||||
The application of ring signatures in IoV involves three main stages:
|
||||
the registration stage, the inter-vehicle communication stage, and the
|
||||
Algorithm 3 Signature generation signature tracing and broadcast stage.
|
||||
input: M, H, one-time signature key pair (𝑥𝑖 , 𝑦𝑖 ) Step 1: Registration Stage
|
||||
output: 𝜎 This stage consists of three main steps, First, the On-Board Unit
|
||||
1 (𝑥𝑖 , 𝑦𝑖 ), 𝑖 ∈ [0, 2ℎ − 1] (OBU) sends a registration request to the Trusted Authority (TA).
|
||||
2 For 𝑥𝑖 Upon receiving the request, the TA generates a public–private key
|
||||
3 Select node to perform a one-time digital pair (𝑃 𝐾𝑂𝐵𝑈 , 𝑆 𝐾𝑂𝐵𝑈 ) for the OBU. In the final step, the TA returns
|
||||
signature on message M to generate the private key to the OBU, along with the public key and identity
|
||||
signature 𝜎𝑂𝑇 𝑆 information bound to the blockchain network. The identity information
|
||||
4 Calculate 𝑦𝑖 authentication path 𝑎𝑢𝑡ℎ𝑖 typically includes vehicle certificates, vehicle identification numbers
|
||||
5 𝜎 = (𝑖, 𝜎𝑂𝑇 𝑆 , 𝑌𝑖 , 𝑎𝑢𝑡ℎ𝑖 ) (VIN), and other vehicle-related data. This process ensures that vehicles
|
||||
|
||||
7
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 7. Diagram of the signature generation process.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. Signature verification diagram.
|
||||
|
||||
|
||||
|
||||
are properly registered and recognized within the blockchain network, the signatures and returns the verification results to the requesting
|
||||
as illustrated in Fig. 9. OBU, enabling secure and authenticated access to the information. This
|
||||
Step 2: Inter-Vehicle Communication Stage process is further illustrated in Fig. 10.
|
||||
At this stage, the OBU utilizes the public key of the Roadside Step 3: Signature Tracing and Broadcast Stage
|
||||
Unit (RSU) 𝑃 𝐾𝑅𝑆 𝑈 to encrypt its own public key and sends it to the In the event of an accident, the OBU sends accident-related informa-
|
||||
RSU, requesting the creation of a ring. Upon receiving the encrypted tion to the RSU, which then processes and broadcasts the information
|
||||
message, the RSU decrypts it using its private key to obtain 𝑃 𝐾𝑂𝐵 𝑈 , to other OBUs. At the same time, the RSU forwards the signature of the
|
||||
which is then added to the ring. When the number of ring members OBU involved in the accident, denoted as 𝑆 𝐼 𝐺(𝑂𝐵 𝑈 𝑎𝑐 𝑐 ) to the TA. The
|
||||
reaches the threshold of 2ℎ , the RSU broadcasts the ring structure, TA uses its private key to identify the relevant vehicle information. If
|
||||
allowing all ring members to participate in signing processes. the OBU is determined to be malicious, the TA revokes its identity and
|
||||
If the threshold is not met, virtual members may be added, or the public key on the blockchain network. The TA then sends the revoked
|
||||
ring may be split into smaller sub-rings to ensure each ring contains public key and the adverse record of the malicious OBU to the RSU. The
|
||||
2ℎ members. Once the ring is established, the OBU can sign messages RSU subsequently broadcasts this information to other OBUs, ensuring
|
||||
using a ring signature and forward them to the RSU. The RSU sub- they are aware of the revoked identity and can exclude the malicious
|
||||
sequently broadcasts the signed messages to other OBUs, which can OBU from further network participation. This process is illustrated in
|
||||
request verification from the Verification Node (VN). The VN validates Fig. 11.
|
||||
|
||||
8
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 12. IOV model based on post-quantum ring signature.
|
||||
|
||||
|
||||
|
||||
|
||||
accident, sends the public key and adverse record of the vehicle
|
||||
Fig. 9. Registration phase.
|
||||
involved to the RSU.
|
||||
[4] Verification Node (VN): Responsible for verifying signature re-
|
||||
quests sent by other vehicles.
|
||||
[5] Anonymous Blockchain Network (ABN): In this model, vehicle
|
||||
public keys are stored in the blockchain network, providing a
|
||||
secure and anonymous framework for identity management.
|
||||
|
||||
In addition to the interactions between the OBU and the TA, as well
|
||||
as between the OBU and RSU in the aforementioned process, within
|
||||
a specific segment of roadway, the OBU is also capable of engaging
|
||||
with pedestrians, road infrastructure, and stations located within that
|
||||
segment.
|
||||
In general, the integrity and privacy protection of data transmis-
|
||||
sion are more emphasized in interactions between vehicles and other
|
||||
vehicles, as well as roadside units. However, interactions between
|
||||
Fig. 10. Information interaction phase.
|
||||
vehicles and pedestrians often involve location verification and identity
|
||||
confirmation. In a vehicular networking system, vehicles may need to
|
||||
verify both the identity and location of pedestrians, while using post-
|
||||
quantum ring signatures to ensure the integrity and non-repudiation of
|
||||
pedestrian information.
|
||||
|
||||
|
||||
4. Security analysis
|
||||
|
||||
4.1. Safety assessment
|
||||
|
||||
The proposed scheme possesses the following characteristics:
|
||||
(1) Anonymity: Ring signatures inherently support anonymity, pro-
|
||||
tecting the identity of signer. Assuming an attacker has obtained a valid
|
||||
ring signature generated only by members within the ring, if the ring
|
||||
contains 𝑛 members, the probability that the attacker identifies the true
|
||||
signer is 1∕𝑛. For any member other than the signer, the probability of
|
||||
Fig. 11. Signature tracing phase. knowing the identity of signer is 1∕𝑛 − 1.
|
||||
(2) Privacy: The generation of a ring signature relies solely on the
|
||||
signer within the ring, with no involvement from other ring members,
|
||||
When applying this ring signature scheme to a vehicular network thus preserving the privacy of the signer.
|
||||
system, the overall model framework is shown in Fig. 12. The primary (3) Post-Quantum Security: This scheme employs a post-quantum
|
||||
ring signature approach based on Merkle trees, leveraging hash-based
|
||||
components of the model include:
|
||||
and post-quantum secure mathematical problems. This design provides
|
||||
robust security against quantum computing threats. The use of hash-
|
||||
[1] On - Board Unit (OBU): Responsible for sending requests to the
|
||||
based post-quantum ring signatures combines the strong properties of
|
||||
TA, transferring its public key to the RSU, signing messages with
|
||||
hash functions with quantum-resilient security, maintaining integrity
|
||||
the ring signature, and sharing traffic accident information. even under potential quantum computing attacks.
|
||||
[2] Road - Side Unit (RSU): Organizes received public keys into a (4) Efficiency: The computational efficiency of hash functions makes
|
||||
ring, broadcasts signatures, accident information, and adverse this scheme suitable for a variety of application scenarios.
|
||||
records to other vehicles, and forwards accident-related signa- (5) Unforgeability: The scheme ensures unforgeability through the
|
||||
tures to the TA. one-way and irreversible properties of hash functions in constructing
|
||||
[3] Trusted Authority (TA): Generates key pairs for the OBU, up- hash chains. Thus, it is highly challenging for anyone other than the
|
||||
loads these to the blockchain network, and, in the event of an legitimate signer to forge a signature within this scheme.
|
||||
|
||||
9
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
C computes the corresponding 𝜎𝑠 , which S returns as a complete ring
|
||||
signature to A.
|
||||
Step 4: In the challenge phase, A sends M and an unobserved forged
|
||||
ring signature to S, which calculates the corresponding 𝑌𝑠 of the forged
|
||||
signer and submits (𝑌𝑠 , 𝜎𝑠 ) to C. If C verifies 𝑌𝑠 and 𝜎𝑠 as valid, then
|
||||
S has successfully forged a signature, with output 1; otherwise, S fails,
|
||||
outputting 0.
|
||||
Since A can break the scheme with non-negligible probability P,
|
||||
we deduce that 𝑝𝑟(𝑜𝑢𝑡𝑝𝑢𝑡(𝐺𝑎𝑚𝑒) = 1) = 𝑝, allowing S to break the
|
||||
post-quantum ring signature algorithm with non-negligible probability.
|
||||
However, this contradicts the assumed security of scheme, proving that
|
||||
A cannot successfully forge signatures in polynomial time.
|
||||
Fig. 13. Authentication path diagram of a node with index i = 2.
|
||||
Theorem 3. If the underlying hash function family {𝐻𝑘 }, 𝑘 ∈ 𝐾𝐾 is a
|
||||
collision-resistant family, then the proposed hash-based post-quantum ring
|
||||
4.2. Security proof
|
||||
signature scheme is collision-resistant.
|
||||
The following section provides security proofs and discussions for Proof. During initialization, this reduction interacts with a collision-
|
||||
the proposed scheme: resistant hash function challenge to acquire 𝐻𝑘 and completes initial-
|
||||
ization per the original protocol. If an attacker generates a collision
|
||||
Lemma 1. If a one-time signature scheme passes verification and the within the accumulator, this implies that the reduction knows two
|
||||
reconstructed Merkle root Root∗ matches the original Merkle root Root, then distinct inputs that collide under 𝐻𝑘 , with the collision probability
|
||||
the signature is valid. bounded by the collision resistance of hash function.
|
||||
|
||||
Proof. Suppose the index 𝑖 = 2 is chosen for the one-time signature key Theorem 4. If the employed hash functions are one-way, then the proposed
|
||||
used in the message signature. The nodes from index 𝑖 = 2 to the root Merkle-tree-based post-quantum ring signature scheme is unforgeable under
|
||||
node traverse nodes [2, 9, 12], with sibling nodes [3, 8, 13], forming chosen-message attacks.
|
||||
a verification path [3, 8, 13], In Fig. 13, we illustrate the verification Let 𝑛, 𝑤, 𝑚 ∈ 𝑁 , 𝑤𝑖𝑡ℎ𝑤, 𝑚 = 𝑝𝑜𝑙𝑦(𝑛), and let the function family 𝐹𝑛 =
|
||||
pathway of the leaf node indexed at 2, which is depicted as the gray 𝑓𝑘 ∶ {0, 1}𝑛 → {0, 1}𝑛 where 𝑘 ∈ {0, 1}𝑛 satisfy second-preimage resistance
|
||||
node. Reconstructing the root Root* follows these steps: and one-way properties. The variable t represents the computational time.
|
||||
𝑁 𝑜𝑑 𝑒(9) = Hash(𝑛𝑜𝑑 𝑒(2) ∥ 𝑛𝑜𝑑 𝑒(3)) The term 𝜔 ⋅ 𝐼 𝑛𝑆 𝑒𝑐 𝑈 𝐷 (𝐹𝑛 ; 𝑡∗ ) reflects the undetectability (UD) security of
|
||||
the function family 𝐹𝑛 , while 𝐼 𝑛𝑆 𝑒𝑐 𝑂𝑊 (𝐹𝑛 ; 𝑡′ ) represents its one-way(OW)
|
||||
𝑁 𝑜𝑑 𝑒(12) = Hash(𝑛𝑜𝑑 𝑒(9) ∥ 𝑛𝑜𝑑 𝑒(8)) security. Additionally, the term 𝜔 ⋅ 𝐼 𝑛𝑆 𝑒𝑐 𝑆 𝑃 𝑅 (𝐹𝑛 ; 𝑡′ ) denotes the second-
|
||||
preimage resistance(SPR) security, scaled by the parameter 𝜔. The formal
|
||||
definitions of EU-CMA and SPR are provided in [14], and will not be
|
||||
𝑁 𝑜𝑑 𝑒(14) = Hash(𝑛𝑜𝑑 𝑒(12) ∥ 𝑛𝑜𝑑 𝑒(13))
|
||||
elaborated on here.
|
||||
The value of node 9 is computed from nodes 2 and 3, the value of We define the unforgeability insecurity under chosen-message at-
|
||||
node 12 is computed from nodes 9 and 8, and the value of the root node tack of WOTS+ as follows:
|
||||
Root∗ (node 14) is computed from nodes 12 and 13. This computed
|
||||
lnSecEU-CMA (WOTS+ (1𝑛 , 𝑤, 𝑚); 𝑡, 1)
|
||||
Root∗ value is then compared with the public key. Clearly, the hash of
|
||||
Root∗ matches the original public key. The proof process for any other ≤ 𝑤 ⋅ ln SecUD (𝐹𝑛 ; 𝑡∗ ) + 𝑤𝑙
|
||||
node is identical, thus confirming the correctness of the signature. ⋅ max{ln SecOW (𝐹𝑛 ; 𝑡′ ), 𝑤 ⋅ ln SecSPR (𝐹𝑛 ; 𝑡′ )} with 𝑡′
|
||||
= 𝑡 + 3𝑙𝑤 and 𝑡∗
|
||||
Theorem 1. The proposed post-quantum ring signature scheme preserves
|
||||
= 𝑡 + 3𝑙𝑤 + 𝑤 − 1
|
||||
anonymity.
|
||||
Assuming a valid signature 𝜎 = (𝑖, 𝜎𝑂𝑇 𝑆 , 𝑌𝑖 , 𝑎𝑢𝑡ℎ𝑖 ), where each value For WOTS+ combined with Merkle trees, the non-forgeability under
|
||||
of 𝑖 is within the appropriate range 𝑖 ∈ [0, 2ℎ − 1], the probability that chosen-message attacks on the Merkle tree can be defined as follows:
|
||||
any other person can identify the true signer is 1∕2ℎ (for a ring with ( ( ) )
|
||||
InSecEU-CMA Merkle-tree 1𝑛 , 𝑇 = 2ℎ ; 𝑡, 1
|
||||
2ℎ members). For other ring members, the probability of knowing the { ℎ+log 𝓁−1
|
||||
≤ 2 ⋅ max 2 2 ⋅
|
||||
identity of signer is 1∕(2ℎ − 1). }
|
||||
SPR
|
||||
InSec (WOTS+ (1𝑛 , 𝜔, 𝑚) ; 𝑡, 1)
|
||||
Theorem 2. The proposed ring signature scheme is unforgeable. Using the derived insecurity function for the Merkle tree combined
|
||||
Proof. Suppose an attacker A could successfully forge a ring signature with W-OTS, which employs pseudorandom key generation and 𝐺𝑒𝑛2ℎ
|
||||
with non-negligible probability P within polynomial time. We construct we arrive at the following results:
|
||||
( )
|
||||
a simulator S to challenge a ring signature algorithm claimed to be InSecEU-CMA XMSS(1𝑛 , 𝑇 = 2ℎ ); 𝑡, 1
|
||||
( )
|
||||
secure by challenger C as follows: ≤ InSecEU-CMA WOTS+(1𝑛 , 𝜔, 𝑚); 𝑡, 1
|
||||
Step 1: The challenger initializes 𝑛 signing instances with the MSS ( )
|
||||
+ InSecEU-CMA Merkle-tree(1𝑛 , 𝑇 = 2ℎ ); 𝑡, 1
|
||||
signing algorithm, generating 𝑛 key pairs (𝑠𝑘, 𝑝𝑘) and sends all public
|
||||
keys pk to simulator S. = InSecPRF (𝐹𝑛 , 𝑡′ + 2ℎ , 2ℎ )
|
||||
Step 2: Upon receiving the public keys, S initializes the ring sig- ⎧(2ℎ+log2 𝑙−1 ) ⋅ InSecSPR (𝐻𝑛 , 𝑡′ ), ⎫
|
||||
nature algorithm by randomly selecting additional parameters and ⎪ ℎ PRF ′
|
||||
⎪
|
||||
⎪2 ⋅ InSec (𝐹𝑛 ; 𝑡 + 𝑙, 𝑙)+ ⎪
|
||||
forwarding the public keys to attacker A. + 2 max ⎨ ( { OW ′
|
||||
}) ⎬.
|
||||
Step 3: In the query phase, A selects a message M and sends it to ⎪ UD ∗
|
||||
InSec (𝐹𝑛 ; 𝑡 ), ⎪
|
||||
⎪ 𝜔 ⋅ InSec 𝐹𝑛 ; 𝑡 + max ⎪
|
||||
S. Following the ring signature algorithm, S randomly selects a user ⎩ InSecSPR (𝐹𝑛 ; 𝑡′ ) ⎭
|
||||
𝑠 to generate the ring signature, computes 𝑌𝑠 , and forwards it to C.
|
||||
|
||||
10
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
Table 5
|
||||
Test 16 XMSS-SHA2_10_256 signatures.
|
||||
Number Signature time Verification time
|
||||
0 1.990014 0.001119
|
||||
1 1.980151 0.000947
|
||||
2 1.969849 0.001210
|
||||
3 1.965888 0.001184
|
||||
4 1.969898 0.001056
|
||||
5 1.980296 0.001144
|
||||
6 2.017889 0.001093
|
||||
7 2.054971 0.001101
|
||||
8 2.016147 0.001241
|
||||
9 2.020737 0.001267
|
||||
10 1.954583 0.001016
|
||||
11 2.021315 0.001060
|
||||
12 2.029765 0.001043
|
||||
Fig. 14. Signature generation time of 16 test results.
|
||||
13 2.057487 0.001016
|
||||
14 1.958401 0.001081
|
||||
15 1.990919 0.001053
|
||||
|
||||
|
||||
|
||||
|
||||
To prove XMSS is unforgeable under chosen-message attacks, we
|
||||
consider the following factors:
|
||||
Random Oracle Model: Assuming the hash function behaves as a
|
||||
random oracle, an attacker has no foreknowledge of input–output pairs.
|
||||
Irreversibility: WOTS+ security relies on the irreversibility of hash
|
||||
chains; given a hash value 𝐻𝑖 (𝑥), finding the predecessor 𝐻𝑖−1 (𝑥) is
|
||||
infeasible.
|
||||
Collision Resistance: The hash function must resist collisions, mak-
|
||||
ing it nearly impossible for an attacker to produce distinct messages
|
||||
that yield identical hash chains.
|
||||
Fig. 15. Signature verification time of 16 test results.
|
||||
|
||||
5. Performance analysis
|
||||
Table 6
|
||||
Signature efficiency comparison table.
|
||||
This study evaluates the performance of proposed scheme in densely
|
||||
Scheme Number of Key Signature Verification
|
||||
trafficked urban areas, focusing particularly on resistance to quantum
|
||||
Members generation time/s time/s
|
||||
attacks. The experiments are based on the Merkle tree-ring signature time/s
|
||||
scheme, with a primary emphasis on security strength, as attacks in
|
||||
OURS HBS 210 2.06 1.97 9.47e−04
|
||||
the IoV environments are expected to become increasingly complex, [33] LBS 10 0.07 0.06 0.04
|
||||
especially with the advent of quantum attacks. Consequently, a high- [32] LBS – 34.1e−06 9.59e−05 3.49e−05
|
||||
security, quantum-resistant signature scheme is essential for the IoV [25] HBS 210 – 0.16 0.11
|
||||
systems.
|
||||
The primary operations in the signature scheme include generating Table 7
|
||||
public and private keys, measuring the time required for message Function comparison table of the scheme.
|
||||
signing and verification, and instantiating the SHA-256 function as Scheme Post- Anonymity Traceability Application
|
||||
the underlying hash function. Key parameters include the security quantum to IOV
|
||||
parameter 𝑛, the Winternitz parameter 𝜔, and the number of ring security
|
||||
|
||||
members, with specific values assigned to each. These operations allow OURS HBS YES YES YES YES
|
||||
[33] LBS NO YES YES YES
|
||||
us to measure metrics such as key generation time, signature generation
|
||||
[32] LBS YES NO NO YES
|
||||
time, and signature verification time. [25] HBS YES YES YES NO
|
||||
In this scheme, the digital signature algorithm is set to XMSS-
|
||||
SHA2-10-256, utilizing the SHA-256 hash function with a Merkle tree
|
||||
height of 10, enabling a maximum of 210 = 1024 possible ring signa-
|
||||
tures. The number of signature tests is set to 16 to balance efficiency of Merkle tree as 10, and the number of ring members as 210 . Among
|
||||
and data stability, ensuring valid results without excessive resource them, HBS stands for the scheme based on hash and LBS stands for a
|
||||
consumption. scheme based on lattices.
|
||||
To present the data more intuitively, the experimental results of the Comparing the scheme proposed in this paper with the scheme
|
||||
16 tests shown in Table 5 are depicted in graphical form, resulting in in [33], it can be seen that the post-quantum ring signature scheme
|
||||
Fig. 14 and Fig. 15. Fig. 14 illustrates the signature generation times based on Merkle tree has great advantages. First, in this evaluation, the
|
||||
across the 16 tests, while Fig. 15 displays the signature verification number of ring members our scheme can accommodate is 210 , which
|
||||
times. These figures show that both the signature generation time and is much larger than the number of ring members evaluated in [33].
|
||||
verification time fluctuate within a certain range, indicating variability When the road section is wider and crowded, the scheme proposed in
|
||||
rather than fixed values. Select one of the 16 test results to compare this paper is more suitable. Secondly, this scheme has post-quantum
|
||||
with relevant literature studies. The attributes of comparison include security, which is more secure; Moreover, although the key generation
|
||||
key generation time, signature generation time, signature verification time of our scheme is slightly longer than that of the scheme with
|
||||
time, resistance to quantum attacks, anonymity, traceability, and ap- fewer ring members in [33], it is much faster in terms of signature time
|
||||
plication to the IoV. The comparison results are drawn in Tables 6 and and verification time, especially the verification time is nearly 44 times
|
||||
7, In our scheme, we set the parameters as n = 32, 𝜔 = 16, the height faster than that of [25].
|
||||
|
||||
11
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
Compared with the scheme in [32], the outstanding feature of Data availability
|
||||
the scheme in this paper is ring signature, which has anonymity and
|
||||
traceability, making it more suitable for the Internet of vehicles en- No data was used for the research described in the article.
|
||||
vironment. In addition, the scheme in this paper uses Merkle tree
|
||||
structure, which reduces the storage cost of public key and signature.
|
||||
References
|
||||
In general, lattice signature may require special optimization in high
|
||||
performance computing. The algorithm maturity is not high, but the
|
||||
[1] I. Wanger, Car production: Number of cars produced worldwide, Statista (2020).
|
||||
underlying hash function of the post-quantum ring signature scheme in [2] Patrick Miner, Barbara M. Smith, Anant Jani, Geraldine McNeill, Alfred
|
||||
this paper is SHA-256, and the SHA-256 function has passed the test of Gathorne-Hardy, Car harm: A global review of automobility’s harm to people
|
||||
time in many practical applications, and has high algorithm maturity. and the environment, J. Transp. Geogr. 115 (2024) 103817.
|
||||
Comparing the scheme in this paper with the scheme in [25], it can [3] Juan Contreras-Castillo, Sherali Zeadally, Juan Antonio Guerrero-Ibañez, Internet
|
||||
of vehicles: Architecture, protocols, and security, IEEE Internet Things J. 5 (5)
|
||||
be seen that both papers are based on hash function. The advantages (2018) 3701–3709, http://dx.doi.org/10.1109/JIOT.2017.2690902.
|
||||
of the scheme in this paper are as follows: First, although the time [4] David Deutsch, Quantum theory, the Church–Turing principle and the universal
|
||||
of signature generation in [25] is nearly 12 times faster than that in quantum computer, Proc. R. Soc. A 400 (1818) (1985) 97–117.
|
||||
this paper, the time of signature verification in this paper is nearly 100 [5] Rasha Shajahan, Kurunandan Jain, Prabhakar Krishnan, A survey on NIST 3
|
||||
rd round post quantum digital signature algorithms, in: 2024 5th International
|
||||
times faster than that in [25]. In addition, the scheme in this paper is
|
||||
Conference on Mobile Computing and Sustainable Informatics, ICMCSI, IEEE,
|
||||
also applied to the vehicle networking model. 2024, pp. 132–140.
|
||||
As shown in Table 7, this study compares the attributes of ‘‘Post- [6] David A. Cooper, Daniel C. Apon, Quynh H. Dang, Michael S. Davidson, Morris J.
|
||||
quantum’’, ‘‘Anonymity’’, ‘‘Traceability’’, and ‘‘Application to IoV’’. Dworkin, Carl A. Miller, et al., Recommendation for stateful hash-based signature
|
||||
The comparison reveals that our scheme offers post-quantum security, schemes, NIST Spec. Publ. 800 (208) (2020) 208–800.
|
||||
[7] Samira El Madani, Saad Motahhir, Abdelaziz El Ghzizal, Internet of vehicles:
|
||||
anonymity, traceability, and the ability to apply to IoV, with the
|
||||
concept, process, security aspects and solutions, Multimedia Tools Appl. 81 (12)
|
||||
advantages of our proposed scheme becoming more evident through (2022) 16563–16587.
|
||||
this comprehensive comparison. [8] Cesar Castellon, Swapnoneel Roy, Patrick Kreidl, Ayan Dutta, Ladislau Bölöni,
|
||||
Energy efficient merkle trees for blockchains, in: 2021 IEEE 20th International
|
||||
6. Conclusion Conference on Trust, Security and Privacy in Computing and Communications,
|
||||
TrustCom, IEEE, 2021, pp. 1093–1099.
|
||||
[9] Daniel J. Bernstein, Andreas Hülsing, Stefan Kölbl, Ruben Niederhagen, Joost
|
||||
The hash-based post-quantum ring signature scheme offers advan- Rijneveld, Peter Schwabe, The SPHINCS+ signature framework, in: Proceedings
|
||||
tages such as high signature efficiency, good scalability, and inde- of the 2019 ACM SIGSAC Conference on Computer and Communications Security,
|
||||
pendence from complex mathematical assumptions. In the context of 2019, pp. 2129–2146.
|
||||
[10] Kaiyi Zhang, Hongrui Cui, Yu Yu, SPHINCS-𝛼: A compact stateless hash-based
|
||||
increasing security threats posed by advancements in quantum com-
|
||||
signature scheme, 2022, Cryptology ePrint Archive.
|
||||
puting, applying post-quantum ring signatures in IoV can enhance [11] Mikhail Kudinov, Andreas Hülsing, Eyal Ronen, Eylon Yogev, SPHINCS+ C:
|
||||
anonymity and privacy protection while ensuring quantum-resistant Compressing SPHINCS+ with (almost) no cost, 2022, Cryptology ePrint Archive.
|
||||
security. This paper presents a hash-based post-quantum ring signature [12] Sun Siwei, Liu Tianyu, Guan Zhi, SM3-based post-quantum digital signature
|
||||
scheme built on the XMSS algorithm and demonstrates its application schemes, J. Cryptologic Res. 10 (1) (2023) 46.
|
||||
[13] Andreas Hülsing, Mikhail Kudinov, Recovering the tight security proof of
|
||||
in the IoV system. The proposed scheme is analyzed and proven secure.
|
||||
SPHINCS+, in: International Conference on the Theory and Application of
|
||||
Performance analysis is conducted following 16 experimental tests, Cryptology and Information Security, Springer, 2022, pp. 3–33.
|
||||
with comparisons made to other similar schemes. The results show [14] Andreas Hülsing, Denis Butin, Stefan Gazdag, Joost Rijneveld, Aziz Mohaisen,
|
||||
that the proposed scheme exhibits significant advantages in signature XMSS: Extended Merkle Signature Scheme, Technical Report, 2018.
|
||||
verification time compared to other approaches. This is due to the [15] Jan Philipp Thoma, Tim Güneysu, A configurable hardware implementation of
|
||||
XMSS, 2021, Cryptology ePrint Archive.
|
||||
efficient hash computations and Merkle tree verification paths, which [16] Siwei Sun, Tianyu Liu, Zhi Guan, Yifei He, Jiwu Jing, Lei Hu, Zhenfeng
|
||||
maintain low time complexity and high efficiency even with large Zhang, Hailun Yan, XMSS-SM3 and MT-XMSS-SM3: Instantiating extended Merkle
|
||||
data sets. Moreover, the scheme satisfies the properties of quantum signature schemes with SM3, 2022, Cryptology ePrint Archive.
|
||||
resistance, anonymity, traceability, and applicability to IoV. [17] Andreas Hülsing, W-OTS+–shorter signatures for hash-based signature schemes,
|
||||
in: Progress in Cryptology–AFRICACRYPT 2013: 6th International Conference on
|
||||
Future research will aim to further improve the practicality and
|
||||
Cryptology in Africa, Cairo, Egypt, June 22-24, 2013. Proceedings 6, Springer,
|
||||
security of the scheme in response to the evolving threats posed by 2013, pp. 173–188.
|
||||
quantum computing, and second, interdisciplinary collaboration can [18] Kaiyi Zhang, Hongrui Cui, Yu Yu, Revisiting the constant-sum winternitz
|
||||
be strengthened in future research to provide valuable insights for one-time signature with applications to SPHINCS+ and XMSS, in: Annual
|
||||
optimizing solutions in real-world scenarios. International Cryptology Conference, Springer, 2023, pp. 455–483.
|
||||
[19] Xie Jia, Liu Shizhao, Wang Lu, Research progress and prospects of ring signature
|
||||
technology., J. Front. Comput. Sci. Technol. 17 (5) (2023).
|
||||
CRediT authorship contribution statement [20] Rohit Chatterjee, Kai-Min Chung, Xiao Liang, Giulio Malavolta, A note on the
|
||||
post-quantum security of (ring) signatures, in: IACR International Conference on
|
||||
Shuanggen Liu: Conceptualization. Xiayi Zhou: Writing – original Public-Key Cryptography, Springer, 2022, pp. 407–436.
|
||||
[21] Yuxi Xue, Xingye Lu, Man Ho Au, Chengru Zhang, Efficient linkable ring signa-
|
||||
draft. Xu An Wang: Supervision. Zixuan Yan: Investigation. He Yan:
|
||||
tures: new framework and post-quantum instantiations, in: European Symposium
|
||||
Formal analysis. Yurui Cao: Resources. on Research in Computer Security, Springer, 2024, pp. 435–456.
|
||||
[22] Abida Haque, Alessandra Scafuro, Threshold ring signatures: new definitions
|
||||
Declaration of competing interest and post-quantum security, in: Public-Key Cryptography–PKC 2020: 23rd IACR
|
||||
International Conference on Practice and Theory of Public-Key Cryptography,
|
||||
Edinburgh, UK, May 4–7, 2020, Proceedings, Part II 23, Springer, 2020, pp.
|
||||
The authors declare that they have no known competing finan-
|
||||
423–452.
|
||||
cial interests or personal relationships that could have appeared to [23] Maxime Buser, Joseph K. Liu, Ron Steinfeld, Amin Sakzad, Post-quantum id-based
|
||||
influence the work reported in this paper. ring signatures from symmetric-key primitives, in: International Conference on
|
||||
Applied Cryptography and Network Security, Springer, 2022, pp. 892–912.
|
||||
Acknowledgments [24] J. Odoom, X. Huang, Z. Zhou, et al., Linked or unlinked: A systematic review
|
||||
of linkable ring signature schemes, J. Syst. Archit. 134 (2023) 102786.
|
||||
[25] Shiwei Xu, Tao Wang, Ao Sun, Yan Tong, Zhengwei Ren, Rongbo Zhu,
|
||||
This work was supported by the National Natural Science Founda- Houbing Herbert Song, Post-quantum anonymous, traceable and linkable au-
|
||||
tion of China (NSFC) under Grant No. 62172436.The first author and thentication scheme based on blockchain for intelligent vehicular transportation
|
||||
the third author are the corresponding authors of this paper. systems, IEEE Trans. Intell. Transp. Syst. (2024).
|
||||
|
||||
|
||||
12
|
||||
S. Liu et al. Journal of Systems Architecture 160 (2025) 103345
|
||||
|
||||
|
||||
[26] Nyothiri Aung, Tahar Kechadi, Tao Zhu, Saber Zerdoumi, Tahar Guerbouz, [33] Cui Yongquan, Cao Ling, Zhang Xiaoyu, Privacy protection of internet of vehicles
|
||||
Sahraoui Dhelim, Blockchain application on the internet of vehicles (iov), based on lattice-based ring signature, Chinese J. Comput. 42 (5) (2019) 980–992.
|
||||
in: 2022 IEEE 7th International Conference on Intelligent Transportation [34] Cesar Castellon, Swapnoneel Roy, Patrick Kreidl, Ayan Dutta, Ladislau Bölöni,
|
||||
Engineering, ICITE, IEEE, 2022, pp. 586–591. Energy efficient merkle trees for blockchains, in: 2021 IEEE 20th International
|
||||
[27] Haibin Zhang, Jiajia Liu, Huanlei Zhao, Peng Wang, Nei Kato, Blockchain-based Conference on Trust, Security and Privacy in Computing and Communications,
|
||||
trust management for internet of vehicles, IEEE Trans. Emerg. Top. Comput. 9 TrustCom, IEEE, 2021, pp. 1093–1099.
|
||||
(3) (2020) 1397–1409. [35] David Derler, Sebastian Ramacher, Daniel Slamanig, Post-quantum zero-
|
||||
[28] Mirador Labrador, Weiyan Hou, Implementing blockchain technology in the knowledge proofs for accumulators with applications to ring signatures from
|
||||
internet of vehicle (IoV), in: 2019 International Conference on Intelligent
|
||||
symmetric-key primitives, in: Post-Quantum Cryptography: 9th International Con-
|
||||
Computing and Its Emerging Applications, ICEA, IEEE, 2019, pp. 5–10.
|
||||
ference, PQCrypto 2018, Fort Lauderdale, FL, USA, April 9-11, 2018, Proceedings
|
||||
[29] Y. Liu, Q. Xia, X. Li, et al., An authentication and signature scheme for UAV-
|
||||
9, Springer, 2018, pp. 419–440.
|
||||
assisted vehicular ad hoc network providing anonymity, J. Syst. Archit. 142
|
||||
[36] Xinyu Zhang, Ron Steinfeld, Joseph K. Liu, Muhammed F. Esgin, Dongxi
|
||||
(2023) 102935.
|
||||
[30] X. Feng, X. Wang, K. Cui, et al., A distributed message authentication scheme Liu, Sushmita Ruj, DualRing-PRF: Post-quantum (linkable) ring signatures from
|
||||
with reputation mechanism for internet of vehicles, J. Syst. Archit. 145 (2023) Legendre and power residue PRFs, in: Australasian Conference on Information
|
||||
103029. Security and Privacy, Springer, 2024, pp. 124–143.
|
||||
[31] S. Thapliyal, M. Wazid, D.P. Singh, et al., Robust authenticated key agreement [37] David A. Cooper, Daniel C. Apon, Quynh H. Dang, Michael S. Davidson, Morris J.
|
||||
protocol for internet of vehicles-envisioned intelligent transportation system, J. Dworkin, Carl A. Miller, et al., Recommendation for stateful hash-based signature
|
||||
Syst. Archit. 142 (2023) 102937. schemes, NIST Spec. Publ. 800 (208) (2020) 208–800.
|
||||
[32] Nikhil Verma, Swati Kumari, Pranavi Jain, Post quantum digital signature change [38] Ralph C. Merkle, A certified digital signature, in: Conference on the Theory and
|
||||
in iota to reduce latency in internet of vehicles (iov) environments, in: 2022 Application of Cryptology, Springer, 1989, pp. 218–238.
|
||||
International Conference on IoT and Blockchain Technology, ICIBT, IEEE, 2022,
|
||||
pp. 1–6.
|
||||
|
||||
|
||||
|
||||
|
||||
13
|
||||
|
||||
@@ -0,0 +1,929 @@
|
||||
Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Journal of Systems Architecture
|
||||
journal homepage: www.elsevier.com/locate/sysarc
|
||||
|
||||
|
||||
|
||||
|
||||
A load-balanced acceleration method for small and irregular batch matrix
|
||||
multiplication on GPU
|
||||
Yu Zhang a , Lu Lu a,b ,∗, Zhanyu Yang a , Zhihong Liang c,d , Siliang Suo c,d
|
||||
a School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China
|
||||
b
|
||||
Peng Cheng Laboratory, Shenzhen, 518055, China
|
||||
c
|
||||
Electric Power Research Institute, CSG, Guangzhou, China
|
||||
d
|
||||
Guangdong Provincial Key Laboratory of Power System Network Security, Guangzhou, China
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: As an essential mathematical operation, GEneral Matrix Multiplication (GEMM) plays a vital role in many
|
||||
Batch GEMM applications, such as high-performance computing, machine learning, etc. In practice, the performance of
|
||||
Thread workload GEMM is limited by the dimension of matrix and the diversity of GPU hardware architectures. When dealing
|
||||
Multi-thread kernel
|
||||
with batched, irregular and small matrices, the efficiency of GEMM usually performs poorly. To this end, a
|
||||
Tiling algorithm
|
||||
common approach is to segment the matrix into multiple tiles and utilize parallelism between workgroups in
|
||||
GPU to compute the results. However, previous works only consider tile size and inter-workgroup parallelism
|
||||
and ignore the issues of low computational efficiency and hardware resource utilization caused by the
|
||||
difference in workloads between wavefronts. To address these issues, we propose a load-balanced batch GEMM
|
||||
acceleration method, consisting of a multi-thread kernel design and an efficient tiling algorithm. The multi-
|
||||
thread kernel design can address the workload unbalance between wavefronts in different workgroups, and the
|
||||
efficient tiling algorithm can choose the optimal tiling scheme with the new thread-level parallelism calculation
|
||||
method to achieve load-balanced task allocation. Finally, various comparative experiments were conducted
|
||||
on two GPU platforms: AMD and NVIDIA. Experimental results indicate the proposed method outperforms
|
||||
previous methods.
|
||||
|
||||
|
||||
|
||||
1. Introduction Many real-world applications, such as deep learning, involve ir-
|
||||
regular, small-size matrix multiplication operations in their computa-
|
||||
GEneral Matrix Multiplication (GEMM) is a standard computing tions [11]. For example, in Convolutional Neural Networks (CNN) [12–
|
||||
kernel that plays an important role in high-performance computing [1], 14], the structure of these models contains a large number of convo-
|
||||
artificial intelligence [2], image processing [3], and other research lutional layers. The scale of the convolution kernel tends to be small
|
||||
fields. With the explosive growth of data volume and the emergence of (e.g. ‘‘1*1’’ and ‘‘3*3’’). Convolution operations are converted to GEMM
|
||||
various algorithms, the demand for high-performance GEMM comput- using Im2col function, and the dimension of the matrix is typically
|
||||
ing is increasing [4,5]. Additional stream processors and memory are less than 1000 [15,16]. These small GEMM computations prevent the
|
||||
integrated into the GPU to cater to this trend, providing tremendous GPU from fully exploiting its hardware computing potential. In this
|
||||
computational power for GEMM acceleration. To fully utilize the hard- case, the scheduling overhead between batch GEMMs and the regularity
|
||||
ware acceleration capability, AMD and NVIDIA, provide developers
|
||||
of the matrix poses challenges to computational performance [17,18].
|
||||
with a platform for parallel computing based on GPU (ROCm and
|
||||
For a GEMM, the tiling is a standard solution method. The matrix is
|
||||
CUDA). Based on these parallel computing acceleration platforms, var-
|
||||
segmented into multiple tiles, and a thread block is responsible for
|
||||
ious optimization algorithms and acceleration libraries have been pro-
|
||||
computing individual tiles. Since each tile is independent, multiple tiles
|
||||
posed and demonstrated to have powerful effects, such as rocBLAS [6],
|
||||
can be computed in parallel by using multiple threads in GPU, to speed
|
||||
cuBLAS [7], MAGMA [8], etc. These methods achieve optimal computa-
|
||||
tional task allocation through hardware resource scheduling and thread up the computation process of GEMM. The larger dimension of tile will
|
||||
parallelism to accelerate the matrix multiplication operation [9,10]. increase the Thread-Level Parallelism (TLP) of a single tile and also will
|
||||
|
||||
|
||||
|
||||
∗ Corresponding author at: School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China.
|
||||
E-mail addresses: yuzhang0722@163.com (Y. Zhang), lul@scut.edu.cn (L. Lu), yangzhanyu@hotmail.com (Z. Yang), liangzh@csg.cn (Z. Liang),
|
||||
suosl@csg.cn (S. Suo).
|
||||
|
||||
https://doi.org/10.1016/j.sysarc.2025.103341
|
||||
Received 3 September 2024; Received in revised form 3 November 2024; Accepted 8 January 2025
|
||||
Available online 23 January 2025
|
||||
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
reduce the number of tile, resulting in the failure to fully utilize the 2. Related work and motivation
|
||||
hardware resources of GPU [19,20]. The Instruction-Level Parallelism
|
||||
(ILP) of a single thread is related to the K-dimension. Generally, for a 2.1. Related work
|
||||
large enough matrix size, it can fully use GPU hardware resources and
|
||||
achieve higher TLP and ILP [21,22]. Several approaches have been proposed for batch GEMM computa-
|
||||
To improve computational efficiency, previous studies have pro- tion, which mainly focus on algorithm-level optimization or architecture-
|
||||
posed some acceleration methods for matrix multiplication. For in- level optimization. The former mainly explores lower bounds on the
|
||||
stance, rocBLAS [6] and cuBLAS [7] provide batch GEMM API time complexity of GEMM operations at the mathematical level and
|
||||
(rocblasSgemmBatched and cublasSgemmBatched), which can support optimizes the computational effort. The latter is based on different GPU
|
||||
multiple GEMMs to be simultaneously calculated on GPUs. However, architecture features and uses corresponding optimization techniques
|
||||
these APIs support only uniform matrix sizes that considerably limit to improve the computational efficiency of GEMM. In algorithm-level
|
||||
these applications. NVIDIA also provides a C++-style template library, optimization, Strassen et al. [24] proposed a novel GEMM algorithm
|
||||
CUTLASS [23], which utilizes built-in tile templates and sorting to
|
||||
based on the property that matrix addition is faster than matrix multi-
|
||||
accelerate matrix multiplication operations. In fact, the size of matrices
|
||||
plication to speed up the computational process, which uses seven-time
|
||||
is variable in many real-world applications [11]. To solve this issue,
|
||||
multiplications and multiple addition operations instead of eight-time
|
||||
a Vbatch GEMM route that supports batch GEMM in various sizes is
|
||||
multiplications. This approach mathematically reduced the time com-
|
||||
designed and implemented by MAGMA (magmablas_sgemm_vbatched). It
|
||||
plexity of GEMM to 𝑂(𝑛2.81 ) for the first time. To reduce the require-
|
||||
adapts to batch GEMMs with multiple tiling strategies, assigning the ap-
|
||||
ment of Strassen’s algorithm for extra memory space, three different
|
||||
propriate tile to a single GEMM for huge performance gains. Although
|
||||
methods were proposed in [25]: pre-additions, overwriting the input
|
||||
variable sizes are supported in MAGMA, it still has some limitations.
|
||||
First, MAGMA only supports some coarse-grained tiling strategies that matrix, and recursive scheduling to alleviate this problem. At the
|
||||
are not appropriate for all GEMM. Coarse-grained tiling results in an same time, due to the powerful effect of deep neural networks in
|
||||
unbalanced kernel workload and GPU utilization reduction. Second, the various domains, Alhussein Fawzi et al. [26] transformed the process
|
||||
grid size is determined by the tiling of the largest matrix, which leads of finding the optimal complexity of matrix multiplication into a tensor
|
||||
to idle threads and a waste of GPU computing power. Third, the lack decomposition problem and used reinforcement learning to explore
|
||||
of an evaluation criterion for tiling leads to lower efficiency of strategy lower bounds on the complexity of matrix multiplication. In particular,
|
||||
choice. for a 4 × 4 matrix, the multiplication number was as low as 47 multi-
|
||||
To thoroughly support batch GEMM with variable sizes, it is es- plications. This performance was better than the two-level Strassen’s
|
||||
sential to design a tiling algorithm that can be adapted to all GEMMs algorithm, which involves 49 multiplications. Although the above
|
||||
and adaptively choose tile sizes, not limited to single size. The optimal approach reduces the mathematical complexity of matrix multiplication
|
||||
tiling for each GEMM is different, depending on the size of the matrix operations, it is difficult to take advantage of the performance benefits
|
||||
dimensions (𝑀, 𝑁, 𝐾). How to choose a suitable tile is a challenge of these approach due to the neglect of computational scheduling
|
||||
for batch GEMM. At the same time, an evaluation criterion based on strategies and multi-level memory architecture features on the GPU.
|
||||
the current GPU hardware and tiling strategy is also essential. With In architecture-level optimization, GPU vendors (NVIDIA and AMD)
|
||||
GPU hardware, an appropriate tiling for each GEMM can be chosen have designed and implemented computing libraries such as cuBLAS [6]
|
||||
to fully utilize the GPU computing capabilities and achieve better and rocBLAS [7] based on their parallel computing platforms to im-
|
||||
computational performance. How to measure the effectiveness of the prove GPU hardware utilization and parallelism. However, due to the
|
||||
tiling algorithm on the GPU hardware is a challenging problem. The tile restriction of uniform-sized matrix, the performance is poor when faced
|
||||
with various sizes can lead to significant differences in computational with small and irregular batch GEMMs. Although NVIDIA provides
|
||||
effort within each workgroup, further to an unbalanced distribution of a C++-style template library, the small size of the matrix and the
|
||||
computational tasks and excessive load differences between threads. lack of assembly-level optimizations make it difficult for CUTLASS
|
||||
Hence, for tiles with various sizes, balancing thread computation and to fully exploit its performance advantages for irregular and small
|
||||
data loading during computation is also a challenge for batch GEMM. matrix multiplication [23]. These irregular and small-sized matrices
|
||||
To address the above challenges, we propose a batch GEMM accel-
|
||||
often lead to unbalanced workloads among threads in different work-
|
||||
eration method with a multi-thread kernel design. Furthermore, an ef-
|
||||
groups, which can reduce kernel performance. For Sparse GEneral
|
||||
ficient tiling algorithm is proposed to achieve load-balanced and higher
|
||||
Matrix-Matrix multiplication (SpGEMM), the matrix’s sparsity leads to
|
||||
hardware resource utilization. Our contributions can be summarized as
|
||||
significant differences in thread workloads [27,28]. To address the
|
||||
follows:
|
||||
unbalanced workload, Chen et al. [29] optimized the matrix segmen-
|
||||
• A multi-threaded kernel design scheme is proposed to balance tation by analyzing the distribution of the floating point calculations
|
||||
thread computation and data loading in different workgroups to of the CSR-based SpGEMM, which achieves load balance and perfor-
|
||||
compute the various tiles. mance improvement on Sunway TaihuLight. For the issue of workload
|
||||
• A novel TLP computation method is designed to select the optimal unbalance in threads, it is necessary to conduct a detailed analysis
|
||||
tiling algorithm by combining the kernel occupancy of the GPU of the computation process and hardware platform characteristics to
|
||||
and the tiling operation. design an efficient parallel framework implementation [30,31]. Xiao
|
||||
• An efficient tiling algorithm is implemented by considering the et al. [32] introduce a fine-grained partitioning strategy to select ap-
|
||||
GPU hardware architecture and the batch GEMM workload. propriate segmentation dimensions, efficiently utilizing the parallelism
|
||||
• The proposed method can efficiently handle batch irregular GEMM of multi-thread and improving the performance of binary sparse tensor
|
||||
and achieve state-of-the-art performance on AMD and NVIDIA contracts. The diversity of matrix sizes makes it difficult to utilize a
|
||||
GPU platforms. unified routine for calculations, resulting in some threads being idle
|
||||
The rest of the paper is organized as follows. Section 2 provides in CU [33,34]. Indeed, the size of matrices is variable and irregular
|
||||
related work and motivation. Section 3 introduces background on batch in various scientific computing scenarios. To overcome the matrix
|
||||
GEMM, GPU architecture, and kernel occupancy. Section 4 presents restriction of uniform size, MAGMA [8] proposes a Vbatch routine
|
||||
the details of the multi-thread kernel design and load-balanced tiling to support batch GEMM with various sizes. In this way, it uses a 3D
|
||||
algorithm. Section 5 demonstrates and evaluates the experimental re- grid to indicate batch GEMM’s kernel design, where grid.z represents
|
||||
sult. Section 6 provides the conclusions of the paper and future work. batch size. Each GEMM corresponds to one of the 2D-grid planes, and
|
||||
The source code of this paper can be obtained in this repository link: the size of the two-dimensional plane (grid.x, grid.y) is determined by
|
||||
https://github.com/zhangyu0722/BatchGEMM.git. the largest GEMM. In the case of irregular GEMM, if the dimension
|
||||
|
||||
2
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1. GEMM and batch GEMM schematic diagram.
|
||||
|
||||
|
||||
difference between the largest GEMM and the rest is too large, a large 3. Background
|
||||
number of threads and workgroups will be idle, resulting in a waste of
|
||||
GPU computing resources. For various parallel acceleration platforms, 3.1. GEMM and batch GEMM
|
||||
different hardware characteristics, such as register size and number of
|
||||
CUs, will affect the allocation of computing resources in the kernel. To For a single GEMM, its accumulation routine is 𝐶 = 𝛼 𝐴𝐵+𝛽 𝐶, where
|
||||
ensure kernel performance, it is necessary to flexibly set parameters 𝐴 ∈ 𝑅𝑀×𝐾 , 𝐵 ∈ 𝑅𝐾×𝑁 and 𝐶 ∈ 𝑅𝑀×𝑁 are dense matrices, 𝑀, 𝑁, and
|
||||
based on different matrix sizes and hardware architectures [9,35]. 𝐾 represent matrix dimensions, and 𝛼 and 𝛽 are constant scalars. A
|
||||
To solve this problem, a coordinated tiling and batching strategy is common approach is tiling matrix C into multiple tiles [21,36], which
|
||||
proposed in [21], where a different tiling strategy is used for each utilizes the parallel computing of thread in GPU to calculate each tile
|
||||
GEMM in batch GEMM and appropriate batching is used according and splices together the result. As shown in Fig. 1 (b), given a GEMM
|
||||
to the tile size to improve the computational efficiency of the GPU.
|
||||
with size 𝑀 × 𝑁 × 𝐾, the matrix C is segmented into multiple tiles with
|
||||
Wang et al. [36] proposed the sort-up algorithm based on the GEMM
|
||||
𝑇𝑚 × 𝑇𝑛 . Each workgroup is responsible for the calculation of a tile and
|
||||
workload and split-down in the tiling process, which can segment large
|
||||
needs to access the row section of matrix A with size 𝑇𝑚 ×𝐾 and column
|
||||
tiles into multiple smaller tiles. This approach can make better use of
|
||||
section of matrix B with size 𝐾 × 𝑇𝑛 . However, the row cross-section
|
||||
CU utilization when the number of GEMM is limited.
|
||||
of A and the column cross-section of B (represented in Fig. 1 (b) by
|
||||
2.2. Motivation the gray parts of matrices A and B, respectively) are too large to store
|
||||
in shared memory and registers. Hence, the row section of A and the
|
||||
Although the above-mentioned methods improve the parallel com- column section of B are segments of multiple A tiles with 𝑇𝑚 × 𝑇𝑘 and B
|
||||
puting efficiency of batch GEMM on GPU from various perspectives, tiles with 𝑇𝑘 × 𝑇𝑛 , respectively. The partial result of C can be obtained
|
||||
there are two problems. One is that the workload of threads varies by calculating with A tile and B tile, and accumulative partial results
|
||||
significantly across the kernel. In the above approach, tiles with various can obtain the final result.
|
||||
sizes are designed, and each tile is responsible for the corresponding To batch-run multiple GEMMs, a naive routine is computed for
|
||||
kernel, where the number of threads is fixed. In general, larger tiles each GEMM individually. However, when the matrix size is small,
|
||||
have better TLP. This will also increase the workload of each thread a single GEMM does not fully utilize the GPU’s computing power,
|
||||
for large-size tiles, and the thread responsible for computing large tiles leaving the CU idle [37,38]. To avoid this situation, a batch GEMM
|
||||
requires more hardware resources (VGPR, SGPR, LDS) and computing method is proposed to design multiple kernels for various GEMM in
|
||||
time. The other one is that differences between wavefronts within dif- the GPUs [36,39]. Compared to GEMM, batch GEMM is expressed in
|
||||
ferent workgroups are ignored in the TLP calculations. The workgroup (𝑀 × 𝑁 × 𝐾 × 𝐵𝑠𝑖𝑧𝑒 ), where 𝑀, 𝑁 and 𝐾 represent the dimensions of
|
||||
will be transformed into multiple wavefronts during GPU computation the matrix, and 𝐵𝑠𝑖𝑧𝑒 represents the batch size. A batch GEMM is 3D-
|
||||
and be executed in parallel on the CU. Each CU can run multiple dimension grid, where grid.z is batch sizes, and grid.x and grid.y are the
|
||||
wavefronts simultaneously, and the number of wavefronts depends on
|
||||
lengths and widths of a two-dimensional plane respectively [40]. To
|
||||
the hardware resources required by the wavefront. Thus, the TLP on the
|
||||
balance the workload of a batch GEMM, a variety of tile sizes are used
|
||||
GPU should be determined by the number of threads in the wavefront
|
||||
for GEMM tiling. The two-dimensional grid size has the corresponding
|
||||
that can be executed in parallel on the CU.
|
||||
matrix C and tiling strategy. Each tile is responsible for the correspond-
|
||||
To solve the above problems, we propose an efficient and load-
|
||||
balanced batch GEMM acceleration method, which consists of two ing workgroup. A workgroup is decomposed into multiple wavefronts
|
||||
parts: a multi-thread kernel design scheme and an efficient tiling algo- that execute on the CU. The 3D grid of batch GEMM is shown in Fig. 1
|
||||
rithm. A multi-thread kernel design is proposed to balance the amount (a).
|
||||
of loading and computation in the thread corresponding to each tile.
|
||||
Tiles with various sizes correspond to the number of threads selected. 3.2. GPU architecture and kernel occupancy
|
||||
Although this is limited by the parallel programming interfaces of the
|
||||
CUDA and ROCm platforms, the number of threads responsible for With the improvement of hardware architecture and parallel com-
|
||||
computing a tile is uniform. To overcome this shortcoming, we use the puting programming platforms (such as ROCm1 and CUDA2 ), GPUs
|
||||
corresponding filtering operation in the kernel execution process to ef- are becoming the most popular hardware accelerator. The two most
|
||||
fectively alleviate this problem. An efficient tiling algorithm can choose commonly used GPUs are AMD and NVIDIA, widely used in various
|
||||
the optimal scheme based on different GEMMs and GPUs. To measure scientific computing platforms. However, some basic concepts of ex-
|
||||
the effect of tiling, we propose a new way of TLP computation based pression in ROCm and CUDA are different. We chose AMD’s official
|
||||
on wavefronts. The optimal tiling scheme is obtained by adjusting the
|
||||
tiling strategy according to the TLP. Finally, we obtain an efficient tiling
|
||||
algorithm based on the new TLP calculation method. In Section 4, the 1
|
||||
https://rocm.docs.amd.com/en/latest/
|
||||
2
|
||||
details of the proposed method are introduced. https://docs.nvidia.com/cuda/
|
||||
|
||||
|
||||
3
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
Table 1
|
||||
ROCm/CUDA terminology.
|
||||
ROCm CUDA Description
|
||||
Compute Unit (CU) Streaming One of many parallel vector processors in a GPU that contains
|
||||
Multiprocessor (SM) parallel ALUs. All waves in a workgroup are assigned
|
||||
to the same CU.
|
||||
Kernel Kernel Functions launched to the GPU that are executed by multiple
|
||||
parallel workers on the GPU. Kernels can work in
|
||||
parallel with CPU.
|
||||
Wavefront Warp Collection of operations that execute in lockstep, run the
|
||||
same instructions, and follow the same control-flow path.
|
||||
Individual lanes can be masked off.
|
||||
Workgroup Thread block Think of this as a vector thread. A 64-wide wavefront
|
||||
is a 64-wide vector op.
|
||||
Work-item/Thread Thread GPU programming models can treat this as a separate thread
|
||||
of execution, though this does not necessarily get
|
||||
forward sub-wavefront progress.
|
||||
Global Memory Global Memory DRAM memory accessible by the GPU that goes
|
||||
through some layers cache.
|
||||
Local Memory Shared Memory Scratchpad that allows communication between wavefront
|
||||
in a workgroup.
|
||||
Private Memory Local Memory Per-thread private memory often mapped to registers.
|
||||
|
||||
|
||||
|
||||
terminology for this paper to provide precise specifications. To clarify resources. In order to fully utilize the hardware resources of the GPU
|
||||
some differences and relationships between ROCm and CUDA terms, a and improve the efficiency of parallel computing, the kernel occupancy
|
||||
comparison of terminology is given in Table 1. should be improved as much as possible without data overflow [46,47].
|
||||
A GPU is composed of multiple Shader Engines (SE) and a com- In batch GEMM, an efficient kernel design should properly allocate
|
||||
mand processor. Each SE has its own workload manager. One SE is the data loading and computation workload for each work-item in the
|
||||
integrated with multiple CU and workload manager. Each CU contains wavefront, so that the memory space and computing power on the CU
|
||||
an enormous amount of Arithmetic and Logic Units (ALUs), a small can be more efficiently utilized [48,49].
|
||||
number of control units, and caches. Hence, GPUs are suitable for a
|
||||
large number of simple parallel computing tasks. A GPU kernel consists 4. Overview
|
||||
of one or multiple workgroups, the size of which is determined by the
|
||||
number of wavefronts and threads. On the memory hierarchy, the GPU 4.1. Multi-thread kernel design
|
||||
has global memory, local memory, and private memory from slow to
|
||||
fast according to memory access speed, and local memory and private Tile size and kernel design are closely related in the design of batch
|
||||
memory are much smaller than global memory [41,42]. GEMM algorithms, and there are two matrix tile design routes. The
|
||||
Kernel Occupancy represents the actual utilization of computing first way is to design a tile to adapt to all GEMMs, and the second
|
||||
unit resources by a kernel function on GPU, which is the ratio of is to design the various tiles to adapt to different GEMMs. Compared
|
||||
actived wavefront to the maximum wavefront supported by CU [35,43]. with the first method, for irregular GEMM, the latter method is more
|
||||
An active wavefront running on CU requires resources such as Vec- flexible and efficient to utilize the computing resources of GPU. For
|
||||
tor General-Purpose Register (VGPR), Scalar General-Purpose Registers GEMMs with various shapes and sizes, using a single tile can easily
|
||||
(SGPR), Local Data Share (LDS), etc. A wavefront can be activated lead to increased workload differences between threads in multiple
|
||||
and run on a CU when all required resources are available. When the
|
||||
workgroups, affecting the allocation of computing resources. In this
|
||||
utilization of CU resources is low, the number of active wavefronts
|
||||
paper, we perform a multi-thread kernel design for the second matrix
|
||||
is small, which leads to the waste of hardware resources and the
|
||||
segmentation method. Two different tile design strategies are shown
|
||||
degradation of the parallel performance of the kernel. On the other
|
||||
in Fig. 2. Here we present the effect of two different tile strategies on
|
||||
hand, when the number of active wavefronts in the CU increases, the
|
||||
the occupancy of the 3D grid. For the batch GEMM, different tile sizes
|
||||
resources used by each wavefront and the available register storage
|
||||
lead to different numbers of workgroups, resulting in different 3D grid
|
||||
space of each work-item in the wavefront decrease [44,45].
|
||||
occupancies.
|
||||
The number of active wavefronts on a CU is mainly limited by the
|
||||
For a single GEMM, matrix C is tiled into multiple tiles. The tile
|
||||
following factors: the number of work-items in each workgroup and
|
||||
size can be flexibly designed, and each tile can be run in parallel
|
||||
the sizes of VGPR, SGPR, and LDS. For example, in AMD’s MI1003
|
||||
without data interference. Each tile is calculated by the corresponding
|
||||
and MI210,4 a wavefront consists of 64 work-terms. When the number
|
||||
workgroup and can be represented by a 2D-grid as a whole. When the
|
||||
of work-items in a workgroup is less than or equal to 64, only one
|
||||
size and number of tiles is large enough, efficient parallel execution
|
||||
wavefront is included. The VGPR, SGPR, and LDS sizes on the CU have a
|
||||
efficiency can usually be obtained. However, in real-world cases, the
|
||||
corresponding upper bound for each work-item. According to the kernel
|
||||
size of matrices in batch GEMM tends to be small and irregular,
|
||||
design, the resources on the CU need to be allocated before executing
|
||||
which leads to poor performance of traditional methods. Therefore, the
|
||||
each work-item. When resource requirements of the work-item are
|
||||
previous method adopts a variety of tiles to adapt to the corresponding
|
||||
satisfied, the wavefront can be active and run on the CU. Otherwise,
|
||||
GEMM, and each tile is based on a unified number of threads, which
|
||||
it will not run until other wavefronts accomplish tasks and release
|
||||
will lead to the workload of threads in large-scale tiles being much
|
||||
larger than that of small tiles. This gap in the workload of threads
|
||||
3
|
||||
https://www.amd.com/system/files/documents/instinct-mi100- results in unbalanced thread loading and reduces GPU parallel com-
|
||||
brochure.pdf puting efficiency. Table 2 lists the detailed parameters for tiles with
|
||||
4
|
||||
https://www.amd.com/content/dam/amd/en/documents/instinct- various sizes based on the same work-item design (The number of work-
|
||||
business-docs/white-papers/amd-cdna2-white-paper.pdf items in the kernel is 128). 𝑊𝐶 𝑃 and 𝑊𝐷𝐿 represent the computation
|
||||
|
||||
4
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 2. Two different tile design strategies for batch GEMMs. ((a) All GEMMs adopt the same tiling scheme, which is divided into multiple tiles of the same size. (b) Different
|
||||
GEMMs adopt different tiling schemes and are divided into multiple tiles of different sizes.).
|
||||
|
||||
|
||||
Table 2 speed of global memory is considerably lower than that of registers,
|
||||
The common kernel design scheme for batch GEMM (There are significant workload threads’ data access efficiency decreases, and overall time consumption
|
||||
gaps between threads).
|
||||
increases. At the same time, since the variety of thread workloads,
|
||||
Tile 𝑇𝑚 𝑇𝑛 𝑇𝑘 𝑊𝐶 𝑃 𝑊𝐷𝐿
|
||||
when a thread with a heavy workload is run on the CU, the number
|
||||
small 16 16 8/16 2 4/6 of active wavefronts on the CU is less, resulting in the CU’s kernel
|
||||
medium 32 32 8/16 8 12/16
|
||||
large 64 64 8/16 32 40/48
|
||||
occupancy (The ratio between the number of active wavefronts and the
|
||||
maximum number of supported wavefronts) will be reduced. The state
|
||||
of the CU with low kernel occupancy will be longer due to the longer
|
||||
work-item computation time.
|
||||
amount and data loading amount of work-item, respectively, and their To solve this problem, we propose a multi-thread kernel design,
|
||||
calculation expressions are considered as: which ensures that the workload of each thread is balanced as much as
|
||||
𝑇 × 𝑇𝑛 possible. The experimental results in Fig. 3 show that multiple kernels’
|
||||
𝑊𝐶 𝑃 = 𝑚 (1)
|
||||
𝑊𝑛𝑢𝑚 performance varies when calculating the same tile. For example, the
|
||||
𝑇𝑚 × 𝑇𝑛 + 𝑇𝑚 × 𝑇𝑘 + 𝑇𝑘 × 𝑇𝑛 128-thread kernel performs best when calculating a tile with ‘‘32*32’’,
|
||||
𝑊𝐷 𝐿 = (2) as shown in Fig. 3. The performance gap mentioned above is mainly
|
||||
𝑊𝑛𝑢𝑚
|
||||
because of the varying workloads of threads under different kernels,
|
||||
where 𝑊𝑛𝑢𝑚 represents the number of work-items responsible for com-
|
||||
which affects the overall performance. For the 128-thread kernel, when
|
||||
puting the tile.
|
||||
calculating a tile with ‘‘32*32’’, each thread needs to complete the
|
||||
For different tiles, there is a significant gap in workload between
|
||||
calculation of 8 elements and the loading of 16 elements. When cal-
|
||||
threads (𝑊𝐶 𝑃 ∈ [2, 32] and 𝑊𝐷𝐿 ∈ [4, 48]). The choice of 𝑇𝑘 also has a
|
||||
culating a tile with ‘‘64*64’’, the workload of the threads is heavy, and
|
||||
certain impact on the data load of work-item. Each thread is responsible
|
||||
each thread needs to complete the calculation of 32 elements and the
|
||||
for more data loads when 𝑇𝑘 is larger. For example, in large tile, when loading of 64 elements. When calculating larger tiles, the workload of
|
||||
the value of 𝑇𝑘 is set to 8 or 16, each work-item is responsible for the thread increases significantly. To avoid significant differences in
|
||||
loading 40 and 48 elements, respectively. The workload differences workload between threads, we used a multi-thread kernel to calculate
|
||||
caused by these different tile sizes impact kernel performance. various tiles by considering the computation amount (𝑊𝐶 𝑃 ) and data
|
||||
To explore the impact of the number of work-items in the work- loading amount (𝑊𝐷𝐿 ) of threads in the kernel. For larger tiles such
|
||||
group and the tile size on the performance of batch GEMM, some as ‘‘32*64’’ and ‘‘64*64’’, a 256-thread kernel is used for computation.
|
||||
experiments are performed, whose results are given in Fig. 3. As shown In this way, increasing the number of threads will reduce the thread’s
|
||||
in Fig. 3, under the condition that the number of GEMMs is large and computation amount and data loading amount, thereby reducing the
|
||||
𝑀, 𝑁, and 𝐾 are large enough, various thread-kernels (thread number gaps between threads’ workloads and achieving load balancing. There
|
||||
is 64, 128, 256, and 512) are used to compute multiple tiles (The nine are five tiles and two kernels (𝑊𝑛𝑢𝑚 ) for small and irregular batch
|
||||
tiles are shown in Fig. 3). In Fig. 3, four thread kernels commonly matrix multiplication, as shown in Table 3. Compared to Table 2, we
|
||||
used in previous work are selected as benchmarks [21,34,36]. We used balance the thread workload by setting the tile size and number of
|
||||
these kernels to investigate their performance under various tiles in kernel threads so that thread computation and data loading are as
|
||||
comparative experiments. Fig. 3 shows that the kernel’s performance consistent as possible across different workgroups. In the calculation
|
||||
first increases and then decreases for different tiles. When the tile process of GEMM, five tile types are designed for GEMM calculation
|
||||
size is small, the thread’s workload is also tiny. In this case, threads of different sizes, from ‘‘small’’ to ‘‘large’’. To ensure that the amount
|
||||
in the kernel only compute a few elements, which causes a lack of of computation and data loading for the work-item responsible for
|
||||
full utilization of threads’ computing power. As the tile size increases, computing different tiles are as equal as possible, the number of threads
|
||||
the number of elements that the thread needs to calculate and store varies depending on the tile size. In Table 3, two different thread
|
||||
is also increasing. Under the condition that the register data does numbers are used (128 and 256), respectively, and the computation
|
||||
not overflow, the computing efficiency of the thread is continuously amount (𝑊𝐶 𝑃 ) and data loading amount (𝑊𝐷𝐿 ) of the work-item in
|
||||
improving. When the tile corresponding to the thread is too large, the each scheme are given. Although the current ROCm and CUDA platform
|
||||
register data overflows, and the data will be transferred to the global programming interfaces only support the kernel design of a uniform
|
||||
memory. For example, for a 64-thread-kernel, when computing ‘‘8*8’’ thread number, we use a screening operation in the early stage of kernel
|
||||
and ‘‘32*32’’ tiles, respectively, each thread needs to compute 1 and 32 execution to achieve the effect of kernel design of multiple threads. For
|
||||
elements in matrix C. It is obvious that ‘‘32*32’’ requires more register example, in this paper, the number of kernel threads is set to 256. When
|
||||
memory. However, the register memory of each thread is precious. the tiles of ‘‘small’’, ‘‘small-medium’’ and ‘‘medium’’ are executed, the
|
||||
When the maximum limit of the register memory is exceeded, the data extra threads will be terminated immediately and the corresponding
|
||||
will be transferred to the global memory for storage. Because the access computing resources will be released because these tiles only need
|
||||
|
||||
5
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Experimental results of multi-thread kernel.
|
||||
|
||||
|
||||
Table 3 and each tile is computed by a workgroup. Workgroups are further
|
||||
The multi-thread kernel design scheme with a more balanced workload.
|
||||
transformed into wavefronts based on their hardware resource require-
|
||||
Tile 𝑇𝑚 𝑇𝑛 𝑇𝑘 𝑊𝑛𝑢𝑚 𝑊𝐶 𝑃 𝑊𝐷𝐿
|
||||
ments and the number of work-item. Finally, these wavefronts are run
|
||||
small 16 16 16 128 2 6 in parallel on multiple CUs for batch GEMM calculations. Due to the
|
||||
small-medium 16 32 16 128 4 10
|
||||
difference between tile sizes, the computation amount and data loading
|
||||
medium 32 32 16 128 8 16
|
||||
medium–large 32 64 16 256 8 14 amount of threads are not unified in the different wavefront, which
|
||||
large 64 64 16 256 16 24 will lead to unbalanced hardware resource requirements. The execution
|
||||
time of the wavefront on the CU is also different. The overall time of the
|
||||
batch GEMM is the maximum of all CU execution time. If the workload
|
||||
difference between wavefronts is too significant, the execution time of
|
||||
128 threads. Terminating threads early allows for a better allocation
|
||||
one wavefront will be excessive, increasing the overall calculation time
|
||||
of computational resources to threads responsible for computing other
|
||||
consumption.
|
||||
tiles. With this implementation, we can achieve the effect of a multi-
|
||||
Therefore, Eq. (3) does not consider the workload gaps between
|
||||
threaded kernel. Even though the performance may be degraded in
|
||||
wavefronts. To solve this problem, we propose a new TLP calculation
|
||||
comparison with an actual multi-threaded kernel, the experimental
|
||||
method as follows:
|
||||
results in Section 5 demonstrate the excellent performance of this ( )
|
||||
∑ 𝑀𝑖 × 𝑁𝑖
|
||||
method. 𝑇 𝐿𝑃 𝑛𝑒𝑤 = 𝜑 × 𝑇𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 (4)
|
||||
𝑖
|
||||
𝑇𝑚𝑖 × 𝑇𝑛𝑖
|
||||
4.2. Tiling algorithm where the expression of 𝑀𝑖 , 𝑁𝑖 , 𝑇𝑚𝑖 and 𝑇𝑛𝑖 have the same meaning
|
||||
as Eq. (3), and 𝑇𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 is number of work-item in wavefront, 𝜑
|
||||
4.2.1. Criteria for evaluation represents the conversion process of workgroup to wavefront.
|
||||
The tiling can be seen as a re-assignment of GEMM computation The conversion process mainly considers the following factors: the
|
||||
task. Efficient tiling algorithm can transform GEMM operations and number of workitems in the workgroup, the size of VGPR, SGPR, LDS
|
||||
improve hardware resource utilization. When various kernel designs required by a workitem, and the maximum number of wavefront sup-
|
||||
are implemented, choosing an appropriate tiling scheme becomes a ported in the CU. These factors are related to GPU hardware architec-
|
||||
crucial issue. In general, for a GEMM, there will be better parallelism ture. Next, take AMD’s MI210, which is based on CDNA2.0 architecture,
|
||||
within the workgroup when the tile size is larger. However, a larger tile as an example. Under the limitation of the number of workitems in the
|
||||
means that the number of tiles needs to be reduced. If the number of workgroup, the number of wavefront can be calculated as follows:
|
||||
tiles is too few, the CU cannot be fully utilized, resulting in a waste of ( )
|
||||
𝑊 𝐼𝑤𝑔
|
||||
computing resources. Therefore, choosing a suitable tiling evaluation 𝑊 𝐹𝑤𝑔 = 16 × ceil (5)
|
||||
64
|
||||
criteria is crucial. In the previous study, TLP was used to quantify the
|
||||
parallelism of tiling strategies on GPUs. Given a GEMM and a tiling where 𝑊 𝐹𝑤𝑔 is the maximum number of wavefronts under the limit of
|
||||
strategy, its TLP can be calculated as follows: the number of work-item in the workgroup, and 𝑊 𝐼𝑤𝑔 represents the
|
||||
∑ 𝑀𝑖 × 𝑁 𝑖 number of work-item in the workgroup. Eq. (5) indicates that when the
|
||||
𝑇 𝐿𝑃 = × 𝑇𝑤𝑜𝑟𝑘𝑔𝑟𝑜𝑢𝑝 (3)
|
||||
𝑇𝑚𝑖 × 𝑇𝑛𝑖 number of work-item is less than or equal to 64, a workgroup contains
|
||||
𝑖
|
||||
only one wavefront, and the number of workgroups is limited to 16 in
|
||||
where 𝑀𝑖 and 𝑁𝑖 are the dimension size of matrix C of the 𝑖th GEMM,
|
||||
the CU.
|
||||
and 𝑇𝑚𝑖 and 𝑇𝑛𝑖 are the tile sizes chosen by matrix C. 𝑇𝑤𝑜𝑟𝑘𝑔𝑟𝑜𝑢𝑝 is
|
||||
Limited by the size of VGPR, SGPR, and LDS, the number of the
|
||||
the number of threads in workgroup. However, the above formulation
|
||||
wavefront can be calculated as follows:
|
||||
only considers TLP from the level of the workgroup. Indeed, during ( )
|
||||
𝑉 𝐺𝑃 𝑅𝑚𝑎𝑥
|
||||
the computation of the GEMM, the workgroup needs to be further 𝑊 𝐹𝑉 = 4 × floor (6)
|
||||
𝑉 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 × 64
|
||||
transformed into wavefronts and run on the CU in the form of a
|
||||
wavefront. The execution process of batch GEMM can be divided into where 𝑊 𝐹𝑉 is the maximum number of wavefronts under the limit of
|
||||
four phases: segmentation, workgroup, wavefront, and execution. In the the size of VGPR, 𝑉 𝐺𝑃 𝑅𝑚𝑎𝑥 is the size of VGPR in the Single Instruction
|
||||
segmentation phase, the GEMM is tiling into tiles with various sizes, Multiple Data (SIMD) unit, and 𝑉 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 is the VGPR size used by a
|
||||
|
||||
6
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
work-item. In the CDNA2.0 hardware architecture, each CU consists of decreases. This fine-tuning approach ensures that the CU is not idle
|
||||
four SIMDs. by increasing the utilization of hardware resources at the expense of
|
||||
( )
|
||||
𝑆 𝐺𝑃 𝑅𝑚𝑎𝑥 intra-tile parallelism.
|
||||
𝑊 𝐹𝑆 = floor (7)
|
||||
𝑆 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 Algorithm 1 The Tiling algorithm.
|
||||
where 𝑊 𝐹𝑆 is the maximum number of wavefronts under the limit of 1: Initialize 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 , 𝑇 𝐿𝑃 =0, 𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝=0,
|
||||
the size of SGPR, 𝑆 𝐺𝑃 𝑅𝑚𝑎𝑥 is the size of SGPR in the CU, and 𝑆 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 𝑡𝑜𝑡𝑎𝑙_𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 = 0;
|
||||
is the size of SGPR used by a wavefront. 2: for 𝑖 = 0 to 𝐵𝑠𝑖𝑧𝑒 − 1 do
|
||||
( ) ( )
|
||||
𝐿𝐷𝑆𝑚𝑎𝑥 𝑊 𝐼𝑤𝑔 3: Calculate 𝑇𝑚𝑖 , 𝑇𝑛𝑖 according to equation (10);
|
||||
𝑊 𝐹𝐿 = floor × ceil (8)
|
||||
𝐿𝐷𝑆𝑢𝑠𝑒𝑑 64 4: 𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝+ = (𝑀𝑖 ∗ 𝑁𝑖 )∕(𝑇𝑚𝑖 ∗ 𝑇𝑛𝑖 );
|
||||
5: end for
|
||||
where 𝑊 𝐹𝐿 is the maximum number of wavefronts under the limit of
|
||||
the size of LDS, 𝐿𝐷𝑆𝑚𝑎𝑥 is the size of LDS in the workgroup, 𝐿𝐷𝑆𝑢𝑠𝑒𝑑 6: 𝑇 𝐿𝑃𝑛𝑒𝑤 = 𝜑(𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝) × 𝑇𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 ;
|
||||
is the size of LDS used by a workgroup, and the expression of 𝑊 𝐼𝑤𝑔 7: 𝑇 𝑖𝑙𝑒[𝑠𝑖𝑧𝑒] represent to "large" to "small";
|
||||
8: while ( 𝑇 𝐿𝑃𝑛𝑒𝑤 >= 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 ) do
|
||||
have same meaning as Eq. (5).
|
||||
9: for 𝑗 = 0 to 𝐵𝑠𝑖𝑧𝑒 − 1 do
|
||||
To sum up, the number of wavefronts should meet the limitations
|
||||
10: if 𝑇 𝑖𝑙𝑒[𝑗] is "large" then
|
||||
of all the above factors, and the calculation method is as follows.
|
||||
11: Set 𝑇 𝑖𝑙𝑒[𝑗] is "medium-large";
|
||||
𝑊 𝐹 = min(𝑊 𝐹𝑤𝑔 , 𝑊 𝐹𝑉 , 𝑊 𝐹𝑆 , 𝑊 𝐹𝐿 , 𝑊 𝐹𝐶 ) (9) 12: else if 𝑇 𝑖𝑙𝑒[𝑗] is "medium-large" then
|
||||
13: Set 𝑇 𝑖𝑙𝑒[𝑗] is "medium";
|
||||
where 𝑊 𝐹 is the number of activated wavefronts, 𝑊 𝐹𝐶 is the maxi-
|
||||
14: else if 𝑇 𝑖𝑙𝑒[𝑗] is "medium" then
|
||||
mum number of wavefront supported in the CU.
|
||||
15: Set 𝑇 𝑖𝑙𝑒[𝑗] is "small-medium";
|
||||
The number of wavefronts and the corresponding number of threads
|
||||
16: else if 𝑇 𝑖𝑙𝑒[𝑗] is "small-medium" then
|
||||
are introduced into Eq. (4) to compute the TLP more accurately and
|
||||
17: Set 𝑇 𝑖𝑙𝑒[𝑗] is "small";
|
||||
appropriately. Compared to Eq. (3), the former only considers the
|
||||
18: end if
|
||||
workload at the workgroup-level, which neglects further conversion
|
||||
19: 𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝+ = (𝑀𝑗 ∗ 𝑁𝑗 )∕(𝑇𝑚𝑗 ∗ 𝑇𝑛𝑗 );
|
||||
between the workgroup and wavefront at runtime. Eq. (3) is valid only
|
||||
20: end for
|
||||
if the following two conditions are satisfied. One is that all thread
|
||||
21: 𝑇 𝐿𝑃𝑛𝑒𝑤 = 𝜑(𝑡𝑜𝑡𝑎𝑙_𝑤𝑜𝑟𝑘𝑔 𝑟𝑜𝑢𝑝) × 𝑇𝑤𝑎𝑣𝑒𝑓 𝑟𝑜𝑛𝑡 ;
|
||||
computations and data load amounts are consistent. The other one
|
||||
22: end while
|
||||
is that the hardware resources required for activated wavefront do
|
||||
not exceed the limit in the CU. Note that for GEMM with different 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 is used as a threshold to ensure parallelism among
|
||||
precision, threads have different requirements for computing resources multiple tiles in fine-tuning phase. Note that 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 has an impor-
|
||||
(VGPR, SGPR, LDS) during the computation process. Therefore, for tant influence on the selection of tiling scheme for different hardware
|
||||
matrices with different precision, the values of 𝑉 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 , 𝑆 𝐺𝑃 𝑅𝑢𝑠𝑒𝑑 , architectures. As a measure, the TLP values of the batch GEMM vary
|
||||
and 𝐿𝐷𝑆𝑢𝑠𝑒𝑑 in Eqs. (6)–(8) above are different. This will affect the according to the different tiling schemes. The setting of the 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑
|
||||
number of activated wavefronts. value is related to the architecture of the GPU because it uses the
|
||||
number of wavefront and the number of threads in the wavefront to
|
||||
4.2.2. Tiling fine-tuning measure the parallelism of the tiling scheme. The hardware resources
|
||||
For batch GEMM, an initial tiling scheme is first assigned to solve and the maximum number of wavefronts supported by each CU are
|
||||
the problem of switching between contexts and low hardware resource diverse, so corresponding 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 should be set for different GPU
|
||||
utilization caused by the matrix’s variable scale. Then, the tiling scheme architectures.
|
||||
is adjusted according to the TLP estimation of batch GEMM and the The specific process of selecting a tiling scheme for batch GEMM
|
||||
hardware architecture of GPU, and finally, the best tiling scheme is ob- is given in Algorithm 1: (1) when batch GEMM is given, an ‘‘initial
|
||||
tained. In the first stage, the tile size chosen by each GEMM according scheme’’ is obtained according to Eq. (10). (2) The TLP of this scheme
|
||||
to the dimensions of the matrix should meet the following conditions: is calculated according to the given batch GEMM and tiling scheme.
|
||||
⎧𝑇𝑚𝑖 ≤ 𝑀𝑖 and 𝑀𝑖 𝑚𝑜𝑑 𝑇𝑚𝑖 = 0 (3) Compare the TLP of the current tiling scheme with the 𝑇 𝐿𝑃𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 .
|
||||
⎪
|
||||
⎨𝑇𝑛𝑖 ≤ 𝑁𝑖 and 𝑁𝑖 𝑚𝑜𝑑 𝑇𝑛𝑖 = 0 (10) If the TLP is not reached, the fine-tuning operation will be performed,
|
||||
⎪ and the current tiling scheme will be changed and then returned to
|
||||
⎩𝑇𝑘𝑖 ≤ 𝐾𝑖 and 𝐾𝑖 𝑚𝑜𝑑 𝑇𝑘𝑖 = 0
|
||||
step (2). If the current TLP is greater than or equal to the threshold,
|
||||
where 𝑇𝑚𝑖 and 𝑇𝑛𝑖 represent the size of the tile dimension corresponding go to step (4). (4) The batch GEMM is calculated according to the final
|
||||
to the tiling scheme, and 𝑇𝑘𝑖 is the sub-tile size along the dimension of tiling scheme. In the above procedures, the TLP is used as an evaluation
|
||||
𝐾. There are two issues. (1) After the first phase, batch GEMM is only
|
||||
criterion to measure the effectiveness of the tiling scheme on the batch
|
||||
an ‘‘initial scheme’’ that cannot achieve optimal parallel computing
|
||||
GEMM. If the threshold is not reached, fine-tuning is used to adjust and
|
||||
efficiency. (2) Due to the variability of matrix size in batch GEMM, one
|
||||
improve the utilization of GPU hardware resources. The optimal tiling
|
||||
or several items of 𝐵𝑠𝑖𝑧𝑒 , 𝑀, 𝑁, and 𝐾 values may be particularly small
|
||||
scheme can be obtained to ensure an optimal implementation at the
|
||||
in batch GEMM, which is called an extreme GEMM case. In this case,
|
||||
GEMM and workgroup level. After the final tiling scheme, the multi-
|
||||
the ‘‘initial scheme’’ cannot get enough tiles, which will make some CU
|
||||
thread kernel is calculated based on the tile size so that the wavefront
|
||||
in an idle state, resulting in a waste of GPU computing power.
|
||||
and work-item levels can achieve a ‘‘workload balance’’ state.
|
||||
To solve these problems, the ‘‘initial scheme’’ is adjusted reasonably
|
||||
and efficiently in the second stage. For the larger-size matrix, smaller The proposed method is based on the GPU platforms of AMD and
|
||||
tiles are used to segment, and the number of tiles is increased by NVIDIA for implementation. The hardware characteristics of the GPU
|
||||
reducing the tile size to avoid CU being idle. The details are as follows: platform can also significantly impact GEMM performance. For exam-
|
||||
for a GEMM, given an appropriate ‘‘initial scheme’’, to avoid the waste ple, in AMD and NVIDIA platforms, threads are based on wavefront
|
||||
of GPU hardware resources, some larger GEMMs are cut with smaller and warp as the basic execution units containing 64 and 32 threads,
|
||||
tiles to ensure that the number of tiles is sufficient. For example, for respectively. The number of threads in the kernel needs to be an integer
|
||||
tiles whose initial value is ‘‘64 * 64’’, tiles with ‘‘32 * 32’’ are used for multiple of the number of threads in wavefront and warp to improve
|
||||
segmentation. As a result, the number of tiles increases as the tile size kernel occupancy. Meanwhile, the size of registers and shared memory
|
||||
|
||||
7
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
Table 4 set of value ranges. The experimental results were represented by the
|
||||
The configuration of platforms for evaluation.
|
||||
average value of GFLOPS (Giga Floating-point Operations Per Second),
|
||||
Platform setup AMD-platform NVIDIA-platform which is calculated as:
|
||||
CPU EPYC 7763 Platinum 8358 ∑𝑛−1
|
||||
2(𝑀𝑖 × 𝑁𝑖 × 𝐾𝑖 )
|
||||
GPU MI210 A800 𝐺𝐹 𝐿𝑂𝑃 𝑆 = 𝑖=0 (11)
|
||||
OS Ubuntu 20.04 Ubuntu 20.04 𝑡𝑜𝑡𝑎𝑙_𝑡𝑖𝑚𝑒 ∗ 1.0𝑒9
|
||||
ROCm/CUDA ROCm 5.6 CUDA 12.0 where 𝑀𝑖 , 𝑁𝑖 and 𝐾𝑖 represent the matrix dimension of the 𝑖th GEMM,
|
||||
and 𝑡𝑜𝑡𝑎𝑙_𝑡𝑖𝑚𝑒 represents the running time on this GPU, 𝑛 represents
|
||||
Table 5 batch sizes. For simplicity, the experimental data is represented as
|
||||
The configuration of GPUs for evaluation. single-precision floating-point data and the storage format is based
|
||||
Name MI210 A800 on the row-first format. The experimental results are averaged over
|
||||
Architecture CDNA 2.0 Ampere 10 consecutive runs. The final experimental results were rounded to
|
||||
Core 1700 MHz 1410 MHz preserve two decimal places.
|
||||
Caches L1 16 KB (per CU) L2 16 MB L1 192 KB (per SM) L2 40 MB
|
||||
Memory 64 GB 3.2 Gbps HBM2 80 GB 2.4 Gbps HBM2
|
||||
Bandwidth 1.6 TB/s 2.04 TB/s 5.2. Speed up
|
||||
|
||||
In the two platforms, we first compare with the default methods
|
||||
rocBLAS and cuBLAS. These two methods do not support batch irreg-
|
||||
can affect parameter settings during implementation based on different ular GEMMs; we convert batch GEMMs into multiple single GEMMs
|
||||
hardware architectures. Based on this difference, the proposed method and compute the results. The specific experimental results are shown
|
||||
considers parallelism at the wavefront or warp level when performing in Figs. 4–5. Figs. 4–5 show that the proposed method achieves 5.09×
|
||||
matrix segmentation on two GPU platforms. In this way, the proposed and 7.18× average speedup compared to rocBLAS and cuBLAS. This
|
||||
method can flexibly select tiling schemes based on the hardware char- result is primarily due to the fact that this method does not sup-
|
||||
acteristics of the GPU to achieve optimal performance. In this way, port GEMMs of different scales when computing batch GEMMs, so
|
||||
the proposed method can avoid exceeding the maximum register limit it can only compute one GEMM simultaneously. When faced with
|
||||
and prevent data overflow, which improves its applicability for various a small matrix, the computational resources of the GPU cannot be
|
||||
hardware architectures. fully utilized due to the cost of context switching between multiple
|
||||
GEMMs. As the batch size gradually increases, the advantage of the
|
||||
5. Evaluation proposed method becomes more evident. This shows that for batch
|
||||
and irregular GEMMs, rocBLAS and cuBLAS are at a disadvantage in
|
||||
5.1. Setup terms of computational efficiency and switching between instances.
|
||||
Meanwhile, we also compare CUTLASS, which handles batch GEMM,
|
||||
Experiment platform and matrix generation. The overall configu- using sorting to solve the problem of significant workload differences
|
||||
ration of the experimental platform and the details of the two GPUs between multiple matrix multiplications. Fig. 5 shows that the proposed
|
||||
are shown in Tables 4 and 5, respectively. To ensure the irregular- method has a 4.64× speedup, which is because CUTLASS’s built-in
|
||||
ity and variability of the input matrix, the GEMM size parameters tiles are unsuitable when the matrix dimensions are small. Therefore,
|
||||
𝑀, 𝑁, and 𝐾 are randomly generated within corresponding ranges the proposed method performs better acceleration than CUTLASS for
|
||||
([𝑀 𝑖𝑛, 𝑀 𝑎𝑥_𝑀(𝑁)] and [𝑀 𝑖𝑛, 𝑀 𝑎𝑥_𝐾]). 𝑀 𝑎𝑥_𝑀, 𝑀 𝑎𝑥_𝑁, and 𝑀 𝑎𝑥_𝐾 batch, irregular, and small-size matrix multiplication. We then perform
|
||||
represent the upper bounds of 𝑀, 𝑁, and 𝐾, respectively. The lower a detailed comparison and analysis of the experimental performance
|
||||
bound for each experiment is denoted uniformly by 𝑀 𝑖𝑛. In this paper, based on MAGMA. The proposed method has 4.37× and 3.36× speed
|
||||
the value of 𝑀 𝑖𝑛 is set to 16. For example, Max_M(N) = 512 and improvement compared to MAGMA. Figs. 4–5 show that the advantage
|
||||
Max_K = 128 indicate that the range of matrix dimensions is 𝑀 ∈ of our method becomes more pronounced as the batch size increases.
|
||||
[16, 512], 𝑁 ∈ [16, 512] and 𝐾 ∈ [16, 128]. Thus, multiple sets of This is because MAGMA only uses the largest GEMM size in the batch
|
||||
matrix dimension ranges can be obtained, and the parameters needed GEMM to set grid.x. Due to the irregularity of the matrix size, a
|
||||
for GEMM generation are chosen from the different value ranges by large number of computational resources in the grid will be idle. The
|
||||
random selection. proposed method, in this case, employs fine-grained filtering operations
|
||||
Comparison method. First, for the two GPU experimental platforms, to ensure further efficient utilization of computational resources, which
|
||||
the default GEMM processing methods rocBLAS [6] and cuBLAS [7] is more evident when the difference between matrix dimensions is
|
||||
provided by the respective GPU manufacturers are chosen as the basic significant.
|
||||
comparison methods to demonstrate the effectiveness of the proposed As shown in Fig. 4, the proposed method achieves an average
|
||||
method. Since these methods do not support the way of batch invo- 1.88× speedup performance compared to Wang. It is noted that the
|
||||
cation, in this paper, rocBLAS and cuBLAS compute batch GEMM in a advantage of the proposed method is more pronounced when 𝑀 𝑎𝑥_𝐾
|
||||
loop manner. No stream operations are used during the computation. and 𝑀 𝑎𝑥_𝑀 are small. For example, in the case of (𝑀 𝑎𝑥_𝑀(𝑁) = 128,
|
||||
Meanwhile, we also compared the CUTLASS [23], which supports 𝑀 𝑎𝑥_𝐾 = 128), the average speedup can reach 1.95×. This is mainly
|
||||
batch GEMM based on sorting and built-in tiles. We then compare due to the fact that when the dimension of matrix is small, there are
|
||||
with MAGMA [8] supported by the University of Tennessee ICL Lab, not enough tiles to cover the time consumption of data loading in the
|
||||
which only extends the 𝑔 𝑟𝑖𝑑 .𝑧 to support batch GEMM but does not wavefront, which is more pronounced in workgroups with heavy loads.
|
||||
have a fine-grained optimization strategy. The MAGMA comparison The proposed method adjusts the wavefront workload corresponding
|
||||
experiments were run on two GPU platforms. Meanwhile, to show the to the tiles through a multi-thread kernel and ensures consistent com-
|
||||
advancement of our proposed method, we compare with the state-of- putation and data loading by different workgroups. At the same time,
|
||||
the-art methods such as Wang [36] and Li [21] on their respective it has also shown that the state of load and computation balancing
|
||||
platforms. All of the above methods perform a warp-up operation to between wavefronts is more conducive to improving the efficiency of
|
||||
eliminate the effect of the first kernel boot. GPU parallel computing. In the NVIDIA platform, Fig. 5 shows that the
|
||||
Evaluation criteria. In the following experiments, there are 12 sets proposed method has average 1.94× speedup performance compared to
|
||||
of GEMM dimension ranges. The experiments with batch sizes 8, 16, Li. The advantage of the proposed method becomes clearer as the batch
|
||||
32, 64, 128, and 256 were run continuously for ten epochs under each size increases. There are two reasons for this speedup performance :
|
||||
|
||||
8
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 4. The comparative results on MI210. (5.09×, 4.37×, 1.88× speedup over rocBLAS, MAGMA, Wang).
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 5. The comparative results with on A800. (7.18×, 4.64×, 3.63×, 1.94× speedup over cuBLAS, CUTLASS, MAGMA, Li).
|
||||
|
||||
|
||||
|
||||
9
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 6. The kernel occupancy on two GPU platforms.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 7. The time overhead of tiling algorithm.
|
||||
|
||||
|
||||
(1) Li et al. used batching to balance the workload among different wavefronts and 𝑁 𝑢𝑚_𝑡𝑜𝑡𝑎𝑙 is the theoretical number of wavefronts that
|
||||
blocks but did not consider the difference between the workload of CU can execute simultaneously. 𝑁 𝑢𝑚_𝑎𝑐 𝑡𝑖𝑣𝑒𝑑 and 𝑁 𝑢𝑚_𝑡𝑜𝑡𝑎𝑙 represent
|
||||
threads in different tiles. (2) When selecting the tiling scheme, the TLP the number of warps in activation and the number of warps that are
|
||||
is calculated only by considering the block, and the fine-grained warp theoretically parallelizable simultaneously in the NVIDIA platform.
|
||||
level is neglected, which leads to the inaccurate calculation of TLP. The The results of the experiment are shown in Fig. 6. By comparing
|
||||
proposed method adjusts the wavefront workload corresponding to the rocBLAS and cuBLAS, it can be seen that the proposed method has a
|
||||
tiles through a multi-thread kernel and ensures consistent computation clear advantage in the case of batch GEMM. The proposed method is
|
||||
and data loading by different workgroups. At the same time, it has also in the best position compared to the other methods (CUTLASS,
|
||||
also shown that the state of load and computation balancing between MAGMA, Wang, Li), showing high efficiency in terms of utilization of
|
||||
wavefronts is more conducive to improving the efficiency of GPU GPU resources. As shown in Fig. 6, the proposed method consistently
|
||||
parallel computing. maintains the optimal kernel occupancy on both GPU platforms, which
|
||||
indicates that the proposed method can better exploit the computing
|
||||
power of the GPU.
|
||||
5.3. Kernel occupancy
|
||||
|
||||
5.4. The overhead of tiling algorithm
|
||||
To explore the difference between the proposed method and the
|
||||
comparison methods in terms of GPU resource utilization, we present
|
||||
This section presents the proportion of the runtime that is taken
|
||||
the kernel occupancy of the various methods on two GPU platforms.
|
||||
up by the tiling algorithm when executing the proposed method on
|
||||
The formula for kernel occupancy can be expressed as:
|
||||
two different GPU platforms with various batch sizes. The experimental
|
||||
𝑁 𝑢𝑚_𝑎𝑐 𝑡𝑖𝑣𝑒𝑑
|
||||
kernel occupancy = (12) results are presented in Fig. 7. From Fig. 7, it is evident that the tiling
|
||||
𝑁 𝑢𝑚_𝑡𝑜𝑡𝑎𝑙
|
||||
algorithm’s runtime percentage decreases as the batch size increases.
|
||||
To obtain more accurate performance metrics, we utilize Omniperf5 When batch size is 8, the runtime of the tiling algorithm on the two
|
||||
and Nsight6 commands, profiling tools provided by AMD and NVIDIA, GPU platforms is 6.06% and 6.37%, respectively. As the batch size
|
||||
to evaluate the resource utilization of the kernel during the execution increases, more and more GEMMs are executed on the GPU, and the
|
||||
process. The kernel occupancy has distinct interpretations owing to execution time of these GEMMs on the GPU side takes up most of the
|
||||
the distinctions in GPU architecture between AMD MI210 and NVIDIA time, resulting in a smaller runtime portion of the tiling algorithm.
|
||||
A800. On the AMD platform, 𝑁 𝑢𝑚_𝑎𝑐 𝑡𝑖𝑣𝑒𝑑 is the number of activated For example, with a batch size is 1024, the tiling algorithm takes less
|
||||
than 1% of the runtime. The experimental results on two GPUs indicate
|
||||
that the time overhead of the tiling algorithm in the batch GEMM
|
||||
5
|
||||
https://github.com/ROCm/omniperf execution process is negligible, especially when the batch size is large.
|
||||
6
|
||||
https://docs.nvidia.com/nsight-compute/NsightCompute/index.html In real-world scenarios such as deep learning, where a large number of
|
||||
|
||||
10
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. The performance improvement of the proposed TLP on MI210. (1.077× average speedup).
|
||||
|
||||
|
||||
GEMM operations are often required, the tiling algorithm will have less 4.53×, and 1.62× compared to rocBLAS, MAGMA, and Wang, respec-
|
||||
overhead in the execution process. tively. The proposed method has the lowest latency performance on
|
||||
MI210, indicating higher computational efficiency and can effectively
|
||||
5.5. The performance benefits of the proposed TLP reduce latency. On A800, the proposed method showed performance
|
||||
improvements of 3.02×, 2.59×, 2.45×, and 1.89× compared to cuBLAS,
|
||||
This section presents the comparative experimental results on two MAGMA, CUTLASS, and Li, respectively. Fig. 10 shows that as the
|
||||
GPU platforms to provide a more detailed evaluation of the proposed batch size gradually increases, the kernel latency increases on both
|
||||
TLP. The detailed experimental results are shown in Figs. 8–9. From GPU platforms. rocBLAS and cuBLAS have the highest latency as the
|
||||
Figs. 8–9, it is clear that the proposed TLP performs better overall than batch size increases. This phenomenon is because the traditional loop
|
||||
traditional TLP. The proposed methods have a speedup of 1.077× and scheduling method significantly increases latency consumption due to
|
||||
1.085× on MI210 and A800, respectively. From Fig. 8, the proposed context switching between kernels when the batch size is large. From
|
||||
method significantly improves performance when the batch size is Fig. 10, it can be seen that some methods exhibit different latency
|
||||
larger. For example, on MI210, the proposed method has an average performances at various batch sizes. For example, when batch size
|
||||
speedup of 1.04× when batch size <= 16. When batch size >= 32, the <= 16, MAGMA has the highest latency performance on two GPU
|
||||
proposed method can improve performance by 1.10×. The performance platforms. When the batch size is large, its computational performance
|
||||
improvement gap is because when the batch size and matrix dimension improves, indicating that the MAGMA performs better when there are
|
||||
are small, it is difficult to utilize hardware resources fully. When there many matrices. The experimental results on two platforms show that
|
||||
are a large number of tiles, the proposed TLP can more accurately the proposed method has the lowest latency under various batch sizes,
|
||||
evaluate the thread’s workload and select the optimal tiling scheme. indicating better performance and broad applicability.
|
||||
The same performance trend is also reflected in the A800 platform. On
|
||||
A800, the proposed TLP has performance improvements of 1.04× and
|
||||
1.11× when batch size <= 16 and batch size >= 32, respectively. The 5.7. The improved performance on inception layers of CNN
|
||||
effectiveness of the proposed TLP can be further demonstrated through
|
||||
comparative experiment results on two GPU platforms. Modern CNN model architectures often have multiple branches to
|
||||
capture features at different scales. Convolution operations of differ-
|
||||
5.6. The latency ent scales in each branch can be represented as batch GEMM oper-
|
||||
ations with various dimensions, e.g. GoogleNet [13], DenseNet [50],
|
||||
This section compares kernel latency on two GPU platforms to SqueezeNet [12], etc. To demonstrate the effectiveness of the proposed
|
||||
provide a more detailed evaluation of the proposed method. We mea- method in real-world scenarios, we use various Inception module as
|
||||
sured kernel latency with different batch sizes in the comparative a typical application to perform the forward computation process on
|
||||
experiment. The detailed experimental results are shown in Fig. 10. two GPU platforms. The Inception module involves a large number of
|
||||
On MI210, the proposed method has a latency reduction of 3.87×, irregular, small-size GEMM operations. The deep learning frameworks
|
||||
|
||||
11
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 9. The performance improvement of the proposed TLP on A800. (1.085× average speedup).
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 10. The latency performance of the kernel on two GPU platforms.
|
||||
|
||||
|
||||
MIOpen7 and cuDNN8 are used as benchmark implementations on the other Inception module, and the dimensions of these matrices are
|
||||
both GPU platforms. In this section, we select several commonly used smaller than the former two. Finally, the proposed method has been
|
||||
Inception modules to evaluate the proposed method’s speedup perfor- proven to significantly accelerate CNN models with various branch
|
||||
mance. The GEMM sizes in Inception modules are shown in Table 6. structures on two different GPU platforms, particularly in scenarios
|
||||
Fig. 11 shows the speedup performance of the proposed method in each involving multiple branches, irregular shapes, and small dimensions.
|
||||
Inception module. As shown in Fig. 11, the average speedups are 2.88×
|
||||
and 1.87× respectively. The gray boxes represent the average speedup 6. Conclusion
|
||||
ratios of the different Inception modules in Fig. 11. The experimental
|
||||
results suggest that the Inception 8–9 series has the highest average In this paper, we propose a load-balanced batch GEMM acceleration
|
||||
speedup ratio (3.68× and 2.66× respectively) among the Inception method for the problem of low parallel computing efficiency and poor
|
||||
modules, because Inception 8–9 has more matrix shapes compared to hardware resource utilization in batch, irregular, and variable matrix
|
||||
multiplication scenarios. The kernel occupancy and hardware resource
|
||||
utilization can be effectively improved by a multi-thread kernel design
|
||||
7
|
||||
https://github.com/ROCm/MIOpen that balances the computational and data load in the work-item. A
|
||||
8
|
||||
https://github.com/NVIDIA/cudnn-frontend novel approach to TLP computation is devised, where the parallelism of
|
||||
|
||||
12
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 11. The speedup performance on Inception layers.
|
||||
|
||||
|
||||
Table 6
|
||||
The size of GEMM in various Inception modules.
|
||||
Inception module GEMM size (M ×N × K)
|
||||
Inception-1 784 × 96 × 192, 784 × 64 × 192, 784 × 32 × 192, 784 × 16 × 192
|
||||
Inception-2 784 × 64 × 192, 784 × 32 × 192, 784 × 128 × 192
|
||||
Inception-3 196 × 192 × 192, 196 × 16 × 192, 196 × 96 × 192, 196 × 64 × 192
|
||||
Inception-4 196 × 64 × 192, 196 × 24 × 192, 196 × 160 × 192
|
||||
Inception-5 196 × 64 × 192, 196 × 128 × 192, 196 × 24 × 192
|
||||
Inception-6 196 × 112 × 192, 196 × 144 × 192, 196 × 32 × 192, 196 × 64 × 192
|
||||
Inception-7 196 × 256 × 192, 196 × 160 × 192, 196 × 128 × 192
|
||||
Inception-8 49 × 160 × 192, 49 × 128 × 192, 49 × 256 × 192, 49 × 160 × 192, 49 × 32 × 192
|
||||
Inception-9 49 × 192 × 192, 49 × 128 × 192, 49 × 384 × 192, 49 × 192 × 192, 49 × 48 × 192
|
||||
|
||||
|
||||
|
||||
the tiling scheme is measured by the number of activated wavefronts. References
|
||||
This approach allows the optimal tiling scheme to be selected based on
|
||||
different GPU architectures. Experiments are conducted on two GPU [1] P. Valero-Lara, I. Jorquera, F. Lui, J. Vetter, Mixed-precision S/DGEMM using
|
||||
the TF32 and TF64 frameworks on low-precision AI tensor cores, in: Proceedings
|
||||
platforms to validate the effectiveness and progress of our proposed
|
||||
of the SC’23 Workshops of the International Conference on High Performance
|
||||
method. Computing, Network, Storage, and Analysis, 2023, pp. 179–186.
|
||||
Future work includes exploring batch GEMM with various preci- [2] H. Martínez, S. Catalán, A. Castelló, E.S. Quintana-Ortí, Parallel GEMM-based
|
||||
sion performances. With the development of Transformer-based, many convolutions for deep learning on multicore ARM and RISC-V architectures, J.
|
||||
GEMM operations are involved in the training and inference process Syst. Archit. (2024) 103186.
|
||||
[3] J. Fornt, P. Fontova-Musté, M. Caro, J. Abella, F. Moll, J. Altet, C. Studer, An
|
||||
of Large Language Models (LLMs), which often have lower accuracy, energy-efficient gemm-based convolution accelerator with on-the-fly im2col, IEEE
|
||||
such as FP16, FP8, etc. For example, quantized LLMs often involve Trans. Very Large Scale Integr. (VLSI) Syst. 31 (11) (2023) 1874–1878.
|
||||
GEMM operations where the weight matrices and activation values [4] H. Kim, W.J. Song, Las: locality-aware scheduling for GEMM-accelerated
|
||||
have different precisions, e.g. W4A16, W8A8. More complex precisions convolutions in GPUs, IEEE Trans. Parallel Distrib. Syst. 34 (5) (2023)
|
||||
1479–1494.
|
||||
and storage formats pose challenges to the performance of GEMM [5] W. Yang, J. Fang, D. Dong, X. Su, Z. Wang, Optimizing full-spectrum matrix
|
||||
operations. multiplications on ARMv8 multi-core CPUs, IEEE Trans. Parallel Distrib. Syst.
|
||||
(2024).
|
||||
CRediT authorship contribution statement [6] AMD, Next generation BLAS implementation for ROCm platform, 2024, https:
|
||||
//github.com/ROCm/rocBLAS.
|
||||
[7] B. Tuomanen, Hands-On GPU Programming with Python and CUDA: Explore
|
||||
Yu Zhang: Writing – review & editing, Writing – original draft. Lu High-Performance Parallel Computing with CUDA, Packt Publishing Ltd, 2018.
|
||||
Lu: Writing – review & editing, Supervision. Zhanyu Yang: Writing [8] ICL, Matrix algebra for GPU and multicore architectures, 2024, https://icl.utk.
|
||||
– review & editing. Zhihong Liang: Supervision, Conceptualization. edu/magma/.
|
||||
[9] T. Faingnaert, T. Besard, B. De Sutter, Flexible performant GEMM kernels on
|
||||
Siliang Suo: Supervision, Conceptualization.
|
||||
GPUs, IEEE Trans. Parallel Distrib. Syst. 33 (9) (2021) 2230–2248.
|
||||
[10] W.S. Moses, I.R. Ivanov, J. Domke, T. Endo, J. Doerfert, O. Zinenko, High-
|
||||
Declaration of competing interest performance gpu-to-cpu transpilation and optimization via high-level parallel
|
||||
constructs, in: Proceedings of the 28th ACM SIGPLAN Annual Symposium on
|
||||
The authors declare that they have no known competing finan- Principles and Practice of Parallel Programming, 2023, pp. 119–134.
|
||||
[11] H. Kim, H. Nam, W. Jung, J. Lee, Performance analysis of CNN frameworks
|
||||
cial interests or personal relationships that could have appeared to
|
||||
for GPUs, in: 2017 IEEE International Symposium on Performance Analysis of
|
||||
influence the work reported in this paper. Systems and Software, ISPASS, IEEE, 2017, pp. 55–64.
|
||||
[12] F.N. Iandola, S. Han, M.W. Moskewicz, K. Ashraf, W.J. Dally, K. Keutzer,
|
||||
Acknowledgments SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB
|
||||
model size, 2016, arXiv preprint arXiv:1602.07360.
|
||||
[13] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V.
|
||||
This work was supported by the Natural Science Foundation of Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: Proceedings of
|
||||
Guangdong Province (2024A1515010204) and the Technological Re- the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp.
|
||||
search Project of Southern Power Grid Company (ZBKJXM20232483). 1–9.
|
||||
[14] G. Pant, D. Yadav, A. Gaur, ResNeXt convolution neural network topology-based
|
||||
deep learning model for identification and classification of pediastrum, Algal Res.
|
||||
Data availability
|
||||
48 (2020) 101932.
|
||||
[15] S. Barrachina, M.F. Dolz, P. San Juan, E.S. Quintana-Ortí, Efficient and
|
||||
No data was used for the research described in the article. portable GEMM-based convolution operators for deep neural network training
|
||||
on multicore processors, J. Parallel Distrib. Comput. 167 (2022) 240–254.
|
||||
|
||||
|
||||
13
|
||||
Y. Zhang et al. Journal of Systems Architecture 160 (2025) 103341
|
||||
|
||||
|
||||
[16] S. Rajbhandari, Y. He, O. Ruwase, M. Carbin, T. Chilimbi, Optimizing cnns on [35] G. Alaejos, A. Castelló, H. Martínez, P. Alonso-Jordá, F.D. Igual, E.S. Quintana-
|
||||
multicores for scalability, performance and goodput, ACM SIGARCH Comput. Ortí, Micro-kernels for portable and efficient matrix multiplication in deep
|
||||
Archit. News 45 (1) (2017) 267–280. learning, J. Supercomput. 79 (7) (2023) 8124–8147.
|
||||
[17] C. Rivera, J. Chen, N. Xiong, S.L. Song, D. Tao, Ism2: Optimizing irregular-shaped [36] R. Wang, Z. Yang, H. Xu, L. Lu, A high-performance batched matrix multiplica-
|
||||
matrix-matrix multiplication on gpus, 2020, arXiv preprint arXiv:2002.03258. tion framework for gpus under unbalanced input distribution, J. Supercomput.
|
||||
[18] K. Matsumoto, N. Nakasato, S.G. Sedukhin, Performance tuning of matrix 78 (2) (2022) 1741–1758.
|
||||
multiplication in opencl on different gpus and CPUs, in: 2012 SC Companion: [37] Y. Zhang, Y. Wang, Z. Mo, Y. Zhou, T. Sun, G. Xu, C. Xing, L. Yang, Accelerating
|
||||
High Performance Computing, Networking Storage and Analysis, IEEE, 2012, pp. small matrix multiplications by adaptive batching strategy on GPU, in: 2022
|
||||
396–405. IEEE 24th Int Conf on High Performance Computing & Communications; 8th
|
||||
[19] G.E. Moon, H. Kwon, G. Jeong, P. Chatarasi, S. Rajamanickam, T. Krishna, Eval- Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int
|
||||
uating spatial accelerator architectures with tiled matrix-matrix multiplication, Conf on Dependability in Sensor, Cloud & Big Data Systems & Application,
|
||||
IEEE Trans. Parallel Distrib. Syst. 33 (4) (2021) 1002–1014. HPCC/DSS/SmartCity/DependSys, IEEE, 2022, pp. 882–887.
|
||||
[20] Q. Han, H. Yang, M. Dun, Z. Luan, L. Gan, G. Yang, D. Qian, Towards [38] A. Abdelfattah, S. Tomov, J. Dongarra, Matrix multiplication on batches of small
|
||||
efficient tile low-rank GEMM computation on sunway many-core processors, J. matrices in half and half-complex precisions, J. Parallel Distrib. Comput. 145
|
||||
Supercomput. 77 (5) (2021) 4533–4564. (2020) 188–201.
|
||||
[21] X. Li, Y. Liang, S. Yan, L. Jia, Y. Li, A coordinated tiling and batching [39] A. Abdelfattah, A. Haidar, S. Tomov, J. Dongarra, Novel HPC techniques to batch
|
||||
framework for efficient GEMM on GPUs, in: Proceedings of the 24th Symposium execution of many variable size BLAS computations on GPUs, in: Proceedings of
|
||||
on Principles and Practice of Parallel Programming, 2019, pp. 229–241. the International Conference on Supercomputing, 2017, pp. 1–10.
|
||||
[22] P. Tillet, D. Cox, Input-aware auto-tuning of compute-bound HPC kernels, in: [40] A. Abdelfattah, A. Haidar, S. Tomov, J. Dongarra, Performance, design, and
|
||||
Proceedings of the International Conference for High Performance Computing, autotuning of batched GEMM for GPUs, in: High Performance Computing: 31st
|
||||
Networking, Storage and Analysis, 2017, pp. 1–12. International Conference, ISC High Performance 2016, Frankfurt, Germany, June
|
||||
[23] NVIDIA, CUDA templates for linear algebra subroutines, 2024, https://github. 19-23, 2016, Proceedings, Springer, 2016, pp. 21–38.
|
||||
com/NVIDIA/cutlass. [41] A. Li, G.-J. van den Braak, H. Corporaal, A. Kumar, Fine-grained synchronizations
|
||||
[24] J. Huang, C.D. Yu, R.A.v.d. Geijn, Strassen’s algorithm reloaded on GPUs, ACM and dataflow programming on GPUs, in: Proceedings of the 29th ACM on
|
||||
Trans. Math. Softw. 46 (1) (2020) 1–22. International Conference on Supercomputing, 2015, pp. 109–118.
|
||||
[25] B. Boyer, J.-G. Dumas, C. Pernet, W. Zhou, Memory efficient scheduling of [42] J. Li, H. Ye, S. Tian, X. Li, J. Zhang, A fine-grained prefetching scheme
|
||||
strassen-winograd’s matrix multiplication algorithm, in: Proceedings of the 2009 for DGEMM kernels on GPU with auto-tuning compatibility, in: 2022 IEEE
|
||||
International Symposium on Symbolic and Algebraic Computation, 2009, pp. International Parallel and Distributed Processing Symposium, IPDPS, IEEE, 2022,
|
||||
55–62. pp. 863–874.
|
||||
[26] A. Fawzi, M. Balog, A. Huang, T. Hubert, B. Romera-Paredes, M. Barekatain, [43] Z. Yang, L. Lu, R. Wang, A batched GEMM optimization framework for deep
|
||||
A. Novikov, F.J. R Ruiz, J. Schrittwieser, G. Swirszcz, et al., Discovering faster learning, J. Supercomput. 78 (11) (2022) 13393–13408.
|
||||
matrix multiplication algorithms with reinforcement learning, Nature 610 (7930) [44] H. Mei, H. Qu, J. Sun, Y. Gao, H. Lin, G. Sun, GPU occupancy prediction of
|
||||
(2022) 47–53. deep learning models using graph neural network, in: 2023 IEEE International
|
||||
[27] G. Xiao, C. Yin, T. Zhou, X. Li, Y. Chen, K. Li, A survey of accelerating parallel Conference on Cluster Computing, CLUSTER, IEEE, 2023, pp. 318–329.
|
||||
sparse linear algebra, ACM Comput. Surv. 56 (1) (2023) 1–38. [45] I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, J.
|
||||
[28] Y. Chen, G. Xiao, K. Li, F. Piccialli, A.Y. Zomaya, fgSpMSpV: A fine-grained Dongarra, Algorithms and optimization techniques for high-performance matrix-
|
||||
parallel SpMSpV framework on HPC platforms, ACM Trans. Parallel Comput. 9 matrix multiplications of very small matrices, Parallel Comput. 81 (2019)
|
||||
(2) (2022) 1–29. 1–21.
|
||||
[29] Y. Chen, G. Xiao, W. Yang, Optimizing partitioned CSR-based SpGEMM on the [46] G. Park, B. Park, M. Kim, S. Lee, J. Kim, B. Kwon, S.J. Kwon, B. Kim, Y. Lee,
|
||||
sunway TaihuLight, Neural Comput. Appl. 32 (10) (2020) 5571–5582. D. Lee, Lut-gemm: Quantized matrix multiplication based on luts for efficient
|
||||
[30] Y. Chen, K. Li, W. Yang, G. Xiao, X. Xie, T. Li, Performance-aware model for inference in large-scale generative language models, 2022, arXiv preprint arXiv:
|
||||
sparse matrix-matrix multiplication on the sunway taihulight supercomputer, 2206.09557.
|
||||
IEEE Trans. Parallel Distrib. Syst. 30 (4) (2018) 923–938. [47] B. Feng, Y. Wang, G. Chen, W. Zhang, Y. Xie, Y. Ding, EGEMM-TC: accelerating
|
||||
[31] G. Xiao, K. Li, Y. Chen, W. He, A.Y. Zomaya, T. Li, Caspmv: A customized scientific computing on tensor cores with extended precision, in: Proceedings
|
||||
and accelerative spmv framework for the sunway taihulight, IEEE Trans. Parallel of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel
|
||||
Distrib. Syst. 32 (1) (2019) 131–146. Programming, 2021, pp. 278–291.
|
||||
[32] G. Xiao, C. Yin, Y. Chen, M. Duan, K. Li, Efficient utilization of multi-threading [48] G. Shobaki, A. Kerbow, S. Mekhanoshin, Optimizing occupancy and ILP on the
|
||||
parallelism on heterogeneous systems for sparse tensor contraction, IEEE Trans. GPU using a combinatorial approach, in: Proceedings of the 18th ACM/IEEE
|
||||
Parallel Distrib. Syst. (2024). International Symposium on Code Generation and Optimization, 2020, pp.
|
||||
[33] D.E. Tanner, Tensile: Auto-tuning gemm gpu assembly for all problem sizes, 133–144.
|
||||
in: 2018 IEEE International Parallel and Distributed Processing Symposium [49] A.B. Hayes, L. Li, D. Chavarría-Miranda, S.L. Song, E.Z. Zhang, Orion: A
|
||||
Workshops, IPDPSW, IEEE, 2018, pp. 1066–1075. framework for gpu occupancy tuning, in: Proceedings of the 17th International
|
||||
[34] S. Wang, FlexGEMM: A flexible micro-kernel generation framework, in: Proceed- Middleware Conference, 2016, pp. 1–13.
|
||||
ings of the 5th International Conference on Computer Information and Big Data [50] G. Huang, S. Liu, L. Van der Maaten, K.Q. Weinberger, Condensenet: An
|
||||
Applications, 2024, pp. 164–170. efficient densenet using learned group convolutions, in: Proceedings of the IEEE
|
||||
Conference on Computer Vision and Pattern Recognition, 2018, pp. 2752–2761.
|
||||
|
||||
|
||||
|
||||
|
||||
14
|
||||
|
||||
@@ -0,0 +1,943 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
A multi-criteria process for IT project success evaluation–Addressing a
|
||||
critical gap in standard practices
|
||||
João Carlos Lourenço a , João Varajão b,*
|
||||
a
|
||||
CEGIST, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal
|
||||
b
|
||||
Centro ALGORITMI, Universidade do Minho, Campus de Azurém, 4804-533 Guimarães, Portugal
|
||||
|
||||
|
||||
|
||||
|
||||
A R T I C L E I N F O A B S T R A C T
|
||||
|
||||
Keywords: The evaluation of project success is widely recognised as valuable for improving IT (Information Technology)
|
||||
Project success project performance and impact. However, many processes fail to adequately address the requirements for a
|
||||
Project evaluation sound evaluation due to their inherent complexity or by not complying with fundamental practical and theo
|
||||
Multi-criteria evaluation
|
||||
retical concepts. This paper presents a process that combines a problem structuring method with a multi-criteria
|
||||
MACBETH
|
||||
Process
|
||||
decision analysis approach to evaluate the success of IT projects. Put into practice in the context of a software
|
||||
Methodology development project developed for a leading global supplier of technology and services, it offers a new way of
|
||||
creating a model for evaluating project success and tackling uncertainty, bringing clarity and consistency to the
|
||||
overall assessment process. A strong advantage of this process is that it is theoretically sound and can be easily
|
||||
applied to other evaluation problems involving other criteria. It also serves as a call to action for the development
|
||||
of formal standards in evaluation processes. Practical pathways to achieve such standardization include
|
||||
collaboration through industry consortia, development and adoption of ISO frameworks, and embedding eval
|
||||
uation processes within established maturity models. These pathways can foster consistency, comparability, and
|
||||
continuous improvement across organizations, paving the way for more robust and transparent evaluation
|
||||
practices.
|
||||
|
||||
|
||||
|
||||
|
||||
1. Introduction Additionally, several errors identified by decision analysis literature
|
||||
[12,13] are often made, generating meaningless project success evalu
|
||||
The sustainable success of virtually any organisation is strongly ations [14]. Some common mistakes involve not including relevant
|
||||
associated with the success of its projects [1]. A key factor for project criteria in the evaluation model, not distinguishing the performance of a
|
||||
success is that project managers clearly understand what success means project from its value, assigning weights to evaluation criteria without
|
||||
[2], which is usually not the case [3]. Despite different notions about considering the ranges of variation of their performance scales, and
|
||||
what constitutes “project success” and the many criteria that can be used making calculations that violate measurement scales’ properties. In
|
||||
for evaluation (e.g., cost, time, and performance, among others) [4], a other words, such evaluations are inconsistent with multi-attribute
|
||||
project must satisfy its clients to be considered successful [5–8]. value theory (MAVT) and value measurement foundations.
|
||||
Given the importance and complexity of the evaluation of projects, Considering these limitations, this research proposes a process that
|
||||
companies should define and implement systematic processes for eval combines a problem structuring method with a multi-criteria approach
|
||||
uating success to improve project management performance and the for evaluating the success of information technology (IT) projects sup
|
||||
impact of deliverables [9]. However, despite the models and techniques ported by a real-world case. This process was developed and applied in
|
||||
that are currently available for assessing project success, they are typi the context of a project of GlobalSysMakers (for confidentiality reasons,
|
||||
cally challenging to implement for a variety of reasons, notably the the name of the company herein is anonymized), a leading global sup
|
||||
complexity caused by using multiple and often conflicting objectives (e. plier of technology and services.
|
||||
g., minimise cost and maximise quality), the scarcity of empirical studies In the GlobalSysMakers project, the need for a new process arose
|
||||
reporting their genuine use in projects [10], and the fact that practices because the project management team felt that the scoring model
|
||||
employed in companies are generally informal and simplistic [11]. initially defined for success assessment, while helpful, lacked accuracy.
|
||||
|
||||
|
||||
* Corresponding author.
|
||||
E-mail address: varajao@dsi.uminho.pt (J. Varajão).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104122
|
||||
Received 12 August 2025; Received in revised form 7 November 2025; Accepted 23 December 2025
|
||||
Available online 24 December 2025
|
||||
0920-5489/© 2025 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
Following an appraisal of several methodological alternatives, a new weights of several stakeholders without a discussion obliterates their
|
||||
multi-criteria approach combined with a problem structuring method individual differences [26]. Additionally, the “importance of the
|
||||
was shown to be the best solution, providing the required precision and criteria” should consider their respective performance ranges; other
|
||||
transparency to the process, along with a better understanding of the wise, the resulting weights would be arbitrary [27].
|
||||
real meaning of the relative importance of each evaluation criterion. Basar [28] proposes a methodology to evaluate the performance of IT
|
||||
This paper describes the process developed in detail so that it can be projects in a fuzzy environment. She first identifies the evaluation
|
||||
replicated in other projects. Also, the results are presented and dis criteria using the balanced scorecard method. Second, she determines
|
||||
cussed, including contributions to theory and practice. the criteria weights with expert judgments and hesitant fuzzy weights.
|
||||
The proposed process, which combines a problem structuring Then, the weights are used to evaluate the performance of IT projects in
|
||||
method with a multi-criteria approach for evaluating IT project success, a Turkish company. The weighting process described in this paper is
|
||||
offers several theoretical implications. First, it advances the conceptu difficult for a non-expert evaluator to understand. Additionally, the
|
||||
alization of project success by integrating both subjective stakeholder quantitative performances of projects on the criteria are systematically
|
||||
perspectives and objective performance criteria, addressing the multi normalised to scores between 0 and 1 with a linear transformation that
|
||||
dimensional and context-dependent nature of success in IT projects. may not correspond to the preferences of evaluators (which may be
|
||||
Second, it contributes to decision theory and project management non-linear). The paper does not explain how to address the evaluation of
|
||||
literature by demonstrating how problem structuring methods—typi the qualitative criteria.
|
||||
cally underutilized in IT evaluation—can enhance the clarity and rele Ismail [29] applies the Delphi method and conducts a seminar with
|
||||
vance of criteria selection and prioritization. Third, the integration of experts to identify a construction project’s potential evaluation criteria
|
||||
these methodologies provides a foundation for developing more robust, and group them into clusters. A relative importance index is calculated
|
||||
transparent, and adaptable evaluation frameworks, which can inform for each criterion with a weighted average of the responses to a survey
|
||||
future theoretical models and empirical studies. Ultimately, this expressed on a Likert scale. In a subsequent step, the experts 1) reduced
|
||||
research supports the movement toward standardization by offering a the number of clusters and criteria and 2) assigned the same weight to
|
||||
replicable and theoretically grounded process that can be refined and the latter. Then, a priority index was calculated for each criterion with
|
||||
generalized across different organizational and project contexts. the Priority Evaluation Model (PEM) [30], which combines the “satis
|
||||
The remainder of this paper is organised as follows. Section 2 briefly faction” rate (assigned by the experts) and the “importance” of the cri
|
||||
reviews previous related work on project evaluation methods, cases, and terion. The overall project success is obtained with a weighted sum of
|
||||
multi-criteria evaluation methods. Section 3 describes the case context the averages of the priority indexes obtained on each cluster and the
|
||||
and the development of the success evaluation model using a process clusters’ weights. However, the paper does not explain how these
|
||||
that combines a problem structuring model with a multi-criteria deci weights were assessed. Additionally, the Likert scale classifications
|
||||
sion analysis approach. Section 4 discusses the results obtained. Finally, cannot be used for calculating averages or other arithmetic calculations.
|
||||
Section 5 presents the conclusions and avenues for further work. Nguvulu et al. [31] use a Deep Belief Network (DBN) to evaluate eight
|
||||
IT projects’ performances after training the DBN with five projects of 12
|
||||
2. Previous related work months duration. The DPN automatically assigned weights and scores to
|
||||
the criteria, considering possible interactions between them. The au
|
||||
2.1. Success of projects thors stress the advantage of this approach by not considering human
|
||||
subjectivity. However, from our point of view, this is a weakness
|
||||
Evaluation can be defined as the assessment and analysis of the ef because the subjective preferences of project managers, clients, and
|
||||
ficiency and effectiveness of the project’s activities and results. The other stakeholders should be considered in an evaluation process to
|
||||
evaluation looks at what is planned to do, what has been achieved, and avoid arbitrary results generated by inadequate analytical approaches.
|
||||
how it has been achieved [15]. Kahan and Goodstadt [16] conceive Wohlin and Andrews [32] apply principal component analysis and
|
||||
evaluation as a set of questions and methods properly articulated to subjective evaluation factors to estimate which projects are successful or
|
||||
review processes, activities, and strategies to achieve better results. unsuccessful out of a set of projects. This statistical approach may be
|
||||
Therefore, the purpose of an evaluation is not just to find out what used to identify key project characteristics, but it does not allow for
|
||||
happened but to use that information to make the project better [17,18]. evaluating the project’s success according to stakeholders’ preferences.
|
||||
There are several evaluation approaches in the literature, some Yan [33] suggests the combined use of the balanced scorecard (BSC)
|
||||
considerably complex regarding their practical operationalisation and [34], the Analytic Hierarchy Process (AHP), and the Fuzzy Comprehensive
|
||||
use. Varajão et al. [10] present a comprehensive review of models and Analysis method (FCA), respectively, to construct a performance criteria
|
||||
methods for evaluating information systems project success. Some ex system, assess the criteria weights, and obtain an overall evaluation
|
||||
amples are described and analysed next. score. The author explains how to obtain the performance criteria sys
|
||||
Bannerman and Thorogood [19] propose a framework for defining IT tem, but does not explain the weighting and scoring components.
|
||||
project success that provides a common language for communication Yang et al. [35] apply a multi-criteria model for evaluating a soft
|
||||
and compares what stakeholders perceive as important. The authors list ware development project’s success using the Analytical Network Process
|
||||
the criteria that should be used to assess the success of a project within (ANP) [36] to assess the criteria weights at several hierarchical levels.
|
||||
five domains (process, project management, product, business, and The scores of a project on a given criterion were obtained by calculating
|
||||
strategy). However, they do not explain how to consider these domains the average of the scores assigned by five experts using a 5-point Likert
|
||||
and criteria together. scale. Note that, as mentioned above, averages should not be calculated
|
||||
Barclay and Osei-Bryson [20] describe a structured framework with ordinal scales. In addition, ANP is based on AHP, a method with
|
||||
named Project Objectives Measurement Model (POMM) to identify the known issues that affect the validity of the criteria weights (see, e.g.,
|
||||
criteria for evaluating an information system (IS) project and assigning a [37–39]).
|
||||
performance measure to each criterion. POMM applies value-focused Section 2.2 reviews important concepts and methods related to
|
||||
thinking principles [21] and goal question metric methods [22]. An multi-criteria evaluation that are needed to create a proper value mea
|
||||
illustrative case is presented in which the importance of each criterion is surement model [40,41] to assess the success of a project.
|
||||
directly assessed using an average of the stakeholders’ answers based on
|
||||
a 5-point Likert scale. However, despite its virtues, this operation is 2.2. Multi-criteria evaluation
|
||||
neither quantitatively nor substantively meaningful [23], respectively,
|
||||
because a Likert scale is an ordinal scale [24,25] and averaging the In a multi-criteria value model, the measure of success of a project is
|
||||
|
||||
2
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
given by the additive value function model: generates a proposal of weights compatible with the inputted qualitative
|
||||
n n
|
||||
judgments by solving the linear programming problem described in
|
||||
∑ ( ) ∑
|
||||
V(x1 , x2 , …, xn ) = wj vj xj , with wj = 1 and wj > 0, ∀j (1) Bana e Costa et al. [52]. The evaluators should validate the proposed
|
||||
j=1 j=1 weighting scale and adjust it if needed.
|
||||
|
||||
Where V is the overall value score of the success of the project, wj is the 2.2.2. Methods to build value scales
|
||||
weight of criterion j, vj(xj) is the value score on criterion j of the per We must assign fixed scores to the previously defined references to
|
||||
formance xj, and nrepresents the number of evaluation criteria. build a criterion value scale. For example, we may assign 100 and
|
||||
Despite being straightforward in form, this model is often poorly 0 value units to the “best” and the “worst” performances in each crite
|
||||
applied. We highlight that the criteria weights wj are scaling constants rion, respectively, although two other scores could be used so that the
|
||||
[42], which represent trade-offs between criteria and not the erroneous highest score is assigned to the most preferred reference. Though this
|
||||
notion of criteria’s measures of importance [21]. In addition, vj is a arbitrary assignment of scores leads to obtaining interval value scales
|
||||
measurable value function, which represents both a preference order [25]. Additionally, the score of a project on a given criterion should
|
||||
between performances on criterion j and a strength-of-preference order consider the preferences expressed by the evaluators upon performance
|
||||
on differences of performances [43]. Moreover, the model requires the ranges within the criterion [43] (e.g., the difference in value between
|
||||
criteria to be mutually preferentially independent [44], which entails performances A and B is worth twice the difference between C and D).
|
||||
special care during the model structuring phase. Hereinafter, we present two numerical scoring methods and a qualita
|
||||
There are some fundamental aspects to note regarding the desired tive one.
|
||||
properties for each evaluation criterion and also for the whole set of Edwards [53] presents the direct rating method. This numerical
|
||||
criteria [45]. Each criterion should be essential for the evaluation and procedure first requires evaluators to rank the project performances in
|
||||
controllable in the sense that the performance of the project influences order of decreasing attractiveness. The highest score (100 units) is
|
||||
the degree to which the criterion is satisfied, independently of other assigned to the “best” performance and the lowest score (0 units) to the
|
||||
additional decisions. Also, a family of evaluation criteria should be: “worst”. Intermediate scores are assigned to other performance levels
|
||||
complete (the set of criteria should represent all of the relevant conse considering the intensities of preferences between each two of them,
|
||||
quences of the project); nonredundant (the criteria should not repeat the knowing that the difference between the “best” and “worst” is worth 100
|
||||
same concerns); concise (the number of criteria should be kept to the value units. This method allows scoring a project directly or indirectly
|
||||
necessary minimum to evaluate the project); specific (each criterion using a performance measure (e.g., quantitative continuous, quantita
|
||||
should be able to assess the consequences of the project, instead of being tive discrete, or qualitative). von Winterfeldt and Edwards [54] describe
|
||||
so broad that it compromises this purpose); and understandable (the the bisection method, also known as the mid-value splitting technique [55],
|
||||
evaluation criteria should be clear in the eyes of any interested to create a value scale for a criterion. This numerical method assigns the
|
||||
individual). highest score to the “best” performance (100) on the criterion and the
|
||||
Depending on the ability to use appropriate numerical principles and lowest score (zero) to the “worst”. Then, it is asked which performance p
|
||||
fluency to express oneself in words, an evaluator may prefer to apply a has a value equally distant from the “best” and the “worst” perfor
|
||||
numerical method or a non-numerical one [46]. In light of this, the mances, which means that the ranges “p–to–best” and “p–to–worst” have
|
||||
remainder of this section focuses on quantitative and qualitative tech the same strength-of-preference. Therefore, the performance p would get
|
||||
niques tailored for these two types of evaluators. Specifically, we delve a midpoint score of 50. Similar midpoint questions are asked to identify
|
||||
into methods for criteria weighting and building a value scale for each other points that can be used to form a piecewise linear value function or
|
||||
criterion. a curve. This method allows the creation of value functions upon a
|
||||
quantitative and continuous performance measure on the criterion.
|
||||
2.2.1. Weighting methods Bana e Costa and Vansnick [50] developed MACBETH [51] to create
|
||||
A theoretically sound weighting method must consider the perfor a value scale for a criterion (and to weight criteria, as described in the
|
||||
mance ranges defined by two fixed references on each criterion. Com preceding section). Still, contrary to the above-mentioned methods, it
|
||||
mon references are, for example, the “worst” and the “best” needs only to elicit qualitative judgments. An evaluator judges the dif
|
||||
performances [39] or “neutral” and “good” performances [47]. Below, ference in attractiveness between two performances at a time, using the
|
||||
we briefly describe two quantitative weighting procedures and one qualitative scale presented in the previous section, and inputs them into
|
||||
qualitative. the software tool M-MACBETH. This tool verifies the consistency of the
|
||||
Keeney and Raiffa [48] developed the trade-off procedure, which is a inputted judgments and generates a proposal of a value scale compatible
|
||||
numerical method that requires establishing indifferences between two with them and with the scores assigned to the reference performances
|
||||
fictitious projects using two criteria at each time. After establishing n – 1 “best” and “worst” (or “good” and “neutral”) [52]. In the final step, the
|
||||
indifference relationships for the n criteria, a system of equations is evaluator must validate and adjust the proposed value scale if needed.
|
||||
solved, including one equation in which the sum of the weights equals 1, As in direct rating, this method allows scoring a project directly or
|
||||
to obtain the criteria weights. indirectly using any performance measure.
|
||||
Edwards and Barron [49] created the swing weighting method, which
|
||||
is a numerical method that involves measuring the relative importance 2.3. Review summary
|
||||
of the improvements (swings) that can be achieved on the criteria,
|
||||
considering a change from the “worst” to the “best” performance on In the project success literature reviewed, most papers address the
|
||||
each of them. identification of IT criteria (e.g., Lobato et al. [4] and Assalaarachchi
|
||||
Bana e Costa and Vansnick [50] developed MACBETH [51] to weight et al. [56]) or success factors (e.g., Pinheiro et al. [57] and Jayakody and
|
||||
the criteria. This procedure requires ranking the worst–best swings and Wijayanayake [58]), but only a few present an evaluation approach. In
|
||||
judging them using the qualitative scale of difference in attractiveness: addition, the evaluation methods identified suffer from one or more
|
||||
no (difference), very weak, weak, moderate, strong, very strong, or extreme. theoretical errors (e.g., weights used as indicators of importance, aver
|
||||
This qualitative scale is also used to judge the difference in attractive ages calculated with ordinal scales, application of techniques with
|
||||
ness between two swings at a time. The elicited judgments are used to fill known flaws, and normalisation procedures that do not consider
|
||||
in the upper triangular part of a matrix in the software tool non-linear preferences). Furthermore, as far as we know, there is no
|
||||
M-MACBETH, which validates each judgment’s consistency with those description of a formal process that may guide the evaluators from
|
||||
previously inputted (see [52], pp. 425–443). Then, the software tool beginning to end, i.e., from identifying the evaluation criteria until
|
||||
|
||||
3
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
reaching an overall measure of project success. Therefore, a gap in the IT different roles in the project; all of them were somehow interested in the
|
||||
project literature needs to be addressed, which will be done by applying project’s outcomes. The group had three members: two from TEAMGSM
|
||||
multi-criteria evaluation principles. and TEAMUNI, and one external consultant. The team members were
|
||||
Given the characteristics of the evaluators, the simplicity of use of selected considering their managerial responsibilities and to ensure
|
||||
the MACBETH method and its software tool M-MACBETH, including its representativeness of all the involved parties. All the members agreed to
|
||||
ability to validate the consistency of the value judgments expressed by be involved in the model development tasks. Note that larger groups
|
||||
evaluators and to work with any performance measure (be it qualitative require different group processes, typically having separate meetings
|
||||
or quantitative, continuous or discrete), this was the approach selected with stakeholders of different areas of interest to develop parts of the
|
||||
to weight the criteria and build a value function for each criterion in the model, and with merge meetings gathering higher-level representatives
|
||||
real-world case described in this paper. of the client to validate the work done by the stakeholders and to finish
|
||||
the overall model [63].
|
||||
3. Model development Fig. 1 depicts the model development tasks. The first task involves
|
||||
identifying the aspects of interest for evaluating the project’s success
|
||||
3.1. Research setting (“problem structuring”, described in Section 3.3). This is a critical task
|
||||
because it is not possible to develop a proper evaluation model without
|
||||
GlobalSysMakers develops solutions in four business areas: mobility understanding the problem, which is the reason why several publica
|
||||
solutions, industrial technology, consumer goods, and energy and tions have been devoted to identifying the fundamental evaluation
|
||||
building technology. It has several divisions, including automobile concerns to be addressed (e.g., [28,64]). Second, all the relevant eval
|
||||
multimedia, automobile accessories, electric tools, heating and hot uation criteria should be included in the model, and a descriptor of
|
||||
water, and home appliances. It employs roughly 410,000 associates performance should be identified for each of them, enabling the
|
||||
worldwide, has about 440 subsidiaries and regional companies in 60 assessment of the extent to which each criterion is met (“model struc
|
||||
countries, and employs nearly 70,000 associates in research and devel turing”, Section 3.4). Third, the evaluation component of the model must
|
||||
opment at 125 locations. be built (“value model building”, Section 3.5), which includes the con
|
||||
The target project, here identified as PROJRD, was part of an R&D struction of a value function for each criterion to transform the perfor
|
||||
program that had the participation of GlobalSysMakers and a university. mances of the project into value scores (Section 3.5.1), and weighting
|
||||
The project had as its primary goal the development of a software tool to the criteria to depict their trade-offs (Section 3.5.2). Last, the evaluation
|
||||
automate the assessment of printed circuit boards (PCBs) design. PCBs model should be tested for adequacy and consistency (Section 4.1).
|
||||
are essentially boards that connect electronic components used in all
|
||||
(but the simplest) electronic products, such as household appliances or
|
||||
vehicles. In addition to the software tool, the project deliverables 3.3. Problem structuring
|
||||
included technical specifications, prototypes, and presentations.
|
||||
The software development process adopted was based on a hybrid/ The problem structuring task aims to identify the fundamental ob
|
||||
agile methodology supported by SCRUM [59]. Agile methods for soft jectives [45] that determine the project’s success from the client’s
|
||||
ware development have been increasingly used in the IT sector [60] and perspective. Such objectives are essential reasons for the project’s suc
|
||||
are now mainstream [61]. In this project, agility enabled greater cess. Therefore, they should be used as criteria in the evaluation model.
|
||||
adaptability of the development phases according to the company’s However, the identification of these objectives in ill-structured
|
||||
needs and requirements, which evolved along with the project lifecycle. problems may not be easy, which is why we opted to apply a problem
|
||||
Thus, it was possible to deal with changes in the requirements that were structuring method (PSM) known as group map [65], which can be used in
|
||||
reflected in the final deliverables during the project development. In a combination with a multi-criteria decision analysis approach [66].
|
||||
later phase of the project, the SCRUM was coupled with a waterfall To begin structuring the problem, the decision-making group was
|
||||
process since the objectives stabilised without needing a periodic up asked to say which aspects or concerns were relevant to evaluate the
|
||||
date. The project team was multidisciplinary, incorporating engineers project’s success. Then, for each of the concerns expressed, it was asked,
|
||||
from GlobalSysMakers (TEAMGSM) and researchers from the university “Why is that important?” or “What would be the consequences of doing
|
||||
(TEAMUNI). Together, the teams (TEAMGSM and TEAMUNI) had that?”, which allowed us to identify other aspects.
|
||||
electronics, software engineering, and project management skills. Fig. 2 depicts the complete group causal map built with the answers
|
||||
On average, the team allocated 1040 h per month to the project
|
||||
(approximately 6.5 Full-Time Equivalent), distributed by the different
|
||||
tasks of the project and according to the functions performed by each
|
||||
element (three of the team members were not full-time in the project).
|
||||
The project had a duration of 36 months.
|
||||
The project’s overall success was first assessed using a simple grid
|
||||
scoring model built by non-specialists in evaluation, which directly
|
||||
scored the project on several criteria and assigned importance weights.
|
||||
However, the project management team felt the need for a more
|
||||
advanced model to improve confidence in the evaluation. More in-depth
|
||||
research on multi-criteria evaluation revealed some misinterpretations
|
||||
in that process, which ultimately led to the development of a new model
|
||||
in line with decision analysis principles. This paper describes the new
|
||||
evaluation model.
|
||||
|
||||
3.2. Development tasks
|
||||
|
||||
The model development process started by asking the project man
|
||||
ager to identify the members who should form the decision-making
|
||||
group [62], i.e., the group in charge of developing the model to eval
|
||||
uate the project’s success. It was recommended to select members with Fig. 1. Model development tasks.
|
||||
|
||||
4
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 2. Group map.
|
||||
|
||||
|
||||
of the elements of the group using the software tool “Decision Explorer”
|
||||
(from Banxia Software Ltd., https://banxia.com/dexplore), which
|
||||
automatically numbered the concerns for identification purposes. This
|
||||
map results from several iterations, adding some aspects and removing
|
||||
others. Note that a specific concern may be expressed by one statement
|
||||
(e.g., “(33) good requirements definition”) or by two statements sepa
|
||||
rated by an ellipsis, which depicts a positive pole and a negative one to
|
||||
clarify the meaning of the concern (e.g., “15 time fulfilment… time
|
||||
exceeded”). An arrow between two concerns indicates the direction of
|
||||
causality. When an arrow points to a concern with two poles, it means
|
||||
that the concern affected is the one at the positive pole (e.g., a “(29) good Fig. 3. Project’s success evaluation criteria.
|
||||
contract management” contributes to the positive pole of “(1) cost
|
||||
fulfilment… cost exceeded”; in the reverse case, the arrow would have a problem structuring task.
|
||||
negative sign near its head). The concerns represented by these criteria are as follows:
|
||||
In Fig. 2, it is possible to identify chains of means-ends objectives. For
|
||||
example, an “(31) effective change management” contributes to the • Scope/quality fulfilment (ScoQual)—the extent to which the planned
|
||||
“(36) deliverables use”, which respectively allows to “(41) reduce users’ (functional and non-functional) requirements were fulfilled (this
|
||||
repetitive work”, which contributes to “increase users’ satisfaction”. criterion resulted from concern 14 in Fig. 2).
|
||||
Although the “(41) reduce users’ repetitive work” is a means-objective
|
||||
to the end-objective “(39) increase users’ satisfaction”, the group The prime deliverable of the project is a software tool to support the
|
||||
considered the former a fundamental objective because it is important in PCB’s design assessment, the other deliverables being subsidiary to this
|
||||
itself and not because of its contribution to the latter. Therefore, “(41) tool. In the end, if the software tool does not comply with a minimum set
|
||||
reduce users' repetitive work” will be used as an evaluation criterion. of planned requirements, it will not be able to assess the PCB’s design
|
||||
Objective “(39) increase users' satisfaction” was considered too broad to and will compromise the investment objectives.
|
||||
evaluate the project’s success and thus will not be used.
|
||||
• Cost fulfilment (Cost)—the extent to which the planned cost was
|
||||
fulfilled (this criterion resulted from concern 1 in Fig. 2).
|
||||
3.4. Model structuring
|
||||
The budget defined for the project needs to be carefully managed due
|
||||
3.4.1. Evaluation criteria to being financed by an external R&D entity with a very narrow margin
|
||||
Fig. 3 depicts the seven evaluation criteria that emerged from the of deviation.
|
||||
concerns highlighted in bold in the group causal map developed in the
|
||||
|
||||
5
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
• Time fulfilment (Time)—the extent to which the planned time was direct (the descriptor levels should directly describe the performances on
|
||||
fulfilled (this criterion resulted from concern 15 in Fig. 2). the corresponding criterion), operational (the information concerning
|
||||
the performances of the project can be obtained and value judgments
|
||||
Since this project is part of a large program, time fulfillment is a can be made), understandable (performances and value judgments made
|
||||
significant management aspect because all the program’s projects must using the descriptor can be clearly understood and communicated).
|
||||
be finished simultaneously due to the program’s constraints. In other Table 1 presents the list of all the descriptors created to measure the
|
||||
words, not meeting the deadline in this project would mean completing performance of the project, as well as two reference performance levels,
|
||||
it in whatever form it is in when the program reaches its end, complying “neutral” and “good”, for each of them. Note that the definition of two
|
||||
or not with the scope, and delivering or not what was planned. reference performance levels is required to weigh the criteria, allowing
|
||||
comparisons between criteria preference ranges and defining two fixed
|
||||
• Increase of the number and type of errors identified in each verification anchors for the value scales (see Section 2.2). Furthermore, the use of a
|
||||
cycle (IncNoType)—the extent to which the number and type of errors “neutral” performance level (which corresponds to a performance that is
|
||||
identified in each PCB’s verification cycle increase (this criterion neither positive nor negative on the criterion) and of a “good” perfor
|
||||
resulted from concern 43 in Fig. 2). mance level (which corresponds to a very positive performance on the
|
||||
criterion) allows to increase the understandability of the criterion, and
|
||||
Before the project was implemented in the company, the PCB designs are thus preferable to the “worst” and the “best” references used as ex
|
||||
had been checked mainly in a semi-automatic way by specialised engi amples in Section 2.2.
|
||||
neers. Due to the many PCB components, details, and rules to review, it As shown in Table 1, the criteria scope/quality fulfilment and increase
|
||||
was virtually impossible to check all of the required features. The in the number and type of errors identified in each verification cycle do not
|
||||
consequence was the late detection of some errors in more advanced have direct descriptors of performance. For these criteria, constructed
|
||||
stages of the projects, or, in other words, in later verification cycles. This descriptors were developed combining the characteristics inherent to
|
||||
accounts for the importance of the new software tool to increase the those criteria, as explained next (Bana e Costa et al. [67] describe a
|
||||
number and type of errors identified early on in each verification cycle, detailed procedure for creating constructed descriptors).
|
||||
thereby reducing the design costs. To measure the performance of the project on the scope/quality
|
||||
fulfilment criterion, several requirements that deliver different contri
|
||||
• Reduction of the number of verification cycles (RNVC)—the extent to butions to the project’s success were considered, following the MoSCoW
|
||||
which the number of verification cycles is reduced (this criterion method principles [68]. These requirements were classified into three
|
||||
resulted from concern 37 in Fig. 2). types (“must have”, “important to have”, and “nice to have”) and
|
||||
combined to obtain the performance levels of the descriptor presented in
|
||||
A PCB typically needs to go through several verification cycles until Table 2.
|
||||
it is free from errors and ready for production. When errors are detected To measure the performance of the project on the increase of the
|
||||
in a verification cycle, the PCB design needs to be corrected and tested number and type of errors identified in each verification cycle criterion,
|
||||
again, possibly requiring a new verification cycle. Each verification several combinations of the number and type of errors identified at each
|
||||
cycle of a PCB design implies high costs. Furthermore, there is the risk of verification cycle (based on a past project) need to be considered (see
|
||||
detecting errors only at the production stage, with even more severe Table 3). For example, a “5 % increase in the number of identified er
|
||||
consequences. A primary expected result of the new software tool is to rors” and a “10 % increase in the type of identified errors” is a perfor
|
||||
reduce the number of verification cycles by enabling the early detection mance depicted as level “E5 T10”. A verification cycle includes a series
|
||||
of errors. of tests to check for errors in the PCB’s design or if it is ready for pro
|
||||
duction (free from errors).
|
||||
• Improve efficiency (ImpEff)—the extent to which the number of We note that the indicators used in the constructed scales presented
|
||||
verified rules increases in each verification cycle without increasing in Tables 2 and 3 cannot be considered in isolation, as they are mutually
|
||||
the involved human resources (this criterion resulted from concern preferentially dependent. For example, in Table 3, an increase of 10 % in
|
||||
42 in Fig. 2).
|
||||
|
||||
Since the process for verifying the PCB’s design rules is semi- Table 1
|
||||
automatic, with a substantial part of manual labour, the current num Descriptors of performance.
|
||||
ber of specialised engineers can only check some of the relevant aspects. Criterion Descriptor Neutral Good
|
||||
With the new software tool, it is expected that the same number of en
|
||||
Scope/quality fulfilment Constructed L2 L3
|
||||
gineers can check a greater number of design rules, not spending more (ScoQual) descriptor (see
|
||||
time doing it. Table 2)
|
||||
Cost fulfilment (Cost) Cost of the project Planned 95 % of the
|
||||
• Reduction of the repetitive work of the users (RRWU)—the extent to (k€) cost planned cost
|
||||
(k€ 500) (k€ 450)
|
||||
which the number of rules manually verified is reduced in each Time fulfilment (Time) Project duration Planned 95 % of the
|
||||
verification cycle (this criterion resulted from concern 41 in Fig. 2). (weeks) time planned time
|
||||
(96 (90 weeks)
|
||||
In the semi-automatic verification of PCB’s design rules, manual la weeks)
|
||||
Increase in the number and Constructed E5 T0 E10 T5
|
||||
bour is repetitive and prone to errors due to the fatigue of specialists.
|
||||
type of errors identified in descriptor (see
|
||||
Automating most of the rules’ assessment is expected to reduce the re each verification cycle Table 3)
|
||||
petitive work of these specialists and free them to perform other tasks. (IncNoType)
|
||||
Reduction of the number of Number of 1 cycle 2 cycles
|
||||
3.4.2. Descriptors of performance verification cycles verification cycles
|
||||
(RNVC) decreased
|
||||
In this task, we associate a descriptor of performance with each Improve efficiency (ImpEff) Number of verified 0% 40 %
|
||||
evaluation criterion to measure how much the project satisfies the cri rules increased ( %)
|
||||
terion. According to Keeney [45], a descriptor should be unambiguous (to Reduction of the repetitive Number of rules 0% 10 %
|
||||
describe the performances on the associated criterion clearly), compre work of the users (RRWU) manually verified
|
||||
reduced ( %)
|
||||
hensive (to cover the range of possible performances on the criterion),
|
||||
|
||||
6
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
Table 2 scope/quality fulfilment criterion with a discrete descriptor, and time
|
||||
Scale for “scope/quality fulfilment” criterion. fulfilment criterion with a continuous descriptor.
|
||||
Performance levels Fig. 4 presents the matrix of judgments for the scope/quality fulfilment
|
||||
criterion. Table 2 shows the constructed descriptor for this criterion
|
||||
The project…
|
||||
…satisfied all the requirements “must have” and “important to have” L1 where: L1 means “the project satisfied all the requirements ‘must have’
|
||||
and most of the “nice to have” and ‘important to have’ and the majority of the ‘nice to have’”, L2 means
|
||||
…satisfied all the requirements “must have” and at least 85 % of the L2 = Good “the project satisfied all the requirements ‘must have’ and at least 85 %
|
||||
“important to have” and at least 20 % of the “nice to have” (or an of the ‘important to have’ and at least 20 % of the ‘nice to have’ (or an
|
||||
equivalent performance on the requirements “important to have”
|
||||
and “nice to have”)
|
||||
equivalent performance)”, and L3 means “the project satisfied all the
|
||||
…satisfied all the requirements “must have” and at least 60 % of the L3 = requirements ‘must have’ and at least 60 % of the ‘important to have’
|
||||
“important to have” and at least 20 % of the “nice to have” (or an Neutral and at least 20 % of the ‘nice to have’ (or an equivalent performance)”.
|
||||
equivalent performance on the requirements “important to have” We can see in Fig. 4 that the difference in attractiveness between “L1”
|
||||
and “nice to have”)
|
||||
and “L2 = Good” was deemed weak by the evaluators, whereas the
|
||||
…did not satisfy one requirement “must have”, or satisfied less than 60 L4
|
||||
% of the requirements “important to have” difference in attractiveness between “L2 = Good” and “L3 = Neutral”
|
||||
…did not satisfy more than one requirement “must have” L5 was considered moderate. Therefore, the difference in value between
|
||||
“L1” and “L2 = Good” should be lower than the difference between “L2
|
||||
= Good” and “L3 = Neutral”, which can be confirmed in the value scale
|
||||
Table 3 presented in Fig. 6a, where the former difference corresponds to 65
|
||||
Constructed scale for “increase of the number and type of errors identified in value units and the latter to 100.
|
||||
each verification cycle” criterion. The time fulfilment criterion has the descriptor of performance
|
||||
Increase in the number of Increase in the type of Level
|
||||
“project duration (in weeks)” with the references “96 weeks = Neutral”
|
||||
identified errors (E) identified errors (T) and “90 weeks = Good”. To build a value function for this criterion, first,
|
||||
we created three more equally spaced performance levels: one worse
|
||||
10 % 10 % E10 T10
|
||||
10 % 5% E10 T5 = than “neutral” (99 weeks), one between “neutral” and “good” (93
|
||||
Good weeks), and one better than “good” (87 weeks). Then, the evaluators
|
||||
10 % 0% E10 T0 judged the differences in attractiveness between each two of these
|
||||
5% 10 % E5 T10 levels, together with the “neutral” and the “good” levels, resulting in the
|
||||
5% 5% E5 T5
|
||||
5% 0% E5 T0 =
|
||||
matrix of judgments presented in Fig. 5.
|
||||
Neutral Looking at the diagonal (above the grey shaded cells) of the matrix in
|
||||
0% 0% E0 T0 Fig. 5 we see that the intensities of the differences in attractiveness
|
||||
between each two consecutive levels increase more when the number of
|
||||
weeks exceeds 93 weeks: the evaluators considered weak the differences
|
||||
the number of identified errors (E) is valued more highly when the per
|
||||
in attractiveness between “87” and “90 = Good” (and also between “90
|
||||
centage increase in the type of identified errors (T) is greater. Otherwise, the
|
||||
= Good” and “93”), whereas they considered moderate the difference in
|
||||
number and the type of identified errors could have been used as in
|
||||
attractiveness between “93” and “96 = Neutral”, and very strong the
|
||||
dicators for two separate evaluation criteria.
|
||||
difference between “96 = Neutral” and “99”. Therefore, the difference in
|
||||
After the seven criteria had been clearly identified and their de
|
||||
value between “87” and “90 = Good” (and also between “90 = Good”
|
||||
scriptors of performance established, the decision-making group was
|
||||
and “93”) should be lower than the difference in value between “93” and
|
||||
asked whether there was any additional aspect that might be considered
|
||||
“96 = Neutral”, and the latter should also be lower than the difference in
|
||||
in assessing the project’s success. The negative response indicated that
|
||||
value between “96 = Neutral” and “99”, which can be confirmed in the
|
||||
this set of criteria was exhaustive and, consequently, that the value tree
|
||||
value function presented in Fig. 6c (each of the first two intervals cor
|
||||
presented in Fig. 3 could be considered complete.
|
||||
responds to 40 value units, whereas the third and fourth equal 60 value
|
||||
units and 160, respectively). Therefore, this function shows that the
|
||||
3.5. Value model building evaluators considered that increments in time after 93 weeks are
|
||||
increasingly penalizing for the project’s success.
|
||||
3.5.1. Value functions We emphasize that the decision group made these judgments for
|
||||
As previously described, a descriptor of performance provides a way each criterion independently of the performance levels or the differences
|
||||
of measuring the project’s performance on its associated criterion. in attractiveness on the remaining criteria, thereby supporting the
|
||||
However, to build a value model, we also need to obtain the value of assumption of mutual preferential independence between criteria.
|
||||
each plausible performance of the project (in the form of a value scale or Fig. 6 (6a–6g) presents the value functions of all the evaluation
|
||||
value function), which requires knowing the preferences of the evalua criteria.
|
||||
tors upon differences in performances on the corresponding criterion.
|
||||
For that purpose, we applied the MACBETH method [51]. As 3.5.2. Criteria weighting
|
||||
described in Section 2.2, the questioning procedure of MACBETH re Weighting requires establishing trade-offs between criteria, which is
|
||||
quires the evaluators to answer questions of difference in attractiveness typically demanding because it implies comparing performance im
|
||||
between two performance levels at each time, using the qualitative provements on different criteria. The improvements (swings) are defined
|
||||
scale: no (difference in attractiveness), very weak, weak, moderate, between the two predefined performance references, “neutral” and
|
||||
strong, very strong, and extreme. The answers provided are used for “good”, in each criterion.
|
||||
filling in a matrix of judgments in the M-MACBETH software tool, which According to the MACBETH weighting procedure, the first step was
|
||||
analyses the consistency of the answers as soon as they are inserted, and to rank the “neutral–good” swings in order of decreasing preference
|
||||
then generates (by linear programming) a proposal of value scale which (Fig. 7). The evaluators considered the swing from “1 to 2 verification
|
||||
is compatible with the answers provided, given the fixed value scores cycles decreased” as the most important one (1st in Fig. 7), which im
|
||||
assigned to the “neutral” and the “good” performances (0 and 100 value plies that the criterion “reduction of the number of verification cycles
|
||||
units, respectively). (RNVC)” will have the highest weight. In contrast, the criterion
|
||||
We present two examples of applying the MACBETH method to build “reduction of repetitive work of the users (RRWU)” will obtain the
|
||||
value functions for criteria with different descriptors of performance: lowest weight because it has the least important “neutral–good” swing
|
||||
|
||||
7
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 4. MACBETH judgment matrix for the “Scope/quality fulfilment” criterion.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 5. MACBETH judgment matrix for the “time fulfilment” criterion.
|
||||
|
||||
|
||||
(7th in Fig. 7). criteria, because their performances are not worse than “neutral” in any
|
||||
In the second step, the improvements provided by the criteria swings of the criteria and are better than it in several criteria. Therefore, both
|
||||
were judged qualitatively using the MACBETH semantic scale (Fig. 8), scenarios dominate [69] a “neutral project”. Additionally, we may see
|
||||
which allowed filling in the rightmost column in Fig. 9. For example, the that scenario “PCB red 2 cycles” has an overall score very close to that of
|
||||
improvement provided by the most important swing [RNVC] was a “good project” (100 units), whereas the value of scenario “PCB red 1
|
||||
considered extreme, whereas the least important “neutral–good” swing cycle” is almost mid-distance from a “neutral project” and a “good
|
||||
[RRWU] was judged weak. project”.
|
||||
Then, the differences in attractiveness between each two “neu However, it is not robust to say that the scenario “PCB no red of
|
||||
tral–good” swings were assessed to fill in the remaining cells of the first cycles” corresponds to an unsuccessful project, looking only at its overall
|
||||
row of the weighting matrix and fill in the diagonal above the shaded value score. We must determine if its overall result will always be worse
|
||||
cells in Fig. 9. For example, Fig. 10 depicts the comparison of the than that of a “neutral project” when in the face of the uncertainty
|
||||
“neutral–good” swings in the reduction of the number of verification cycles defined for the model parameters (i.e., the value scores and criteria
|
||||
(RNVC) criterion and in the increase in the number and type of errors weights). In fact, the evaluators considered it plausible that: a) each
|
||||
identified in each verification cycle (IncNoType) criterion, which was criterion weight (wj,j = 1, …, 7) may vary within an interval defined by
|
||||
( )
|
||||
deemed as very strong (v. strong in Fig. 9). The other cells with no the lower and upper limits wj ≤ wj ≤ wj , j = 1, …, 7 shown in Table 6;
|
||||
judgments were filled in automatically (by transitiveness) with “P” and b) the value scores of the scenario “PCB no red of cycles” may have
|
||||
(positive) judgments by M-MACBETH. ( ) ( )
|
||||
plus or minus 5 value units (respectively denoted by vj yj and vj yj ,
|
||||
Finally, the software tool applied the linear programming model
|
||||
described in Bana e Costa et al. [51] to generate a proposal of a j = 1,…,7) in all the criteria for which this scenario has a performance
|
||||
weighting scale consistent with the qualitative judgments expressed in different from “neutral” and “good”, otherwise it will keep 0 and 100,
|
||||
the weighting matrix, which were subsequently validated by the eval respectively.
|
||||
uators (with some minor adjustments), resulting in the weights pre The linear programming (LP) problem (2) was then used to test
|
||||
sented in Fig. 11. whether a “neutral project” additively dominates [70] the scenario “PCB
|
||||
no red of cycles”, which would require a negative maxD. The result
|
||||
maxD = 9.575denotes that there is at least one combination of plausible
|
||||
4. Results and discussion
|
||||
scores and weights for which scenario “PCB no red of cycles” has a
|
||||
higher overall value than that of a “neutral project”.
|
||||
4.1. Model testing and results
|
||||
The worst possible overall value for scenario “PCB no red of cycles”
|
||||
was also calculated, with the LP problem (3), resulting in minD =
|
||||
At this point, the actual performances of the project are already
|
||||
–14.10. Therefore, in the face of the uncertainty, the overall value score
|
||||
known for most of the criteria, but not for the reduction of the number of
|
||||
of scenario “PCB no red of cycles” may vary between –14.10 and 9.575.
|
||||
verification cycles (RNVC) criterion, which will only be identified in the
|
||||
long term. Therefore, three alternative scenarios were created with 7
|
||||
∑ [ ( ) ( )]
|
||||
hypothetical future performances on RNCV: no reduction at all (PCB no maxD = wj vj yj − vj neutralj (2)
|
||||
j=1
|
||||
red cycles), a decrease of one verification cycle (PCB red 1 cycle), and a
|
||||
decrease of two verification cycles (PCB red 2 cycles). The performances Subject to:
|
||||
of these scenarios are shown in Table 4.
|
||||
7
|
||||
∑
|
||||
Applying the value functions previously defined for each criterion to wj = 1
|
||||
the performances presented in Table 4, we obtain the partial and the j=1
|
||||
overall value scores of the three scenarios shown in Table 5 using the
|
||||
previously assessed criteria weights. wj ≤ wj ≤ wj , j = 1, …, 7
|
||||
As seen in Table 5, the most advantageous scenario corresponds to
|
||||
[ ( ) ]
|
||||
“PCB red 2 cycles” with 94.60 overall value units, followed by “PCB red 7
|
||||
∑ ( )
|
||||
1 cycle” with 49.60, and “PCB no red of cycles” with –6.65. minD = wj vj yj − vj neutralj (3)
|
||||
j=1
|
||||
Scenarios “PCB red 2 cycles” and “PCB red 1 cycle” undoubtedly
|
||||
denote a successful project independently of the weights assigned to
|
||||
|
||||
8
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 6. Value functions of criteria: (a) scope/quality fulfilment, (b) cost fulfilment, (c) time fulfilment, (d) increase in the number and type of errors identified in each
|
||||
verification cycle, (e) reduction of the number of verification cycles, (f) improve efficiency, (g) reduction of the repetitive work of the users.
|
||||
|
||||
|
||||
|
||||
|
||||
9
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 7. Neutral–good swings ranking.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. Neutral–good swings’ weighting judgments.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 9. MACBETH weighting matrix (the P and I within the matrix respectively mean positive difference in attractiveness and indifference).
|
||||
|
||||
|
||||
subject to:
|
||||
|
||||
|
||||
|
||||
10
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
members. Therefore, the model has a form and content sufficient to
|
||||
evaluate the project’s success [71].
|
||||
|
||||
5. Discussion
|
||||
|
||||
The absence of a formal evaluation of project success results in the
|
||||
waste of relevant lessons that can be used to enhance project manage
|
||||
ment practices [9,72]. This is a strong reason for implementing
|
||||
well-structured processes to evaluate project success.
|
||||
Any evaluation process should start by identifying the success
|
||||
criteria according to the decision-makers’ preferences and systems of
|
||||
values, which are inherently subjective. We underscore that an evalua
|
||||
tion model has an objective component (factual data) and a subjective
|
||||
one (value judgments), which should be independently addressed.
|
||||
Therefore, subjectivity is a key component in an evaluation process, but
|
||||
it should not be confused with ambiguity, which should be avoided. That
|
||||
is why the success evaluation criteria should be carefully identified, and
|
||||
a measure of the performance of a project on each of those criteria must
|
||||
be operationalised. The “neutral” and “good” references of intrinsic
|
||||
value allow identifying the project’s success level.
|
||||
Fig. 10. Assessment of the difference in attractiveness between the “neu Throughout the development of the evaluation model, the members
|
||||
tral–good” swings in RNVC and IncNoType. of the decision-making group were encouraged to engage in open dis
|
||||
cussion whenever differences of opinion arose. This approach enabled a
|
||||
better understanding of their points of view and helped the group reach
|
||||
an agreement on the way forward.
|
||||
In the case described herein, the success of the project may depend
|
||||
on the future performance of the reduction of the number of verification
|
||||
cycles (RNVC) criterion. With “no reduction of verification cycles”, the
|
||||
project may be unsuccessful, with –6.65 overall value units, caused by
|
||||
its low performance and corresponding negative score (–125 value
|
||||
units) on this criterion. However, as we have seen, given the uncertainty
|
||||
defined for the partial value scores and the criteria weights, this scenario
|
||||
is not guaranteed to correspond to a negative evaluation. In fact, its
|
||||
overall value may vary between –14.10 and 9.575 units.
|
||||
With a “reduction of 1 verification cycle”, the project would obtain
|
||||
49.60 overall value units, which is nearly a mid-distance evaluation
|
||||
between a “good project” and a “neutral project”. With a “reduction of 2
|
||||
verification cycles”, the project would obtain 94.60 overall value units,
|
||||
Fig. 11. Criteria weights. which is very close to that of a “good project”.
|
||||
Developing a transparent evaluation process, such as the one
|
||||
7
|
||||
∑ described here, will promote the decision-making group’s understand
|
||||
wj = 1 ing and acceptance of the results. The participation of the decision-
|
||||
j=1 makers in all of the process phases is a key element for this purpose,
|
||||
which will allow them to develop a sense of ownership of the model
|
||||
wj ≤ wj ≤ wj , j = 1, …, 7 [63]. However, this is not a practice found in the literature related to
|
||||
After concluding the robustness analysis, the evaluation group evaluating project success, which offers an opportunity for
|
||||
revisited the model and considered that it could deal with all the plau improvement.
|
||||
sible performances and adequately considered the value judgments of its The proposed process, which integrates a problem structuring
|
||||
|
||||
|
||||
Table 4
|
||||
Performance profiles of the project’s success for the three scenarios.
|
||||
Scenario / Criterion ScoQual Cost (k€) Time IncNoType RNVC ImpEff RRWU
|
||||
(weeks) ( %) ( %)
|
||||
|
||||
PCB no red of cycles L2 480 96 E10 T10 No decrease 60 15
|
||||
PCB red 1 cycle L2 480 96 E10 T10 Decrease 1 cycle 60 15
|
||||
PCB red 2 cycles L2 480 96 E10 T10 Decrease 2 cycles 60 15
|
||||
|
||||
|
||||
|
||||
|
||||
Table 5
|
||||
Value scores of the project success for the three scenarios.
|
||||
Scenario / Criterion ScoQual Cost Time IncNoType RNVC ImpEff RRWU Overall value score
|
||||
(15 %) (5 %) (8 %) (22 %) (45 %) (3 %) (2 %)
|
||||
|
||||
PCB no red of cycles 100 40 0 115 –125 150 140 –6.65
|
||||
PCB red 1 cycle 100 40 0 115 0 150 140 49.60
|
||||
PCB red 2 cycles 100 40 0 115 100 150 140 94.60
|
||||
|
||||
|
||||
11
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
Table 6
|
||||
Plausible intervals for the criteria weights.
|
||||
Criterion ScoQual Cost Time IncNoType RNVC ImpEff RRWU
|
||||
|
||||
Index (j) 1 2 3 4 5 6 7
|
||||
Current weight (wj) 15 % 5% 8% 22 % 45 % 3% 2%
|
||||
( )
|
||||
Upper limit wj 18 % 7% 10 % 25 % 45 % 4% 2.5 %
|
||||
Lower limit (wj ) 12 % 5% 8% 19 % 40 % 3% 2%
|
||||
|
||||
|
||||
|
||||
method with a multi-criteria decision analysis (MCDA) approach for encouraging future research to refine, validate, and extend the proposed
|
||||
evaluating the success of information technology (IT) projects, offers framework. Ultimately, this work not only enriches theoretical under
|
||||
several significant theoretical contributions to the fields of project standing but also provides a foundation for more consistent, transparent,
|
||||
management, decision sciences, and IS. First, it advances the conceptual and stakeholder-aligned evaluation practices in the IT project domain.
|
||||
understanding of IT project success by addressing its inherently multi
|
||||
dimensional and context-dependent nature. Traditional models often 6. Conclusions
|
||||
rely on narrow success criteria—such as time, cost, and scope—while
|
||||
this research introduces a more holistic and stakeholder-sensitive Evaluating the success of IT projects should be a mandatory project
|
||||
framework. By incorporating problem structuring methods, the pro management activity. However, this is not observed in the practice [11,
|
||||
cess facilitates the elicitation and organization of the stakeholder per 72]. There are several contributions given by the process herein
|
||||
spectives, which are often overlooked or underrepresented in described, which can be easily adapted to other evaluation problems:
|
||||
conventional evaluation models. This contributes to theory by empha
|
||||
sizing the social and interpretive dimensions of project success, aligning • It shows how a multi-criteria approach may be used to evaluate IT
|
||||
with contemporary views that success is not an objective outcome but a (software development) projects while avoiding committing critical
|
||||
negotiated construct [73]. mistakes.
|
||||
Second, the integration of MCDA techniques provides a rigorous and • It offers a transparent process.
|
||||
transparent mechanism for prioritizing and aggregating evaluation • It involves the decision-makers in all of the model development
|
||||
criteria, thereby enhancing the methodological robustness of success tasks.
|
||||
assessment. This methodological synthesis bridges a gap in the literature • It identifies the fundamental objectives of decision-makers with the
|
||||
by demonstrating how qualitative insights from problem structuring can help of a problem structuring method, avoiding ending up solving
|
||||
be systematically translated into quantitative decision models. Theo the wrong problem [76].
|
||||
retically, this supports the development of hybrid evaluation frame • It allows establishing quantitative and substantive meaningful [23]
|
||||
works that are both contextually grounded and analytically sound. trade-offs between criteria (i.e., mathematically valid and unam
|
||||
Third, the application of the proposed process in a real-world case adds biguously understood).
|
||||
empirical depth to the theoretical model, offering evidence of its prac • It allows the management of the project to focus on what matters for
|
||||
tical relevance and adaptability. This empirical grounding strengthens the project’s success.
|
||||
the external validity of the framework and encourages further theoret • It can be implemented to evaluate the success of other projects, in
|
||||
ical exploration across different organizational and project contexts. similar or different contexts.
|
||||
The MACBETH approach has been successfully employed, with • The use of descriptors of performance clarifies what is intended to be
|
||||
different nuances and across various processes, to evaluate projects or achieved in each criterion.
|
||||
decision alternatives in diverse problem settings and for a wide range of • It distinguishes performance from value, instead of directly attrib
|
||||
organizations [74]. The process described in this paper, which combines uting scores to the project, mixing these two components.
|
||||
problem structuring with the MACBETH approach and robustness • And, it allows creating value scales adjusted to the preferences of
|
||||
analysis, may also be applied in other contexts, subject to the necessary evaluators, upon different types of performance (e.g., qualitative or
|
||||
adjustments. quantitative, continuous or discrete).
|
||||
Our proposed process can also be scaled to the program or portfolio
|
||||
level, although this should be done with caution. In the case presented Additionally, it enables the identification of alternative scenarios to
|
||||
here, we applied an additive value function model, which is compen deal with unknown future performances and to test the robustness of the
|
||||
satory—meaning that poor performance on one criterion can be offset conclusions considering uncertainties on the model parameters.
|
||||
by good performance on others. However, this assumption may not al In the target organization, given the shortcomings recognised in a
|
||||
ways hold. In a program or portfolio context, for instance, if a key previous “grid scoring model”, the multi-criteria evaluation model of the
|
||||
project performs poorly, that alone may render the entire program or real-world case described in this paper was built during an advanced
|
||||
portfolio unsuccessful, regardless of the performance of the remaining stage of the project’s development. This late development can be
|
||||
projects. In such cases, a mixed model should be adopted, combining considered a threat to internal validity regarding consistency and a
|
||||
classification rules to address the non-compensatory criteria with an limitation since the evaluation model should be built during the plan
|
||||
additive component for the compensatory ones. ning phase of a project and revisited during the project development to
|
||||
Moreover, the research highlights the absence of standardized ap be improved, if needed, or adjusted to possible changes to the project
|
||||
proaches for evaluating IT project success, which has long been a limi aim. Another threat to external validity should also be disclosed.
|
||||
tation in both academic and professional domains. Standardization Namely, concerning scalability, further research is needed to test if the
|
||||
facilitates the dissemination of knowledge and enhances predictability, proposed process can be scaled or adapted for different project sizes or
|
||||
thereby minimizing uncertainty and reducing risk [75]. By proposing a types.
|
||||
replicable and adaptable process, the study lays the groundwork for the In future work, it would be interesting to create a process capable of
|
||||
development of formalized evaluation standards. This has implications dealing with all project phases, allowing the evaluation of its develop
|
||||
for theory-building, as it suggests a pathway toward unifying frag ment and evolution at several milestones, from the project initiation
|
||||
mented evaluation practices under a coherent, theoretically informed until its termination. The process described in this paper may be
|
||||
model. In doing so, it contributes to the ongoing discourse on stan extended to evaluate project success throughout the project lifecycle.
|
||||
dardization in project management and information systems evaluation, This requires developing a model that includes both final and
|
||||
|
||||
|
||||
12
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
|
||||
intermediate objectives (criteria) for measuring project success. The [10] J. Varajão, J.C. Lourenço, J. Gomes, Models and methods for information systems
|
||||
project success evaluation–a review and directions for research, Heliyon 8 (12)
|
||||
intermediate objectives should be used during project development and
|
||||
(2022), https://doi.org/10.1016/j.heliyon.2022.e11977.
|
||||
later deactivated by setting their weights to zero and rescaling the [11] J. Varajão, J.Á. Carvalho, Evaluating the success of IS/IT projects: how are
|
||||
remaining criteria weights so that they sum to one. Monitoring the companies doing it?, in: Proceedings of the 13th Pre-ICIS International Research
|
||||
evolution of a project’s success against a well-defined set of criteria will Workshop on IT Project Management (IRWITPM 2018), San Francisco, USA, 2018.
|
||||
[12] R.L. Keeney, Common mistakes in making value trade-offs, Oper. Res. 50 (6)
|
||||
allow identifying problems sooner and taking proper measures in time. (2002) 935–945, https://doi.org/10.1287/opre.50.6.935.357.
|
||||
Furthermore, the integration of the proposed evaluation process in the [13] J.E. Russo, P.J.H. Schoemaker, Decision Traps: The Ten Barriers to Brilliant
|
||||
success management process [77] will add value to the management Decision-Making and How to Overcome Them, Doubleday, 1989.
|
||||
[14] S. Lipovetsky, A. Tishler, D. Dvir, A. Shenhar, The relative importance of project
|
||||
efforts. success dimensions, R&D Manag. 27 (2) (1997) 97–106, https://doi.org/10.1111/
|
||||
Finally, since artificial intelligence technology, especially with the 1467-9310.00047.
|
||||
rise of Large Language Models (LLMs), has shown great potential in [15] Shapiro, J. (2005). Monitoring and evaluation. C.-W. A. f. C. Participation. htt
|
||||
ps://www.civicus.org/view/media/Monitoring%20and%20Evaluation.pdf.
|
||||
revolutionizing the automation of various complex tasks [78], it is [16] Kahan, B., & Goodstadt, M. (2005). The IDM manual: basics. http://sites.utoronto.
|
||||
imperative to explore it in the context of success evaluation. ca/chp/download/IDMmanual/IDM_basics_dist05.pdf.
|
||||
[17] V. Arumugam, J. Antony, M. Kumar, Linking learning and knowledge creation to
|
||||
project success in Six Sigma projects: an empirical investigation, Int. J. Prod. Econ.
|
||||
CRediT authorship contribution statement 141 (1) (2013) 388–402, https://doi.org/10.1016/j.ijpe.2012.09.003.
|
||||
[18] R. Linzalone, G. Schiuma, A review of program and project evaluation models,
|
||||
João Carlos Lourenço: Writing – review & editing, Writing – orig Meas. Bus. Excell. 19 (3) (2015) 90–99, https://doi.org/10.1108/MBE-04-2015-
|
||||
0024.
|
||||
inal draft, Visualization, Validation, Software, Methodology, Investiga [19] P.L. Bannerman, A. Thorogood, Celebrating IT projects success: a multi-domain
|
||||
tion, Formal analysis, Conceptualization. João Varajão: Writing – analysis, in: Proceedings of the 45th Hawaii International Conference on System
|
||||
review & editing, Writing – original draft, Validation, Methodology, Sciences, Maui, HI, 2012.
|
||||
[20] C. Barclay, K. Osei-Bryson, Determining the contribution of IS projects: an
|
||||
Investigation, Data curation, Conceptualization.
|
||||
approach to measure performance, in: Proceedings of the 42nd Hawaii
|
||||
International Conference on System Sciences, Waikoloa, HI, 2009.
|
||||
[21] R.L. Keeney, Value-Focused Thinking: A Path to Creative Decisionmaking, Harvard
|
||||
Declaration of competing interest
|
||||
University Press, 1992.
|
||||
[22] R. Solingen, E. Berghout, The Goal/Question/Metric Method: A Practical Guide for
|
||||
The authors declare that they have no known competing financial Quality Improvement of Software Development, McGraw-Hill, 1999.
|
||||
[23] S. French, Decision Theory: An Introduction to the Mathematics of Rationality,
|
||||
interests or personal relationships that could have appeared to influence
|
||||
Ellis Horwood, 1986.
|
||||
the work reported in this paper. [24] R. Göb, C. McCollin, M. Ramalhoto, Ordinal methodology in the analysis of Likert
|
||||
scales, Qual. Quant. 41 (5) (2007) 601–626, https://doi.org/10.1007/s11135-007-
|
||||
9089-z.
|
||||
Acknowledgement
|
||||
[25] S.S. Stevens, On the theory of scales of measurement, Science 103 (2684) (1946)
|
||||
677–680, https://doi.org/10.1126/science.103.2684.677.
|
||||
This work has been supported by FCT – Fundação para a Ciência e [26] W. Edwards, J.R. Newman, Multiattribute evaluation, in: T. Connolly, H.R. Arkes,
|
||||
Tecnologia within the R&D Unit Project Scope UID/00319/2025 - K.R. Hammond (Eds.), Judgment and Decision Making: An Interdisciplinary
|
||||
Reader, 2nd ed, Cambridge University Press, 2000, pp. 17–34.
|
||||
Centro ALGORITMI (ALGORITMI/UM). João C. Lourenço acknowledges [27] R. von Nitzsch, M. Weber, The effect of attribute ranges on weights in
|
||||
the financial support of Portuguese funds through FCT – Fundação para multiattribute utility measurements, Manag. Sci. 39 (8) (1993) 937–943, https://
|
||||
a Ciência e a Tecnologia, I.P., under the project UID/97/2025 (CEGIST). doi.org/10.1287/mnsc.39.8.937.
|
||||
[28] A. Basar, A novel methodology for performance evaluation of IT projects in a fuzzy
|
||||
João C. Lourenço acknowledges the financial support of Portuguese environment: a case study, Soft Comput. 24 (14) (2020) 10755–10770, https://doi.
|
||||
funds through FCT – Fundação para a Ciência e a Tecnologia, I.P., under org/10.1007/s00500-019-04579-y.
|
||||
the project UID/97/2025 (CEGIST). [29] H.N. Ismail, Measuring success of water reservoir project by using delphi and
|
||||
priority evaluation method, in: Proceedings of the IOP Conference Series: Earth
|
||||
and Environmental Science 588, 2020 042021, https://doi.org/10.1088/1755-
|
||||
Data availability 1315/588/4/042021.
|
||||
[30] J.H. Yu, H.R. Kwon, Critical success factors for urban regeneration projects in
|
||||
Korea, Int. J. Proj. Manag. 29 (7) (2011) 889–899, https://doi.org/10.1016/j.
|
||||
The data is presented in the article. ijproman.2010.09.001.
|
||||
[31] A. Nguvulu, S. Yamato, T. Honma, Project performance evaluation using deep
|
||||
References belief networks, IEEJ Trans. Electron. Inf. Syst. 132 (2) (2012) 306–312, https://
|
||||
doi.org/10.1541/ieejeiss.132.306.
|
||||
[32] C. Wohlin, A.A. Andrews, Assessing project success using subjective evaluation
|
||||
[1] R. Colomo-Palacios, I. González-Carrasco, J.L. López-Cuadrado, A. Trigo, J.
|
||||
factors, Softw. Qual. J. 9 (1) (2001) 43–70, https://doi.org/10.1023/a:
|
||||
E. Varajao, I-Competere: using applied intelligence in search of competency gaps in
|
||||
1016673203332.
|
||||
software project managers, Inf. Syst. Front. 16 (4) (2014) 607–625, https://doi.
|
||||
[33] X. Yan, Utilizing the BSC method for IT performance evaluation of construction
|
||||
org/10.1007/s10796-012-9369-6.
|
||||
companies, in: Proceedings of the First International Conference on Information
|
||||
[2] M.A. Kafaji, Interchange roles of formal and informal project management on
|
||||
Science and Engineering, Nanjing, China, 2009.
|
||||
business operational success, Prod. Plan. Control (2022) 1–21, https://doi.org/
|
||||
[34] R.S. Kaplan, D.P. Norton, The balanced scorecard–measures that drive
|
||||
10.1080/09537287.2022.2089265.
|
||||
performance, Harv. Bus. Rev. 70 (1) (1992) 71–79.
|
||||
[3] L.A. Ika, J.K. Pinto, The “re-meaning” of project success: updating and recalibrating
|
||||
[35] C.L. Yang, R.H. Huang, M.T. Ho, Multi-criteria evaluation model for a software
|
||||
for a modern project management, Int. J. Proj. Manag. 40 (7) (2022) 835–848,
|
||||
development project, in: Proceedings of the IEEE International Conference on
|
||||
https://doi.org/10.1016/j.ijproman.2022.08.001.
|
||||
Industrial Engineering and Engineering Management, Hong Kong, China, 2009.
|
||||
[4] B. Lobato, J. Varajão, C. Tam, A.A. Baptista, CrEISPS–a framework of criteria for
|
||||
[36] T.L. Saaty, The Analytic Hierarchy Process: Planning, Priority Setting, Resource
|
||||
evaluating success in information systems projects, Procedia Comput. Sci. 256
|
||||
Allocation, McGraw-Hill, 1980.
|
||||
(2025) (2025) 1821–1835, https://doi.org/10.1016/j.procs.2025.02.323.
|
||||
[37] C.A. Bana e Costa, J.C. Vansnick, A critical analysis of the eigenvalue method used
|
||||
[5] N. Agarwal, U. Rathod, Defining ‘success’ for software projects: an exploratory
|
||||
to derive priorities in AHP, Eur. J. Oper. Res. 187 (3) (2008) 1422–1428, https://
|
||||
revelation, Int. J. Proj. Manag. 24 (4) (2006) 358–370, https://doi.org/10.1016/j.
|
||||
doi.org/10.1016/j.ejor.2006.09.022.
|
||||
ijproman.2005.11.009.
|
||||
[38] J.S. Dyer, Remarks on the analytic hierarchy process, Manag. Sci. 36 (3) (1990)
|
||||
[6] R. Atkinson, Project management: cost, time and quality, two best guesses and a
|
||||
249–258, https://doi.org/10.1287/mnsc.36.3.249.
|
||||
phenomenon, its time to accept other success criteria, Int. J. Proj. Manag. 17 (6)
|
||||
[39] P. Goodwin, G. Wright, Decision Analysis for Management Judgment, 5th ed., John
|
||||
(1999) 337–342, https://doi.org/10.1016/S0263-7863(98)00069-6.
|
||||
Wiley & Sons, 2014.
|
||||
[7] H. Landrum, V.R. Prybutok, X. Zhang, The moderating effect of occupation on the
|
||||
[40] V. Belton, T.J. Stewart, Multiple Criteria Decision Analysis: An Integrated
|
||||
perception of information services quality and success, Comput. Ind. Eng. 58 (1)
|
||||
Approach, Kluwer Academic Publishers, 2002.
|
||||
(2010) 133–142, https://doi.org/10.1016/j.cie.2009.09.006.
|
||||
[41] R.L. Keeney, D. von Winterfeldt, Practical value models, in: W. Edwards, R.
|
||||
[8] J.K. Pinto, D.P. Slevin, Project success: definitions and measurement techniques,
|
||||
F. Miles Jr., D. von Winterfeldt (Eds.), Advances in Decision Analysis: From
|
||||
Proj. Manag. J. 19 (1) (1988) 67–72.
|
||||
Foundations to Applications, Cambridge University Press, 2007, pp. 232–252.
|
||||
[9] J. Varajão, L. Magalhães, L. Freitas, P. Rocha, Success management–from theory to
|
||||
practice, Int. J. Proj. Manag. 40 (5) (2022) 481–498, https://doi.org/10.1016/j.
|
||||
ijproman.2022.04.002.
|
||||
|
||||
|
||||
13
|
||||
J.C. Lourenço and J. Varajão Computer Standards & Interfaces 97 (2026) 104122
|
||||
|
||||
[42] J.S. Dyer, J.E. Smith, Innovations in the science and practice of decision analysis: [61] V. Henriquez, J.A. Calvo-Manzano, A.M. Moreno, T. San Feliu, Agile governance
|
||||
the role of management science, Manag. Sci. 67 (9) (2020) 5364–5378, https://doi. practices by aligning CMMI V2.0 with portfolio SAFe 5.0, Comput. Stand.
|
||||
org/10.1287/mnsc.2020.3652. Interfaces 91 (2025) (2025) 103881, https://doi.org/10.1016/j.csi.2024.103881.
|
||||
[43] J.E. Smith, J.S. Dyer, On (measurable) multiattribute value functions: an [62] V. Ferretti, G. Montibeller, Key challenges and meta-choices in designing and
|
||||
expository argument, Decis. Anal. 18 (4) (2021) 247–256, https://doi.org/ applying multi-criteria spatial decision support systems, Decis. Support Syst. 84
|
||||
10.1287/deca.2021.0435. (2016) 41–52, https://doi.org/10.1016/j.dss.2016.01.005.
|
||||
[44] J.S. Dyer, R.K. Sarin, Measurable multiattribute value functions, Oper. Res. 27 (4) [63] L.D Phillips, Decision conferencing, in: W. Edwards, R.F. Miles Jr., D. von
|
||||
(1979) 810–822, https://doi.org/10.1287/opre.27.4.810. Winterfeldt (Eds.), Advances in Decision Analysis: From Foundations to
|
||||
[45] R.L Keeney, Developing objectives and attributes, in: W. Edwards, R.F. Miles Jr., Applications, Cambridge University Press, 2007, pp. 375–399.
|
||||
D. von Winterfeldt (Eds.), Advances in Decision Analysis: From Foundations to [64] T.Y. Chen, H.F. Chang, Critical success factors and architecture of innovation
|
||||
Applications, Cambridge University Press, 2007, pp. 104–128. services models in data industry, Expert Syst. Appl. 213 (2023) 119014, https://
|
||||
[46] B. Fasolo, C.A. Bana e Costa, Tailoring value elicitation to decision makers' doi.org/10.1016/j.eswa.2022.119014.
|
||||
numeracy and fluency: expressing value judgments in numbers or words, Omega [65] C.M. Smith, D. Shaw, The characteristics of problem structuring methods: a
|
||||
44 (0) (2014) 83–90, https://doi.org/10.1016/j.omega.2013.09.006. literature review, Eur. J. Oper. Res. 274 (2) (2019) 403–416, https://doi.org/
|
||||
[47] C.A. Bana e Costa, E.C. Corrêa, J.M. De Corte, J.C. Vansnick, Facilitating bid 10.1016/j.ejor.2018.05.003.
|
||||
evaluation in public call for tenders: a socio-technical approach, Omega 30 (3) [66] M. Marttunen, J. Lienert, V. Belton, Structuring problems for multi-criteria
|
||||
(2002) 227–242, https://doi.org/10.1016/S0305-0483(02)00029-4. decision analysis in practice: a literature review of method combinations, Eur. J.
|
||||
[48] R.L. Keeney, H. Raiffa, Decisions With Multiple Objectives: Preferences and Value Oper. Res. 263 (1) (2017) 1–17, https://doi.org/10.1016/j.ejor.2017.04.041.
|
||||
Tradeoffs, John Wiley & Sons, 1976. [67] C.A. Bana e Costa, J.C. Lourenço, M.P. Chagas, J.C. Bana e Costa, Development of
|
||||
[49] W. Edwards, F.H. Barron, SMARTS and SMARTER: improved simple methods for reusable bid evaluation models for the Portuguese Electric Transmission Company,
|
||||
multiattribute utility measurement, Organ. Behav. Hum. Decis. Process. 60 (3) Decis. Anal. 5 (1) (2008) 22–42, https://doi.org/10.1287/deca.1080.0104.
|
||||
(1994) 306–325, https://doi.org/10.1006/obhd.1994.1087. [68] D. Clegg, R. Barker, Case Method Fast-Track: A RAD Approach, Addison-Wesley
|
||||
[50] C.A. Bana e Costa, J.C. Vansnick, MACBETH – An interactive path towards the Longman Publishing, 1994.
|
||||
construction of cardinal value functions, Int. Trans. Oper. Res. 1 (4) (1994) [69] M. Weber, Decision making with incomplete information, Eur. J. Oper. Res. 28 (1)
|
||||
489–500, https://doi.org/10.1016/0969-6016(94)90010-8. (1987) 44–57, https://doi.org/10.1016/0377-2217(87)90168-8.
|
||||
[51] C.A. Bana e Costa, J.M. De Corte, J.C. Vansnick, MACBETH, Int. J. Inf. Technol. [70] C.A. Bana e Costa, P. Vincke, Measuring credibility of compensatory preference
|
||||
Decis. Mak. 11 (2) (2012) 359–387, https://doi.org/10.1142/ statements when trade-offs are interval determined, Theory Decis. 39 (2) (1995)
|
||||
S0219622012400068. 127–155, https://doi.org/10.1007/BF01078981.
|
||||
[52] C.A. Bana e Costa, J.M. De Corte, J.C. Vansnick, On the mathematical foundations [71] L.D. Phillips, A theory of requisite decision models, Acta Psychol. 56 (1–3) (1984)
|
||||
of MACBETH, in: S. Greco, M. Ehrgott, J.R. Figueira (Eds.), Multiple Criteria 29–48, https://doi.org/10.1016/0001-6918(84)90005-2.
|
||||
Decision Analysis: State of the Art Surveys, Springer, 2016, pp. 421–463, https:// [72] J. Pereira, J. Varajão, N. Takagi, Evaluation of information systems project
|
||||
doi.org/10.1007/978-1-4939-3094-4_11. success–insights from practitioners, Inf. Syst. Manag. (2021) 1–18, https://doi.org/
|
||||
[53] W. Edwards, How to use multiattribute utility measurement for social 10.1080/10580530.2021.1887982.
|
||||
decisionmaking, IEEE Trans. Syst. Man Cybern. 7 (5) (1977) 326–340, https://doi. [73] N. Takagi, J. Varajão, ISO 21502 and Success Management: A Required Marriage in
|
||||
org/10.1109/TSMC.1977.4309720. Project Management, SAGE Open, 2025, pp. 1–11, https://doi.org/10.1177/
|
||||
[54] D. von Winterfeldt, W. Edwards, Decision Analysis and Behavioral Research, 21582440251355046. July-September.
|
||||
Cambridge University Press, 1986. [74] F.A.F. Ferreira, S.P. Santos, Two decades on the MACBETH approach: a
|
||||
[55] C.W. Kirkwood, Strategic Decision Making: Multiobjective Decision Analysis with bibliometric analysis, Ann. Oper. Res. 296 (1) (2021) 901–925, https://doi.org/
|
||||
Spreadsheets, Duxbury Press, 1997. 10.1007/s10479-018-3083-9v.
|
||||
[56] L.I. Assalaarachchi, M.P.P. Liyanage, C. Hewagamage, A framework of critical [75] J. Varajão, L. Lopes, A. Tenera, Framework of standards, guides and methodologies
|
||||
success factors of cloud-based project management software adoption, Int. J. Inf. for project, program, portfolio, and PMO management, Comput. Stand. Interfaces
|
||||
Syst. Proj. Manag. 13 (2) (2025) e4, https://doi.org/10.12821/ijispm130204. 92 (2025) (2025) 103888, https://doi.org/10.1016/j.csi.2024.103888.
|
||||
[57] N. Pinheiro, J. Vrajão, I. Moura, Success factors of public sector information [76] I.I. Mitroff, T.R. Featheringham, On systemic problem solving and the error of the
|
||||
systems projects in developing countries, Sustain. Futures 10 (2025) (2025) third kind, Behav. Sci. 19 (6) (1974) 383–393, https://doi.org/10.1002/
|
||||
101095, https://doi.org/10.1016/j.sftr.2025.101095. bs.3830190605.
|
||||
[58] J. Jayakody, W. Wijayanayake, Critical success factors for DevOps adoption in [77] J. Varajão, Success Management as a PM knowledge area – work-in-progress,
|
||||
information systems development, Int. J. Inf. Syst. Proj. Manag. 11 (3) (2023) Procedia Comput. Sci. 100 (2016) (2016) 1095–1102, https://doi.org/10.1016/j.
|
||||
60–82, https://doi.org/10.12821/ijispm110304. procs.2016.09.256.
|
||||
[59] K. Schwaber, J. Sutherland, The Scrum Guide - The Definitive Guide to Scrum: The [78] Y. Kong, N. Zhang, Z. Duan, B. Yu, Collaboration with generative AI to improve
|
||||
Rules of the Game, scrumguides.org, 2020. https://scrumguides.org/docs/sc requirements change, Comput. Stand. Interfaces 94 (2025) (2025) 104013, https://
|
||||
rumguide/v2020/2020-Scrum-Guide-US.pdf. doi.org/10.1016/j.csi.2025.104013.
|
||||
[60] M. Jovanovic, A.L. Mesquida, A. Mas, R. Colomo-Palacios, Agile transition and
|
||||
adoption frameworks, issues and factors: a systematic mapping, IEEE Access 8
|
||||
(2020) (2020) 15711–15735, https://doi.org/10.1109/ACCESS.2020.2967839.
|
||||
|
||||
|
||||
|
||||
|
||||
14
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,726 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
ARMOR: A multi-layered adaptive defense framework for robust deep
|
||||
learning systems against evolving adversarial threatsI
|
||||
∗
|
||||
Mahmoud Mohamed , Fayaz AlJuaid
|
||||
Electrical and Computer Engineering , King Abdul Aziz University, Saudi Arabia
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: Introduction: Adversarial attacks represent a major challenge to deep learning models deployed in critical
|
||||
Adversarial machine learning fields such as healthcare diagnostics and financial fraud detection. This paper addresses the limitations of
|
||||
Deep learning security single-strategy defenses by introducing ARMOR (Adaptive Resilient Multi-layer Orchestrated Response), a novel
|
||||
Multi-layered defense
|
||||
multi-layered architecture that seamlessly integrates multiple defense mechanisms.
|
||||
Robustness evaluation
|
||||
Methodology: We evaluate ARMOR against seven state-of-the-art defense methods through extensive experi-
|
||||
Adaptive security
|
||||
ments across multiple datasets and five attack methodologies. Our approach combines adversarial detection, in-
|
||||
put transformation, model hardening, and adaptive response layers that operate with intentional dependencies
|
||||
and feedback mechanisms.
|
||||
Results: Quantitative results demonstrate that ARMOR significantly outperforms individual defense methods,
|
||||
achieving a 91.7% attack mitigation rate (18.3% improvement over ensemble averaging), 87.5% clean accuracy
|
||||
preservation (8.9% improvement over adversarial training alone), and 76.4% robustness against adaptive
|
||||
attacks (23.2% increase over the strongest baseline).
|
||||
Discussion: The modular framework design enables flexibility against emerging threats while requiring only
|
||||
1.42× computational overhead compared to unprotected models, making it suitable for resource-constrained
|
||||
environments. Our findings demonstrate that activating and integrating complementary defense mechanisms
|
||||
represents a significant advance in adversarial resilience.
|
||||
|
||||
|
||||
|
||||
1. Introduction However, existing defenses are typically based on single strategies
|
||||
such as adversarial training [6], input preprocessing [7], or detection
|
||||
Deep learning technologies have been widely adopted in critical models [8]. While effective against specific attacks, these methods
|
||||
sectors including autonomous vehicles, medical diagnostics, and cy- often fail when facing diverse or adaptive attacks [9]. This limita-
|
||||
bersecurity. While they offer powerful capabilities, they also introduce tion is increasingly concerning as adversaries continue to evolve their
|
||||
new security vulnerabilities. Adversarial examples—carefully crafted strategies. Furthermore, existing techniques often suffer from high com-
|
||||
inputs designed to deceive models—pose significant risks to AI sys- putational costs, degraded performance on clean data, and continued
|
||||
tems [1,2]. Small, seemingly imperceptible distortions can cause state- susceptibility to adaptive attacks [10].
|
||||
of-the-art models to misclassify inputs, which may have life-threatening Problem Statement: This paper addresses the vulnerability of deep
|
||||
consequences in safety-critical applications [3]. learning systems to adversarial attacks in mission-critical environments.
|
||||
Recent advances in deep learning have highlighted the importance Current defenses exhibit three key weaknesses:
|
||||
of robust defense mechanisms. For example, UNet-based segmentation
|
||||
models in medical imaging have achieved approximately 96% accuracy 1. They typically optimize for a single threat model, leaving them
|
||||
in COVID-19 detection from CT scans [4]. Similarly, CNN and BiGRU exposed to diverse attack strategies.
|
||||
models have demonstrated strong performance in traffic network anal- 2. They employ static approaches that cannot adapt to evolving
|
||||
ysis with an R-squared of 0.9912 [5]. These successes underscore the threats.
|
||||
critical need for robust defenses, particularly as deep learning models 3. They fail to balance performance and security, often sacrificing
|
||||
are increasingly integrated into high-stakes decision-making processes. accuracy on benign data.
|
||||
|
||||
|
||||
|
||||
I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces.
|
||||
∗ Corresponding author.
|
||||
E-mail address: mhassan0085@stu.kau.edu.sa (M. Mohamed).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104117
|
||||
Received 2 June 2025; Received in revised form 2 December 2025; Accepted 12 December 2025
|
||||
Available online 17 December 2025
|
||||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
These weaknesses motivate the need for an agile and flexible defense 2.3. Detection-based defenses
|
||||
architecture.
|
||||
Research Gaps: Our comprehensive literature survey, following Detection methods aim to identify adversarial examples without
|
||||
systematic review methodologies [11], identifies several critical gaps: necessarily correcting them. Metzen et al. [8] attached a binary detec-
|
||||
tor subnetwork to identify adversarial inputs. Lee et al. [22] used Ma-
|
||||
• Most defenses optimize for a single threat model, creating vulner- halanobis distance-based confidence scores to detect out-of-distribution
|
||||
abilities across diverse attack strategies [12].
|
||||
samples.
|
||||
• Current ensemble approaches typically use simple voting or aver- Recent approaches include statistical methods using odds ratio
|
||||
aging, failing to leverage the complementary strengths of different
|
||||
tests [23] and Local Intrinsic Dimensionality (LID) [24] to characterize
|
||||
defense mechanisms [13].
|
||||
adversarial regions in feature space.
|
||||
• There is insufficient focus on dynamic adaptation to evolving
|
||||
While detection mechanisms can be accurate, adaptive attacks
|
||||
threats in real-time operational environments [14].
|
||||
specifically target their vulnerabilities [25]. Moreover, they do not
|
||||
• The performance-security trade-off is poorly addressed, with
|
||||
provide predictions for identified adversarial examples.
|
||||
many techniques significantly degrading model performance on
|
||||
benign inputs [15].
|
||||
2.4. Certified robustness approaches
|
||||
Our ARMOR framework addresses these gaps through:
|
||||
Certified defenses provide theoretical guarantees that perturbations
|
||||
• Orchestrated Integration: Complementary defense layers oper- within certain bounds will not alter predictions. Cohen et al. [26]
|
||||
ate cooperatively rather than in isolation. applied randomized smoothing to create certifiably robust classifiers
|
||||
• Dynamic Threat Assessment: Adaptive response mechanisms against L2-norm bounded perturbations. Gowal et al. [27] developed
|
||||
learn from observed attack patterns. interval bound propagation for training verifiably robust networks.
|
||||
• Explicit Trade-off Optimization: High clean accuracy is main- Recent progress includes DeepPoly [28], which provides tighter
|
||||
tained while improving robustness. bounds for neural network verification, and improved certification
|
||||
• Comprehensive Testing: Evaluation across diverse attacks, in- bounds for cascading architectures [29].
|
||||
cluding engineered adaptive attacks. While certified methods offer valuable theoretical assurances, they
|
||||
• Modular Design: New defense mechanisms can be incorporated generally achieve lower empirical robustness than adversarial training
|
||||
as they emerge. and can be significantly more resource-intensive [30].
|
||||
As shown in Table 1, our method advances the state-of-the-art
|
||||
2.5. Ensemble and hybrid approaches
|
||||
across multiple performance dimensions while maintaining reasonable
|
||||
computational overhead.
|
||||
Ensemble methods combine multiple models or defense mechanisms
|
||||
2. Related work to enhance robustness. Tramèr et al. [31] proposed Ensemble Adversar-
|
||||
ial Training, which augments training data with adversarial examples
|
||||
This section analyzes current adversarial defense mechanisms, their from other models. Pang et al. [13] introduced adaptive diversity
|
||||
limitations, and specific gaps our framework addresses. We categorize promoting (ADP) training to develop robust ensemble models. Sen
|
||||
existing work into adversarial training, input transformation, detection- et al. [32] integrated detection and adversarial training in a two-stage
|
||||
based methods, certified robustness, and ensemble approaches. process.
|
||||
However, most current ensembles employ basic averaging or voting
|
||||
2.1. Adversarial training methods schemes that fail to leverage the complementary strengths of different
|
||||
defense types [33].
|
||||
Adversarial training remains one of the most effective empirical
|
||||
defense mechanisms. Madry et al. [6] introduced PGD adversarial 2.6. Research gaps and contributions
|
||||
training, which serves as a strong baseline but suffers from reduced
|
||||
clean accuracy and high computational cost.
|
||||
Based on our literature review, we identify the following critical
|
||||
Recent advances include TRADES [15], which explicitly regularizes
|
||||
research gaps:
|
||||
the trade-off between standard accuracy and robustness; Fast Adver-
|
||||
sarial Training [16], which improves computational efficiency using • Poor Integration: Most studies focus on single defenses or simple
|
||||
FGSM with randomization; and Robust Self-Training (RST) [17], which combinations that fail to leverage synergistic effects.
|
||||
leverages additional unlabeled data to enhance robustness.
|
||||
• Static Defense Mechanisms: Current approaches use fixed
|
||||
Despite these improvements, adversarial training techniques remain
|
||||
strategies that cannot adapt to evolving threats.
|
||||
fundamentally constrained: they are typically resistant only to attacks
|
||||
• Performance-Security Trade-offs: Robust models frequently sac-
|
||||
encountered during training, often fail on out-of-distribution samples,
|
||||
rifice clean-data accuracy.
|
||||
and exhibit reduced performance on clean data [18].
|
||||
• Lack of Standardization: Inconsistent evaluation protocols hin-
|
||||
2.2. Input transformation approaches der fair comparisons.
|
||||
• Insufficient Adaptive Attack Testing: Most defenses are not
|
||||
Input transformation methods aim to remove adversarial perturba- evaluated against adaptive attacks designed to circumvent them.
|
||||
tions before model inference. Guo et al. [7] explored various image
|
||||
transformations, finding that total variance minimization and image Our ARMOR framework addresses these gaps through:
|
||||
quilting provide moderate robustness. Xie et al. [19] proposed random
|
||||
resizing and padding as preprocessing defenses. • Orchestrated Integration: Complementary defense layers oper-
|
||||
More recent work includes Neural Representation Purifiers [20], ate cooperatively rather than in isolation.
|
||||
which use self-supervised learning to clean adversarial inputs, and • Dynamic Threat Assessment: Response mechanisms adapt based
|
||||
ComDefend [21], a compression-decompression architecture that elim- on observed attack patterns.
|
||||
inates adversarial perturbations. • Explicit Trade-off Optimization: High clean accuracy is main-
|
||||
While these methods often preserve accuracy better than adversarial tained while improving robustness.
|
||||
training, they remain vulnerable to adaptive attacks that account for • Comprehensive Testing: Evaluation across diverse attacks, in-
|
||||
the transformation process [10]. cluding engineered adaptive attacks.
|
||||
|
||||
2
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
Table 1
|
||||
Comparison of state-of-the-art adversarial defense methods (2020–2025).
|
||||
Reference Year Defense type Multi-attack robustness Clean accuracy Computation overhead Adaptive attack resistance
|
||||
Madry et al. [6] 2018 Adversarial training Medium (66.4%) Low (87.3%) High (10×) Medium (54.2%)
|
||||
Zhang et al. [15] 2019 Adv. training (TRADES) Medium (73.5%) Medium (84.9%) High (7×) Medium (61.8%)
|
||||
Cohen et al. [26] 2019 Certified defense Low (49.2%) Medium (83.5%) Very high (30×) High (guaranteed bounds)
|
||||
Wong et al. [16] 2020 Fast Adv. training Medium (71.2%) Medium-high (85.8%) Medium (3×) Medium (58.3%)
|
||||
Rebuffi et al. [17] 2021 Robust self-training High (76.5%) Medium-high (86.1%) High (12×) Medium-high (64.5%)
|
||||
Ma et al. [24] 2021 Detection-based Low-medium (detection only) Very high (99.1%) Low (1.2×) Low (35.6%)
|
||||
Naseer et al. [20] 2020 Input transformation Medium (68.7%) High (88.3%) Medium (2.5×) Low (42.1%)
|
||||
Pang et al. [13] 2019 Ensemble Medium-high (74.8%) Medium (83.2%) Very high (15×) Medium (63.1%)
|
||||
Sen et al. [32] 2020 Hybrid Medium-high (75.1%) Medium (83.9%) High (8×) Medium (62.5%)
|
||||
Kariyappa et al. [34] 2019 Diversity ensemble Medium-high (73.9%) Medium (84.1%) Very high (18×) Medium-high (65.8%)
|
||||
Jia et al. [21] 2019 Stochastic defense Medium (67.2%) High (89.5%) Low (1.5×) Low-medium (53.6%)
|
||||
Gowal et al. [27] 2019 Interval bound Prop. Medium (68.8%) Medium (82.8%) High (9×) High (certified regions)
|
||||
Yang et al. [29] 2020 Certified defense Medium (64.3%) Medium (84.2%) High (7×) High (certified regions)
|
||||
Croce et al. [30] 2022 Regularization Medium-high (73.8%) Medium-high (85.7%) Medium (4×) Medium (60.9%)
|
||||
Wei et al. [35] 2021 Adv. distillation Medium-high (75.6%) Medium-high (86.3%) Medium (3.5×) Medium-High (64.2%)
|
||||
Our work (ARMOR) 2025 Multi-layered Very high (91.7%) High (87.5%) Low-medium (1.42×) High (76.4%)
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1. ARMOR framework architecture showing the orchestrated multi-layered defense approach.
|
||||
|
||||
|
||||
• Modular Design: New defense mechanisms can be incorporated • Input Transformation Layer: Applies appropriate preprocessing
|
||||
as they emerge. techniques to remove or reduce adversarial perturbations.
|
||||
• Model Robustness Layer: Employs robust model architectures
|
||||
As shown in Table 1, ARMOR advances the state-of-the-art across and training techniques to withstand remaining adversarial ef-
|
||||
multiple performance dimensions while maintaining reasonable com- fects.
|
||||
putational overhead. • Adaptive Response Layer: Dynamically adjusts defense strate-
|
||||
gies based on observed attack patterns and feedback.
|
||||
3. Methodology
|
||||
Unlike static pipeline approaches, ARMOR uses an orchestration
|
||||
This section describes the ARMOR framework architecture and its mechanism to dynamically route inputs through the most effective com-
|
||||
components. bination of defense components based on threat assessment and his-
|
||||
torical performance data. This orchestrated approach provides stronger
|
||||
3.1. Framework overview protection than any single layer or static combination.
|
||||
|
||||
As shown in Fig. 1, ARMOR integrates four complementary defense
|
||||
3.2. Threat assessment layer
|
||||
layers:
|
||||
|
||||
• Threat Assessment Layer: Analyzes inputs to detect potential The threat assessment layer employs multiple detection methods to
|
||||
adversarial examples and characterize their properties. identify and classify adversarial examples:
|
||||
|
||||
3
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
3.2.1. Feature space analysis 3.3.2. Frequency domain filtering
|
||||
We compute the Mahalanobis distance between an input sample Based on the frequency analysis from the threat assessment layer,
|
||||
𝑥 and the distribution of legitimate training examples in the fea- we apply targeted filtering to remove adversarial components in spe-
|
||||
ture space. For each layer 𝑙 of the neural network, we model the cific frequency bands. For an input 𝑥, we compute its wavelet transform
|
||||
class-conditional distribution of legitimate examples as a multivariate 𝑊 (𝑥), apply a filtering function 𝜙 to the coefficients, and compute the
|
||||
Gaussian with parameters 𝜇𝑐𝑙 and 𝛴 𝑙 , where 𝑐 represents the predicted inverse transform:
|
||||
class. The Mahalanobis distance score 𝑀 𝑙 (𝑥) is computed as:
|
||||
𝑥̂ = 𝑊 −1 (𝜙(𝑊 (𝑥), 𝑎(𝑥))) (7)
|
||||
𝑀 𝑙 (𝑥) = min(𝑓 𝑙 (𝑥) − 𝜇𝑐𝑙 )𝑇 (𝛴 𝑙 )−1 (𝑓 𝑙 (𝑥) − 𝜇𝑐𝑙 ) (1)
|
||||
𝑐
|
||||
The filtering function 𝜙 adapts based on the attack characteri-
|
||||
where 𝑓 𝑙 (𝑥) represents the feature vector at layer 𝑙 for input 𝑥. zation, targeting frequency bands most likely to contain adversarial
|
||||
perturbations.
|
||||
3.2.2. Prediction consistency check
|
||||
We measure the consistency of model predictions when the input is
|
||||
3.3.3. Randomized smoothing
|
||||
subjected to small benign transformations. Given a set of 𝑘 transforma-
|
||||
For inputs with high uncertainty, we apply randomized smoothing
|
||||
tions {𝑇1 , 𝑇2 , … , 𝑇𝑘 } and model 𝑓 , the consistency score 𝐶(𝑥) is defined
|
||||
as: with Gaussian noise:
|
||||
|
||||
1∑
|
||||
𝑘 𝑥̂ = 𝑥 + (0, 𝜎 2 𝐼) (8)
|
||||
𝐶(𝑥) = I[𝑓 (𝑇𝑖 (𝑥)) = 𝑓 (𝑥)] (2)
|
||||
𝑘 𝑖=1 where 𝜎 is dynamically adjusted based on the threat score and attack
|
||||
where I[⋅] is the indicator function. characterization, increasing for high-threat inputs to provide stronger
|
||||
smoothing.
|
||||
3.2.3. Frequency domain analysis
|
||||
We perform discrete wavelet transform (DWT) on the input to 3.4. Model robustness layer
|
||||
analyze its frequency characteristics. Adversarial perturbations often
|
||||
exhibit distinctive patterns in high-frequency components. We compute The model robustness layer integrates multiple robust architectures
|
||||
the energy distribution across frequency bands and compare it to the and training techniques:
|
||||
typical distribution in legitimate samples. The frequency abnormality
|
||||
score 𝐹 (𝑥) is calculated as:
|
||||
3.4.1. Diverse model ensemble
|
||||
∑
|
||||
𝑚
|
||||
We employ an ensemble of models with diverse architectures and
|
||||
𝐹 (𝑥) = 𝑤𝑖 ⋅ |𝐸𝑖 (𝑥) − 𝜇𝐸𝑖 | (3)
|
||||
𝑖=1 training procedures:
|
||||
where 𝐸𝑖 (𝑥) is the energy in frequency band 𝑖, 𝜇𝐸𝑖 is the mean energy = {𝑓1 , 𝑓2 , … , 𝑓𝑛 } (9)
|
||||
for legitimate samples in that band, and 𝑤𝑖 are learned weights.
|
||||
Instead of simple averaging, we compute weighted predictions
|
||||
3.2.4. Integrated threat score based on each model’s historical performance against the detected
|
||||
The individual detection scores are combined into an integrated attack type:
|
||||
threat score 𝑇 (𝑥) using a logistic regression model:
|
||||
∑
|
||||
𝑛
|
||||
|
||||
𝑇 (𝑥) = 𝜎(𝑤𝑀 ⋅ 𝑀(𝑥) + 𝑤𝐶 ⋅ 𝐶(𝑥) + 𝑤𝐹 ⋅ 𝐹 (𝑥) + 𝑏) (4) 𝑝(𝑦|𝑥) = 𝑤𝑖 (𝑎(𝑥)) ⋅ 𝑝𝑖 (𝑦|𝑥) (10)
|
||||
𝑖=1
|
||||
where 𝜎 is the sigmoid function, and 𝑤𝑀 , 𝑤𝐶 , 𝑤𝐹 , and 𝑏 are learned where 𝑤𝑖 (𝑎(𝑥)) is the weight assigned to model 𝑖 based on the attack
|
||||
parameters. characterization 𝑎(𝑥).
|
||||
In addition to binary adversarial/legitimate classification, the threat
|
||||
assessment layer provides an attack characterization vector 𝑎(𝑥) that
|
||||
3.4.2. Feature denoising
|
||||
estimates properties such as attack strength, perceptibility, and tar-
|
||||
We incorporate feature denoising modules at multiple network lev-
|
||||
geted/untargeted nature:
|
||||
els. For a feature map ℎ, the denoised features ℎ̂ are computed as:
|
||||
𝑎(𝑥) = 𝑔(𝑀(𝑥), 𝐶(𝑥), 𝐹 (𝑥), 𝑓 (𝑥)) (5)
|
||||
where 𝑔 is a small neural network trained on a diverse set of known ℎ̂ = ℎ + 𝛾 ⋅ 𝐺(ℎ, 𝑎(𝑥)) (11)
|
||||
attacks.
|
||||
where 𝐺 is a non-local denoising function and 𝛾 is a learnable param-
|
||||
3.3. Input transformation layer eter controlling denoising strength.
|
||||
|
||||
The input transformation layer employs multiple preprocessing 3.4.3. Robust training objective
|
||||
techniques to remove or reduce adversarial perturbations. Rather than Models in the ensemble are trained using a composite objective
|
||||
applying all transformations sequentially (which would degrade clean function balancing standard accuracy, adversarial robustness, and
|
||||
performance), ARMOR selectively applies the most appropriate trans- model diversity:
|
||||
formations based on threat assessment:
|
||||
= 𝛼 ⋅ 𝐶𝐸 (𝑥) + 𝛽 ⋅ 𝐴𝐷𝑉 (𝑥) + 𝛾 ⋅ 𝐷𝐼𝑉 (𝑥, ) (12)
|
||||
3.3.1. Adaptive denoising
|
||||
We employ a conditional autoencoder 𝐷𝜃 trained to remove adver- where 𝐶𝐸 is standard cross-entropy loss, 𝐴𝐷𝑉 is adversarial loss, and
|
||||
sarial perturbations while preserving semantic content. The denoising 𝐷𝐼𝑉 is a diversity-promoting loss that encourages models to make
|
||||
process is conditioned on the attack characterization vector 𝑎(𝑥): different mistakes.
|
||||
|
||||
𝑥̂ = 𝐷𝜃 (𝑥, 𝑎(𝑥)) (6) 3.5. Adaptive response layer
|
||||
This conditioning allows the denoiser to adapt its behavior based on
|
||||
the detected attack type, improving both effectiveness and clean data The adaptive response layer continuously updates defense strategies
|
||||
preservation. based on observed attack patterns and performance feedback:
|
||||
|
||||
4
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
3.5.1. Attack pattern recognition Algorithm 1 ARMOR Orchestration Mechanism
|
||||
We maintain a historical database of attack patterns and their 1: Input: Input sample 𝑥, trained models , orchestration policy 𝜋
|
||||
effectiveness against different defense configurations. New inputs are 2: Output: Prediction 𝑦, updated effectiveness scores
|
||||
compared to this database to identify similar patterns: 3: Compute threat assessment 𝑇 (𝑥) and attack characterization 𝑎(𝑥)
|
||||
( ) 4: Select initial defense configuration 𝑐0 = 𝜋(𝑥, 𝑇 (𝑥), 𝑎(𝑥))
|
||||
‖𝑎(𝑥) − 𝑎(𝑥𝑖 )‖2
|
||||
𝑠(𝑥, 𝑥𝑖 ) = exp − (13) 5: Apply defenses in 𝑐0 to 𝑥, obtaining intermediate result 𝑥̂ 0
|
||||
2𝜎 2
|
||||
6: Evaluate model confidence on 𝑥̂ 0
|
||||
where 𝑠(𝑥, 𝑥𝑖 ) measures similarity between the current input 𝑥 and 7: if confidence below threshold then
|
||||
historical sample 𝑥𝑖 . 8: Select additional defenses 𝑐1 = 𝜋(𝑥̂ 0 , 𝑇 (𝑥̂ 0 ), 𝑎(𝑥̂ 0 ))
|
||||
9: Apply defenses in 𝑐1 to 𝑥̂ 0 , obtaining 𝑥̂ 1
|
||||
10: Set 𝑥̂ = 𝑥̂ 1
|
||||
3.5.2. Defense effectiveness tracking 11: else
|
||||
For each defense component 𝑑 and attack type 𝑎, we track historical 12: Set 𝑥̂ = 𝑥̂ 0
|
||||
effectiveness 𝐸(𝑑, 𝑎) based on successful mitigation. This score updates 13: end if
|
||||
after each prediction: 14: Compute final prediction 𝑦 = 𝑓 (𝑥) ̂
|
||||
15: Update effectiveness scores 𝐸(𝑑, 𝑎(𝑥)) for all applied defenses 𝑑
|
||||
𝐸(𝑑, 𝑎) ← 𝜆 ⋅ 𝐸(𝑑, 𝑎) + (1 − 𝜆) ⋅ 𝑆(𝑑, 𝑥) (14) 16: return 𝑦, updated 𝐸
|
||||
|
||||
where 𝑆(𝑑, 𝑥) indicates success of defense component 𝑑 on input 𝑥, and
|
||||
𝜆 is a forgetting factor weighting recent observations.
|
||||
3.7. Implementation details
|
||||
|
||||
3.5.3. Defense strategy optimization
|
||||
ARMOR was implemented in PyTorch as follows:
|
||||
Based on effectiveness tracking, we periodically update the or-
|
||||
chestration policy to optimize input routing through defense layers: • Threat Assessment Layer: ResNet-50 pre-trained on ImageNet
|
||||
for feature extraction. Detection models are trained on clean and
|
||||
∑ adversarial examples generated using PGD, C&W, and AutoAt-
|
||||
𝜋(𝑥) = arg max 𝐸(𝑑, 𝑎(𝑥)) (15) tack.
|
||||
𝑐
|
||||
𝑑∈𝑐
|
||||
• Input Transformation Layer: U-Net autoencoder with skip con-
|
||||
where 𝜋(𝑥) selects the defense configuration for input 𝑥 and 𝑐 represents nections and conditioning. Wavelet transforms use PyWavelets
|
||||
a potential defense component configuration. with db4 wavelets.
|
||||
• Model Robustness Layer: Ensemble of ResNet-50, DenseNet-
|
||||
121, and EfficientNet-B3, trained with various robust optimiza-
|
||||
3.6. Orchestration mechanism tion methods (TRADES, MART, AWP).
|
||||
• Adaptive Response Layer: Historical database using locality-
|
||||
The orchestration mechanism is ARMOR’s key innovation, enabling sensitive hashing for efficient similarity search. Orchestration
|
||||
dynamic routing of inputs through the most effective combination of policy trained using Proximal Policy Optimization (PPO).
|
||||
defense components. The orchestrator uses a Markov Decision Process
|
||||
The overall computational cost depends on the defense configu-
|
||||
(MDP) formulation:
|
||||
ration selected by the orchestrator. In our experiments, the average
|
||||
overhead is 1.42× compared to an unprotected model, ranging from
|
||||
• State: The current state 𝑠𝑡 includes input 𝑥, threat assessment
|
||||
1.1× (minimal defense) to 2.8× (full defense stack).
|
||||
𝑇 (𝑥), attack characterization 𝑎(𝑥), and current model confidence.
|
||||
• Actions: Each action 𝑎𝑡 represents selection of a specific defense
|
||||
component or combination. 4. Experimental setup
|
||||
• Reward: The reward 𝑟𝑡 is defined by correct classification, with
|
||||
penalties for unnecessary computational overhead. 4.1. Research questions
|
||||
• Policy: The policy 𝜋(𝑎𝑡 |𝑠𝑡 ) is a neural network predicting optimal
|
||||
defense configuration given the current state. Our study addresses the following research questions:
|
||||
|
||||
The policy is trained using reinforcement learning on diverse attacks • RQ1: How does ARMOR compare to state-of-the-art individual
|
||||
and inputs. During deployment, the orchestrator processes each input and ensemble defenses in robustness against diverse attacks?
|
||||
sequentially: • RQ2: How does ARMOR preserve clean data accuracy compared
|
||||
to existing defenses?
|
||||
1. Compute threat assessment and attack characterization. • RQ3: What is ARMOR’s resistance to adaptive attacks targeting
|
||||
2. Select initial defense configuration based on the policy. its components?
|
||||
3. Apply selected defenses and evaluate the result. • RQ4: How does ARMOR’s computational overhead compare to
|
||||
4. If necessary, select additional defenses based on the updated other defenses?
|
||||
state. • RQ5: What are the contributions of individual ARMOR compo-
|
||||
5. Return final prediction and update effectiveness tracking. nents to overall effectiveness?
|
||||
|
||||
|
||||
This dynamic approach allows ARMOR to provide strong protec- 4.2. Datasets
|
||||
tion while minimizing computational overhead. Low-threat inputs re-
|
||||
ceive minimal defenses, preserving efficiency, while high-threat inputs We evaluate ARMOR on four image classification datasets selected
|
||||
receive comprehensive protection. to represent varying complexity and domains:
|
||||
|
||||
5
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
• CIFAR-10: 60,000 32 × 32 color images across 10 classes (50,000 Table 2
|
||||
training, 10,000 test). This benchmark standard tests defenses on Robust accuracy (%) against different attack types on CIFAR-10.
|
||||
small to medium-complexity images [36]. Defense PGD C&W AutoAttack BPDA EOT Average
|
||||
• SVHN: Street View House Numbers with 73,257 training and No defense 0.0 0.0 0.0 0.0 0.0 0.0
|
||||
26,032 test images of digits. This dataset evaluates defense gen- AT 47.3 54.1 43.8 46.2 45.9 47.5
|
||||
TRADES 49.8 55.6 45.2 48.3 47.1 49.2
|
||||
eralization to digit recognition [37].
|
||||
RS 38.9 42.3 36.5 25.1 18.4 32.2
|
||||
• GTSRB: German Traffic Sign Recognition Benchmark with 39,209 FD 45.7 50.2 41.3 44.5 44.1 45.2
|
||||
training and 12,630 test images across 43 traffic sign classes. IT 35.4 38.6 21.7 15.3 33.2 28.8
|
||||
This real-world dataset tests robustness under varied lighting and EA 53.2 59.8 48.6 50.1 49.4 52.2
|
||||
perspectives [38]. ADP 56.1 62.3 51.4 53.6 52.8 55.2
|
||||
ARMOR (Ours) 67.8 73.5 65.2 64.1 63.7 66.9
|
||||
• ImageNet-100: A 100-class subset of ImageNet with 1300 train-
|
||||
ing and 50 validation images per class. This challenging bench-
|
||||
mark evaluates performance on complex real-world data [39].
|
||||
• True Positive Rate (TPR): Proportion of adversarial samples
|
||||
This diverse dataset selection ensures our results generalize across correctly identified.
|
||||
different data environments. • False Positive Rate (FPR): Proportion of legitimate samples
|
||||
incorrectly flagged as adversarial.
|
||||
4.3. Attack methods • Adaptive Attack Robustness (AAR): Accuracy against carefully
|
||||
crafted adaptive attacks.
|
||||
We evaluate robustness against five attack types:
|
||||
4.6. Adaptive attacks
|
||||
• PGD (Projected Gradient Descent): Strong iterative attack with
|
||||
𝜖 = 8∕255, 𝛼 = 2∕255, and 20 iterations.
|
||||
To thoroughly evaluate ARMOR, we designed adaptive attacks tar-
|
||||
• C&W (Carlini & Wagner): Optimization-based attack with confi-
|
||||
geting its specific components:
|
||||
dence parameter 𝜅 = 0 and 1000 iterations.
|
||||
• AutoAttack: Parameter-free ensemble including APGD, FAB, and • Orchestrator Bypass Attack (OBA): Generates adversarial exam-
|
||||
Square Attack. ples with low threat scores to route through minimal defenses.
|
||||
• BPDA (Backward Pass Differentiable Approximation): Adap- • Transformation-Aware Attack (TAA): Uses EOT to average gra-
|
||||
tive attack designed to circumvent gradient obfuscation defenses. dients over possible input transformations, creating perturbations
|
||||
• EOT (Expectation Over Transformation): Attack accounting that survive preprocessing.
|
||||
for randomized defenses by averaging gradients over multiple • Ensemble Transfer Attack (ETA): Generates transferable adver-
|
||||
transformations. sarial examples targeting the diverse model ensemble.
|
||||
• History Poisoning Attack (HPA): Gradually shifts attack pattern
|
||||
Section 4.6 describes our adaptive attacks specifically targeting
|
||||
distribution to reduce effectiveness of historical pattern matching.
|
||||
ARMOR components.
|
||||
These adaptive attacks combine EOT, BPDA, and transferability
|
||||
4.4. Baseline defenses methods with ARMOR-specific modifications.
|
||||
|
||||
We compare ARMOR against the following state-of-the-art defenses: 5. Results
|
||||
|
||||
• Adversarial Training (AT): Standard PGD adversarial training. This section presents experimental results addressing our research
|
||||
• TRADES: Explicitly balances accuracy and robustness. questions.
|
||||
• Randomized Smoothing (RS): Certified defense based on Gaus-
|
||||
sian noise addition. 5.1. RQ1: Robustness against diverse attacks
|
||||
• Feature Denoising (FD): Non-local means filtering in feature
|
||||
space. Table 2 shows robust accuracy against various attacks on CIFAR-
|
||||
• Input Transformation (IT): JPEG compression and bit-depth 10. ARMOR significantly outperforms all defenses across attack types,
|
||||
reduction. achieving 66.9% average robust accuracy compared to 55.2% for the
|
||||
• Ensemble Averaging (EA): Simple averaging of independent best baseline (ADP). Performance is particularly strong against adap-
|
||||
robust models. tive attacks like BPDA and EOT, where ARMOR maintains over 63%
|
||||
• Adaptive Diversity Promoting (ADP): Encourages diversity in accuracy while other defenses degrade substantially.
|
||||
ensemble predictions. Fig. 2 shows robust accuracy across all four datasets against Au-
|
||||
toAttack. ARMOR consistently outperforms baselines, with the largest
|
||||
4.5. Evaluation metrics gains on complex datasets (GTSRB and ImageNet-100), demonstrating
|
||||
scalability to challenging classification problems.
|
||||
We use the following performance metrics:
|
||||
5.2. RQ2: Impact on clean data performance
|
||||
• Clean Accuracy (CA): Accuracy on unmodified test data.
|
||||
• Robust Accuracy (RA): Accuracy on adversarial examples. Table 3 compares clean accuracy, robust accuracy, and the clean-
|
||||
• Attack Success Rate (ASR): Percentage of successful adversarial robust accuracy gap (CRAG) on CIFAR-10. ARMOR achieves 87.5%
|
||||
examples that deceive the model. clean accuracy—higher than most comparably robust defenses. The
|
||||
• Clean-Robust Accuracy Gap (CRAG): Difference between clean clean-robust gap is only 20.6%, compared to 28.6% for the next best
|
||||
and robust accuracy. approach (ADP), indicating a better performance-security trade-off.
|
||||
• Computational Overhead (CO): Inference time relative to an Fig. 3 visualizes the clean-robust accuracy trade-off across datasets.
|
||||
undefended model. Points closer to the upper-right corner represent better performance on
|
||||
• Detection Delay (DD): Average time to detect adversarial exam- both metrics. ARMOR consistently occupies the most favorable region
|
||||
ples. of this trade-off space.
|
||||
|
||||
6
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
Table 4
|
||||
Robust accuracy (%) against adaptive attacks on CIFAR-10.
|
||||
Defense Standard attack OBA TAA ETA HPA Average
|
||||
AT 47.5 47.5 47.5 47.5 47.5 47.5
|
||||
TRADES 49.2 49.2 49.2 49.2 49.2 49.2
|
||||
RS 32.2 32.2 18.4 32.2 32.2 29.4
|
||||
FD 45.2 45.2 45.2 45.2 45.2 45.2
|
||||
IT 28.8 28.8 15.3 28.8 28.8 26.1
|
||||
EA 52.2 52.2 49.4 40.6 52.2 49.3
|
||||
ADP 55.2 55.2 52.8 45.1 55.2 52.7
|
||||
ARMOR (Ours) 66.9 58.3 56.7 52.4 59.8 58.8
|
||||
|
||||
|
||||
Table 5
|
||||
Computational overhead and memory requirements.
|
||||
Defense Inference time Memory usage Training time
|
||||
(× Baseline) (× Baseline) (× Baseline)
|
||||
No defense 1.00× 1.00× 1.00×
|
||||
AT 1.05× 1.00× 7.80×
|
||||
Fig. 2. Robust accuracy comparison across datasets against AutoAttack. TRADES 1.05× 1.00× 8.50×
|
||||
RS 3.20× 1.05× 1.20×
|
||||
FD 1.30× 1.20× 1.50×
|
||||
Table 3 IT 1.15× 1.00× 1.00×
|
||||
Clean accuracy and clean-robust accuracy gap on CIFAR-10. EA 3.10× 3.00× 7.80×
|
||||
Defense Clean accuracy (%) Robust accuracy (%) CRAG (%) ADP 3.15× 3.00× 9.20×
|
||||
No defense 95.6 0.0 95.6 ARMOR (Min) 1.10× 1.15× –
|
||||
AT 83.4 47.5 35.9 ARMOR (Avg) 1.42× 1.35× 12.50×
|
||||
TRADES 84.9 49.2 35.7 ARMOR (Max) 2.80× 3.20× –
|
||||
RS 87.3 32.2 55.1
|
||||
FD 85.7 45.2 40.5
|
||||
IT 89.5 28.8 60.7
|
||||
Table 6
|
||||
EA 82.6 52.2 30.4 Detection performance of ARMOR’s threat assessment layer.
|
||||
ADP 83.8 55.2 28.6 Dataset TPR (%) FPR (%) Detection delay (ms)
|
||||
ARMOR (Ours) 87.5 66.9 20.6
|
||||
CIFAR-10 92.3 3.7 12.4
|
||||
SVHN 93.1 3.2 11.8
|
||||
GTSRB 91.7 4.1 13.2
|
||||
ImageNet-100 90.8 4.5 15.6
|
||||
|
||||
|
||||
|
||||
|
||||
5.4. RQ4: Computational overhead
|
||||
|
||||
|
||||
Table 5 compares inference time, memory usage, and training time
|
||||
across defenses. ARMOR’s computational cost varies by configuration.
|
||||
With minimal defenses (low-threat inputs), overhead is only 1.10×.
|
||||
With maximal defenses (highly suspicious inputs), overhead reaches
|
||||
2.80×.
|
||||
ARMOR’s average inference overhead of 1.42× is substantially
|
||||
lower than ensemble methods like EA (3.10×) and ADP (3.15×), despite
|
||||
providing superior robustness. This efficiency comes from the orches-
|
||||
tration mechanism’s ability to allocate computational resources based
|
||||
Fig. 3. Trade-off between clean accuracy and robust accuracy across defenses. on threat assessment.
|
||||
Table 6 shows the threat assessment layer’s detection performance
|
||||
in terms of true positive rate (TPR), false positive rate (FPR), and aver-
|
||||
5.3. RQ3: Effectiveness against adaptive attacks age detection delay. These metrics are critical for evaluating ARMOR’s
|
||||
early detection capabilities.
|
||||
Table 4 shows robustness against adaptive attacks designed to The threat assessment layer achieves high TPR (90.8–93.1%) with
|
||||
exploit defense-specific vulnerabilities. We test all adaptive attacks low FPR (3.2–4.5%) across all datasets. Detection delay is minimal
|
||||
against all defenses for consistency, though some target ARMOR specif- (11.8–15.6 ms), enabling real-time threat assessment without signifi-
|
||||
ically (e.g., OBA). cant computational cost.
|
||||
ARMOR maintains 58.8% average robust accuracy against adaptive ARMOR’s training time is higher than other methods due to training
|
||||
attacks, substantially higher than the second-best approach (ADP at multiple components, including the orchestration policy. However, this
|
||||
52.7%). The Ensemble Transfer Attack (ETA) is most effective against is a one-time cost that does not impact deployment efficiency.
|
||||
ARMOR, reducing robust accuracy to 52.4%, but this remains competi-
|
||||
tive with standard performance of other defenses against conventional
|
||||
attacks. 5.5. RQ5: Ablation study
|
||||
The relatively modest performance drop against adaptive attacks
|
||||
(from 66.9% to 58.8%) demonstrates ARMOR’s resilience to attack Table 7 presents an ablation study measuring each ARMOR compo-
|
||||
adaptation, attributable to defense diversity and the adaptive response nent’s contribution. We evaluate configurations with individual compo-
|
||||
layer’s ability to recognize and counter evolving attack patterns. nents removed (w/o X) and single-component-only versions (X Only).
|
||||
|
||||
7
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
Table 7
|
||||
Ablation study: Component contributions on CIFAR-10.
|
||||
Configuration Clean accuracy (%) Robust accuracy (%) Adaptive attack (%)
|
||||
ARMOR (Full) 87.5 66.9 58.8
|
||||
w/o threat assessment 86.8 61.2 49.5
|
||||
w/o input transformation 85.3 59.7 52.1
|
||||
w/o model robustness 87.9 42.3 35.8
|
||||
w/o adaptive response 87.2 63.5 48.9
|
||||
w/o orchestration (Pipeline) 84.1 65.7 54.2
|
||||
Threat assessment only 95.1 0.0 0.0
|
||||
Input transformation only 89.3 28.7 16.5
|
||||
Model robustness only 83.4 53.2 46.8
|
||||
Adaptive response only 95.5 0.0 0.0
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 4. Contribution of ARMOR components to overall performance.
|
||||
|
||||
|
||||
Each component contributes significantly to ARMOR’s performance. • Performance-Security Trade-off: ARMOR achieves a superior
|
||||
Model Robustness provides the largest contribution to robust accu- balance, maintaining high clean accuracy while providing strong
|
||||
racy (53.2% when used alone), but the full system achieves 66.9%, robustness.
|
||||
demonstrating additive benefits from integration. • Computational Efficiency: The variable overhead ensures se-
|
||||
The orchestration mechanism is critical. Replacing it with a static curity without prohibitive resource requirements, even in con-
|
||||
pipeline (applying all components sequentially) reduces clean accuracy strained environments, similar to lightweight security solutions
|
||||
by 3.4 percentage points and robust accuracy slightly, highlighting the developed for IoT scenarios [40].
|
||||
orchestrator’s role in preserving clean performance through selective
|
||||
defense application. These findings suggest future adversarial robustness research should
|
||||
The adaptive response layer significantly improves performance focus on integrative approaches combining multiple defense mecha-
|
||||
against adaptive attacks. Without it, robustness drops to 48.9% versus nisms for enhanced effectiveness and efficiency.
|
||||
58.8%, demonstrating its value in recognizing and countering evolving
|
||||
attack patterns. 6.2. Real-world applications
|
||||
Fig. 4 visualizes component contributions across performance met-
|
||||
rics. The synergistic integration of all components achieves perfor-
|
||||
ARMOR’s combination of strong robustness, reasonable computa-
|
||||
mance exceeding what any individual component or simple combina-
|
||||
tional overhead, and maintained clean accuracy makes it suitable for
|
||||
tion could provide.
|
||||
practical deployment:
|
||||
6. Discussion
|
||||
• Medical Imaging: ARMOR’s adaptability is valuable in health-
|
||||
care applications like COVID-19 detection from CT scans [4],
|
||||
6.1. Key findings and implications
|
||||
where diagnostic accuracy is critical. High clean accuracy (87.5%
|
||||
on CIFAR-10) and robustness help prevent costly false negatives.
|
||||
Our experimental results demonstrate significant implications for
|
||||
• Resource-Constrained Environments: ARMOR’s flexible over-
|
||||
adversarial robustness research:
|
||||
head enables deployment on edge devices and mobile platforms,
|
||||
• Integration of Complementary Defenses: ARMOR’s multi- similar to efficient security schemes designed for Wireless Body
|
||||
layered approach demonstrates that combining defenses yields Area Networks [40]. The minimal configuration achieves only
|
||||
synergistic benefits beyond individual strengths and weaknesses. 1.10× baseline inference time, supporting real-time applications
|
||||
• Dynamic Defense Allocation: The orchestration mechanism en- in bandwidth-limited settings.
|
||||
ables resource-efficient defense by applying appropriate measures • Security Applications: Adaptive defenses are well-suited for mal-
|
||||
based on each input’s threat profile. ware and intrusion detection domains. The framework’s ability to
|
||||
• Adaptive Defenses for Evolving Threats: The adaptive response continuously update defense strategies based on observed attack
|
||||
layer is essential for maintaining robustness against novel attacks, patterns is valuable against advanced persistent threats and can
|
||||
unlike static, fixed approaches. be applied to infrastructure surveillance systems [5].
|
||||
|
||||
8
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
ARMOR’s modularity enables integration with existing security so- • Explainability and Interpretability: Improving understanding
|
||||
lutions while accommodating domain-specific requirements, making it of ARMOR’s decision-making process to provide transparency
|
||||
practical for real-world critical applications. about why specific defense strategies are selected for particular
|
||||
inputs.
|
||||
7. Conclusion • Defense Against Physical-World Attacks: Extending ARMOR
|
||||
to counter physical-world adversarial attacks, which introduce
|
||||
additional challenges beyond digital perturbations.
|
||||
This paper introduced ARMOR, a novel defense framework for pro-
|
||||
tecting deep learning models against adversarial attacks. Our approach
|
||||
advances the state-of-the-art through several key innovations: CRediT authorship contribution statement
|
||||
|
||||
• A multi-layered architecture that orchestrates complementary de- Mahmoud Mohamed: Writing – original draft, Supervision, Soft-
|
||||
fense strategies to provide synergistic protection exceeding indi- ware, Conceptualization. Fayaz AlJuaid: Writing – review & editing,
|
||||
vidual methods. Validation, Resources, Methodology, Formal analysis, Data curation.
|
||||
• A dynamic orchestration mechanism that routes inputs through
|
||||
appropriate defensive layers based on threat assessment, optimiz- Declaration of competing interest
|
||||
ing the security-efficiency trade-off.
|
||||
• An adaptive response system that continuously updates defense The authors declare that they have no known competing finan-
|
||||
strategies based on observed attack patterns, providing resilience cial interests or personal relationships that could have appeared to
|
||||
against evolving threats. influence the work reported in this paper.
|
||||
• Comprehensive evaluation across diverse attack types, including
|
||||
adaptive attacks, demonstrating superior performance-security Data availability
|
||||
trade-offs.
|
||||
Data will be made available on request.
|
||||
Extensive experimental evaluation shows ARMOR significantly out-
|
||||
performs existing defenses:
|
||||
References
|
||||
• 91.7% attack mitigation rate (18.3% improvement over ensemble
|
||||
averaging) [1] I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial
|
||||
• 87.5% clean accuracy preservation (8.9% improvement over ad- examples, in: International Conference on Learning Representations, ICLR, 2015.
|
||||
[2] N. Carlini, D. Wagner, Towards evaluating the robustness of neural networks,
|
||||
versarial training alone)
|
||||
in: IEEE Symposium on Security and Privacy, (SP), 2017, pp. 39–57.
|
||||
• 76.4% robustness against adaptive attacks (23.2% increase over [3] N. Akhtar, A. Mian, Threat of adversarial attacks on deep learning in computer
|
||||
the strongest baseline) vision: A survey, IEEE Access 6 (2018) 14410–14430.
|
||||
• Minimal 1.42× computational overhead compared to unprotected [4] O. Akinlade, E. Vakaj, A. Dridi, S. Tiwari, F. Ortiz-Rodriguez, Semantic seg-
|
||||
models, substantially lower than alternative ensemble methods mentation of the lung to examine the effect of COVID-19 using UNET model,
|
||||
in: Communications in Computer and Information Science, Vol. 2440, Springer,
|
||||
2023, pp. 52–63, http://dx.doi.org/10.1007/978-3-031-34222-6_5.
|
||||
Our results demonstrate that integrating and coordinating comple-
|
||||
[5] C. Wang, O. Akinlade, S.A. Ajagbe, Dynamic resilience assessment of urban traffic
|
||||
mentary defense mechanisms substantially improves adversarial robust- systems based on integrated deep learning, in: Advances in Transdisciplinary
|
||||
ness. By addressing the limitations of single-dimension strategies, AR- Engineering, Springer, 2025, http://dx.doi.org/10.3233/atde250238.
|
||||
MOR provides more comprehensive and sustainable protection against [6] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning
|
||||
models resistant to adversarial attacks, in: International Conference on Learning
|
||||
diverse and dynamic adversarial threats, moving closer to trustworthy
|
||||
Representations, ICLR, 2018.
|
||||
deep learning systems for high-performance, security-critical applica- [7] C. Guo, M. Rana, M. Cisse, L. Van Der Maaten, Countering adversarial im-
|
||||
tions. ages using input transformations, in: International Conference on Learning
|
||||
Future Directions: While ARMOR shows significant improvements, Representations, ICLR, 2018.
|
||||
[8] J.H. Metzen, T. Genewein, V. Fischer, B. Bischoff, On detecting adversarial
|
||||
several research directions remain:
|
||||
perturbations, in: International Conference on Learning Representations, ICLR,
|
||||
2017.
|
||||
• Domain Expansion: Extending ARMOR to domains beyond im- [9] F. Tramèr, N. Carlini, W. Brendel, A. Madry, On adaptive attacks to adver-
|
||||
age classification (e.g., natural language processing, speech recog- sarial example defenses, Adv. Neural Inf. Process. Syst. (NeurIPS) 33 (2020)
|
||||
nition, reinforcement learning), which present unique attack sur- 1633–1645.
|
||||
faces and defense requirements. [10] A. Athalye, N. Carlini, D. Wagner, Obfuscated gradients give a false sense
|
||||
of security: Circumventing defenses to adversarial examples, in: International
|
||||
• Certified Robustness: Developing theoretical guarantees for AR- Conference on Machine Learning, ICML, 2018, pp. 274–283.
|
||||
MOR’s robustness. While we have strong empirical results, for- ̇ From manual to automated systematic review:
|
||||
[11] D. Kalibatiene, J. Miliauskaite,
|
||||
mal certification would provide stronger security assurances for Key attributes influencing the duration of systematic reviews in software en-
|
||||
safety-critical applications. gineering, Comput. Stand. Interfaces 96 (2026) 104073, http://dx.doi.org/10.
|
||||
1016/j.csi.2025.104073.
|
||||
• Advanced Training Strategies: Investigating meta-learning
|
||||
[12] Y. Dong, Q.A. Fu, X. Yang, T. Pang, H. Su, Z. Xiao, J. Zhu, Benchmarking
|
||||
strategies for the orchestration policy to enable rapid adaptation adversarial robustness on image classification, IEEE Conf. Comput. Vis. Pattern
|
||||
to completely novel attack types. Recognit. (CVPR) 32 (2020) 1–331.
|
||||
• Online Learning Capabilities: Enhancing the adaptive response [13] T. Pang, K. Xu, C. Du, N. Chen, J. Zhu, Improving adversarial robustness via
|
||||
promoting ensemble diversity, in: International Conference on Machine Learning,
|
||||
layer with online learning to continuously update defense strate-
|
||||
(ICML), 2019, pp. 4970–4979.
|
||||
gies in real-time without periodic retraining. [14] G.R. Machado, E. Silva, R.R. Goldschmidt, Adversarial machine learning in image
|
||||
• Hardware Optimization: Optimizing ARMOR for deployment classification: A survey toward the defender’s perspective, ACM Comput. Surv.
|
||||
on resource-constrained hardware, especially edge devices. This 54 (5) (2021) 1–35.
|
||||
could involve creating specialized versions that leverage hard- [15] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically princi-
|
||||
pled trade-off between robustness and accuracy, in: International Conference on
|
||||
ware acceleration for specific defense components, building on Machine Learning, ICML, 2019, pp. 7472–7482.
|
||||
approaches from lightweight security schemes for IoT and Wire- [16] E. Wong, L. Rice, J.Z. Kolter, Fast is better than free: Revisiting adversarial
|
||||
less Body Area Networks [40]. training, in: International Conference on Learning Representations, ICLR, 2020.
|
||||
|
||||
|
||||
9
|
||||
M. Mohamed and F. AlJuaid Computer Standards & Interfaces 97 (2026) 104117
|
||||
|
||||
|
||||
[17] S.A. Rebuffi, S. Gowal, D.A. Calian, F. Stimberg, O. Wiles, T. Mann, Fixing data [27] S. Gowal, K. Dvijotham, R. Stanforth, R. Bunel, C. Qin, J. Uesato, R. Arand-
|
||||
augmentation to improve adversarial robustness, Adv. Neural Inf. Process. Syst. jelovic, T. Mann, P. Kohli, Scalable verified training for provably robust image
|
||||
(NeurIPS) 34 (2021) 10213–10224. classification, in: IEEE International Conference on Computer Vision, ICCV, 2019,
|
||||
[18] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, A. Madry, Robustness may be pp. 4842–4851.
|
||||
at odds with accuracy, in: International Conference on Learning Representations, [28] G. Singh, T. Gehr, M. Püschel, M. Vechev, An abstract domain for certifying
|
||||
ICLR, 2019. neural networks, Proc. ACM Program. Lang. 3 (POPL) (2019) 1–30.
|
||||
[19] C. Xie, J. Wang, Z. Zhang, Z. Ren, A. Yuille, Mitigating adversarial effects through [29] G. Yang, T. Duan, J. Hu, H. Salman, I. Razenshteyn, J. Li, Randomized smoothing
|
||||
randomization, in: International Conference on Learning Representations, ICLR, of all shapes and sizes, in: International Conference on Machine Learning, ICML,
|
||||
2018. 2020, pp. 10693–10705.
|
||||
[20] M. Naseer, S. Khan, M. Hayat, F.S. Khan, F. Porikli, A self-supervised approach [30] F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M.
|
||||
for adversarial robustness, in: IEEE Conference on Computer Vision and Pattern Chiang, P. Mittal, M. Hein, RobustBench: a standardized adversarial robustness
|
||||
Recognition, CVPR, 2020, pp. 262–271. benchmark, Adv. Neural Inf. Process. Syst. (NeurIPS) 35 (2022) 32634–32651.
|
||||
[21] X. Jia, X. Wei, X. Cao, H. Foroosh, ComDefend: An efficient image compression [31] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, P. McDaniel,
|
||||
model to defend adversarial examples, in: IEEE Conference on Computer Vision Ensemble adversarial training: Attacks and defenses, in: International Conference
|
||||
and Pattern Recognition, CVPR, 2019, pp. 6084–6092. on Learning Representations, ICLR, 2018.
|
||||
[22] K. Lee, K. Lee, H. Lee, J. Shin, A simple unified framework for detecting out- [32] S. Sen, N. Baracaldo, H. Ludwig, et al., A hybrid approach to adversarial
|
||||
of-distribution samples and adversarial attacks, Adv. Neural Inf. Process. Syst. detection and defense, IEEE Int. Conf. Big Data 423 (2020) 3–4242.
|
||||
(NeurIPS) 31 (2018) 7167–7177. [33] T. Pang, C. Du, J. Zhu, et al., Towards robust detection of adversarial examples,
|
||||
[23] K. Roth, Y. Kilcher, T. Hofmann, The odds are odd: A statistical test for detecting Adv. Neural Inf. Process. Syst. (NeurIPS) 33 (2020) 10256–10267.
|
||||
adversarial examples, in: International Conference on Machine Learning, ICML, [34] S. Kariyappa, M. Qureshi, A survey of adversarial attacks on deep learning
|
||||
2019, pp. 5498–5507. in computer vision: A comprehensive review, 2019, arXiv preprint arXiv:1901.
|
||||
[24] X. Ma, Y. Niu, L. Gu, Y. Wang, Y. Zhao, J. Bailey, F. Lu, Understanding 09984.
|
||||
adversarial attacks on deep learning based medical image analysis systems, [35] X. Wei, B. Liang, Y. Li, et al., Adversarial distillation: A survey, IEEE Trans.
|
||||
Pattern Recognit. 110 (2021) 107332. Neural Netw. Learn. Syst. (2021).
|
||||
[25] N. Carlini, D. Wagner, Adversarial examples are not easily detected: Bypassing [36] A. Krizhevsky, et al., CIFAR,-10 dataset, 2009, https://www.cs.toronto.edu/kriz/
|
||||
ten detection methods, in: ACM Workshop on Artificial Intelligence and Security, cifar.html.
|
||||
2017, pp. 3–14. [37] Y. Netzer, et al., SVHN, dataset, 2011, http://ufldl.stanford.edu/housenumbers/.
|
||||
[26] J. Cohen, E. Rosenfeld, Z. Kolter, Certified adversarial robustness via randomized [38] J. Stallkamp, et al., GTSRB, dataset, 2011, https://benchmark.ini.rub.de/gtsrb_
|
||||
smoothing, in: International Conference on Machine Learning, ICML, 2019, pp. dataset.html.
|
||||
1310–1320. [39] J. Deng, et al., ImageNet dataset, 2009, https://image-net.org/.
|
||||
[40] Z. Ali, J. Hassan, M.U. Aftab, N.W. Hundera, H. Xu, X. Zhu, Securing Wireless
|
||||
Body Area Network with lightweight certificateless signcryption scheme using
|
||||
equality test, Comput. Stand. Interfaces 96 (2026) 104070, http://dx.doi.org/10.
|
||||
1016/j.csi.2025.104070.
|
||||
|
||||
|
||||
|
||||
|
||||
10
|
||||
|
||||
@@ -0,0 +1,750 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
AdaTraj-DP: An adaptive privacy framework for context-aware trajectory
|
||||
data publishingI
|
||||
Yongxin Zhao a , Chundong Wang a,b ,∗, Hao Lin c ,∗∗, Xumeng Wang d , Yixuan Song a , Qiuyu Du c
|
||||
a
|
||||
Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin, China
|
||||
b
|
||||
TianJin Police Institute, Tianjin, China
|
||||
c
|
||||
College of Intelligent Science and Technology (College of Cyberspace Security), Inner Mongolia University of Technology, Inner Mongolia, China
|
||||
d
|
||||
College of Cryptology and Cyber Science, Nankai University, Tianjin, China
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: Trajectory data are widely used in AI-based spatiotemporal analysis but raise privacy concerns due to their fine-
|
||||
Differential privacy grained nature and the potential for individual re-identification. Existing differential privacy (DP) approaches
|
||||
Trustworthy AI often apply uniform perturbation, which compromises spatial continuity, or adopt personalized mechanisms
|
||||
Trajectory data publishing
|
||||
that overlook structural utility. This study introduces AdaTraj-DP, an adaptive differential privacy framework
|
||||
Personalized perturbation
|
||||
designed to balance trajectory-level protection and analytical utility. The framework combines context-aware
|
||||
sensitivity detection with hierarchical aggregation. Specifically, a dynamic sensitivity model evaluates privacy
|
||||
risks according to spatial density and semantic context, enabling adaptive allocation of privacy budgets. An
|
||||
adaptive perturbation mechanism then injects noise proportionally to the estimated sensitivity and represents
|
||||
trajectories through Hilbert-based encoding for prefix-oriented hierarchical aggregation with layer-wise budget
|
||||
distribution. Experiments conducted on the T-Drive and GeoLife datasets indicate that AdaTraj-DP maintains
|
||||
stable query accuracy, spatial consistency, and downstream analytical utility across varying privacy budgets
|
||||
while satisfying formal differential privacy guarantees.
|
||||
|
||||
|
||||
|
||||
1. Introduction differential privacy for trajectory data has become essential to support
|
||||
reliable and ethically compliant AI development.
|
||||
The proliferation of mobile devices, GPS sensors, and intelligent Differential Privacy (DP) [6] provides a rigorous mathematical guar-
|
||||
transportation infrastructures has resulted in the large-scale collection antee against information leakage. However, its application to tra-
|
||||
of spatiotemporal data. Such data serve as the foundation for numerous jectory publishing introduces a persistent trade-off between privacy
|
||||
Location-Based Services (LBS), including navigation, ride-hailing, and strength, data utility, and personalization, which conventional mecha-
|
||||
urban planning [1,2]. Trajectory datasets record detailed sequences of nisms fail to reconcile. Two primary gaps remain unresolved: (1) the
|
||||
individual movements, enabling a wide range of AI applications such as tension between point-level perturbation and structural integrity;(2)
|
||||
traffic forecasting, mobility prediction, and behavioral modeling. These the difficulty of adapting privacy budgets to varying contextual sen-
|
||||
applications have become indispensable for smart city management and sitivity. Early studies injected uniform Laplace noise into each location
|
||||
autonomous systems, where the integrity and granularity of trajectory point [7,8], which protected individual coordinates but severely dis-
|
||||
data directly affect analytical and decision-making accuracy. torted the spatiotemporal correlation essential for route-level analysis.
|
||||
Despite their utility, trajectory datasets raise critical privacy con- Subsequent hierarchical schemes based on prefix trees or space-filling
|
||||
cerns for trustworthy AI. A single trajectory may expose an individual’s curves [9,10] preserved aggregate statistics but relied on global, fixed
|
||||
home, workplace, or health-related locations, revealing sensitive be- privacy parameters, ignoring heterogeneous sensitivity across trajecto-
|
||||
havioral patterns and social relationships [3,4]. Even after removing ries. Recent progress in Personalized Differential Privacy (PDP) [11–13]
|
||||
explicit identifiers, re-identification attacks can reconstruct personal introduced adaptive noise based on semantic or frequency-based sen-
|
||||
traces with minimal auxiliary information [5]. Consequently, ensuring sitivity, yet these methods typically lack integration with hierarchical
|
||||
|
||||
|
||||
|
||||
I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces.
|
||||
∗ Corresponding author at: Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin, China.
|
||||
∗∗ Corresponding author.
|
||||
E-mail addresses: zyx4237@163.com (Y. Zhao), michael3769@163.com (C. Wang), suzukaze_aoba@126.com (H. Lin), wangxumeng@nankai.edu.cn
|
||||
(X. Wang), fykatb0824@163.com (Q. Du).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104125
|
||||
Received 29 October 2025; Received in revised form 25 December 2025; Accepted 29 December 2025
|
||||
Available online 30 December 2025
|
||||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
aggregation, resulting in limited query accuracy and poor scalability quadtree variants support spatial indexing under privacy constraints [7,
|
||||
for AI model training. 10]. Recent work improves spatial locality and query accuracy us-
|
||||
To bridge this gap, we propose AdaTraj-DP, an adaptive differ- ing Hilbert/Geohash encodings and adaptive tree strategies [9]. Zhao
|
||||
entially private trajectory publishing framework that unifies context- et al.’s PerTrajTree-DP further integrates point-level sensitivity with
|
||||
aware sensitivity modeling and hierarchical aggregation. AdaTraj-DP prefix-tree publishing to better support trustworthy AI analytics [24].
|
||||
introduces a two-stage protection mechanism. The first stage detects Complementary systems research on private data access and expla-
|
||||
and quantifies sensitivity using contextual and statistical cues, allowing nation (e.g., DPXPlain, Saibot) demonstrates practical techniques for
|
||||
adaptive privacy budget assignment at the point level. The second supporting DP-protected analytics and helping users interpret noisy
|
||||
stage encodes perturbed trajectories into a hierarchical prefix tree, aggregates [25,26].
|
||||
applying layer-wise budget allocation to preserve structural consistency
|
||||
for downstream analysis. This design ensures both localized protection 2.3. Personalized and adaptive privacy protection
|
||||
and global analytical utility, addressing the core limitations of prior
|
||||
DP-based trajectory mechanisms. Personalized Differential Privacy (PDP) methods adapt protection
|
||||
The main contributions of this work are summarized as follows: to varying point- or user-level sensitivity. Semantics-driven approaches
|
||||
use POI categories or external labels to identify sensitive locations [27,
|
||||
(1) We propose AdaTraj-DP, an adaptive framework that unifies per- 28], and movement-model-based frameworks like OPTDP estimate pri-
|
||||
sonalized perturbation and hierarchical aggregation. By estab- vacy risk from mobility patterns [11]. Statistical personalization meth-
|
||||
lishing a mathematical link between local coordinate noise and ods infer sensitivity from dataset properties; for example, TF–IDF-based
|
||||
global prefix-tree structures, the framework ensures that fine- approaches quantify local importance and global rarity to guide bud-
|
||||
grained point-level protection remains structurally consistent get allocation [12,13]. Interactive tools and visual analytics (DPKnob,
|
||||
with trajectory-level differential privacy guarantees, enabling Defogger) provide practical support for configuring heterogeneous DP
|
||||
high-fidelity reconstruction for downstream tasks. strategies according to utility goals [20,21].
|
||||
(2) We design a context-aware sensitivity model that combines spa- In parallel, recent advances in differentially private deep learning
|
||||
tial density with semantic context to guide adaptive budget and private model training yield methods for improved utility in noisy
|
||||
allocation. This mechanism quantifies privacy risks at a granular training regimes (e.g., optimized DP-SGD variants, selective-update
|
||||
training, and heterogeneous-noise schemes) that can inform budget
|
||||
level, enabling the dynamic adjustment of perturbation intensity
|
||||
allocation and model-aware privacy strategies in trajectory publish-
|
||||
to balance privacy protection and data fidelity.
|
||||
ing [25,26,29–31]. These works highlight opportunities to close the
|
||||
(3) We implement a hierarchical aggregation scheme utilizing Hilbert
|
||||
gap between personalized point-level protection and structural aggrega-
|
||||
spatial mapping and logarithmic layer-wise budget distribution.
|
||||
tion, motivating AdaTraj-DP’s integration of context-aware sensitivity
|
||||
Experiments on the T-Drive and GeoLife datasets validate the
|
||||
detection, adaptive perturbation, and hierarchical encoding to support
|
||||
framework’s effectiveness in preserving query accuracy, spatial
|
||||
AI-oriented downstream tasks.
|
||||
consistency, and AI model performance under varying privacy
|
||||
budgets. 3. Preliminaries
|
||||
|
||||
2. Related work
|
||||
Trajectory Representation. A trajectory 𝑇𝑖 of user 𝑢𝑖 is a temporally
|
||||
Existing privacy-preserving trajectory publishing approaches can ordered sequence of geo-referenced points [32]:
|
||||
be broadly categorized into three classes: (1) foundational differen- 𝑇𝑖 = {(𝑝𝑖,1 , 𝑡𝑖,1 ), (𝑝𝑖,2 , 𝑡𝑖,2 ), … , (𝑝𝑖,𝐿𝑖 , 𝑡𝑖,𝐿𝑖 )}, (1)
|
||||
tial privacy models that ensure privacy but compromise trajectory
|
||||
continuity; (2) structural aggregation mechanisms that enhance data where 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ) denotes the spatial coordinate and 𝑡𝑖,𝑗 is the
|
||||
utility via hierarchical organization; and (3) personalized and adaptive timestamp. The trajectory dataset is denoted as = {𝑇1 , 𝑇2 , … , 𝑇𝑁 }.
|
||||
privacy protection strategies that tailor noise to sensitivity but often Each point can be projected into a discrete grid cell 𝑐𝑖,𝑗 for statistical
|
||||
lack integration with structural models. This section reviews these three analysis or further spatial encoding. The dimensionality and sampling
|
||||
directions and discusses recent advances that motivate AdaTraj-DP. irregularity of result in high sparsity and heterogeneous sensitivity
|
||||
among locations, which requires adaptive privacy mechanisms.
|
||||
2.1. Foundational models for differentially private trajectory publishing Differential Privacy. Let 1 and 2 be two neighboring datasets dif-
|
||||
fering in at most one trajectory. A randomized mechanism satisfies
|
||||
Differential Privacy (DP) [6] is the standard formalism for privacy- 𝜀-differential privacy if for any measurable subset 𝑂 in the output
|
||||
preserving data publication. Early approaches discretize continuous space:
|
||||
spatio-temporal domains and inject Laplace noise into cell counts
|
||||
Pr[(1 ) ∈ 𝑂] ≤ 𝑒𝜀 Pr[(2 ) ∈ 𝑂]. (2)
|
||||
or simple aggregates [14,15], but such methods often disrupt tra-
|
||||
jectory continuity and reduce utility for route-level analysis [7]. To The privacy budget 𝜀 > 0 controls the trade-off between privacy pro-
|
||||
address this, research has explored trajectory generalization and syn- tection and data utility. Smaller 𝜀 implies stronger privacy guarantees
|
||||
thetic data generation under DP, including clustering-based generaliza- but larger perturbation noise.
|
||||
tion [16] and GAN-based synthetic trajectory models [17–19]. Work For a numerical query 𝑓 ∶ → R𝑘 with 𝓁1 sensitivity 𝛥𝑓 =
|
||||
on DP-aware data exploration and visualization—e.g., DPKnob and max1 ,2 ‖𝑓 (1 ) − 𝑓 (2 )‖1 , the Laplace mechanism adds independent
|
||||
Defogger—highlights the challenge of configuring DP mechanisms to noise drawn from the Laplace distribution:
|
||||
balance utility and risk in interactive settings and motivates user- or
|
||||
() = 𝑓 () + Lap(𝛥𝑓 ∕𝜀). (3)
|
||||
task-guided privacy configuration [20,21].
|
||||
This mechanism provides 𝜀-differential privacy and is used in sub-
|
||||
2.2. Structural aggregation for utility enhancement sequent trajectory perturbation and aggregation processes.
|
||||
Geographic Indistinguishability. For any two spatial points 𝑥, 𝑥′ ∈ R2
|
||||
Hierarchical structures—such as prefix trees, Hilbert-encoded se-
|
||||
and any reported location 𝑧, a mechanism achieves 𝜀-geographic
|
||||
quences, and spatial index trees—have been widely adopted to preserve
|
||||
indistinguishability if
|
||||
aggregate query utility under DP. Early prefix-tree methods aggre-
|
||||
′
|
||||
gate shared prefixes to reduce noise impact [22,23], while R-tree and Pr[(𝑥) = 𝑧] ≤ 𝑒𝜀⋅𝑑(𝑥,𝑥 ) Pr[(𝑥′ ) = 𝑧], (4)
|
||||
|
||||
2
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
by combining statistical frequency and contextual semantics to guide
|
||||
subsequent adaptive perturbation.
|
||||
Spatial Discretization. The continuous geographical domain is parti-
|
||||
tioned into a uniform grid of 𝐺 × 𝐺 cells. Each point 𝑝𝑖,𝑗 is mapped to
|
||||
a corresponding grid cell 𝑐𝑖,𝑗 . This transformation converts raw coordi-
|
||||
nates into discrete spatial tokens, enabling frequency-based statistical
|
||||
analysis.
|
||||
|
||||
Fig. 1. Framework of the proposed AdaTraj-DP scheme. Context-aware Sensitivity Measure. For each cell 𝑐𝑖,𝑗 , a sensitivity score
|
||||
𝑆(𝑐𝑖,𝑗 ) is defined as
|
||||
|
||||
𝑆(𝑐𝑖,𝑗 ) = TF(𝑐𝑖,𝑗 , 𝑇𝑖 ) ⋅ IDF(𝑐𝑖,𝑗 ) ⋅ 𝜔𝑐 , (6)
|
||||
where 𝑑(𝑥, 𝑥′ ) is the Euclidean distance between 𝑥 and 𝑥′ [33]. count(𝑐𝑖,𝑗 ∈𝑇𝑖 )
|
||||
This formulation extends differential privacy to continuous spatial where TF(𝑐𝑖,𝑗 , 𝑇𝑖 ) = 𝐿𝑖
|
||||
represents the normalized local fre-
|
||||
||
|
||||
domains and provides distance-dependent protection. quency of visits within trajectory 𝑇𝑖 , and IDF(𝑐𝑖,𝑗 ) = log |{𝑇 ∈∶𝑐
|
||||
𝑘 𝑖,𝑗 ∈𝑇𝑘 }|
|
||||
Hierarchical Aggregation Structure. Trajectory data exhibit hierarchi- denotes the global rarity of the location across the dataset. The term
|
||||
cal correlations that can be represented through prefix-based aggre- 𝜔𝑐 is a contextual weighting coefficient that quantifies the semantic
|
||||
gation. Let each discretized or encoded trajectory be expressed as a sensitivity of a location category. Following the semantic sensitivity
|
||||
hierarchy established in [34], we assign higher weights to privacy-
|
||||
sequence of spatial identifiers 𝑆𝑖 = [𝑠𝑖,1 , 𝑠𝑖,2 , … , 𝑠𝑖,𝐿𝑖 ]. A prefix tree
|
||||
critical categories (e.g., 𝜔ℎ𝑒𝑎𝑙𝑡ℎ𝑐𝑎𝑟𝑒 = 1.5, 𝜔𝑟𝑒𝑠𝑖𝑑𝑒𝑛𝑡𝑖𝑎𝑙 = 1.2) to enforce
|
||||
organizes all trajectories in by shared prefixes, where each node 𝑣
|
||||
stricter protection, while assigning lower base weights to public infras-
|
||||
corresponds to a spatial prefix and maintains a count 𝑐(𝑣) of trajectories
|
||||
tructure (e.g., 𝜔𝑟𝑜𝑎𝑑 = 1.0). These semantic categories are mapped from
|
||||
passing through it. The hierarchical form allows noise to be injected at
|
||||
public map services (e.g., OpenStreetMap), ensuring that the sensitivity
|
||||
multiple granularities while preserving global spatial consistency.
|
||||
configuration relies solely on public knowledge and does not consume
|
||||
The total privacy budget 𝜀tree is distributed across tree layers to the private budget.
|
||||
balance upper-level accuracy and lower-level detail preservation.
|
||||
Normalization and Classification. To unify the sensitivity scale, all
|
||||
Problem Definition. Given a trajectory dataset consisting of 𝑁 users scores are normalized into [0, 1]:
|
||||
and a total privacy budget𝜀total , the objective is to design a mechanism
|
||||
𝑆(𝑐𝑖,𝑗 ) − min(𝑆)
|
||||
traj that releases a trajectory dataset ̃ = traj () satisfying: ̂ 𝑖,𝑗 ) =
|
||||
𝑆(𝑐 . (7)
|
||||
max(𝑆) − min(𝑆)
|
||||
Each point 𝑝𝑖,𝑗 is then labeled as sensitive or non-sensitive according
|
||||
(1) traj ensures 𝜀total -differential privacy at the trajectory level;
|
||||
to a predefined threshold 𝜃𝑆 :
|
||||
(2) The released dataset ̃ preserves statistical and structural prop- {
|
||||
erties essential for AI-based spatiotemporal analysis; ̂ 𝑖,𝑗 ) ≥ 𝜃𝑆 ,
|
||||
1, if 𝑆(𝑐
|
||||
label(𝑝𝑖,𝑗 ) = (8)
|
||||
(3) The expected analytical error between results obtained from ̃ 0, otherwise.
|
||||
and remains bounded. The resulting annotated dataset is represented as ′ = {𝑇1′ , 𝑇2′ , … , 𝑇𝑁′ },
|
||||
where each 𝑇𝑖′ contains the points and corresponding sensitivity labels.
|
||||
Let 𝑓AI (⋅) denote an AI model trained or evaluated on trajectory The normalized score 𝑆(𝑐 ̂ 𝑖,𝑗 ) serves as a continuous privacy indicator in
|
||||
data. The utility preservation objective is formulated as the subsequent adaptive perturbation phase.
|
||||
[ ]
|
||||
̃ − 𝑓AI ()‖2 ,
|
||||
𝐿utility = E ‖𝑓AI () (5)
|
||||
2 4.2. Adaptive personalized perturbation
|
||||
subject to ̃ satisfying 𝜀total -differential privacy. The goal is to minimize
|
||||
𝐿utility while maintaining formal privacy guarantees. This phase injects controlled noise into all trajectory points in ′ to
|
||||
ensure trajectory-level differential privacy. All locations are perturbed
|
||||
4. Proposed framework to avoid inference risks arising from selective protection. The perturba-
|
||||
tion strength is adaptively adjusted based on the normalized sensitivity
|
||||
̂ 𝑖,𝑗 ) and local spatial density, allowing the mechanism to preserve
|
||||
𝑆(𝑐
|
||||
Rapid development of AI-driven spatiotemporal analysis has in-
|
||||
creased the demand for high-quality trajectory data with strong privacy analytical fidelity while maintaining formal privacy guarantees.
|
||||
protection. Traditional differential privacy mechanisms often adopt Adaptive Privacy Budget Allocation. Each trajectory point 𝑝𝑖,𝑗 is as-
|
||||
fixed noise scales or uniform budget allocation, which can cause exces- signed an individual privacy budget 𝜀𝑝𝑖,𝑗 determined by both its sensi-
|
||||
sive utility degradation in dense areas or insufficient protection in sensi- tivity level and spatial context.
|
||||
tive regions. To address these limitations, this study proposes AdaTraj- Let 𝜌(𝑝𝑖,𝑗 ) denote the local point density around 𝑝𝑖,𝑗 within a neigh-
|
||||
DP, a framework that integrates adaptive personalized perturbation borhood radius 𝑟. The adaptive budget is defined as
|
||||
with hierarchical aggregation to achieve trajectory-level differential ( )
|
||||
̂ 𝑖,𝑗 ) + (1 − 𝛼)(1 − 𝜌(𝑝𝑖,𝑗 )) ,
|
||||
𝜀𝑝𝑖,𝑗 = 𝜀max − (𝜀max − 𝜀min ) × 𝛼 𝑆(𝑐 (9)
|
||||
privacy while maintaining analytical utility for AI-based modeling.
|
||||
As illustrated in Fig. 1, AdaTraj-DP operates in three main phases: where 𝛼 ∈ [0, 1] controls the balance between sensitivity-based and
|
||||
(1) trajectory preprocessing and context-aware sensitivity detection; density-based adaptation.
|
||||
(2) adaptive personalized perturbation guided by local sensitivity and A higher 𝑆(𝑐 ̂ 𝑖,𝑗 ) or lower 𝜌(𝑝𝑖,𝑗 ) leads to a smaller 𝜀𝑝 , introducing
|
||||
𝑖,𝑗
|
||||
spatial density; (3) hierarchical aggregation using Hilbert encoding and stronger noise for privacy-critical or sparsely visited regions. The range
|
||||
dynamic layer-wise budget allocation. [𝜀min , 𝜀max ] defines the permissible privacy strength, ensuring stability
|
||||
across heterogeneous data distributions.
|
||||
4.1. Context-aware sensitivity detection
|
||||
Two-Dimensional Laplace Perturbation. For each point 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ),
|
||||
independent Laplace noise is applied to both coordinates according to
|
||||
Let = {𝑇1 , … , 𝑇𝑁 } denote the trajectory dataset after basic
|
||||
the assigned privacy budget:
|
||||
preprocessing. Each trajectory 𝑇𝑖 = {(𝑝𝑖,1 , 𝑡𝑖,1 ), … , (𝑝𝑖,𝐿𝑖 , 𝑡𝑖,𝐿𝑖 )} consists {
|
||||
of temporally ordered spatial points 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ). The objective lat 𝑖,𝑗 + Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )
|
||||
𝑝′𝑖,𝑗 = (10)
|
||||
of this phase is to quantify the privacy sensitivity of each spatial point lon𝑖,𝑗 + Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )
|
||||
|
||||
3
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
Algorithm 1 Adaptive Personalized Perturbation under AdaTraj-DP Algorithm 2 Dynamic Hierarchical Aggregation under AdaTraj-DP
|
||||
Input: Annotated dataset ′ , privacy range [𝜀min , 𝜀max ], sensitivity Input: Perturbed dataset ′′ , total tree budget 𝜀tree , height ℎ,
|
||||
scores 𝑆, ̂ balance coefficient 𝛼 parameters 𝑎, 𝛾, encoding length 𝐿enc
|
||||
Output: Perturbed dataset ′′ Output: Privacy-aware prefix tree ′
|
||||
1: ′′ ← ∅ 1: Initialize empty tree
|
||||
2: for each trajectory 𝑇𝑖 ∈ ′ do 2: for each trajectory 𝑇𝑖′′ = {𝑝′𝑖,1 , … , 𝑝′𝑖,𝐿 } in ′′ do
|
||||
𝑖
|
||||
3: 𝑇𝑖′′ ← ∅ 3: Encode trajectory:
|
||||
4: for each point 𝑝𝑖,𝑗 in 𝑇𝑖 do 𝑆𝑖 ← [Encode1D(𝐻(𝑝′𝑖,1 )), … , Encode1D(𝐻(𝑝′𝑖,𝐿 ))]
|
||||
𝑖
|
||||
5: Compute local density 𝜌(𝑝𝑖,𝑗 ) 4: Insert 𝑆𝑖 into and increment node counts along each path
|
||||
6: 𝜀𝑝𝑖,𝑗 ← 𝜀max − (𝜀max − 𝜀min ) × (𝛼 𝑆(𝑐 ̂ 𝑖,𝑗 ) + (1 − 𝛼)(1 − 𝜌(𝑝𝑖,𝑗 ))) 5: end for
|
||||
7: 𝑛lat ∼ Laplace(0, 1∕𝜀𝑝𝑖,𝑗 ) 6: for layer 𝑖 = 1 to ℎ do
|
||||
8: 𝑛lon ∼ Laplace(0, 1∕𝜀𝑝𝑖,𝑗 ) 7: Compute node count variance 𝜎𝑖2
|
||||
9: 𝑝′𝑖,𝑗 ← (lat 𝑖,𝑗 + 𝑛lat , lon𝑖,𝑗 + 𝑛lon ) (log(𝑖+𝑎))(1+𝛾𝜎𝑖2 )
|
||||
8: 𝜀level,𝑖 ← ∑ℎ ⋅ 𝜀tree
|
||||
10: Append 𝑝′𝑖,𝑗 to 𝑇𝑖′′ 2
|
||||
𝑗=1 (log(𝑗+𝑎))(1+𝛾𝜎𝑗 )
|
||||
11: end for 9: for each node 𝑣 at layer 𝑖 do
|
||||
12: Add 𝑇𝑖′′ to ′′ 10: 𝑐 ′ (𝑣) ← 𝑐(𝑣) + Laplace(0, 1∕𝜀level,𝑖 )
|
||||
13: end for 11: Update 𝑐(𝑣) ← 𝑐 ′ (𝑣)
|
||||
14: return ′′ 12: end for
|
||||
13: end for
|
||||
14: return ′
|
||||
|
||||
The perturbed trajectory 𝑇𝑖′′ = {𝑝′𝑖,1 , 𝑝′𝑖,2 , … , 𝑝′𝑖,𝐿 } is constructed by
|
||||
𝑖
|
||||
replacing each original point with its perturbed counterpart. The com-
|
||||
plete differentially private dataset is denoted as = {𝑇1′′ , 𝑇2′′ , … , 𝑇𝑁′′ }.
|
||||
′′ loss in fine-grained trajectories, the logarithmic term ensures that leaf
|
||||
Algorithm 1 outlines the adaptive personalized perturbation proce- nodes retain sufficient privacy budget to preserve local spatial details.
|
||||
dure. Differentially Private Node Perturbation. For each node 𝑣 at layer 𝑖,
|
||||
the sensitivity of its count query is 𝛥𝑓 = 1. Laplace noise is applied
|
||||
according to its layer-wise budget:
|
||||
4.3. Hierarchical aggregation with dynamic budget allocation
|
||||
( )
|
||||
1
|
||||
𝑐 ′ (𝑣) = 𝑐(𝑣) + Laplace 0, . (13)
|
||||
This phase organizes the perturbed trajectories into a structured 𝜀level,𝑖
|
||||
form for privacy-preserving analytical querying and AI model training. The resulting prefix tree ′ with perturbed counts serves as a
|
||||
A hierarchical prefix tree is constructed from the encoded trajectories, privacy-preserving hierarchical representation supporting aggregate
|
||||
where node counts are perturbed under a dynamically adjusted budget analytics and AI-based trajectory modeling.
|
||||
to preserve global consistency while mitigating noise propagation. Algorithm 2 summarizes the hierarchical aggregation process with
|
||||
dynamic budget adjustment.
|
||||
Spatial Encoding via Hilbert Curve. Each perturbed point 𝑝′𝑖,𝑗 ∈ ′′
|
||||
is mapped into a one-dimensional integer value 𝑣𝑖,𝑗 using a Hilbert
|
||||
space-filling curve 𝐻(⋅), ensuring spatial locality preservation: 4.4. Privacy analysis
|
||||
|
||||
𝑣𝑖,𝑗 = 𝐻(𝑝′𝑖,𝑗 ). (11)
|
||||
The proposed AdaTraj-DP framework comprises two sequential
|
||||
Each integer value 𝑣𝑖,𝑗 is then converted into a fixed-length binary privacy-preserving mechanisms: adaptive personalized perturbation
|
||||
string 𝑠𝑖,𝑗 of length 𝐿enc , forming a discretized trajectory representation (with budget 𝜀point ) and hierarchical aggregation (with budget 𝜀tree ).
|
||||
𝑆𝑖 = [𝑠𝑖,1 , 𝑠𝑖,2 , … , 𝑠𝑖,𝐿𝑖 ]. The set of all encoded trajectories {𝑆𝑖 } consti- By the sequential composition theorem of differential privacy, the total
|
||||
tutes the input to hierarchical aggregation. The technical details of this privacy guarantee satisfies
|
||||
Hilbert-to-binary-string encoding, including the relationship between 𝜀total = 𝜀point + 𝜀tree . (14)
|
||||
the curve’s order and the string length, are elaborated in Appendix.
|
||||
|
||||
Prefix Tree Construction. A prefix tree is built from {𝑆𝑖 }, where each Privacy of Adaptive Personalized Perturbation (𝜀point ). The adaptive
|
||||
path from the root to a node 𝑣 represents a spatial prefix, and the node perturbation mechanism assigns an individual privacy budget 𝜀𝑝𝑖,𝑗 to
|
||||
count 𝑐(𝑣) indicates the number of trajectories sharing that prefix. The ̂ 𝑖,𝑗 )
|
||||
each trajectory point 𝑝𝑖,𝑗 derived from its normalized sensitivity 𝑆(𝑐
|
||||
maximum tree depth ℎ corresponds to the maximum trajectory length and local density 𝜌(𝑝𝑖,𝑗 ). To ensure rigorous privacy guarantees, it is
|
||||
or encoding depth. assumed that the global weighting parameters (e.g., contextual weights
|
||||
𝜔𝑐 and density thresholds) are computed from public sources, such as
|
||||
Dynamic Layer-wise Budget Allocation. The total privacy budget 𝜀tree
|
||||
map topologies or non-sensitive historical statistics. This reliance on
|
||||
is distributed across tree layers according to both layer depth and
|
||||
public metadata is a standard practice in privacy-preserving spatial
|
||||
statistical variance. Let 𝜎𝑖2 denote the empirical variance of node counts
|
||||
publishing [14,33], ensuring that the sensitivity calibration process
|
||||
at layer 𝑖. The adaptive allocation for layer 𝑖 is defined as
|
||||
itself does not leak private information. Consequently, the allocated
|
||||
(log(𝑖 + 𝑎)) ⋅ (1 + 𝛾𝜎𝑖2 ) budget 𝜀𝑝𝑖,𝑗 depends solely on the characteristics of its corresponding
|
||||
𝜀level,𝑖 = ∑ℎ ⋅ 𝜀tree , (12) trajectory 𝑇𝑖 . Under this assumption:
|
||||
2
|
||||
𝑗=1 (log(𝑗 + 𝑎))(1 + 𝛾𝜎𝑗 )
|
||||
|
||||
where 𝑎 > 0 is a smoothing parameter and 𝛾 ≥ 0 controls the weight of (1) The assignment of 𝜀𝑝𝑖,𝑗 relies solely on local statistics within 𝑇𝑖
|
||||
variance-based adjustment. Adopting the logarithmic strategy from [9], and public constants, which ensures independence among users.
|
||||
the function log(𝑖 + 𝑎) is selected to smooth the budget decay across (2) Each trajectory is processed through an independent Laplace
|
||||
layers. Unlike linear or exponential allocation schemes, which might mechanism. For any point 𝑝𝑖,𝑗 , the Laplace mechanism with scale
|
||||
excessively penalize deeper layers and lead to significant information 1∕𝜀𝑝𝑖,𝑗 satisfies 𝜀𝑝𝑖,𝑗 -differential privacy.
|
||||
|
||||
4
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
(3) Because the budgets are bounded within [𝜀min , 𝜀max ], the overall Both datasets are preprocessed by: (1) removing sampling intervals
|
||||
privacy cost of this phase is dominated by the smallest allocated exceeding 300 s; (2) filtering out trajectories shorter than 20 points;
|
||||
budget, and the worst-case (strongest) guarantee corresponds to (3) normalizing all coordinates into a [0, 1] × [0, 1] grid to ensure scale
|
||||
𝜀min -DP for each point. comparability.
|
||||
(4) By parallel composition across trajectories, the global privacy These datasets collectively provide both high-density and low-
|
||||
consumption of this phase is 𝜀point = 𝜀max , representing the max- density spatial distributions, enabling a fair evaluation of the proposed
|
||||
imum privacy loss incurred when the weakest noise is added. context-aware sensitivity modeling.
|
||||
|
||||
Hence, the adaptive perturbation phase satisfies 𝜀max -differential 5.1.2. Baseline methods
|
||||
privacy. To demonstrate the advantages of AdaTraj-DP, we compare it with
|
||||
Privacy of Hierarchical Aggregation (𝜀tree ). The hierarchical aggrega- four representative baselines, each reflecting a distinct privacy design
|
||||
tion mechanism constructs a prefix tree and perturbs its node counts paradigm:
|
||||
with layer-specific noise calibrated by 𝜀level,𝑖 . Each trajectory affects
|
||||
• HA-Tree [9]: A hierarchical aggregation method based on Hilbert
|
||||
exactly one node per layer, implying that the sensitivity of the count
|
||||
mapping and fixed logarithmic budget allocation, representing
|
||||
query at any layer is 𝛥𝑓 = 1. Adding Laplace noise with scale 1∕𝜀level,𝑖
|
||||
state-of-the-art static DP trees.
|
||||
guarantees 𝜀level,𝑖 -DP for that layer.
|
||||
• TFIDF-DP [13]: A personalized perturbation method using TF–
|
||||
Because the per-layer budgets 𝜀level,𝑖 are partitioned from 𝜀tree ac-
|
||||
IDF-based sensitivity scoring without hierarchical structure, cor-
|
||||
cording to
|
||||
responding to point-level DP only.
|
||||
∑
|
||||
ℎ
|
||||
• QJLP (LDP) [7]: A local differential privacy baseline where each
|
||||
𝜀level,𝑖 = 𝜀tree , (15) trajectory is perturbed independently on the client side.
|
||||
𝑖=1
|
||||
• AdaTraj-DP (Ours): The proposed adaptive framework that com-
|
||||
and the layers are sequentially composed along each trajectory path, bines context-aware sensitivity detection, adaptive perturbation,
|
||||
the entire prefix tree synthesis mechanism satisfies 𝜀tree -differential and dynamic hierarchical aggregation.
|
||||
privacy. The dynamic allocation factor (1 + 𝛾𝜎𝑖2 ) modifies the budget
|
||||
distribution without altering the total privacy bound, ensuring that the 5.1.3. Evaluation metrics
|
||||
overall guarantee remains unchanged. Performance is evaluated from three complementary perspectives:
|
||||
Overall Privacy Guarantee. Applying the sequential composition theo- Data Utility. We adopt three quantitative metrics: Mean Absolute Error
|
||||
rem to the two phases yields the total privacy protection level: (MAE), Mean Relative Error (MRE), and Hausdorff Distance (HD).
|
||||
𝜀total = 𝜀max + 𝜀tree . (16) MAE and MRE evaluate accuracy for range-count queries on perturbed
|
||||
trajectories, while HD measures spatial fidelity between original and
|
||||
This ensures that AdaTraj-DP provides formal, trajectory-level released datasets.
|
||||
differential privacy. The adaptive and hierarchical mechanisms jointly
|
||||
Model Utility. To align with AI-oriented evaluation, we train a down-
|
||||
maintain consistent privacy guarantees while supporting utility-
|
||||
stream trajectory classification model based on a lightweight Mamba
|
||||
preserving analysis for AI-based spatiotemporal modeling.
|
||||
encoder [37]. The model predicts driver ID from trajectory segments,
|
||||
and classification accuracy on the perturbed data reflects end-task
|
||||
5. Experimental evaluation
|
||||
utility (𝑈cls ).
|
||||
|
||||
This section presents an extensive empirical evaluation of the pro- Computational Efficiency. We report total runtime (𝑇total ) from prepro-
|
||||
posed AdaTraj-DP framework. The experiments aim to validate both cessing to privacy-protected publication, including all three phases of
|
||||
privacy preservation and analytical utility in AI-oriented trajectory AdaTraj-DP.
|
||||
publishing. Specifically, we address the following research questions:
|
||||
5.1.4. Parameter configuration
|
||||
• RQ1: How does the total privacy budget 𝜀total affect the analytical Unless otherwise stated, experiments use the following default con-
|
||||
utility of the released trajectories? figuration: the total privacy budget 𝜀total is divided by an allocation
|
||||
• RQ2: How does AdaTraj-DP perform compared to state-of-the- ratio 𝛼, where 𝛼 ∈ [0.3, 0.7] controls the portion used for adaptive
|
||||
art differential privacy mechanisms in terms of accuracy and perturbation (𝜀point ), and (1 − 𝛼) for hierarchical aggregation (𝜀tree ):
|
||||
computational efficiency?
|
||||
• RQ3: What are the impacts of the adaptive parameters—including 𝜀point = 𝛼𝜀total , 𝜀tree = (1 − 𝛼)𝜀total . (17)
|
||||
allocation ratio 𝛼 and variance factor 𝛾—on privacy–utility trade- We vary 𝜀total from 0.5 to 3.0 to investigate the privacy–utility
|
||||
offs? trade-off.
|
||||
The variance factor 𝛾 controlling dynamic budget adaptation is se-
|
||||
5.1. Experimental setup lected from {0, 0.2, 0.5, 1.0}, and the hierarchical smoothing parameter
|
||||
is set to 𝑎 = 1.0. The sensitivity threshold 𝜃𝑆 for classifying sensitive
|
||||
This subsection introduces the datasets, baseline methods, evalua- points is chosen from {0.6, 0.7, 0.8, 0.9}. The personalized budget range
|
||||
tion metrics, and parameter configurations used in the experiments. is fixed at [𝜀min , 𝜀max ] = [0.1, 1.0].
|
||||
To ensure comparability, all methods share identical grid resolution
|
||||
5.1.1. Datasets (𝐺 = 128) and Hilbert encoding length (𝐿enc = 16). All experiments are
|
||||
Experiments are primarily conducted on the widely used T-Drive implemented in Python 3.8 with PyTorch 2.4 on an NVIDIA RTX 4090
|
||||
dataset, which records GPS trajectories of 10,357 taxis in Beijing GPU.
|
||||
over seven days (February 2–8, 2008) [35]. It contains approximately
|
||||
15 million spatial points after preprocessing. To further verify cross- 5.2. RQ1: Data utility evaluation
|
||||
domain robustness, we additionally include the GeoLife dataset [36],
|
||||
which comprises 17,621 trajectories from 182 users, covering both This experiment evaluates how AdaTraj-DP preserves the analytical
|
||||
dense urban and sparse suburban mobility patterns. utility of published trajectories under different privacy budgets. All
|
||||
|
||||
5
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
|
||||
|
||||
(a) MAE of Count Queries (b) MRE of Count Queries
|
||||
|
||||
|
||||
Fig. 2. Trajectory count query accuracy under varying 𝜀total on both datasets.
|
||||
|
||||
|
||||
evaluations are conducted on both the T-Drive and GeoLife datasets, Table 1
|
||||
covering dense and sparse mobility scenarios to ensure cross-domain Spatial fidelity comparison (average over T-Drive and GeoLife datasets). Lower
|
||||
consistency. values indicate higher spatial accuracy.
|
||||
𝜀total Hausdorff Distance (HD) Mean Displacement (MD)
|
||||
|
||||
5.2.1. Accuracy of trajectory count queries AdaTraj-DP Best Baseline AdaTraj-DP Best Baseline
|
||||
We evaluate the ability of each method to answer prefix-based count 0.5 0.152 0.171 (HA-Tree) 0.098 0.113 (HA-Tree)
|
||||
queries accurately. For each dataset, a query set consisting of 1000 1.0 0.096 0.127 (HA-Tree) 0.069 0.087 (HA-Tree)
|
||||
1.5 0.089 0.125 (TFIDF-DP) 0.063 0.088 (TFIDF-DP)
|
||||
random trajectory prefixes with lengths between 4 and 8 is selected.
|
||||
2.0 0.083 0.118 (TFIDF-DP) 0.059 0.083 (TFIDF-DP)
|
||||
Let 𝑐(𝑞) denote the true count of trajectories matching prefix 𝑞 ∈ , and 3.0 0.079 0.130 (QJLP) 0.056 0.094 (QJLP)
|
||||
𝑐(𝑞)
|
||||
̂ be the noisy count returned by the mechanism. The data utility is
|
||||
quantified using Mean Absolute Error (MAE) and Mean Relative Error
|
||||
(MRE), defined as:
|
||||
tasks. Two representative learning tasks are considered: (1) trajectory
|
||||
1 ∑ 1 ∑ |𝑐(𝑞) − 𝑐(𝑞)|
|
||||
̂
|
||||
MAE = |𝑐(𝑞) − 𝑐(𝑞)|,
|
||||
̂ MRE = (18) classification, which predicts the semantic category of a movement se-
|
||||
|| 𝑞∈ || 𝑞∈ max(𝑐(𝑞), 𝛿)
|
||||
quence; (2) destination prediction, which estimates the likely endpoint
|
||||
where 𝛿 is a smoothing parameter (set to 1% of the total dataset size) of an ongoing trajectory. These tasks are evaluated on the T-Drive
|
||||
to prevent division by zero for small counts. The results are averaged and GeoLife datasets to reflect both dense and sparse urban mobility
|
||||
over ten repetitions with independent noise realizations. environments.
|
||||
|
||||
Effect of Privacy Budget 𝜀total . Figs. 2(a) and 2(b) illustrate the quan- 5.3.1. Trajectory classification
|
||||
titative relationship between privacy strength and data utility. All A hierarchical Transformer-based model with positional encoding is
|
||||
methods exhibit a convex error decay curve as 𝜀total increases from 0.5 trained on the published trajectories to perform multi-class trajectory
|
||||
to 3.0, reflecting the fundamental differential privacy trade-off. classification. The model architecture follows a standard encoder setup
|
||||
In the strict privacy regime (𝜖𝑡𝑜𝑡𝑎𝑙 ∈ [0.5, 1.5]), our method achieves with three attention layers and a hidden size of 256. Each experiment
|
||||
the steepest marginal reduction in MAE, indicating a high return on is repeated five times under independent noise realizations, and the
|
||||
privacy budget investment. Specifically, when 𝜖𝑡𝑜𝑡𝑎𝑙 increases from 0.5 average classification accuracy and macro F1-score are reported. The
|
||||
to 1.0, AdaTraj-DP reduces the MAE by approximately 45.3% (from total privacy budget 𝜀total is varied from 0.5 to 3.0.
|
||||
18.1 to 9.9), whereas the second-best baseline, HA-Tree, only achieves
|
||||
Effect of Privacy Budget 𝜀total . Figs. 4(a) and 4(b) illustrate the influ-
|
||||
a 31.4% reduction. This quantitative gap demonstrates that AdaTraj-
|
||||
ence of 𝜀total on model performance. As the privacy budget increases,
|
||||
DP yields a significantly higher marginal utility gain for every unit of
|
||||
both accuracy and F1-score improve across all methods. AdaTraj-
|
||||
privacy budget expended compared to static hierarchical structures.
|
||||
DP consistently maintains the highest model utility on both datasets,
|
||||
demonstrating that adaptive sensitivity control effectively preserves
|
||||
5.2.2. Preservation of spatial distribution
|
||||
discriminative features. The hierarchical tree representation mitigates
|
||||
Spatial fidelity evaluates the geometric similarity between the orig-
|
||||
local noise accumulation, supporting stable model convergence.
|
||||
inal and perturbed trajectories. We use two complementary metrics:
|
||||
the Hausdorff Distance (HD) for worst-case deviation and the Mean 5.3.2. Destination prediction
|
||||
Displacement (MD) for average positional distortion. To evaluate predictive consistency, a sequence-to-sequence neural
|
||||
Effect of Privacy Budget 𝜀total . Fig. 3 and Table 1 summarize the spatial decoder is trained to predict the destination region of each trajectory
|
||||
accuracy across privacy levels. For both T-Drive and GeoLife datasets, prefix. Prediction accuracy is measured by the top-1 hit rate, while
|
||||
AdaTraj-DP consistently achieves smaller deviations, demonstrating its spatial accuracy is quantified by the mean geodesic distance between
|
||||
robustness across data densities and spatial patterns. The sensitivity- predicted and true destinations.
|
||||
guided perturbation preserves local consistency, while adaptive budget Effect of Privacy Budget 𝜀total . Figs. 5(a) and 5(b) illustrate the results
|
||||
redistribution reduces distortion in dense urban regions. of destination prediction across both datasets. AdaTraj-DP maintains
|
||||
Overall, AdaTraj-DP demonstrates consistent spatial and statisti- stable predictive performance even under strict privacy constraints
|
||||
cal accuracy across both datasets, validating its generalizability to (𝜀total < 1.0), consistently outperforming fixed-budget baselines that
|
||||
heterogeneous mobility distributions. cannot adapt to local sensitivity variations. As the privacy budget
|
||||
increases, the prediction accuracy steadily improves, while the mean
|
||||
5.3. RQ2: Model utility evaluation spatial deviation between predicted and true destinations decreases.
|
||||
This demonstrates that adaptive perturbation and hierarchical encoding
|
||||
This experiment evaluates how the differentially private trajectories together preserve mobility semantics and ensure downstream models
|
||||
generated by AdaTraj-DP retain their utility for AI-based downstream can effectively capture trajectory intent despite injected noise.
|
||||
|
||||
6
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
|
||||
|
||||
(a) Hausdorff Distance vs. Privacy (b) Mean Displacement vs. Privacy
|
||||
Budget Budget
|
||||
|
||||
|
||||
Fig. 3. Spatial fidelity comparison on T-Drive and GeoLife datasets.
|
||||
|
||||
|
||||
|
||||
|
||||
(a) Classification Accuracy (b) F1-score
|
||||
|
||||
|
||||
Fig. 4. Trajectory classification performance under varying 𝜀total on T-Drive and GeoLife datasets.
|
||||
|
||||
|
||||
|
||||
|
||||
(a) Destination Prediction Accuracy (b) Destination Prediction Mean Dis-
|
||||
(Top-1 Hit Rate) tance Error (km)
|
||||
|
||||
|
||||
Fig. 5. Destination prediction accuracy and spatial deviation under varying 𝜀total on T-Drive and GeoLife datasets.
|
||||
|
||||
|
||||
5.4. RQ3: Parameter sensitivity analysis 𝛼 = 0.6, where both the query error and model accuracy achieve
|
||||
near-balanced performance. When 𝛼 < 0.4, excessive noise in point
|
||||
This experiment investigates the effect of key parameters in AdaTraj- perturbation causes degraded spatial precision, while 𝛼 > 0.8 reduces
|
||||
DP on privacy–utility balance, focusing on two critical hyperparame- the reliability of aggregated counts in the prefix tree, highlighting the
|
||||
ters: the budget allocation ratio 𝛼 and the sensitivity threshold 𝜃TFIDF . necessity of coordinated budget allocation.
|
||||
All experiments are conducted with the total privacy budget 𝜀total = 1.5 In practice, the optimal 𝛼 depends on the specific utility require-
|
||||
on both the T-Drive and GeoLife datasets. ments. For applications prioritizing fine-grained point precision (e.g.,
|
||||
destination prediction), a larger 𝛼 (e.g., 0.6–0.7) is recommended to
|
||||
5.4.1. Effect of budget allocation ratio 𝛼 allocate more budget to the perturbation phase. Conversely, for range
|
||||
The parameter 𝛼 controls the distribution of the total privacy budget query tasks relying on aggregate statistics, a smaller 𝛼 favors the hier-
|
||||
between the point-level perturbation and the hierarchical tree aggre- archical tree structure. An empirical strategy for parameter selection
|
||||
gation phases, where 𝜀point = 𝛼𝜀total and 𝜀tree = (1 − 𝛼)𝜀total . A small involves using a small, non-sensitive validation set to estimate the
|
||||
𝛼 assigns more budget to aggregation, reducing hierarchical noise, inflection point of the loss function. A balanced initialization of 𝛼 = 0.6
|
||||
whereas a large 𝛼 increases point-level fidelity at the expense of tree is recommended as a default setting, which prioritizes neither point-
|
||||
consistency. We vary 𝛼 from 0.1 to 0.9 and evaluate both data utility level perturbation nor structural aggregation excessively. To ensure
|
||||
and model accuracy. privacy integrity, this validation set is constructed from public histor-
|
||||
Figs. 6 presents the effect of 𝛼 on count query error (MAE) and ical trajectory data (e.g., open-source T-Drive samples) or a disjoint
|
||||
trajectory classification accuracy. An optimal trade-off is observed near subset of historical records that does not overlap with the private
|
||||
|
||||
7
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. Computational cost decomposition of AdaTraj-DP across three key
|
||||
Fig. 6. Impact of budget allocation ratio 𝛼 on query utility and model
|
||||
stages.
|
||||
performance at 𝜀total = 1.5.
|
||||
|
||||
|
||||
|
||||
T-Drive dataset and the sparse, diverse GeoLife dataset. This cross-
|
||||
dataset stability suggests that AdaTraj-DP is robust to heterogeneous
|
||||
spatial distributions, indicating that a standard parameter configura-
|
||||
tion can yield reliable performance without the need for exhaustive
|
||||
hyperparameter retuning for every new application scenario.
|
||||
|
||||
5.5. Scalability analysis
|
||||
|
||||
To address practical deployment concerns, particularly for city-wide
|
||||
scenarios, we analyze the scalability of AdaTraj-DP regarding both
|
||||
dataset volume (number of users 𝑁) and temporal duration (trajectory
|
||||
length 𝐿).
|
||||
Scalability to Large-scale User Datasets. The computational complex-
|
||||
Fig. 7. Effect of the sensitivity threshold 𝜃TFIDF on spatial fidelity and predic- ity of AdaTraj-DP is dominated by the linear scanning of trajectory
|
||||
tive performance at 𝜀total = 1.5. points. Specifically, the sensitivity detection and adaptive perturbation
|
||||
phases operate on each trajectory independently, with a time complex-
|
||||
ity of 𝑂(𝑁 ⋅ 𝐿). This independence allows for trivial parallelization
|
||||
across multiple processors, significantly reducing runtime on large-
|
||||
dataset . This separation guarantees that the hyperparameter tuning
|
||||
scale datasets. Furthermore, the hierarchical aggregation phase inserts
|
||||
process relies solely on public knowledge and does not consume the
|
||||
encoded sequences into the prefix tree with a complexity of 𝑂(𝑁 ⋅ 𝐿),
|
||||
privacy budget allocated for the sensitive data.
|
||||
avoiding the quadratic 𝑂(𝑁 2 ) pairwise comparisons often required by
|
||||
clustering-based or 𝐾-anonymity approaches. Consequently, the run-
|
||||
5.4.2. Effect of sensitivity threshold 𝜃TFIDF time of AdaTraj-DP grows linearly with the number of users, indicating
|
||||
The threshold 𝜃TFIDF determines how many trajectory points are that the framework is scalable to large-scale spatiotemporal datasets
|
||||
classified as sensitive during the TF–IDF-based detection process. A typical of modern urban computing.
|
||||
smaller threshold labels more points as sensitive, resulting in stronger
|
||||
Robustness for Long Historical Trajectories. For long historical tra-
|
||||
protection but higher noise magnitude. We vary 𝜃TFIDF from 0.6 to 1.2
|
||||
jectories, the challenge lies in maintaining structural efficiency and
|
||||
and evaluate the mean displacement (MD) and destination prediction
|
||||
data utility as the sequence length increases. AdaTraj-DP addresses this
|
||||
accuracy.
|
||||
through two mechanisms:
|
||||
Figs. 7 depicts the variation of spatial fidelity and predictive util-
|
||||
ity under different 𝜃TFIDF values. As 𝜃TFIDF increases, the number of (1) Efficient Encoding: The Hilbert space-filling curve maps high-
|
||||
sensitive points decreases, leading to reduced perturbation intensity dimensional spatial points into 1D integers via efficient bit-
|
||||
and smaller average displacement. However, excessively large 𝜃TFIDF wise operations. Since the encoding complexity is constant per
|
||||
weakens privacy coverage and slightly degrades downstream predic- point, the computational cost scales linearly with the trajectory
|
||||
tion accuracy. The optimal setting is observed around 𝜃TFIDF = 0.9, length, avoiding the performance bottlenecks often associated
|
||||
balancing spatial accuracy with model generalization. with complex sequence alignment methods.
|
||||
(2) Depth-Robust Aggregation: Long trajectories naturally necessitate
|
||||
5.4.3. Generalization and parameter stability deeper prefix trees, which typically suffer from severe budget
|
||||
In the ablation studies presented above, we observed that the frame- dilution at lower levels. AdaTraj-DP addresses this through its
|
||||
work’s utility is responsive to variations in the budget allocation ratio logarithmic layer-wise allocation (Eq. (12)), which dampens
|
||||
𝛼 and sensitivity threshold 𝜃TFIDF , particularly when these parameters the noise increase rate relative to tree depth. This mechanism
|
||||
approach the boundaries of their respective ranges. This sensitivity ensures that the tail ends of extended mobility sequences re-
|
||||
necessitates a discussion on the model’s generalization capabilities tain analytical utility, preventing the rapid signal degradation
|
||||
across different data distributions. commonly observed in uniform allocation schemes.
|
||||
While the framework exhibits sensitivity to extreme parameter vari-
|
||||
ations, it is worth noting that the optimal operating points (𝛼 ≈ Empirical Efficiency Evaluation. To complement the theoretical com-
|
||||
0.6, 𝜃TFIDF ≈ 0.9) remain consistent across both the high-density plexity analysis, Fig. 8 presents the empirical runtime decomposition
|
||||
|
||||
8
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
of AdaTraj-DP on the T-Drive dataset. The total processing time is This transformation is controlled by the Hilbert curve’s order pa-
|
||||
approximately 250 s. As observed, the TF–IDF Analysis phase con- rameter, designated as 𝑘. When applying a Hilbert curve with order 𝑘,
|
||||
stitutes the majority of the computational overhead (approx. 60%) the two-dimensional space becomes divided into a (2𝑘 ) × (2𝑘 ) cellular
|
||||
due to the necessity of global statistical aggregation across the spatial grid. To guarantee that every coordinate within dataset 𝐷 receives
|
||||
grid. However, the core privacy mechanisms—Prefix Tree Construction a distinct Hilbert index √assignment, the order parameter must fulfill
|
||||
and Perturbation—demonstrate high efficiency. Notably, the adaptive the condition 𝑘 ≥ ⌈log |𝐷|⌉. This configuration assigns each cell,
|
||||
perturbation phase accounts for less than 10% of the total time, con- including any coordinate it contains, to a unique integer within the
|
||||
firming that the granular noise injection introduces negligible latency. interval [0, (2𝑘 )2 − 1].
|
||||
This performance profile validates that AdaTraj-DP is well-suited for The binary sequence length, denoted 𝐿enc , depends on the total
|
||||
periodic batch publishing scenarios (e.g., releasing trajectory updates count of representable integer values. Representing all (2𝑘 )2 = 22𝑘
|
||||
every 5-10 min for traffic monitoring). While the current execution distinct values necessitates a binary sequence of length 𝐿enc = 2𝑘. The
|
||||
time is sufficient for such batch-based near-real-time analytics, we transformation consists of a direct conversion from integer 𝑣𝑖,𝑗 to its
|
||||
acknowledge that strictly latency-critical streaming applications may 𝐿enc -bit binary form, applying leading zero-padding when needed to
|
||||
require further optimization of the tree construction process. Neverthe- maintain uniform length.
|
||||
less, for the targeted high-utility analysis tasks, this computational cost Consider the following illustration: assume a Hilbert curve with
|
||||
is a justifiable trade-off for the structural consistency provided by the order 𝑘 = 8. Under these conditions: The cellular count equals (28 )2 =
|
||||
framework. 65,536. The integer value 𝑣𝑖,𝑗 resides within the interval [0, 65535]. The
|
||||
necessary binary sequence length becomes 𝐿enc = 2 × 8 = 16.
|
||||
6. Conclusion When coordinate 𝑝′𝑖,𝑗 maps to integer 𝑣𝑖,𝑗 = 47593, its 16-bit binary
|
||||
sequence representation becomes:
|
||||
This study presented AdaTraj-DP, an adaptive privacy-preserving
|
||||
𝑠𝑖,𝑗 = Encode(47593, 16) = "1011100111101001". (A.1)
|
||||
framework for publishing trajectory data with differential privacy guar-
|
||||
antees. The framework introduces context-aware sensitivity modeling This sequence 𝑠𝑖,𝑗 serves as the actual element for navigating and
|
||||
and adaptive budget allocation to balance privacy protection and an- constructing the prefix tree. Individual bits within the sequence deter-
|
||||
alytical utility in AI-based mobility analysis. By integrating personal- mine decisions at corresponding tree levels, establishing a multi-level
|
||||
ized perturbation with hierarchical prefix-tree aggregation, AdaTraj-DP spatial indexing structure. The selection of parameter 𝑘 (and conse-
|
||||
enables trajectory-level differential privacy while maintaining spatial quently 𝐿enc ) represents a crucial design choice that mediates between
|
||||
fidelity and downstream model performance. spatial granularity and the prefix tree’s dimensions and computational
|
||||
Future work will focus on extending AdaTraj-DP to support multi- overhead.
|
||||
modal trajectory data, integrating semantic and temporal context under
|
||||
unified privacy constraints. Additionally, to address the efficiency con- Data availability
|
||||
cerns in high-frequency streaming environments, we plan to investigate
|
||||
incremental tree update algorithms. This would allow the framework Data will be made available on request.
|
||||
to handle real-time data streams with significantly lower latency while
|
||||
maintaining the established privacy guarantees.
|
||||
References
|
||||
CRediT authorship contribution statement
|
||||
[1] W. Zhang, M. Li, R. Tandon, H. Li, Online location trace privacy: An information
|
||||
theoretic approach, IEEE Trans. Inf. Forensics Secur. 14 (1) (2018) 235–250.
|
||||
Yongxin Zhao: Writing – review & editing, Writing – original [2] F. Jin, W. Hua, M. Francia, P. Chao, M.E. Orlowska, X. Zhou, A survey and
|
||||
draft, Visualization, Validation, Methodology, Investigation, Data cu- experimental study on privacy-preserving trajectory data publishing, IEEE Trans.
|
||||
ration, Conceptualization. Chundong Wang: Writing – review & edit- Knowl. Data Eng. 35 (6) (2022) 5577–5596.
|
||||
[3] J. Liu, J. Chen, R. Law, S. Wang, L. Yang, Travel patterns and spatial structure:
|
||||
ing, Project administration, Methodology. Hao Lin: Visualization, Val-
|
||||
understanding winter tourism by trajectory data mining, Asia Pac. J. Tour. Res.
|
||||
idation, Methodology. Xumeng Wang: Writing – review & editing, 29 (11) (2024) 1351–1368.
|
||||
Methodology, Conceptualization. Yixuan Song: Methodology, Investi- [4] Z. Wu, X. Wang, Z. Huang, T. Zhang, M. Zhu, X. Huang, M. Xu, W. Chen, A
|
||||
gation, Conceptualization. Qiuyu Du: Investigation, Conceptualization. utility-aware privacy-preserving method for trajectory publication, IEEE Trans.
|
||||
Vis. Comput. Graphics.
|
||||
[5] S. Schestakov, S. Gottschalk, T. Funke, E. Demidova, RE-Trace: Re-identification
|
||||
Declaration of competing interest of modified GPS trajectories, ACM Trans. Spat. Algorithms Syst. 10 (4) (2024)
|
||||
1–28.
|
||||
The authors declare that they have no known competing finan- [6] C. Dwork, Differential privacy, in: International Colloquium on Automata,
|
||||
cial interests or personal relationships that could have appeared to Languages, and Programming, Springer, 2006, pp. 1–12.
|
||||
[7] Z. Yang, R. Wang, D. Wu, H. Wang, H. Song, X. Ma, Local trajectory privacy
|
||||
influence the work reported in this paper. protection in 5G enabled industrial intelligent logistics, IEEE Trans. Ind. Inform.
|
||||
18 (4) (2021) 2868–2876.
|
||||
Acknowledgments [8] Z. Shen, Y. Zhang, H. Wang, P. Liu, K. Liu, Y. Shen, BiGRU-DP: Improved
|
||||
differential privacy protection method for trajectory data publishing, Expert Syst.
|
||||
Appl. 252 (2024) 124264.
|
||||
Thanks to the National Key R&D Program of China (2023YFB2703
|
||||
[9] Y. Zhao, C. Wang, Protecting privacy and enhancing utility: A novel approach for
|
||||
900). personalized trajectory data publishing using noisy prefix tree, Comput. Secur.
|
||||
144 (2024) 103922.
|
||||
Appendix. Conversion from integer values to binary sequences [10] S. Yuan, D. Pi, X. Zhao, M. Xu, Differential privacy trajectory data protection
|
||||
scheme based on R-tree, Expert Syst. Appl. 182 (2021) 115215.
|
||||
[11] W. Cheng, R. Wen, H. Huang, W. Miao, C. Wang, OPTDP: Towards opti-
|
||||
Our prefix tree construction necessitates the representation of each mal personalized trajectory differential privacy for trajectory data publishing,
|
||||
geographic coordinate as a character sequence. Although the Hilbert Neurocomputing 472 (2022) 201–211.
|
||||
space-filling curve successfully transforms a two-dimensional coordi- [12] N. Niknami, M. Abadi, F. Deldar, A fully spatial personalized differentially private
|
||||
nate 𝑝′𝑖,𝑗 into a one-dimensional integer 𝑣𝑖,𝑗 , this numerical value can- mechanism to provide non-uniform privacy guarantees for spatial databases, Inf.
|
||||
Syst. 92 (2020) 101526.
|
||||
not be directly incorporated into a conventional prefix tree structure. [13] P. Liu, D. Wu, Z. Shen, H. Wang, K. Liu, Personalized trajectory privacy data
|
||||
Consequently, we implement an additional transformation phase that publishing scheme based on differential privacy, Internet Things 25 (2024)
|
||||
converts this integer into a binary sequence 𝑠𝑖,𝑗 with fixed length. 101074.
|
||||
|
||||
|
||||
9
|
||||
Y. Zhao et al. Computer Standards & Interfaces 97 (2026) 104125
|
||||
|
||||
|
||||
[14] W. Qardaji, W. Yang, N. Li, Differentially private grids for geospatial data, in: [25] T. Wang, Y. Tao, A. Gilad, A. Machanavajjhala, S. Roy, Explaining differen-
|
||||
2013 IEEE 29th International Conference on Data Engineering, ICDE, IEEE, 2013, tially private query results with dpxplain, Proc. VLDB Endow. 16 (12) (2023)
|
||||
pp. 757–768. 3962–3965.
|
||||
[15] G. Cormode, C. Procopiuc, D. Srivastava, E. Shen, T. Yu, Differentially private [26] Z. Huang, J. Liu, D.G. Alabi, R.C. Fernandez, E. Wu, Saibot: A differentially
|
||||
spatial decompositions, in: 2012 IEEE 28th International Conference on Data private data search platform, Proc. VLDB Endow. (PVLDB) 16 (11) (2023) PVLDB
|
||||
Engineering, IEEE, 2012, pp. 20–31. 2023 demo / system paper.
|
||||
[16] J. Hua, Y. Gao, S. Zhong, Differentially private publication of general time- [27] Y. Dai, J. Shao, C. Wei, D. Zhang, H.T. Shen, Personalized semantic trajectory
|
||||
serial trajectory data, in: 2015 IEEE Conference on Computer Communications, privacy preservation through trajectory reconstruction, World Wide Web 21
|
||||
INFOCOM, IEEE, 2015, pp. 549–557. (2018) 875–914.
|
||||
[17] Z. Zhang, X. Xu, F. Xiao, LGAN-DP: A novel differential private publication [28] K. Zuo, R. Liu, J. Zhao, Z. Shen, F. Chen, Method for the protection of
|
||||
mechanism of trajectory data, Future Gener. Comput. Syst. 141 (2023) 692–703. spatiotemporal correlation location privacy with semantic information, J. Xidian
|
||||
[18] Y. Hu, Y. Du, Z. Zhang, Z. Fang, L. Chen, K. Zheng, Y. Gao, Real-time trajectory Univ. 49 (1) (2022) 67–77.
|
||||
synthesis with local differential privacy, in: 2024 IEEE 40th International [29] S. Denisov, H.B. McMahan, J. Rush, A. Smith, A. Guha Thakurta, Improved
|
||||
Conference on Data Engineering, ICDE, IEEE, 2024, pp. 1685–1698. differential privacy for sgd via optimal private linear operators on adaptive
|
||||
[19] R. Zhang, W. Ni, N. Fu, L. Hou, D. Zhang, Y. Zhang, DP-LTGAN: Differentially streams, Adv. Neural Inf. Process. Syst. 35 (2022) 5910–5924.
|
||||
private trajectory publishing via Locally-aware Transformer-based GAN, Future [30] H. Fang, X. Li, C. Fan, P. Li, Improved convergence of differential private sgd
|
||||
Gener. Comput. Syst. 166 (2025) 107686. with gradient clipping, in: The Eleventh International Conference on Learning
|
||||
[20] S. Jiao, J. Cheng, Z. Huang, T. Li, T. Xie, W. Chen, Y. Ma, X. Wang, DPKnob: A Representations, 2023.
|
||||
visual analysis approach to risk-aware formulation of differential privacy schemes [31] J. Fu, coauthors, DPSUR: Accelerating differentially private training via selective
|
||||
for data query scenarios, Vis. Inform. 8 (3) (2024) 42–52. updates and release, Proc. VLDB Endow. (PVLDB) 17 (2024) PVLDB paper; PDF
|
||||
[21] X. Wang, S. Jiao, C. Bryan, Defogger: A visual analysis approach for data available from VLDB site.
|
||||
exploration of sensitive data protected by differential privacy, IEEE Trans. Vis. [32] Y. Zheng, Trajectory data mining: an overview, ACM Trans. Intell. Syst. Technol.
|
||||
Comput. Graphics 31 (1) (2025) 448–458, http://dx.doi.org/10.1109/TVCG. (TIST) 6 (3) (2015) 1–41.
|
||||
2024.3456304. [33] M.E. Andrés, N.E. Bordenabe, K. Chatzikokolakis, C. Palamidessi, Geo-
|
||||
[22] R. Chen, B.C.M. Fung, B.C. Desai, Differentially private trajectory data indistinguishability: Differential privacy for location-based systems, in: Proceed-
|
||||
publication, 2011, arXiv:1112.2020, URL https://arxiv.org/abs/1112.2020. ings of the 2013 ACM SIGSAC Conference on Computer & Communications
|
||||
[23] C. Yin, J. Xi, R. Sun, J. Wang, Location privacy protection based on differential Security, 2013, pp. 901–914.
|
||||
privacy strategy for big data in industrial internet of things, IEEE Trans. Ind. [34] W. Zhang, M. Li, R. Tandon, H. Li, Semantic-aware privacy-preserving online
|
||||
Inform. 14 (8) (2017) 3628–3636. location trajectory data sharing, IEEE Trans. Inf. Forensics Secur. 17 (2022)
|
||||
[24] Y. Zhao, C. Wang, E. Zhao, X. Zheng, H. Lin, PerTrajTree-DP: A personalized 2292–2306.
|
||||
privacy-preserving trajectory publishing framework for trustworthy AI systems, [35] J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, Y. Huang, T-drive: driving
|
||||
in: Data Security and Privacy Protection, Springer Nature Singapore, Singapore, directions based on taxi trajectories, in: Proceedings of the 18th SIGSPATIAL
|
||||
ISBN: 978-981-95-3182-0, 2026, pp. 57–75. International Conference on Advances in Geographic Information Systems, 2010,
|
||||
pp. 99–108.
|
||||
[36] Y. Zheng, X. Xie, W.-Y. Ma, et al., GeoLife: A collaborative social networking
|
||||
service among user, location and trajectory, IEEE Data Eng. Bull. 33 (2) (2010)
|
||||
32–39.
|
||||
[37] Y. Zhao, C. Wang, L. Li, X. Wang, H. Lin, Z. Liu, TrajMamba: A multi-scale
|
||||
mamba-based framework for joint trajectory and road network representation
|
||||
learning, 2025, https://ssrn.com/abstract=5624451.
|
||||
|
||||
|
||||
|
||||
|
||||
10
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,979 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
Chaos experiments in microservice architectures: A systematic literature
|
||||
review
|
||||
Emrah Esen a , Akhan Akbulut a , Cagatay Catal b ,∗
|
||||
a
|
||||
Department of Computer Engineering, Istanbul Kültür University, 34536, Istanbul, Turkey
|
||||
b
|
||||
Department of Computer Science and Engineering, Qatar University, Doha 2713, Qatar
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: This study analyzes the implementation of Chaos Engineering in modern microservice systems. It identifies
|
||||
Chaos engineering key methods, tools, and practices used to effectively enhance the resilience of software systems in production
|
||||
Microservice environments. In this context, our Systematic Literature Review (SLR) of 31 research articles has uncovered 38
|
||||
Systematic literature review
|
||||
tools crucial for carrying out fault injection methods, including several tools such as Chaos Toolkit, Gremlin,
|
||||
and Chaos Machine. The study also explores the platforms used for chaos experiments and how centralized
|
||||
management of chaos engineering can facilitate the coordination of these experiments across complex systems.
|
||||
The evaluated literature reveals the efficacy of chaos engineering in improving fault tolerance and robustness of
|
||||
software systems, particularly those based on microservice architectures. The paper underlines the importance
|
||||
of careful planning and execution in implementing chaos engineering and encourages further research in this
|
||||
field to uncover more effective practices for the resilience improvement of microservice systems.
|
||||
|
||||
|
||||
Contents
|
||||
|
||||
1. Introduction ...................................................................................................................................................................................................... 2
|
||||
2. Background ....................................................................................................................................................................................................... 2
|
||||
2.1. Microservice architecture ........................................................................................................................................................................ 3
|
||||
2.2. Microservice principles ........................................................................................................................................................................... 3
|
||||
2.3. Challenges/Troubleshooting/Failures in microservice architecture .............................................................................................................. 3
|
||||
2.4. Chaos engineering .................................................................................................................................................................................. 4
|
||||
3. Review protocol................................................................................................................................................................................................. 4
|
||||
3.1. Research questions ................................................................................................................................................................................. 4
|
||||
3.2. Search strategy....................................................................................................................................................................................... 4
|
||||
3.3. Study selection criteria ........................................................................................................................................................................... 4
|
||||
3.4. Study quality assessment......................................................................................................................................................................... 5
|
||||
3.5. Data extraction ...................................................................................................................................................................................... 5
|
||||
3.6. Data synthesis ........................................................................................................................................................................................ 6
|
||||
4. Results .............................................................................................................................................................................................................. 6
|
||||
4.1. Main statistics ........................................................................................................................................................................................ 6
|
||||
4.2. How is Chaos engineering effectively applied in production environments to enhance the resilience of software systems? .............................. 6
|
||||
4.3. Which platforms have been used for chaos experiments? ........................................................................................................................... 6
|
||||
4.4. How can Chaos engineering be effectively applied to microservice architecture to ensure successful implementation and enhance system
|
||||
resilience? .............................................................................................................................................................................................. 10
|
||||
4.5. To what extent can the centralized provision of Chaos engineering effectively facilitate the management of chaos experiments across complex
|
||||
systems?................................................................................................................................................................................................. 10
|
||||
4.6. What are the challenges reported in the relevant papers? .......................................................................................................................... 10
|
||||
5. Discussion ......................................................................................................................................................................................................... 10
|
||||
5.1. General discussion .................................................................................................................................................................................. 10
|
||||
5.2. Threats to validity .................................................................................................................................................................................. 12
|
||||
|
||||
|
||||
|
||||
∗ Corresponding author.
|
||||
E-mail address: ccatal@qu.edu.qa (C. Catal).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104116
|
||||
Received 22 September 2024; Received in revised form 28 November 2025; Accepted 12 December 2025
|
||||
Available online 15 December 2025
|
||||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
6. Conclusion ........................................................................................................................................................................................................ 12
|
||||
CRediT authorship contribution statement ........................................................................................................................................................... 12
|
||||
Declaration of competing interest ........................................................................................................................................................................ 12
|
||||
Data availability ................................................................................................................................................................................................ 12
|
||||
References......................................................................................................................................................................................................... 12
|
||||
|
||||
|
||||
|
||||
challenges faced, and solutions. In addition, it will assess the effective-
|
||||
1. Introduction ness of chaos experiments in enhancing the reliability and robustness of
|
||||
microservice systems by using data obtained from real-world scenarios
|
||||
In recent years, the adoption of microservice architecture has led to develop strategic recommendations. This study is a critical step
|
||||
to the transformation of application infrastructures into distributed in understanding the applicability and impact of chaos engineering
|
||||
systems. These systems are designed to enhance maintainability by de- within the complexity of microservice architectures and aims to make
|
||||
coupling services. The primary benefit of this architecture is the ease of significant contributions to the body of knowledge in this field. Recent
|
||||
maintenance of individual services within the microservice ecosystem research has applied chaos engineering for this architectural style, how-
|
||||
due to their smaller and more modular nature [1]. However, despite ever, a systematic overview of the state-of-the-art on the use of chaos
|
||||
these advantages, the distributed nature of microservices introduces engineering in the microservice architecture is lacking. Therefore, a
|
||||
significant challenges. Specifically, the complex management of ser- Systematic Literature Review (SLR) has been performed to provide an
|
||||
vices and their tight integration can considerably complicate software overview of how chaos engineering was applied.
|
||||
debugging. Debugging becomes complex in this architecture due to its This article primarily targets peer-reviewed research papers to main-
|
||||
distributed nature, the necessity to pinpoint the exact service causing tain methodological consistency and ensure scholarly rigor. We specif-
|
||||
the problem, and the dynamic characteristics of microservices. Con- ically chose a systematic literature review (SLR) methodology because
|
||||
sequently, debugging in microservice architecture demands a greater peer-reviewed academic studies are subject to rigorous validation pro-
|
||||
level of effort and specialized expertise compared to conventional cesses, which enhance the reliability and validity of our findings [8,
|
||||
monolithic architectures [2]. However, it becomes quite challenging to 9]. Although excluding industry-specific, grey literature may restrict
|
||||
predict what will happen if there is an unexpected error or if a service certain practical perspectives, this choice was deliberately made to
|
||||
on the network goes out of service. Service outages can be caused by avoid potential biases and uphold the scientific integrity of our re-
|
||||
anything from a malicious cyberattack to a hardware failure to simple view [10,11]. However, future studies could broaden the scope to
|
||||
human error, and they can have devastating financial consequences. incorporate industrial case studies and practical experiences, which
|
||||
Although such unexpected situations are rare, they can interfere with would enrich our understanding of chaos engineering’s applicability
|
||||
the operation of distributed systems and devastatingly affect the live beyond the academic context.
|
||||
environment in which the application is located [3]. It is necessary to The main contributions of this study are listed as follows:
|
||||
detect points in the system before an error occurs and spreads to the
|
||||
1. To the best of our knowledge, this is the first study to employ
|
||||
entire system.
|
||||
a systematic literature review approach in the field of chaos
|
||||
Microservice architecture applications undergo testing procedures
|
||||
engineering on microservice architecture applications [12]. The
|
||||
to ensure their quality and dependability. These include unit testing,
|
||||
study provides an extensive systematic literature review of how
|
||||
service test, end-to-end test, behavior-driven test, integration test, and
|
||||
chaos engineering can be applied to enhance the resilience of mi-
|
||||
regression test [4]. The comprehensive approach to microservices test-
|
||||
croservice architectures. It collates findings from various sources
|
||||
ing also encompasses live testing strategies for complex systems [5].
|
||||
to provide insights into the current state of research and practice
|
||||
This thorough process emphasizes different aspects such as function-
|
||||
in this field.
|
||||
ality, interoperability, performance of individual services within the
|
||||
2. The study categorizes and summarizes the range of chaos en-
|
||||
architecture. It aims to detect and resolve issues early to ensure stable
|
||||
gineering tools and methods used in industry and academia,
|
||||
and high-quality microservice applications [1,6]. However, considering
|
||||
highlighting their functionalities in process/service termination,
|
||||
that microservices consist of multiple services, the application should
|
||||
network simulation, load stressing, security testing, and fault
|
||||
not have an impact on the user experience in cases such as network
|
||||
injection within application code.
|
||||
failures and suddenly increased service loads. For example, if the
|
||||
3. This research paper discusses contemporary techniques and ap-
|
||||
microservice that adds the product to favorites on a shopping site fails
|
||||
proaches for implementing chaos engineering in microservice
|
||||
or responds late, the user should be able to continue the shopping ex-
|
||||
architectures. It also emphasizes the ongoing work in this field,
|
||||
perience. Therefore, testing operations in production-like environments
|
||||
offering a significant reference for future research endeavors.
|
||||
become inevitable. No matter how distributed or complex the system
|
||||
The paper systematically reviews existing literature to showcase
|
||||
is, there is a need for a method to manage unforeseeable situations
|
||||
how chaos engineering can enhance system resilience, laying a
|
||||
that can build trust in the system against unexpected failures. chaos
|
||||
comprehensive groundwork for further exploration into chaos
|
||||
engineering is defined as the discipline of conducting experiments in a
|
||||
experimentation strategies and innovating new fault injection
|
||||
live environment to test or verify the reliability of software [7].
|
||||
methods or tools within microservice architectures.
|
||||
The primary objective of this research is to conduct a thorough
|
||||
investigation into how chaos experiments are performed in the widely The rest of the paper is structured as follows: Section 2 explains
|
||||
used microservices-based systems of today. Microservice architectures the background and related work. Section 3 presents the methodology
|
||||
have come to the forefront in modern software development processes of the research. Section 4 presents the results and Section 5 compre-
|
||||
due to their advantages such as flexibility, scalability, and rapid de- hensively discusses the presented answers to research questions and
|
||||
velopment. However, these architectures also bring unique challenges validity threats. Lastly, the conclusion is presented in Section 6.
|
||||
due to complex service dependencies and dynamic operational environ-
|
||||
ments. This study aims to comprehensively address the methodologies, 2. Background
|
||||
application scenarios, and impacts of chaos experiments conducted
|
||||
to test the resilience of microservice systems and identify potential The microservice approach breaks down a large application into a
|
||||
weak points. The research intends to present the current state of chaos network of small, self-contained units, each running its own process
|
||||
engineering practices by analyzing them, highlighting best practices, and often communicating through web APIs. Unlike large, single-piece
|
||||
|
||||
2
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
monolithic systems, these small services are robust, easy to scale up or Technology heterogeneity. They are treated as small services, each run-
|
||||
down, and can be updated individually using various programming lan- ning independently and communicating with each other using open
|
||||
guages and technologies. This structure allows development teams to be protocols. While monolithic applications are developed with a single
|
||||
smaller and more agile, leading to faster updates and improvements. programming language and database system, services included in a
|
||||
Yet, managing many interconnected services can become complicated, microservice ecosystem may use a different programming language and
|
||||
especially when something goes wrong. To enhance system reliability database. This allows the advantages of each programming language
|
||||
and resilience, a method known as chaos engineering is employed. This and database to be used.
|
||||
involves deliberately introducing problems into the live system to test
|
||||
Resilience. When an error occurs in the system in monolithic applica-
|
||||
its ability to cope and recover. This technique helps to uncover and
|
||||
tions, the whole system is affected. In the microservice architecture,
|
||||
rectify flaws, thereby making the system stronger overall. Regular and
|
||||
only the part under the responsibility of the relevant service is affected,
|
||||
automated tests mimic real-life problems to ensure that the system can the places belonging to other services are not affected and the user
|
||||
handle unexpected challenges and remain stable and efficient. experience continues.
|
||||
|
||||
2.1. Microservice architecture Scalability. While the scaling process on monolithic applications covers
|
||||
the entire application, the services that are under heavy load can be
|
||||
Microservice architectures have gained significant popularity in the scaled in applications developed with microservice architecture. This
|
||||
software industry due to their ability to address the challenges and prevents extra resource costs for partitions that do not need to be scaled
|
||||
complexities of developing modern applications [6,13]. unnecessarily and increases the user experience.
|
||||
|
||||
Deployment. Microservice architecture facilitates the autonomous de-
|
||||
2.2. Microservice principles ployment of individual services, enabling updates or changes without
|
||||
impacting others. Various deployment strategies, including blue–green,
|
||||
Microservice architectures are based on the concept of decentral- canary, and rolling deployment, minimize disruptions during the de-
|
||||
ization, where each service is independently developed, deployed, and ployment process [18]. As a result, microservice architecture provides
|
||||
managed. This emphasizes autonomy and minimal inter-service depen- increased flexibility and resilience in deployment, distinguishing it
|
||||
dencies. Each microservice is designed to focus on a single function or from monolithic applications.
|
||||
closely related set of functions and supports technology heterogeneity
|
||||
by allowing different services to use different technology stacks that Organizational alignment. In software development processes, some
|
||||
best suit their needs. Resilience is a core aspect, with services built to challenges may be encountered due to large teamwork and large pieces
|
||||
withstand failures without affecting the entire system while scalability of code. It is possible to make these challenges more manageable with
|
||||
enables services to be scaled independently as per demand. Com- smaller teams established. At the same time, this is an indication that
|
||||
munication occurs through lightweight mechanisms like HTTP/REST microservices applications allow us to form smaller and more cohesive
|
||||
APIs, supporting continuous delivery and deployment practices. Due teams. Each team is responsible for its own microservice and can take
|
||||
to the distributed nature of microservice architecture, comprehensive action by making improvements if necessary.
|
||||
monitoring and logging for observability becomes crucial. Additionally,
|
||||
there is often an alignment between the microservice architecture 2.3. Challenges/Troubleshooting/Failures in microservice architecture
|
||||
and organizational structure involving small cross-functional teams
|
||||
Microservice architectures pose numerous challenges. As the num-
|
||||
responsible for individual services [14].
|
||||
ber of services increases, the complexity of service interactions also
|
||||
It is helpful to compare the microservice architecture to the mono-
|
||||
grows. Network communication reliance leads to latency and net-
|
||||
lithic architecture. The main difference between them is the dimensions
|
||||
work failure issues, while ensuring data consistency across multiple
|
||||
of the developed applications. The microservice architecture can be
|
||||
databases requires careful design and implementation of distributed
|
||||
thought of as developing an application as a suite of smaller services,
|
||||
transactions or eventual consistency models. Microservices bring typ-
|
||||
rather than as a single, monolithic structure. Enterprise applications
|
||||
ical distributed system challenges such as handling partial failures,
|
||||
usually consist of three main parts: a client-side user interface (i.e., con-
|
||||
dealing with latency and asynchrony, complex service discovery, load
|
||||
taining HTML pages and Javascript running on the user’s machine
|
||||
balancing in dynamic scaling environments, and managing configu-
|
||||
in a browser), a database (i.e., composed of many tables, common
|
||||
rations across multiple services and environments. Security concerns
|
||||
and often relational, added to database management), and a server-
|
||||
are heightened due to increased inter-service communications surface
|
||||
side application. In the server-side application, HTTP requests are area. Testing becomes more complex involving individual service test-
|
||||
processed, business logic is executed, HTML views are prepared that ing along with testing their interactions; deployment is challenging
|
||||
will retrieve data from the database and update it and send it to the especially when there are dependencies between services; effective
|
||||
browser. This structure is a good example of monoliths. Any changes observability and monitoring become crucial for timely issue resolu-
|
||||
to the system involve creating and deploying a new version of the tion; versioning management is critical for maintaining system stability;
|
||||
server-side application [15]. The cycles of change are interdependent. lastly assembling skilled teams proficient in DevOps, cloud computing,
|
||||
A change to a small part of the application requires rebuilding and programming languages presents a significant challenge. Microservice
|
||||
deploying the entire monolith [6]. architecture faces various challenges, troubleshooting, and failures.
|
||||
Microservice architecture, on the other hand, has some common While adopting a distributed architecture enhances modularity, it in-
|
||||
features, unlike monolithic architecture. These are componentization herently introduces operational complexities that differ significantly
|
||||
with services, organizing around job capabilities, smart interfaces and from monolithic structures. Recent research has also explored the use
|
||||
simple communication, decentralized governance, decentralized data of hybrid bio-inspired algorithms to optimize this process dynamically.
|
||||
management, infrastructure automation, and design for failure [16]. For instance, the Hybrid Kookaburra–Pelican Optimization Algorithm
|
||||
Today, although modern internet applications seem like a single appli- has been shown to improve load distribution and system scalability in
|
||||
cation, they use microservice architectures behind them. Microservice cloud and microservice-based environments [19].
|
||||
architecture basically refers to small autonomous and interoperability In conclusion, while microservices offer numerous advantages such
|
||||
services. It has emerged due to increasing needs such as technology as improved scalability, flexibility, and agility, they also introduce
|
||||
diversity, flexibility, scaling, ease of deployment, organization and significant challenges in terms of system complexity, operational de-
|
||||
management, and provides various advantages in these matters. Its mands, and the need for skilled personnel and sophisticated tool-
|
||||
advantages are described as follows [17]: ing [20].
|
||||
|
||||
3
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
2.4. Chaos engineering 3.1. Research questions
|
||||
|
||||
|
||||
‘‘Chaos engineering is the discipline of experimenting on a dis- Research Questions (RQs) and their corresponding motivations are
|
||||
tributed system in order to build confidence in the system’s capability presented as follows:
|
||||
to withstand turbulent conditions in production-like environment’’ [7,
|
||||
• RQ1: How is Chaos engineering effectively applied in production
|
||||
21]. It is the careful and planned execution of experiments to show how
|
||||
environments to enhance the resilience of software systems?
|
||||
the distributed system will respond to a failure. It is necessary for large-
|
||||
Motivation: Understanding the practical implementation of Chaos
|
||||
scale software systems because it is practically impossible to simulate
|
||||
engineering in production environments is crucial for ensuring
|
||||
real events in test environments. Experiments based on real events are the resilience of software systems under real-world operating
|
||||
created together with chaos engineering [22]. By analyzing the test conditions.
|
||||
results, improvements are made where necessary, and in this way, it • RQ2: Which platforms have been used for Chaos experiments?
|
||||
is aimed to increase the reliability of the software in the production Motivation: Identifying the platforms provides insights into the
|
||||
environment. technological landscape and tools available for conducting Chaos
|
||||
Thanks to an experimental and systems-based approach, confidence engineering practices.
|
||||
is established for the survivability of these systems during collapses. • RQ3: How is Chaos engineering effectively applied to microser-
|
||||
Canary analysis collects data on how distributed systems react to vice architectures to ensure its successful implementation in en-
|
||||
failure scenarios by observing their behavior in abnormal situations and hancing system resilience?
|
||||
performing controlled experiments [23]. This method involves applying Motivation: Microservice architectures introduce new challenges
|
||||
new updates or changes to a specific aspect of the system, enabling in system design. Exploring the application of Chaos engineering
|
||||
early detection of potential problems before they affect a larger scale. in this context can help improve the resilience and fault tolerance
|
||||
Chaos experiments consist of the following principles [24,25]: of microservice systems.
|
||||
• RQ4: To what extent can the centralized provision of Chaos
|
||||
• Hypothesize steady state: The first step is to hypothesize the engineering effectively facilitate the management of Chaos exper-
|
||||
steady state of the system under normal conditions. iments across complex systems?
|
||||
• Vary real-world events: The next step is to vary real-world events Motivation: Understanding the feasibility of providing Chaos en-
|
||||
that can cause turbulence in the system. gineering as a centralized service enables organizations to coor-
|
||||
• Run experiments in production: Experimenters should run the ex- dinate Chaos experiments across complex systems.
|
||||
periments in production-like environment to simulate real-world • RQ5: What are the challenges reported in the relevant papers?
|
||||
conditions. Motivation: Identifying these challenges provides valuable in-
|
||||
• Automate experiments to run continuously: Experimenters should sights into overcoming obstacles and advancing the adoption of
|
||||
automate the experiments to run continuously, ensuring that the Chaos engineering practices.
|
||||
system can withstand turbulence over time.
|
||||
• Minimize blast radius: The experiments should be designed to 3.2. Search strategy
|
||||
minimize blast radius, i.e., the impact of the experiment on the
|
||||
system should be limited to a small area The primary studies were carefully selected from the papers pub-
|
||||
• Analyze results: Experimenters should analyze the results of the lished between 2010 and 2022 because the topic is only relevant in
|
||||
experiments to determine the system’s behavior under turbulent recent years. The databases are IEEE Xplore, ACM Digital Library,
|
||||
conditions. Science Direct, Springer, Wiley, MDPI and Scopus and Science Direct.
|
||||
• Repeat experiments: The experiments should be repeated to en- The initial search involved reviewing the titles, abstracts, and keywords
|
||||
sure that the system can consistently withstand turbulence. of the studies identified in the databases. The search results obtained
|
||||
When the experiment is finished, information about the actual from the databases were stored in the data extraction form using a
|
||||
effect will be provided to the system. spreadsheet tool. Furthermore, this systematic review was conducted
|
||||
collaboratively by three authors.
|
||||
The following search string was used to broaden the search scope:
|
||||
3. Review protocol ((chaos engineering) OR (chaos experiments)) OR (microservices)
|
||||
The results of the searches made in the databases mentioned above
|
||||
Systematic review studies must be conducted using a well-defined are shown in Fig. 2.
|
||||
and specific protocol. To conduct a systematic review study, all studies
|
||||
on a particular topic must be examined [12]. We followed the system- 3.3. Study selection criteria
|
||||
atic review process shown in Fig. 1 and took all the steps to reduce risk
|
||||
bias in this study. Multiple reviewers were involved in the SLR process, After applying exclusion inclusion criteria, 55 articles were ob-
|
||||
and in cases of conflict, a brief meeting was organized to facilitate tained. The exclusion criteria in our study are shown as follows:
|
||||
consensus. The first step is to define the research questions. Then,
|
||||
the most appropriate databases were selected. Based on the selected • EC-1: Duplicate papers from multiple sources
|
||||
databases, automated searches were conducted and several articles • EC-2: Papers without full-text availability
|
||||
were identified. Selection criteria were then established to determine • EC-3: Papers not written in English
|
||||
• EC-4: Survey papers
|
||||
which studies should be included and excluded in this research. The
|
||||
• EC-5: Papers not related to Chaos engineering
|
||||
titles and abstracts of all studies were reviewed. In cases of doubt,
|
||||
the full text of the publication was reviewed. Then, after the studies The inclusion criteria in our study are shown as follows:
|
||||
were analyzed in detail, selection criteria were applied. All selected
|
||||
studies were assessed using a quality assessment process. Subsequently, • IC-1: Primary papers discussing the use of Chaos experiments in
|
||||
the results were synthesized, listed, and summarized in a clear and a microservice architecture
|
||||
understandable manner. • IC-2: Primary publications that focus on Chaos engineering
|
||||
|
||||
4
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1. SLR review protocol.
|
||||
Source: Adapted from [26–
|
||||
28].
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 2. Distribution of selected papers per database.
|
||||
|
||||
|
||||
3.4. Study quality assessment Fig. 2 presents the distribution of papers based on databases where
|
||||
they were found at different selection stages. After the initial search,
|
||||
The assessment of each study’s quality is an indicator of the strength 4520 papers were retrieved, of which 55 remained after applying the
|
||||
of evidence provided by the systematic review. The quality of studies selection criteria. After quality assessment, 31 papers were selected
|
||||
was assessed using various questions. Studies of poor quality were as primary studies. The 55 papers were carefully read in full and the
|
||||
not included in the present study. These criteria based on quality required data for answering the research questions were extracted.
|
||||
instruments were adopted guide and other SLRs research [12]. The All the collected articles are listed in Table 1.
|
||||
following questions were used to assess the quality of the studies.
|
||||
3.5. Data extraction
|
||||
• Q1. Are the aims of the study clearly stated?
|
||||
• Q2. Are the scope and experimental design of the study clearly
|
||||
defined? Data required for answering the Research Questions were extracted
|
||||
• Q3. Is the research process documented adequately? from the selected articles to answer the research questions. A data
|
||||
• Q4. Are all the study questions answered? extraction form was created to answer the research questions. The data
|
||||
• Q5. Are the negative findings presented? extraction form consists of several metadata such as the author’s first
|
||||
• Q6. Do the conclusions relate to the aim of the purpose of the and last name, the title of the study, the publication year, and the type
|
||||
study and are they reliable? of study. In addition to this metadata, several columns were created
|
||||
to store the required information related to the research questions. By
|
||||
In this study, considering all these criteria, a general quality as- employing a data extraction form, we ensured that the relevant data
|
||||
sessment was performed for each paper. The rating was 2 points for required to answer each research question were systematically captured
|
||||
the ‘‘yes’’ option, 0 points for the ‘‘no’’ option, and 1 point for the from the selected publications. This approach facilitated the subsequent
|
||||
‘‘somewhat’’ option. The decision threshold for classifying the paper synthesis of the findings. The data extraction process involved meticu-
|
||||
as poor quality was determined based on the mean value, which lous attention to detail and ensured the reliability and integrity of the
|
||||
corresponds to a total of 5 points. data used in our systematic literature review.
|
||||
|
||||
5
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
Table 1
|
||||
Selected primary studies.
|
||||
ID Reference Title Year Database
|
||||
S1 [29] Automating Chaos Experiments in Production 2019 ACM
|
||||
S2 [25] Getting Started with Chaos engineering—design of an implementation framework in practice 2020 ACM
|
||||
S3 [30] Human-AI Partnerships for Chaos engineering 2020 ACM
|
||||
S4 [31] 3MileBeach: A Tracer with Teeth 2021 ACM
|
||||
S5 [32] Service-Level Fault Injection Testing 2021 ACM
|
||||
S6 [33] A Platform for Automating Chaos Experiments 2016 IEEE Xplore
|
||||
S7 [34] Automated Fault-Tolerance Testing 2016 IEEE Xplore
|
||||
S8 [35] Gremlin: Systematic Resilience Testing of Microservices 2016 IEEE Xplore
|
||||
S9 [36] Fault Injection Techniques - A Brief Review 2018 IEEE Xplore
|
||||
S10 [37] ORCAS: Efficient Resilience Benchmarking of Microservice Architectures 2018 IEEE Xplore
|
||||
S11 [38] The Business Case for Chaos engineering 2018 IEEE Xplore
|
||||
S12 [39] Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service 2018 IEEE Xplore
|
||||
S13 [40] Security Chaos engineering for Cloud Services: Work In Progress 2019 IEEE Xplore
|
||||
S14 [41] A Framework of Virtual War Room and Matrix Sketch-Based Streaming Anomaly Detection for Microservice Systems 2020 IEEE Xplore
|
||||
S15 [42] CloudStrike: Chaos engineering for Security and Resiliency in Cloud Infrastructure 2020 IEEE Xplore
|
||||
S16 [43] Identifying and Prioritizing Chaos Experiments by Using Established Risk Analysis Techniques 2020 IEEE Xplore
|
||||
S17 [44] Fitness-guided Resilience Testing of Microservice-based Applications 2020 IEEE Xplore
|
||||
S18 [24] A Chaos engineering System for Live Analysis and Falsification of Exception-Handling in the JVM 2021 IEEE Xplore
|
||||
S19 [45] A Study on Chaos engineering for Improving Cloud Software Quality and Reliability 2021 IEEE Xplore
|
||||
S20 [46] Chaos engineering for Enhanced Resilience of Cyber–Physical Systems 2021 IEEE Xplore
|
||||
S21 [47] ChaosTwin: A Chaos engineering and Digital Twin Approach for the Design of Resilient IT Services 2021 IEEE Xplore
|
||||
S22 [48] Platform Software Reliability for Cloud Service Continuity—Challenges and Opportunities 2021 IEEE Xplore
|
||||
S23 [49] Trace-based Intelligent Fault Diagnosis for Microservices with Deep Learning 2021 IEEE Xplore
|
||||
S24 [50] A Guided Approach Towards Complex Chaos Selection, Prioritization and Injection 2022 IEEE Xplore
|
||||
S25 [51] Chaos Driven Development for Software Robustness Enhancement 2022 IEEE Xplore
|
||||
S26 [22] Maximizing Error Injection Realism for Chaos engineering With System Calls 2022 IEEE Xplore
|
||||
S27 [52] On Evaluating Self-Adaptive and Self-Healing Systems using Chaos engineering 2022 IEEE Xplore
|
||||
S28 [53] Observability and chaos engineering on system calls for containerized applications in Docker 2021 ScienceDirect
|
||||
S29 [54] Scalability resilience framework using application-level fault injection for cloud-based software services 2022 Springer
|
||||
S30 [55] Chaos as a Software Product Line—A platform for improving open hybrid-cloud systems resiliency 2022 Wiley
|
||||
S31 [56] The Observability, Chaos engineering, and Remediation for Cloud-Native Reliability 2022 Wiley
|
||||
|
||||
|
||||
|
||||
3.6. Data synthesis Chaos engineering involves several categories of functionality that
|
||||
serve distinct purposes in resilience testing. The first category involves
|
||||
To answer the research questions, the data obtained are collected intentionally terminating processes or services to evaluate system be-
|
||||
and summarized in an appropriate manner, which is called data syn- havior and recovery from failures [7]. Another category is network
|
||||
thesis. To perform the data synthesis, a qualitative analysis process simulation, which allows engineers to replicate adverse network condi-
|
||||
was conducted on the data obtained. For instance, synonyms used tions to assess system performance and reliability [25]. In the Stressing
|
||||
for different categories were identified and merged in the respective Machine category, engineers subject the system to extreme loads to
|
||||
fields. This comprehensive data synthesis approach allowed us to derive identify limits and potential bottlenecks [7]. In security testing, en-
|
||||
insights and draw conclusions from the collected information. gineers simulate breaches or attacks to assess the system’s response
|
||||
and enhance defenses [7]. Lastly, engineers use fault application code
|
||||
4. Results to inject targeted faults or errors into the codebase, assessing system
|
||||
resilience and error-handling capabilities [24]. These categories help
|
||||
The result section of the paper provides various insights into how organizations proactively identify weaknesses, strengthen system ro-
|
||||
chaos engineering is applied in production environments, particularly bustness, and enhance reliability in complex technology landscapes [7].
|
||||
its use in improving the resilience and reliability of microservice ar- Functionality categories of tools are presented in Fig. 6.
|
||||
chitecture applications. The section discusses how fault detection is The tools utilized in industry settings are not comprehensively ad-
|
||||
developed using chaos engineering tools and is mainly used in pro- dressed in articles. To provide insights for future research, the identified
|
||||
tools from the additional examination were categorized based on their
|
||||
duction for troubleshooting. Chaos Experiments are usually conducted
|
||||
functionality, as presented in Tables 2 and 3. Table 2 displays the
|
||||
in the production environment to provide realistic results. The section
|
||||
tools obtained from the study, while Table 3 presents additional tools
|
||||
further enumerates several tools that have been used for Chaos experi-
|
||||
that have been examined. Tools listed in the table with corresponding
|
||||
ments, as well as discussing general principles such as defining a steady
|
||||
references indicate their inclusion in the referenced articles.
|
||||
state, forming a hypothesis, conducting the experiment, and proving or
|
||||
refuting the hypothesis. These principles and tools help detect problems
|
||||
4.2. How is Chaos engineering effectively applied in production environ-
|
||||
like hardware issues, software errors network interruptions security
|
||||
ments to enhance the resilience of software systems?
|
||||
vulnerabilities configuration mistakes within their respective contexts.
|
||||
Table 4 examines the successful implementation of Chaos Engineer-
|
||||
4.1. Main statistics ing in operational settings, covering different aspects such as goals,
|
||||
techniques and resources, guiding principles, findings, limitations and
|
||||
Fig. 3 shows the results of the quality assessment. The distribution of substitutes, as well as the general strategy.
|
||||
the years of publication is shown in Fig. 4. Most of the studies related to
|
||||
our study were conducted in the last year. This shows that researchers’ 4.3. Which platforms have been used for chaos experiments?
|
||||
interest in chaos engineering has increased in recent years. Most of the
|
||||
studies included were indexed in the IEEE Xplore database. Table 5 provides a concise summary of various tools and platforms
|
||||
Fig. 5 presents the distribution of the type of publications and used in Chaos experiments, along with their specific functionalities
|
||||
the corresponding databases. While there are many journal papers, or characteristics. It offers comprehensive insights into each platform
|
||||
conference proceedings also appear in the selected papers. through detailed descriptions accompanied by the necessary references.
|
||||
|
||||
6
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Quality assessment scores.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 4. Year of publication.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 5. Diagram of the distribution of studies per search database.
|
||||
|
||||
|
||||
7
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 6. Functionality of chaos engineering tools.
|
||||
|
||||
|
||||
|
||||
|
||||
Table 2
|
||||
Chaos engineering tools from studies.
|
||||
Chaos engineering tool Termination Network simulating Stressing machine Security Fault application code
|
||||
Chaos Monkey [57] ×
|
||||
Gremlin [35] × × × × ×
|
||||
Chaos Toolkit [45] × × × × ×
|
||||
Pumba [55] × ×
|
||||
LitmusChaos [45] × × × ×
|
||||
ToxiProxy [45] × ×
|
||||
PowerfulSeal [45] × × × ×
|
||||
Pod Reaper [25] ×
|
||||
Netflix Simian Army [36] × × ×
|
||||
WireMock [25] × ×
|
||||
KubeMonkey [25] × × ×
|
||||
Chaosblade [45] × × ×
|
||||
ChaosTwin [47] × × × ×
|
||||
Chaos Machine [24] × × ×
|
||||
Cloud Strike [42] ×
|
||||
Phoebe [22] ×
|
||||
Mjolnirr [58] ×
|
||||
ChaosOrca [37] × × ×
|
||||
3MileBeach [31] × ×
|
||||
Muxy [25] × × ×
|
||||
Blockade [25] ×
|
||||
Chaos Lambda [25] × ×
|
||||
Byte-Monkey [25] ×
|
||||
Turbulence [25] × × ×
|
||||
Cthulhu [25] × × × ×
|
||||
Byteman [25] × ×
|
||||
ChaosCube [55] ×
|
||||
Chaos Lemur [25] ×
|
||||
Chaos HTTP Proxy [25] ×
|
||||
Chaos Mesh [45] × × ×
|
||||
Istio Chaos [45] ×
|
||||
ChAP [33] × ×
|
||||
IntelliFT [44] × × × ×
|
||||
|
||||
|
||||
|
||||
|
||||
Table 3
|
||||
Chaos engineering tools from our search.
|
||||
Chaos engineering tool Termination Network simulating Stressing machine Security Fault application code
|
||||
Pod Chaos X X X
|
||||
DNS Chaos X
|
||||
AWS Chaos X X X
|
||||
Azure Chaos X X X X
|
||||
GCP Chaos X X X X
|
||||
|
||||
|
||||
|
||||
|
||||
8
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
Table 4
|
||||
Chaos engineering in production environments.
|
||||
Category Description
|
||||
Objective The primary objective of applying chaos engineering in production environments is to enhance the
|
||||
resilience of software systems. This involves troubleshooting to identify and address potential
|
||||
malfunctions before they occur. The overarching goal is to minimize issues in production through the
|
||||
use of chaos engineering tools, enabling automatic fault detection [24,53].
|
||||
Methods and tools chaos engineering relies on specific tools to facilitate its effective application in production
|
||||
environments. These tools aid in automatic fault detection, a crucial aspect of troubleshooting to
|
||||
minimize potential issues in the production environment [24,53].
|
||||
Principles and considerations The effective application of chaos engineering is closely tied to key principles and considerations.
|
||||
These include continuous experimentation, serving as a form of robustness testing conducted in
|
||||
real-world operational conditions. Fundamental principles of Chaos Experiments involve defining a
|
||||
steady state, hypothesizing about its impact, conducting the experiment, and then demonstrating or
|
||||
refuting the hypothesis [53].
|
||||
Insights and results Chaos experiments conducted in the production environment provide valuable insights into the
|
||||
behavior of the system. This is particularly significant as the production environment may exhibit
|
||||
unpredictable behavior that differs from staging environments in some cases [24].
|
||||
Constraints and alternatives While conducting chaos experiments in production is ideal, it is acknowledged that legal or technical
|
||||
constraints may sometimes prevent this. In such cases, an alternative approach is considered, starting
|
||||
chaos experiments in a staging environment and gradually transitioning to the production
|
||||
environment [25].
|
||||
Overall approach The overall approach for the effective application of chaos engineering in production environments
|
||||
involves the systematic execution of chaos experiments. This includes leveraging chaos engineering
|
||||
tools and taking into account the constraints and challenges associated with conducting experiments in
|
||||
real-world operational settings. The aim is to proactively identify and address potential issues before
|
||||
they impact the production environment, ultimately enhancing the resilience of software systems.
|
||||
|
||||
|
||||
|
||||
|
||||
Table 5
|
||||
Chaos engineering tools identified from selected papers.
|
||||
Platform/Tool Description
|
||||
The Chaos Machine A tool for conducting chaos experiments at the application level on Java Virtual Machine (JVM),
|
||||
using exception injection to analyze try-catch blocks for error processing [24].
|
||||
Screwdriver An automated fault-tolerance testing tool for on-premise applications and services, creating realistic
|
||||
error models and collecting metrics by injecting errors into the system [34].
|
||||
Chaos Monkey Designed by Netflix, this tool tests the system’s resilience by randomly killing partitions to check
|
||||
system functionality [7,45].
|
||||
Cloud Strike A security chaos engineering system for multi-cloud security, extending chaos engineering to security
|
||||
by injecting faults impacting confidentiality, integrity, and availability [42].
|
||||
ChaosMesh An open-source chaos engineering platform for testing the resilience and reliability of distributed
|
||||
systems by intentionally injecting failures and disruptions [55].
|
||||
Powerfulseal An open-source tool for testing the resilience of Kubernetes clusters by simulating real-world failures
|
||||
and disruptions [55].
|
||||
IntelliFT A feedback-based, automated failure testing technique for microservice applications, focusing on
|
||||
exposing defects in fault-handling logic [44].
|
||||
The Chaos Toolkit Open-source software that runs experiments against the system to confirm a hypothesis [25,55].
|
||||
Phoebe A fault injection framework for reliability analysis concerning system call invocation errors, enabling
|
||||
full observability of system call invocations and automatic experimentation [22].
|
||||
Mjolnirr A private cloud platform with a built-in Chaos Monkey service for developing private PaaS cloud
|
||||
infrastructure [58].
|
||||
ChaosOrca A tool for Chaos engineering on containers, perturbing system calls for processes inside containers
|
||||
and monitoring their effects [37].
|
||||
Gremlin Offered as a SaaS technology, Gremlin tests system resilience on various parameters and conditions,
|
||||
with capabilities for automation and integration with Kubernetes clusters and public clouds [35].
|
||||
3MileBeach A distributed tracing and fault injection framework for microservices, enabling chaos experiments
|
||||
through message serialization library manipulation [31].
|
||||
ChAP A software platform for running automated chaos experiments, simulating various failure scenarios
|
||||
and providing insights into system behavior under stress [29,33].
|
||||
ChaosTwin Utilizes a digital twin approach in Chaos Engineering to mitigate impacts of unforeseen events,
|
||||
constructing models across workload, network, and service layers [47].
|
||||
Litmus Chaos An open-source cloud-native framework for Chaos Engineering in Kubernetes environments, offering a
|
||||
range of chaos experiments and workflows [50].
|
||||
Filibuster A testing method in chaos engineering that introduces errors into microservice architecture to validate
|
||||
resilience and error tolerance [32].
|
||||
|
||||
|
||||
|
||||
|
||||
9
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
Table 6
|
||||
Chaos engineering in microservices: approaches, descriptions, and expected outcomes.
|
||||
Approach Description Expected impact
|
||||
Fault injection testing This method involves intentionally introducing errors into the system to assess its Evaluating and enhancing the system’s resilience
|
||||
response, particularly in microservices by simulating various failure modes such as and stability.
|
||||
network issues, service outages, or resource shortages within or between
|
||||
microservices, to evaluate the system’s resilience and stability [52].
|
||||
Hypothesis-driven Key to chaos engineering is conducting experiments based on well-defined Identifying system weaknesses and increasing
|
||||
experiments hypotheses about the normal state of the system and its expected behavior during resilience.
|
||||
failure scenarios. This strategic approach enables focused experiments that assess the
|
||||
resilience of both individual microservices and the overall system [45,53].
|
||||
Blast radius Managing the ‘‘blast radius’’ of experiments is crucial in microservices. It involves Better understanding and enhancing the system’s
|
||||
management understanding the potential impact of introduced failures, starting with small resilience.
|
||||
experiments and then expanding, to manage failure impacts while identifying system
|
||||
vulnerabilities [45].
|
||||
Resilience requirement Utilizing chaos engineering to determine and analyze the resilience requirements of Understanding specific resilience needs of each
|
||||
elicitation microservice architectures. This process involves observing the system’s response to microservice and their interactions.
|
||||
induced faults to identify specific resilience needs of each microservice and their
|
||||
interactions [52].
|
||||
Continuous testing and Regularly conducting chaos experiments as part of an ongoing testing process Proactive identification and resolution of system
|
||||
improvement ensures that microservices remain resilient against unforeseen issues. This continuous weaknesses, leading to continual improvement and
|
||||
approach aids in proactively finding and fixing potential system weaknesses [56]. increased resilience.
|
||||
Observability and Integrating chaos engineering with observability tools enhances the monitoring of Real-time tracking of responses to failures and
|
||||
remediation microservices during fault injection, allowing for real-time tracking of responses to development of effective remediation strategies for
|
||||
failures, aiding in the development of effective remediation strategies and overall overall system resilience improvement.
|
||||
system resilience improvement [56].
|
||||
|
||||
|
||||
|
||||
4.4. How can Chaos engineering be effectively applied to microservice archi- 5.1. General discussion
|
||||
tecture to ensure successful implementation and enhance system resilience?
|
||||
In this article, we reviewed the literature on the application of
|
||||
Table 6 provides a comprehensive overview of the different facets chaos engineering in microservice architecture to understand the state-
|
||||
and projected implications of implementing chaos engineering within of-the-art. For this purpose, six research questions were defined and
|
||||
microservice architecture. answered.
|
||||
By implementing these approaches and strategies, organizations can In RQ1, we aimed to understand how chaos engineering is ap-
|
||||
effectively integrate chaos engineering into their microservice architec- plied to production environments. Chaos engineering, when adeptly
|
||||
tures to uncover vulnerabilities and enhance the overall dependability applied in production settings, serves as a pivotal tool for augmenting
|
||||
of their systems. the robustness of software systems. This approach entails conducting
|
||||
deliberate and controlled chaos experiments within the production en-
|
||||
4.5. To what extent can the centralized provision of Chaos engineering vironment, a strategy that is instrumental in uncovering and rectifying
|
||||
effectively facilitate the management of chaos experiments across complex potential issues before they escalate into full-blown system failures,
|
||||
systems? thereby bolstering system uptime [38]. Moreover, chaos engineering
|
||||
is characterized by the intentional injection of faults into systems.
|
||||
Table 7 provides an overview of the ways in which centralized chaos This methodology is crucial for identifying and addressing security
|
||||
engineering can simplify experiment management in intricate systems. flaws and risks, laying the groundwork for the development of resilient
|
||||
It emphasizes advantages like standardization, resource utilization, risk application architectures [56]. By replicating adverse conditions that
|
||||
mitigation, and more, resulting in enhanced system resilience and could naturally arise in production settings, chaos engineering helps
|
||||
performance. detect of inherent system vulnerabilities and structural deficiencies,
|
||||
fostering a proactive stance towards issue mitigation [38].
|
||||
4.6. What are the challenges reported in the relevant papers? Additionally, this practice involves comprehensive testing of real-
|
||||
world scenarios on operational systems. Such testing is vital for as-
|
||||
Table 8 concisely presents the primary obstacles in the area of sessing the complete spectrum of software systems, encompassing both
|
||||
chaos engineering and their respective resolutions. These obstacles hardware malfunctions and software glitches, within their actual de-
|
||||
encompass system intricacy, hazards to live environments, resource ployment contexts. This approach significantly contributes to the en-
|
||||
demands, security issues, and automation complexities. The proposed hancement of overall system resilience [38]. To effectively implement
|
||||
resolutions involve phased implementation, risk assessment, knowledge chaos engineering, it is recommended to initiate with less complex
|
||||
enhancement, robust security protocols, and automation approaches. experiments, leverage automation for these experiments, and focus on
|
||||
areas with either high impact or high frequency of issues. Observing
|
||||
5. Discussion the system at its limits is also crucial for reinforcing resilience [25].
|
||||
In RQ2, we discuss various platforms that aim to increase the
|
||||
In the discussion section, we summarize answers to the research flexibility and reliability of microservice architectures through chaos
|
||||
questions. They mention that chaos engineering can improve robust- experiments. Tools like Gremlin, Chaos Monkey, Chaos Toolkit, Pumba,
|
||||
ness by simulating real-world failure scenarios and exploring system LitmusChaos, ToxiProxy and PowerfulSeal have been utilized in indus-
|
||||
reactions, especially in microservice architectures. Various tools for try settings to simulate different failure scenarios. These tools provide
|
||||
implementing chaos engineering were listed and compared. They con- functions such as terminating processes, simulating network conditions,
|
||||
clude by stating that the application of chaos engineering requires applying stress tests security measures and injecting faults to proac-
|
||||
careful planning due to inherent challenges but has the potential to tively identify weaknesses and strengthen system robustness across
|
||||
greatly improve system resilience. different technology landscapes.
|
||||
|
||||
10
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
Table 7
|
||||
Centralized provision in chaos engineering.
|
||||
Approach Description Expected impact
|
||||
Standardization Centralized provision allows for the standardization of chaos engineering practices Improved coordination and reliability of
|
||||
and tools across the organization. This ensures that all teams follow consistent results.
|
||||
processes and use approved tools, leading to better coordination and more reliable
|
||||
results [42].
|
||||
Resource optimization Centralized provision enables efficient allocation of resources for chaos experiments. Enhanced resource utilization and reduced
|
||||
It allows pooling of expertise, tools, and infrastructure, reducing redundancy and redundancy.
|
||||
optimizing resource utilization [38].
|
||||
Risk management Centralized provision facilitates better risk management by providing oversight and Controlled experimentation and effective
|
||||
governance for chaos experiments. It establishes clear guidelines, safety measures, risk management.
|
||||
and expected states for running experiments in production environments, ensuring
|
||||
controlled experimentation [42].
|
||||
Automation and Centralized provision supports the automation of chaos experiments to run Ongoing validation of system resilience and
|
||||
continuous testing continuously. This ensures regular conduction of experiments, leading to ongoing early identification of potential issues.
|
||||
validation of system resilience and identification of potential issues before they
|
||||
manifest as outages [38,42].
|
||||
Knowledge sharing and A centralized approach encourages knowledge sharing and collaboration among Promotion of a continuous improvement
|
||||
collaboration teams. It facilitates the dissemination of best practices, lessons learned, and culture and shared learning.
|
||||
successful experiment designs, fostering a culture of continuous improvement and
|
||||
shared learning [25].
|
||||
Performance metrics and Centralized provision enables the establishment of standardized performance metrics Consistent system health measurement and
|
||||
analysis and analysis methods for chaos experiments. This allows for consistent measurement more effective decision-making.
|
||||
of system health and identification of deviations from steady-state, leading to more
|
||||
effective decision-making and system improvements [43].
|
||||
|
||||
|
||||
Table 8
|
||||
Challenges and solutions in chaos Engineering.
|
||||
Category Challenges Possible solutions References
|
||||
Complexity Designing and executing effective chaos experiments To mitigate complexity, it is recommended to start with smaller, more [25,43]
|
||||
in large systems is complex due to intricate manageable experiments and gradually expand the scope of chaos
|
||||
interdependencies within these systems. engineering practices.
|
||||
Risk of impact Concerns about causing disruptions in the production Implementing risk analysis techniques can help prioritize experiments, [45,50]
|
||||
environment, affecting users and business operations. focusing on less critical system components first to minimize potential
|
||||
impacts.
|
||||
Resource Significant resources needed including time, expertise, Addressing resource intensiveness involves providing comprehensive [7,47]
|
||||
intensiveness and infrastructure, posing a barrier for many training and education on chaos engineering best practices and tools to
|
||||
organizations. equip teams with the necessary skills and knowledge.
|
||||
Security Introducing controlled failures can raise security To combat security concerns, robust security measures should be [42,47]
|
||||
concerns issues, potentially exposing vulnerabilities or sensitive implemented during experiments to safeguard sensitive data and prevent
|
||||
data. unauthorized access.
|
||||
Tooling and Developing tools for automated chaos experiments is Overcoming tooling and automation challenges requires the development [7,33,38,40,42]
|
||||
automation challenging in heterogeneous and dynamic and use of automated tools for Chaos experiments, which reduce manual
|
||||
environments. efforts and facilitate continuous, unattended testing.
|
||||
|
||||
|
||||
|
||||
Recent studies have emphasized the growing intersection between solutions like Netflix’s Chaos Automation Platform (ChAP) and fault
|
||||
artificial intelligence and cybersecurity within the context of chaos injection techniques such as service call manipulation. The emphasis is
|
||||
engineering. AI-driven techniques are nowadays used for real-time placed on the need for careful planning, effective communication, risk
|
||||
threat detection, anomaly prediction, and automated response mech- management, and continuous learning to ensure comprehensive and
|
||||
anisms in enterprise systems. For example, generative AI models have valuable chaos experiments for enhancing overall system resilience.
|
||||
been proposed to enhance cybersecurity frameworks by improving data In response to RQ5, our discussion concludes that the practical
|
||||
privacy management and identifying potential attack vectors [59]. implementation of chaos engineering, despite its promise to enhance
|
||||
In RQ3, we focused on understanding how chaos engineering is im- system resilience, presents numerous challenges. These challenges in-
|
||||
plemented in microservice architectures. To enhance system resilience clude potential business impacts, difficulty in determining scope, the
|
||||
in microservice architectures through chaos engineering, organizations
|
||||
unpredictability of outcomes, time and resource constraints, system
|
||||
should utilize fault injection testing to replicate failures within mi-
|
||||
complexities, skill and knowledge prerequisites, interpretation of re-
|
||||
croservices. They should also conduct hypothesis-driven experiments
|
||||
sults, cultural readiness, and selection of appropriate tools. These all
|
||||
with a solid comprehension of the normal state and anticipated behav-
|
||||
necessitate meticulous planning and skilled execution for effectiveness.
|
||||
ior during disruptions, while managing the scope of these experiments
|
||||
to minimize impact. Additionally, it is essential to identify and an- Recent studies explore the convergence of Chaos Engineering and
|
||||
alyze resilience requirements, participate in continuous testing and Artificial Intelligence (AI). Large language models (LLMs) have been
|
||||
improvement efforts, as well as integrate observability tools for real- used to automate the chaos engineering lifecycle, managing phases
|
||||
time monitoring during fault injection tests. Moreover, organizations from hypothesis creation to experiment orchestration and remedia-
|
||||
need to establish clear communication channels across teams involved tion [60]. Meanwhile, advances in applying chaos engineering to multi-
|
||||
in order to ensure effective collaboration and knowledge sharing. agent AI systems suggest new directions: for example, chaos experi-
|
||||
The answer to RQ4, highlights the significance of centralized man- ments applied to LLM-based multi-agent systems can surface vulner-
|
||||
agement and monitoring in conducting chaos experiments within large- abilities such as hallucinations, agent failures, or inter-agent communi-
|
||||
scale microservices ecosystems. It discusses the utilization of software cation breakdowns [61]. Together, these works show how intelligent,
|
||||
|
||||
11
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
adaptive chaos frameworks might evolve in microservice-based systems experiments are insightful, as they reveal system behaviors in pro-
|
||||
as well. duction environments, which often differ unpredictably from staging
|
||||
Recent research also discusses specific operational challenges such environments [36,53].
|
||||
as load balancing and security in the context of chaos engineering. For Furthermore, the effectiveness of chaos engineering is contingent
|
||||
example, an empirical study applies delay injections under different on the systematic execution of chaos experiments. These experiments,
|
||||
user loads in cloud-native systems to observe how throughput and utilizing advanced chaos engineering tools, need to navigate the con-
|
||||
latency change under stress, providing insights into how load balanc- straints and challenges inherent in real-world operational settings.
|
||||
ing policies perform under fault conditions [62]. In parallel, several The main objective is the enhancement of system resilience, achieved
|
||||
frameworks have begun integrating security-focused chaos tests that by proactively identifying and preemptively addressing potential is-
|
||||
intentionally inject faults into authentication, identity management, sues [46].
|
||||
and access control components to ensure that security mechanisms However, it is acknowledged that conducting chaos experiments
|
||||
remain effective under stress conditions [63]. These studies highlight directly in production environments might be impeded by legal or
|
||||
how chaos engineering can be extended beyond performance reliability technical constraints. In such scenarios, initiating experiments in a
|
||||
to proactively strengthen both load distribution and security resilience staging environment and then gradually transitioning to the production
|
||||
in microservice environments. environment offers a viable alternative. This approach ensures that
|
||||
The main challenges faced by previous researchers and possible the benefits of chaos engineering can still be realized, but in a more
|
||||
solutions have been discussed in the paper. The collected challenges controlled and possibly less direct manner.
|
||||
were mainly related to the correct interpretation of chaos experiments Our review highlights that chaos engineering is a critical methodol-
|
||||
and making sense of them. There may be more challenges, but if ogy for ensuring the resilience and robustness of software systems. By
|
||||
they were not mentioned in these articles, we could not include them. following continuous experimentation and proactive troubleshooting, it
|
||||
We believe that chaos engineering is still in the early stages and the offers a pathway to address the challenges faced in complex production
|
||||
adoption in the software industry will take some time. environments. This SLR contributes to the scientific community by dis-
|
||||
cussing these methodologies and their applications, thereby providing
|
||||
5.2. Threats to validity a framework for future research and practical implementation in the
|
||||
field of software system resilience.
|
||||
Internal validity
|
||||
The validity of this systematic literature review is threatened by CRediT authorship contribution statement
|
||||
issues related to defining the candidate pool of papers, potential bias
|
||||
in selecting primary studies, data extraction, and data synthesis. The Emrah Esen: Writing – review & editing, Writing – original draft,
|
||||
application of exclusion criteria can be influenced by the researchers’ Visualization, Validation, Software, Methodology, Investigation, For-
|
||||
biases, posing a potential threat to validity. We compiled a compre- mal analysis, Data curation. Akhan Akbulut: Writing – review &
|
||||
hensive list of exclusion criteria, and all conflicts were documented editing, Writing – original draft, Visualization, Validation, Supervi-
|
||||
and resolved through discussions among us. Data extraction validity is sion, Software, Resources, Project administration, Methodology, Inves-
|
||||
crucial as it directly impacts the study results. Whenever any of us was tigation, Formal analysis, Data curation. Cagatay Catal: Writing –
|
||||
uncertain about data extraction, the case was recorded for resolution review & editing, Writing – original draft, Visualization, Validation,
|
||||
through discussions with the team. Multiple meetings were held to Supervision, Software, Resources, Project administration, Methodology,
|
||||
minimize researcher bias. Investigation, Funding acquisition, Formal analysis, Data curation.
|
||||
|
||||
External validity Declaration of competing interest
|
||||
The search for candidate papers involved using general search terms
|
||||
to minimize the risk of excluding relevant studies. Despite using a broad The authors declare that they have no known competing finan-
|
||||
search query to acquire more articles, there remains a possibility that cial interests or personal relationships that could have appeared to
|
||||
some papers were overlooked in electronic databases or missed due to influence the work reported in this paper.
|
||||
recent publications. Furthermore, although seven widely used online
|
||||
databases in computer science and software engineering were searched, Data availability
|
||||
new papers may not have been included.
|
||||
Data will be made available on request.
|
||||
6. Conclusion
|
||||
|
||||
Our systematic literature review (SLR) on chaos engineering has References
|
||||
explored its role in enhancing the resilience of software systems in pro-
|
||||
duction environments. Through our review, we have identified several [1] P. Jamshidi, C. Pahl, N.C. Mendonça, J. Lewis, S. Tilkov, Microservices: The
|
||||
journey so far and challenges ahead, IEEE Softw. 35 (3) (2018) 24–35, http:
|
||||
crucial aspects that underline the effective application and challenges
|
||||
//dx.doi.org/10.1109/MS.2018.2141039.
|
||||
of chaos engineering [25]. [2] I. Beschastnikh, P. Wang, Y. Brun, M.D. Ernst, Debugging distributed systems,
|
||||
Firstly, Chaos Engineering serves as a proactive troubleshooting ap- Commun. ACM 59 (8) (2016) 32–37, http://dx.doi.org/10.1145/2909480.
|
||||
proach in production environments [25]. By identifying and addressing [3] W. Ahmed, Y.W. Wu, A survey on reliability in distributed systems, J. Comput.
|
||||
potential malfunctions before they occur, it effectively preempts system System Sci. 79 (8) (2013) 1243–1255, http://dx.doi.org/10.1016/j.jcss.2013.02.
|
||||
006.
|
||||
disruptions. This proactive strategy is significantly implemented by
|
||||
[4] D. Ma’ruf, S. Sulistyo, L. Nugroho, Applying integrating testing of microservices
|
||||
chaos engineering tools that assist in automatic fault detection, thereby in airline ticketing system, Ijitee (Int. J. Inf. Technol. Electr. Eng.) 4 (2020) 39,
|
||||
minimizing potential issues in these critical environments [50]. http://dx.doi.org/10.22146/ijitee.55491.
|
||||
Secondly, the essence of chaos engineering is rooted in continuous [5] F. Dai, H. Chen, Z. Qiang, Z. Liang, B. Huang, L. Wang, Automatic analysis
|
||||
experimentation and robustness testing under real-world operational of complex interactions in microservice systems, Complexity 2020 (2020) 1–12,
|
||||
http://dx.doi.org/10.1155/2020/2128793.
|
||||
conditions. The methodology involves a systematic approach: defining [6] J. Lewis, M. Fowler, Microservices: a definition of this new architectural term
|
||||
a steady state, hypothesizing its impacts, conducting controlled exper- (2014), 2014, URL: http://martinfowler.com/articles/microservices.html (cit. p.
|
||||
iments, and subsequently confirming or refuting the hypotheses. These 26).
|
||||
|
||||
|
||||
12
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
[7] A. Basiri, N. Behnam, R. de Rooij, L. Hochstein, L. Kosewski, J. Reynolds, C. [31] J. Zhang, R. Ferydouni, A. Montana, D. Bittman, P. Alvaro, 3MileBeach: A
|
||||
Rosenthal, Chaos engineering, IEEE Softw. 33 (3) (2016) 35–41, http://dx.doi. tracer with teeth, in: Proceedings of the ACM Symposium on Cloud Computing,
|
||||
org/10.1109/MS.2016.60. SoCC ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp.
|
||||
[8] R.T. Munodawafa, S.K. Johl, A systematic review of eco-innovation and perfor- 458–472, http://dx.doi.org/10.1145/3472883.3486986.
|
||||
mance from the resource-based and stakeholder perspectives, Sustainability 11 [32] C.S. Meiklejohn, A. Estrada, Y. Song, H. Miller, R. Padhye, Service-level fault
|
||||
(2019) 6067, http://dx.doi.org/10.3390/su11216067. injection testing, in: Proceedings of the ACM Symposium on Cloud Computing,
|
||||
[9] J.M. Macharia, Systematic literature review of interventions supported by inte- SoCC ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp.
|
||||
gration of ict in education to improve learners’ academic performance in stem 388–402, http://dx.doi.org/10.1145/3472883.3487005.
|
||||
subjects in kenya, J. Educ. Pract. 6 (2022) 52–75, http://dx.doi.org/10.47941/ [33] A. Blohowiak, A. Basiri, L. Hochstein, C. Rosenthal, A platform for automating
|
||||
jep.979. chaos experiments, in: 2016 IEEE International Symposium on Software Reliabil-
|
||||
[10] P. Gerli, J.N. Marco, J. Whalley, What makes a smart village smart? a review ity Engineering Workshops, ISSREW, 2016, pp. 5–8, http://dx.doi.org/10.1109/
|
||||
of the literature, Transform. Gov.: People Process. Policy 16 (2022) 292–304, ISSREW.2016.52.
|
||||
http://dx.doi.org/10.1108/tg-07-2021-0126. [34] A. Nagarajan, A. Vaddadi, Automated fault-tolerance testing, in: 2016 IEEE
|
||||
[11] R. Coppola, L. Ardito, Quality assessment methods for textual conversational Ninth International Conference on Software Testing, Verification and Validation
|
||||
interfaces: a multivocal literature review, Information 12 (2021) 437, http: Workshops, ICSTW, 2016, pp. 275–276, http://dx.doi.org/10.1109/ICSTW.2016.
|
||||
//dx.doi.org/10.3390/info12110437. 34.
|
||||
[12] B. Kitchenham, O. Pearl Brereton, D. Budgen, M. Turner, J. Bailey, S. Linkman, [35] V. Heorhiadi, S. Rajagopalan, H. Jamjoom, M.K. Reiter, V. Sekar, Gremlin:
|
||||
Systematic literature reviews in software engineering – A systematic literature Systematic resilience testing of microservices, in: 2016 IEEE 36th International
|
||||
review, Inf. Softw. Technol. 51 (1) (2009) 7–15, http://dx.doi.org/10.1016/j. Conference on Distributed Computing Systems, ICDCS, 2016, pp. 57–66, http:
|
||||
infsof.2008.09.009, Special Section - Most Cited Articles in 2002 and Regular //dx.doi.org/10.1109/ICDCS.2016.11.
|
||||
Research Papers. [36] R.K. Lenka, S. Padhi, K.M. Nayak, Fault injection techniques - a brief review,
|
||||
[13] N. Dragoni, S. Giallorenzo, A.L. Lafuente, M. Mazzara, F. Montesi, R. Mustafin, L. in: 2018 International Conference on Advances in Computing, Communication
|
||||
Safina, Microservices: yesterday, today, and tomorrow, 2017, arXiv:1606.04036. Control and Networking, ICACCCN, 2018, pp. 832–837, http://dx.doi.org/10.
|
||||
[14] P.D. Francesco, I. Malavolta, P. Lago, Research on architecting microservices: 1109/ICACCCN.2018.8748585.
|
||||
Trends, focus, and potential for industrial adoption, in: 2017 IEEE International [37] A. van Hoorn, A. Aleti, T.F. Düllmann, T. Pitakrat, ORCAS: Efficient resilience
|
||||
Conference on Software Architecture, ICSA, 2017, pp. 21–30, http://dx.doi.org/ benchmarking of microservice architectures, in: 2018 IEEE International Sym-
|
||||
10.1109/ICSA.2017.24. posium on Software Reliability Engineering Workshops, ISSREW, 2018, pp.
|
||||
[15] M. Fowler, Patterns of Enterprise Application Architecture, Addison-Wesley 146–147, http://dx.doi.org/10.1109/ISSREW.2018.00-10.
|
||||
Longman Publishing Co., Inc., USA, 2002. [38] H. Tucker, L. Hochstein, N. Jones, A. Basiri, C. Rosenthal, The business case for
|
||||
chaos engineering, IEEE Cloud Comput. 5 (3) (2018) 45–54, http://dx.doi.org/
|
||||
[16] J. Lewis, M. Fowler, Microservices, 2014, https://martinfowler.com/articles/
|
||||
10.1109/MCC.2018.032591616.
|
||||
microservices.html.
|
||||
[39] N. Brousse, O. Mykhailov, Use of self-healing techniques to improve the
|
||||
[17] S. Newman, Building Microservices: Designing Fine-Grained Systems, " O’Reilly
|
||||
reliability of a dynamic and geo-distributed ad delivery service, in: 2018
|
||||
Media, Inc.", 2021.
|
||||
IEEE International Symposium on Software Reliability Engineering Workshops,
|
||||
[18] C.K. Rudrabhatla, Comparison of zero downtime based deployment techniques in
|
||||
ISSREW, 2018, pp. 1–5, http://dx.doi.org/10.1109/ISSREW.2018.00-40.
|
||||
public cloud infrastructure, in: 2020 Fourth International Conference on I-SMAC
|
||||
[40] K.A. Torkura, M.I. Sukmana, F. Cheng, C. Meinel, Security chaos engineering for
|
||||
(IoT in Social, Mobile, Analytics and Cloud), I-SMAC, 2020, pp. 1082–1086,
|
||||
cloud services: Work in progress, in: 2019 IEEE 18th International Symposium
|
||||
http://dx.doi.org/10.1109/I-SMAC49090.2020.9243605.
|
||||
on Network Computing and Applications, NCA, 2019, pp. 1–3, http://dx.doi.org/
|
||||
[19] S.R. Addula, P. Perugu.P, M.K. Kumar, D. Kumar, B. Ananthan, R. R, S. P, S.
|
||||
10.1109/NCA.2019.8935046.
|
||||
G, Dynamic load balancing in cloud computing using hybrid Kookaburra-Pelican
|
||||
[41] H. Chen, P. Chen, G. Yu, A framework of virtual war room and matrix sketch-
|
||||
optimization algorithms, in: 2024 International Conference on Augmented Re-
|
||||
based streaming anomaly detection for microservice systems, IEEE Access 8
|
||||
ality, Intelligent Systems, and Industrial Automation, ARIIA, 2024, pp. 1–7,
|
||||
(2020) 43413–43426, http://dx.doi.org/10.1109/ACCESS.2020.2977464.
|
||||
http://dx.doi.org/10.1109/ARIIA63345.2024.11051893.
|
||||
[42] K.A. Torkura, M.I.H. Sukmana, F. Cheng, C. Meinel, CloudStrike: Chaos engi-
|
||||
[20] M. Waseem, P. Liang, M. Shahin, A systematic mapping study on microservices
|
||||
neering for security and resiliency in cloud infrastructure, IEEE Access 8 (2020)
|
||||
architecture in devops, J. Syst. Softw. 170 (2020) 110798, http://dx.doi.org/10.
|
||||
123044–123060, http://dx.doi.org/10.1109/ACCESS.2020.3007338.
|
||||
1016/j.jss.2020.110798.
|
||||
[43] D. Kesim, A. van Hoorn, S. Frank, M. H00E4ussler, Identifying and prioritizing
|
||||
[21] C. Rosenthal, N. Jones, Chaos Engineering: System Resiliency in Practice, O’Reilly
|
||||
chaos experiments by using established risk analysis techniques, in: 2020 IEEE
|
||||
Media, 2020.
|
||||
31st International Symposium on Software Reliability Engineering, ISSRE, 2020,
|
||||
[22] L. Zhang, B. Morin, B. Baudry, M. Monperrus, Maximizing error injection realism pp. 229–240, http://dx.doi.org/10.1109/ISSRE5003.2020.00030.
|
||||
for chaos engineering with system calls, IEEE Trans. Dependable Secur. Comput. [44] Z. Long, G. Wu, X. Chen, C. Cui, W. Chen, J. Wei, Fitness-guided resilience
|
||||
19 (4) (2022) 2695–2708, http://dx.doi.org/10.1109/TDSC.2021.3069715. testing of microservice-based applications, 2020, pp. 151–158, http://dx.doi.org/
|
||||
[23] Š. Davidovič, B. Beyer, Canary analysis service, Commun. ACM 61 (5) (2018) 10.1109/ICWS49710.2020.00027.
|
||||
54–62, http://dx.doi.org/10.1145/3190566. [45] S. De, A study on chaos engineering for improving cloud software quality
|
||||
[24] L. Zhang, B. Morin, P. Haller, B. Baudry, M. Monperrus, A chaos engineering and reliability, in: 2021 International Conference on Disruptive Technologies
|
||||
system for live analysis and falsification of exception-handling in the JVM, IEEE for Multi-Disciplinary Research and Applications, CENTCON, Vol. 1, 2021, pp.
|
||||
Trans. Softw. Eng. 47 (11) (2021) 2534–2548, http://dx.doi.org/10.1109/TSE. 289–294, http://dx.doi.org/10.1109/CENTCON52345.2021.9688292.
|
||||
2019.2954871. [46] C. Konstantinou, G. Stergiopoulos, M. Parvania, P. Esteves-Verissimo, Chaos
|
||||
[25] H. Jernberg, P. Runeson, E. Engström, Getting started with chaos engineering engineering for enhanced resilience of cyber-physical systems, in: 2021 Re-
|
||||
- design of an implementation framework in practice, in: Proceedings of the silience Week, RWS, 2021, pp. 1–10, http://dx.doi.org/10.1109/RWS52686.
|
||||
14th ACM / IEEE International Symposium on Empirical Software Engineering 2021.9611797.
|
||||
and Measurement, ESEM, ESEM ’20, Association for Computing Machinery, New [47] F. Poltronieri, M. Tortonesi, C. Stefanelli, ChaosTwin: A chaos engineering and
|
||||
York, NY, USA, 2020, http://dx.doi.org/10.1145/3382494.3421464. digital twin approach for the design of resilient IT services, in: 2021 17th
|
||||
[26] A. Alkhateeb, C. Catal, G. Kar, A. Mishra, Hybrid blockchain platforms for the International Conference on Network and Service Management, CNSM, 2021,
|
||||
internet of things (IoT): A systematic literature review, Sensors 22 (4) (2022) pp. 234–238, http://dx.doi.org/10.23919/CNSM52442.2021.9615519.
|
||||
http://dx.doi.org/10.3390/s22041304. [48] N. Luo, Y. Xiong, Platform software reliability for cloud service continuity
|
||||
[27] R. van Dinter, B. Tekinerdogan, C. Catal, Predictive maintenance using digital - challenges and opportunities, in: 2021 IEEE 21st International Conference
|
||||
twins: A systematic literature review, Inf. Softw. Technol. 151 (2022) 107008, on Software Quality, Reliability and Security, QRS, 2021, pp. 388–393, http:
|
||||
http://dx.doi.org/10.1016/j.infsof.2022.107008. //dx.doi.org/10.1109/QRS54544.2021.00050.
|
||||
[28] M. Jorayeva, A. Akbulut, C. Catal, A. Mishra, Machine learning-based software [49] H. Chen, K. Wei, A. Li, T. Wang, W. Zhang, Trace-based intelligent fault diagnosis
|
||||
defect prediction for mobile applications: A systematic literature review, Sensors for microservices with deep learning, in: 2021 IEEE 45th Annual Computers,
|
||||
22 (7) (2022) http://dx.doi.org/10.3390/s22072551. Software, and Applications Conference, COMPSAC, 2021, pp. 884–893, http:
|
||||
[29] A. Basiri, L. Hochstein, N. Jones, H. Tucker, Automating chaos experiments //dx.doi.org/10.1109/COMPSAC51774.2021.00121.
|
||||
in production, in: 2019 IEEE/ACM 41st International Conference on Software [50] O. Sharma, M. Verma, S. Bhadauria, P. Jayachandran, A guided approach
|
||||
Engineering: Software Engineering in Practice, ICSE-SEIP, 2019, pp. 31–40, towards complex chaos selection, prioritisation and injection, in: 2022 IEEE
|
||||
http://dx.doi.org/10.1109/ICSE-SEIP.2019.00012. 15th International Conference on Cloud Computing, CLOUD, 2022, pp. 91–93,
|
||||
[30] L.B. Canonico, V. Vakeel, J. Dominic, P. Rodeghero, N. McNeese, Human-AI http://dx.doi.org/10.1109/CLOUD55607.2022.00025.
|
||||
partnerships for chaos engineering, in: Proceedings of the IEEE/ACM 42nd [51] N. Luo, L. Zhang, Chaos driven development for software robustness enhance-
|
||||
International Conference on Software Engineering Workshops, ICSEW ’20, As- ment, in: 2022 9th International Conference on Dependable Systems and their
|
||||
sociation for Computing Machinery, New York, NY, USA, 2020, pp. 499–503, Applications, DSA, 2022, pp. 1029–1034, http://dx.doi.org/10.1109/DSA56465.
|
||||
http://dx.doi.org/10.1145/3387940.3391493. 2022.00154.
|
||||
|
||||
|
||||
13
|
||||
E. Esen et al. Computer Standards & Interfaces 97 (2026) 104116
|
||||
|
||||
|
||||
[52] M.A. Naqvi, S. Malik, M. Astekin, L. Moonen, On evaluating self-adaptive [58] D. Savchenko, G. Radchenko, O. Taipale, Microservices validation: Mjolnirr
|
||||
and self-healing systems using chaos engineering, in: 2022 IEEE International platform case study, in: 2015 38th International Convention on Information and
|
||||
Conference on Autonomic Computing and Self-Organizing Systems, ACSOS, 2022, Communication Technology, Electronics and Microelectronics, MIPRO, 2015, pp.
|
||||
pp. 1–10, http://dx.doi.org/10.1109/ACSOS55765.2022.00018. 235–240, http://dx.doi.org/10.1109/MIPRO.2015.7160271.
|
||||
[53] J. Simonsson, L. Zhang, B. Morin, B. Baudry, M. Monperrus, Observability and [59] G.S. Nadella, S.R. Addula, A.R. Yadulla, G.S. Sajja, M. Meesala, M.H. Maturi,
|
||||
chaos engineering on system calls for containerized applications in Docker, K. Meduri, H. Gonaygunta, Generative AI-enhanced cybersecurity framework for
|
||||
Future Gener. Comput. Syst. 122 (2021) 117–129, http://dx.doi.org/10.1016/ enterprise data privacy management, Computers 14 (2) (2025) http://dx.doi.org/
|
||||
j.future.2021.04.001. 10.3390/computers14020055.
|
||||
[54] A.A.-S. Ahmad, P. Andras, Scalability resilience framework using application- [60] D. Kikuta, H. Ikeuchi, K. Tajiri, Y. Nakano, ChaosEater: Fully automating chaos
|
||||
level fault injection for cloud-based software services, J. Cloud Comput. 11 (1) engineering with large language models, 2025, arXiv preprint arXiv:2501.11107.
|
||||
(2022) 1, http://dx.doi.org/10.1186/s13677-021-00277-z. URL https://arxiv.org/abs/2501.11107.
|
||||
[55] C. Camacho, P.C. Cañizares, L. Llana, A. Núñez, Chaos as a software product [61] J. Owotogbe, Assessing and enhancing the robustness of LLM-based multi-
|
||||
line—A platform for improving open hybrid-cloud systems resiliency, Softw.: agent systems through chaos engineering, in: 2025 IEEE/ACM 4th International
|
||||
Pract. Exp. 52 (7) (2022) 1581–1614, http://dx.doi.org/10.1002/spe.3076. Conference on AI Engineering – Software Engineering for AI, CAIN, 2025, pp.
|
||||
[56] P. Raj, S. Vanga, A. Chaudhary, The observability, chaos engineering, and 250–252, http://dx.doi.org/10.1109/CAIN66642.2025.00039.
|
||||
remediation for cloud-native reliability, in: Cloud-Native Computing: How To [62] A. Al-Said Ahmad, L.F. Al-Qora’n, A. Zayed, Exploring the impact of chaos
|
||||
Design, Develop, and Secure Microservices and Event-Driven Applications, 2023, engineering with various user loads on cloud native applications: An exploratory
|
||||
pp. 71–93, http://dx.doi.org/10.1002/9781119814795.ch4. empirical study, Computing 106 (2024) 2389–2425, http://dx.doi.org/10.1007/
|
||||
[57] M.A. Chang, B. Tschaen, T. Benson, L. Vanbever, Chaos monkey: Increasing sdn s00607-024-01292-z.
|
||||
reliability through systematic network destruction, in: Proceedings of the 2015 [63] K.A. Torkura, M.I. Sukmana, F. Cheng, C. Meinel, Security chaos engineering for
|
||||
ACM Conference on Special Interest Group on Data Communication, 2015, pp. cloud services: Work in progress, in: 2019 IEEE 18th International Symposium
|
||||
371–372. on Network Computing and Applications, NCA, 2019, pp. 1–3, http://dx.doi.org/
|
||||
10.1109/NCA.2019.8935046.
|
||||
|
||||
|
||||
|
||||
|
||||
14
|
||||
|
||||
@@ -0,0 +1,830 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
Co-distillation-based defense framework for federated knowledge graph
|
||||
embedding against poisoning attacks
|
||||
∗
|
||||
Yiqin Lu, Jiarui Chen , Jiancheng Qin
|
||||
School of Electronic and Information Engineering, South China University of Technology, 510641, China
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: Federated knowledge graph embedding (FKGE) enables collaborative knowledge sharing without data ex-
|
||||
Federated learning change, but it also introduces risks of poisoning attacks that degrade model accuracy or force incorrect
|
||||
Knowledge graph outputs. Protecting FKGE from poisoning attacks becomes a critical research problem. This paper reveals
|
||||
Poisoning attack
|
||||
the malicious strategy of untargeted FKGE poisoning attacks and proposes CoDFKGE, a co-distillation-based
|
||||
Knowledge distillation
|
||||
FKGE framework for defending against poisoning attacks. CoDFKGE deploys two collaborative knowledge
|
||||
graph embedding models on clients, decoupling prediction parameters from shared parameters as a model-
|
||||
agnostic solution. By designing distinct distillation loss functions, CoDFKGE transfers clean knowledge from
|
||||
potentially poisoned shared parameters while compressing dimensions to reduce communication overhead.
|
||||
Experiments show CoDFKGE preserves link prediction performance with lower communication costs, eliminates
|
||||
malicious manipulations under targeted poisoning attacks, and significantly mitigates accuracy degradation
|
||||
under untargeted poisoning attacks.
|
||||
|
||||
|
||||
|
||||
1. Introduction embedding for entities and relations. However, real-world KGs of dif-
|
||||
ferent organizations are often incomplete, making it difficult to train
|
||||
Knowledge graphs (KGs) are structured representations of real- high-quality knowledge graph reasoning models. Moreover, KG data
|
||||
world entities and their relationships, supporting applications in search often contains a large amount of private data, and direct data sharing
|
||||
engines [1,2], recommendation systems [3,4], and security analysis [5, will inevitably lead to privacy leakage. For this reason, federated
|
||||
6]. Knowledge graph embedding (KGE) techniques project entities learning [12] is introduced into knowledge graph reasoning.
|
||||
and relations into low-dimensional vector spaces, enabling efficient
|
||||
FKGE assumes that there are multiple participants with comple-
|
||||
knowledge reasoning and completion [7]. Due to privacy regulations
|
||||
mentary but incomplete KGs, aiming to derive optimal knowledge
|
||||
and data sensitivity requirements, KGs across organizations within the
|
||||
embeddings for each participant without data exchange. Most existing
|
||||
same domain remain fragmented despite growing data volumes. In this
|
||||
context, federated knowledge graph embedding (FKGE) emerges as a studies [13–15] model FKGE as multiple clients that maintain local
|
||||
collaborative learning technique for sharing KG embeddings without KGE models and a central server. Clients train models locally and
|
||||
data exchange. However, the introduction of federation mechanisms upload the model parameters to the central server, which aggregates
|
||||
will bring new privacy risks. malicious participants can inject poisoned the parameters and then returns them to the clients.
|
||||
parameters during training or aggregation to launch a poisoning attack, However, since the embedding vectors are directly the model pa-
|
||||
degrading model accuracy or forcing incorrect outputs. Consequently, rameters, FKGE is highly vulnerable to poisoning attacks. With the
|
||||
protecting FKGE systems against poisoning attacks has emerged as a intent to reduce model performance, steal sensitive information, or dis-
|
||||
critical research challenge. rupt system stability, poisoning attacks refer to malicious modifications
|
||||
Unlike graph neural network (GNN)-based models, KGE models of parameters during local training or parameter aggregation on the
|
||||
usually rely on the translation-based model [8–11]. The embedding
|
||||
server. To protect the participants of FKGE, it is necessary to propose
|
||||
vectors of entity and relation in the KG are directly used as learnable
|
||||
a protection mechanism against FKGE poisoning attacks.
|
||||
parameters. KGE models utilize different score functions to measure
|
||||
Moreover, other related indicators in FKGE deserve attention. For
|
||||
the plausibility of triples (h,r,t). By contrasting the outputs of existing
|
||||
triples and negatively sampled triples, KGE models derive appropriate example, the federated learning of KGE requires frequent parameter
|
||||
|
||||
|
||||
|
||||
∗ Corresponding author.
|
||||
E-mail addresses: eeyqlu@scut.edu.cn (Y. Lu), ee_jrchen@mail.scut.edu.cn (J. Chen), jcqin@scut.edu.cn (J. Qin).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104113
|
||||
Received 3 June 2025; Received in revised form 8 November 2025; Accepted 8 December 2025
|
||||
Available online 9 December 2025
|
||||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
exchange, and the use of a translation-based model will submit the en- 2.3. Poisoning attack in federated learning
|
||||
tity or relation embeddings, which makes the communication overhead
|
||||
greater than that of traditional federated learning. Federated Learning (FL), due to its distributed training nature,
|
||||
Knowledge distillation [16] is a model compression technique that creates favorable conditions for poisoning attacks while protecting
|
||||
improves the performance of a simple (student) model by transfer- data privacy. Poisoning attacks in federated learning have attracted
|
||||
ring the knowledge from a complex (teacher) model. Distillation-based significant attention from researchers [25]. In federated learning sce-
|
||||
methods are considered to be a feasible solution to combat poisoning narios, poisoning attacks pose serious threats to model security by
|
||||
attacks [17–19]. A teacher model can extract clean knowledge from manipulating partial training data or local models to embed malicious
|
||||
the poisoned parameters and transfer it to a student model, thereby behaviors [26]. The literature [27] generates stealthy backdoor trig-
|
||||
improving the robustness without changing the model structure. Co- gers by extracting high-frequency features from images using discrete
|
||||
distillation [20] is a variant of knowledge distillation that trains two or wavelet transform and introduces an asymmetric frequency confusion
|
||||
more models simultaneously, allowing mutual learning and information mechanism, achieving efficient backdoor attacks on multiple datasets.
|
||||
sharing. This paper aims to design a federated knowledge graph defense
|
||||
Meanwhile, many studies have proposed defense methods against poi-
|
||||
framework based on Co-distillation, which can enhance the model’s
|
||||
soning attacks. The Literature [28] proposes the Krum method, which
|
||||
resistance to poisoning attacks through collaborative learning without
|
||||
selects the most reliable gradient update by evaluating the consistency
|
||||
changing the original FKGE architecture.
|
||||
of gradients, thereby effectively defending against poisoning attacks.
|
||||
The rest of this paper is organized as follows. Section 2 reviews the
|
||||
The Literature [29] proposes Fl-Defender, which improves robustness
|
||||
related work on FKGE and knowledge distillation. Section 3 introduces
|
||||
by introducing cosine similarity to adjust the weights of parameter
|
||||
the preliminary concepts and methodologies essential for addressing
|
||||
aggregation. The literature [30] proposed a two-stage backdoor defense
|
||||
FKGE poisoning attacks, with the main contributions of this paper
|
||||
method called MCLDef based on Model Contrastive Learning (MCL),
|
||||
summarized at the end of this section. In Section 4, we detail the threat
|
||||
which can significantly reduce the success rate of backdoor attacks with
|
||||
model and malicious strategies for targeted and untargeted poison-
|
||||
only a small amount of clean data. In summary, existing research on
|
||||
ing attacks in FKGE. Section 5 presents the CoDFKGE framework for
|
||||
poisoning attacks in federated learning mainly focuses on traditional
|
||||
defending against FKGE poisoning attacks, followed by experimental
|
||||
validation in Section 6. Finally, concluding remarks and future research deep learning domains. The design ideas of defense frameworks have
|
||||
directions are outlined in Section 7. laid the foundation for subsequent poisoning attack defense methods of
|
||||
FKGE.
|
||||
2. Related work
|
||||
2.4. Security issues in FKGE
|
||||
2.1. Basic FKGE framework
|
||||
With the development of FKGE, its security and privacy issues have
|
||||
Early research on FKGE mainly focused on how to achieve cross- attracted increasing attention, with existing research mainly focusing
|
||||
client knowledge sharing and model aggregation while protecting data on privacy leakage defense. The literature [31] proposed a decentral-
|
||||
privacy. FedE [13] is the first paper to introduce federated learning into ized scalable learning framework where embeddings from different KGs
|
||||
KGE. FedE facilitates cross-client knowledge sharing by maintaining an can be learned in an asynchronous and peer-to-peer manner while
|
||||
entity table. Nevertheless, the mechanism of sharing entity embeddings being privacy-preserving. The literature [21] conducts the first holistic
|
||||
in FedE has been proven to contain privacy vulnerabilities [21]. At- study of the privacy threat on FKGE from both attack and defense
|
||||
tackers can leverage the embedding information to infer the existence perspectives. It introduced three new inference attacks and proposed
|
||||
of private triples within client datasets. Based on FedE, FedEC [14] a differentially private FKGE model DP-Flames with private selection
|
||||
applies embedding contrastive learning for tackling data heterogeneity and an adaptive privacy budget allocation policy. Based on [21], the
|
||||
and utilizes a global update procedure for sharing entity embeddings. literature [32] introduces five new inference attacks, and proposed
|
||||
In response to the privacy vulnerability of FedE, FedR [15] proposed a PDP-Flames, which leverages the sparse gradient nature of FKGE for
|
||||
privacy-preserving relation embedding aggregation method. By sharing
|
||||
better privacy-utility trade-off.
|
||||
relation embeddings instead of entity embeddings, FedR can signifi-
|
||||
Compared with privacy leakage issues, research on defending
|
||||
cantly reduce the communication overhead of privacy leakage risks
|
||||
against poisoning attacks in FKGE is still in its early stages. Traditional
|
||||
while retaining the semantic information of the KG.
|
||||
federated learning typically does not directly transmit original embed-
|
||||
dings. However, entity and relation embeddings are core components
|
||||
2.2. Knowledge distillation in FKGE
|
||||
in translation-based KGE, so direct transmission of embeddings is
|
||||
required during FKGE aggregation. Direct malicious modifications to
|
||||
Knowledge Distillation techniques are widely applied in the FKGE
|
||||
embeddings are difficult to effectively defend against using traditional
|
||||
field due to their advantages in model compression and knowledge
|
||||
transfer. To cope with the drift between local optimization and global federated learning defense methods.
|
||||
convergence caused by data heterogeneity, FedLU [22] proposes mu- The recent literature [33] is the first work to systematize the risks of
|
||||
tual knowledge distillation. Moreover, it contains an unlearning method FKGE poisoning attacks. However, it primarily focuses on several forms
|
||||
to erase specific knowledge from local clients. FedKD [23] uses knowl- of targeted poisoning attacks in FKGE, without mentioning untargeted
|
||||
edge distillation to reduce communication costs, and proposes to adap- poisoning attacks. Although this research provides some defense sug-
|
||||
tively learn temperature to scale the scores of triples to mitigate teacher gestions, such as zero-knowledge proof and privacy set intersection, it
|
||||
over-confidence issues. In addition to FKGE, the KGE model ColE [24] does not propose specific defense methods. In summary, the existing
|
||||
proposes co-distillation learning to exploit the complementarity of research lacks a systematic introduction to the untargeted poisoning
|
||||
graph structure and text information. It employs Transformer and Bert attack of FKGE, and there is no complete defense method against FKGE
|
||||
for graph and text respectively, then distills selective knowledge from poisoning attacks.
|
||||
each other’s prediction logits. Overall, existing research on knowledge To address the above issues, this paper reveals the malicious strat-
|
||||
distillation in FKGE primarily focuses on handling data heterogeneity, egy of FKGE untargeted poisoning attacks and proposes CoDFKGE,
|
||||
with insufficient exploration of its potential value in model security. a co-distillation-based federating knowledge graph embedding frame-
|
||||
This paper will explore the application of knowledge distillation in work for defending against poisoning attacks. The main contributions
|
||||
FKGE security to defend against poisoning attacks. of this paper are summarized as follows.
|
||||
|
||||
2
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
1 We systematically define untargeted poisoning attacks in FKGE local KGE model to update its local embedding 𝜃𝐿𝑘 and server-shared
|
||||
𝑐
|
||||
and reveal the poisoning attacks’ malicious strategy, thereby en- embedding 𝜃𝑆𝑘 . Then, client 𝑐 uploads its shared embedding 𝜃𝑆𝑘 to the
|
||||
𝑐 𝑐
|
||||
hancing threat identification in FKGE and providing a foundation server. In server aggregate stage, the central server 𝑆 aggregates the
|
||||
for subsequent defense research. shared embeddings from all clients to obtain the shared parameters
|
||||
2 We propose CoDFKGE, the first co-distillation defense framework 𝜃𝑆𝑘+1 . Finally, the server broadcasts the shared parameters 𝜃𝑆𝑘+1 to all
|
||||
against poisoning attacks in FKGE. By deploying bidirectional clients. Entity embeddings in KGE are usually shared parameters, while
|
||||
distillation models with distinct distillation loss at the client side, relation embeddings are local parameters. Only rare literature [15] uses
|
||||
CoDFKGE as a model-agnostic solution decouples prediction pa- relation embeddings as shared parameters.
|
||||
rameters from shared parameters, thereby enhancing the model’s In FKGE, how the server effectively aggregates shared embeddings
|
||||
resistance to poisoning attacks and improving robustness. We from different clients is a common problem. The most common FKGE
|
||||
designed distinct distillation loss functions for the two models in server aggregation method is FedE [13], which is an improvement on
|
||||
CoDFKGE, enabling CoDFKGE to transfer clean knowledge from FedAvg [12]. To handle the imbalance in the number of entities across
|
||||
potentially poisoned shared parameters and compress shared pa- different clients, FedE aggregate the shared entities using the number
|
||||
rameter dimensions, which reduces communication overhead. of occurrences in the local data as the weight 𝑤𝑐 . This weight value
|
||||
3 We validated the performance of CoDFKGE against poisoning can be obtained using the existence matrix 𝑀 mentioned above. The
|
||||
attacks through experiments. The results show that without com- mathematical expression for FedE’s server aggregation method is shown
|
||||
promising link prediction performance CoDFKGE can completely in (2).
|
||||
eliminate targeted poisoning attacks and significantly mitigate ∑
|
||||
𝜃𝑆𝑘+1 = 𝑐 𝑤𝑐 𝜃𝑆𝑘 (2)
|
||||
the performance degradation caused by untargeted poisoning 𝑐
|
||||
|
||||
attacks, while simultaneously reducing communication overhead. The final target of FKGE is to minimize the loss function of all client
|
||||
Ablation experiments further confirm the effectiveness of the two local triplets simultaneously through federated learning. Its optimiza-
|
||||
distillation loss functions in CoDFKGE. tion objective can be expressed as Eq. (3).
|
||||
∑𝐶
|
||||
𝑎𝑟𝑔 min 𝑐 (𝜃𝐿𝑐 , 𝜃𝑆𝑐 ) (3)
|
||||
3. Preliminaries (𝜃 ,𝜃 ) 𝑐
|
||||
𝐿𝑐 𝑆𝑐
|
||||
|
||||
|
||||
3.1. Knowledge graph embedding 3.3. Knowledge distillation
|
||||
|
||||
KG can be represented as (, , ), where E and R are entity sets Knowledge distillation is a model compression technique that trans-
|
||||
and relationship sets. is a set of triples, where a triple (ℎ, 𝑟, 𝑡) ∈ fers knowledge contained in a complex model (teacher) to a simple
|
||||
indicates that a relationship 𝑟 ∈ connects the entities ℎ, 𝑡 ∈ . model (student) to improve the performance of the simple model. In the
|
||||
Translation-based KGE models project entities and relationships classic knowledge distillation framework, the student model’s training
|
||||
in KGs into a continuous vector space. Models employ the scoring loss comprises two components: the cross entropy loss 𝐿𝐶𝐸 , computed
|
||||
function 𝑔(ℎ, 𝑟, 𝑡; 𝜃) to evaluate the plausibility of triples, while 𝜃 rep- between its output and the true label, and the distillation loss 𝐿𝐾𝐷 ,
|
||||
resents the embedding parameters. During model training, negative computed between its output and the teacher model’s output (soft
|
||||
samples (ℎ, 𝑟, 𝑡′ ) are constructed by randomly replacing the tail entities label). In practical applications, the distillation loss is usually quantified
|
||||
of positive triples. The training process aims to maximize the score using the Kullback–Leibler divergence 𝐷𝐾𝐿 between the student model
|
||||
discrepancy between positive and negative samples. Currently, most output and the soft label, and its mathematical expression is shown
|
||||
KGE models [9,11] employ the binary cross-entropy loss to measure in Eq. (4).
|
||||
the difference between positive and negative samples. Its mathematical ( ) ∑ ( )
|
||||
𝑝 (𝑖)
|
||||
expression is as Eq. (1). 𝐷𝐾𝐿 𝑝𝑡𝑒𝑎 ∥ 𝑝𝑠𝑡𝑢 = 𝑖 𝑝𝑡𝑒𝑎 (𝑖) log 𝑝𝑡𝑒𝑎 (𝑖)
|
||||
( ) 𝑠𝑡𝑢 ( ) (4)
|
||||
(
|
||||
∑ 𝐿𝐾𝐷 = 𝜏 2 𝐷𝐾𝐿 𝜎(𝑧(𝑛) (𝑛)
|
||||
𝑡𝑒𝑎 ) ∥ 𝜎(𝑧𝑠𝑡𝑢 ) , 𝑤ℎ𝑒𝑟𝑒 𝜎(𝑥) = sof tmax 𝜏
|
||||
𝑥
|
||||
𝐿 = − log 𝜎 (𝑔(ℎ, 𝑟, 𝑡; 𝜃) − 𝛾)
|
||||
(ℎ,𝑟,𝑡)∈ Among them, 𝑧𝑡𝑒𝑎 and 𝑧𝑠𝑡𝑢 are the logits of the teacher model and
|
||||
)
|
||||
∑ student model, respectively. 𝜏 is the temperature coefficient, which is
|
||||
+ 𝑝(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃) log 𝜎(𝛾 − 𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃)) (1) used to control the smoothness of the output.
|
||||
𝑖
|
||||
To allow the student model to effectively absorb the knowledge
|
||||
Among them, 𝛾 represents the margin, and (ℎ, 𝑟, 𝑡′𝑖 ) is 𝑖th negative contained in the teacher model while fitting the real data distribution,
|
||||
triples. 𝑝(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃) stands for the occurrence probability of this negative the final loss function is usually the weighted sum of 𝐿𝐶𝐸 and 𝐿𝐾𝐷 .
|
||||
sample given the embedding parameters 𝜃.
|
||||
4. Threat model
|
||||
3.2. Federated knowledge graph embedding
|
||||
Poisoning attacks in federated learning can be categorized into
|
||||
FKGE is an application of federated learning that aims to fuse and targeted poisoning attacks, semi-targeted poisoning attacks, and untar-
|
||||
share knowledge vectors from different KGs to enhance the effective- geted poisoning attacks according to the intention of attackers [34].
|
||||
ness of KGE. Currently, most related studies are based on the framework In FKGE, a semi-targeted poisoning attack can be regarded as a special
|
||||
proposed in FedE [13]. case of a targeted poisoning attack. Therefore, this paper focuses on the
|
||||
The basic framework of FKGE consists of a client set 𝐶 and a central targeted and untargeted poisoning attack type.
|
||||
server 𝑆. Each client 𝑐 ∈ 𝐶 holds a local KG 𝑐 (𝑐 , 𝑐 , 𝑐 ). The entity
|
||||
sets of different KGs are partially overlapping, so the understanding of 4.1. Targeted poisoning attack
|
||||
entities in a certain client can be supplemented by information from
|
||||
other clients. The server has the one-hot existence matrix 𝑀 ∈ R𝐶×𝑁 Targeted poisoning attacks are a attack strategy where the attacker
|
||||
of all entities in the client, where 𝑁 is the number of entities. crafts specific malicious triples that do not exist in the target system,
|
||||
In each client, KGE model parameters consist of local parame- and manipulate the target model to accept these fake triples by inject-
|
||||
ters 𝜃𝐿 and shared parameters 𝜃𝑆 . During FKGE training, each epoch ing poisoned parameters into the shared parameters. This type of attack
|
||||
progresses through two sequential phases: client update and server poses a serious threat to the application of FKGE, as the false relation-
|
||||
aggregation. In the 𝑘th client update stage, client 𝑐 first trains its ships it introduces can lead to reasoning errors and decision-making
|
||||
|
||||
3
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1. Process of targeted poisoning attack.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 2. Framework of CoDFKGE model.
|
||||
|
||||
|
||||
|
||||
|
||||
biases in downstream tasks. For example, in financial transaction net- attacker’s deceptive information. The shadow model’s parameters in-
|
||||
works, a knowledge graph is constructed with transaction entities clude 𝜃𝑆𝑝 , which can be initialized with the victim shared parameters
|
||||
as nodes and transaction relationships as edges. Link prediction can 𝜃𝑆𝑐 , and 𝜃𝐿𝑝 , which approximates the victim’s local model parameters
|
||||
then be applied to detect potential transaction relationships (such as 𝜃𝐿𝑐 from random initial values. To ensure the shadow model effectively
|
||||
money laundering or fraud). If an attacker compromises one of the bridges both the victim’s genuine knowledge and the attacker’s ma-
|
||||
participants, they can introduce false transaction relationships through licious objectives, its parameters are optimized to minimize the loss
|
||||
targeted poisoning attacks, leading to unreasonable inferences about function across all triples in the poisoned dataset, as formalized in Eq.
|
||||
the victim entity. (5).
|
||||
∑
|
||||
To execute such an attack successfully, the attacker typically follows arg min 𝐿(ℎ, 𝑟, 𝑡; 𝜃𝑆𝑝 , 𝜃𝐿𝑝 )
|
||||
(𝜃𝑆𝑝 ,𝜃𝐿𝑝 ) (5)
|
||||
a multi-stage process that begins with victim’s local information gath- (ℎ,𝑟,𝑡)∈𝑝
|
||||
ering. Fig. 1 shows the process of a targeted poisoning attack. In FKGE
|
||||
Where L is the loss function of the baseline model.
|
||||
systems, while the server can observe the entities and relations each
|
||||
After training the shadow model, the attacker extracts the poisoned
|
||||
client possesses, it lacks visibility into how these elements are struc- shared parameters 𝜃𝑆𝑝 using the same procedure that legitimate clients
|
||||
tured into specific triples. However, for frameworks that share entity employ to prepare parameters for server aggregation. The attacker can
|
||||
embeddings (such as FedE [13]), recent research [21] has shown that a aggregate the poisoned parameters 𝜃𝑆𝑝 with the normal clients’ shared
|
||||
malicious server can use KGE scoring function to infer the victim’s local parameters. The attacker usually operates as a compromised server and
|
||||
relationship patterns and reconstruct the victim’s triple 𝑣 . Armed with assigns a disproportionately high weight to the poisoned parameters
|
||||
this inferred knowledge, the attacker strategically constructs malicious during the aggregation process to ensure that the poisoned parameter
|
||||
triples 𝑚 that align with the victim’s existing KG schema but represent dominate the aggregated shared parameters.
|
||||
false information. The final stage of the attack exploits the implicit trust in feder-
|
||||
The next critical attack phase involves training a shadow model, a ated systems. The victim client, unaware of the poisoning, directly
|
||||
surrogate KGE model designed to mimic the victim’s learning process. incorporates the compromised aggregated parameters into its local
|
||||
The shadow model is trained on a poisoned dataset 𝑝 , which combines training process without validation. As a result, the victim’s model
|
||||
the inferred victim triples 𝑣 and the malicious triples 𝑚 . This training gradually learns to accept the malicious triples as valid, ultimately pro-
|
||||
strategy ensures the shadow model learns to generate embeddings ducing incorrect predictions on these non-existent relationships while
|
||||
that are consistent with both the victim’s genuine knowledge and the maintaining seemingly normal performance on other parts of the KG.
|
||||
|
||||
4
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
4.2. Untargeted poisoning attack facilitate the reproducibility of our CoDFKGE model, we provide the
|
||||
complete training framework pseudocode as shown in Algorithm 1.
|
||||
The conditions for achieving a targeted poisoning attack are com-
|
||||
plex. For example, FedR [15] shares only relation embeddings (not
|
||||
Algorithm 1 CoDFKGE Training Framework
|
||||
entity embeddings), preventing attackers from inferring victim rela-
|
||||
tions via entity matrices and thus avoiding targeted poisoning attacks. Require: Baseline KGE model 𝑔, Training triples , Learning rate 𝜂,
|
||||
Even with relational data leaks, targeted poisoning attacks are difficult. Distillation weight 𝛽, Distillation temperature 𝜏, Total iterations 𝐾
|
||||
Compared with sharing entity embeddings, the sparsity of relation Initialization:
|
||||
embeddings reduces the shadow model’s ability to align parameters 1: Initialize client-side prediction model with 𝜃0𝑃 = (𝜃0𝑆 , 𝜃0𝐿 ) ⊳ Local
|
||||
with the victim’s vector space. However, FedR has almost no defense parameters randomly initialized
|
||||
2: Initialize client-side communication model with reduced feature
|
||||
effect against untargeted poisoning attacks.
|
||||
dimensions
|
||||
An untargeted poisoning attack means that the attacker aims to dis-
|
||||
3: Initialize server-side aggregated parameters 𝜃1𝑆 = 𝜃0𝑆 ⊳ First round
|
||||
rupt victim model convergence or maximize the mispredictions among
|
||||
initialization
|
||||
test cases. By maximizing the victim’s loss function during training,
|
||||
Main Training Loop (Iterations 𝑘 = 1, 2, ..., 𝐾):
|
||||
attackers can force non-convergent predictions. The attacker can gen-
|
||||
// Client Update Phase (For each client)
|
||||
erate the poisoned shared parameter 𝜃𝑆∗ for the victim, which can be
|
||||
𝑣 4: for each client 𝑐 ∈ 𝐶 do
|
||||
formalized in Eq. (6).
|
||||
∑ 5: // Step 1: Communication to Prediction Model Distillation
|
||||
arg max 𝐿(ℎ, 𝑟, 𝑡; 𝜃𝑆∗ , 𝜃𝐿𝑣 ) (6) 6: Load server-shared parameters 𝜃𝑘𝑆 ⊳ Latest global shared
|
||||
𝜃∗𝑆𝑣 (ℎ,𝑟,𝑡)∈𝑣
|
||||
𝑣
|
||||
embeddings
|
||||
𝐶𝐿
|
||||
Among them, 𝜃𝐿𝑣 denotes the victim’s local parameters. 𝑣 is the 7: Initialize communication model with 𝜃 𝐶 = (𝜃𝑘𝑆 , 𝜃𝑘−1 )
|
||||
8: Freeze communication model parameters ⊳ Act as teacher
|
||||
victim’s triplet set. Since it is difficult for the attacker to obtain these
|
||||
model
|
||||
two parameters directory, they can use random values as guesses for 𝑃
|
||||
9: Compute distillation loss 𝐿𝑘 𝐾𝐷 using Equation (7) ⊳ Only
|
||||
𝜃𝐿𝑣 and use triples of random combinations of 𝑣 and as guesses for
|
||||
positive samples
|
||||
𝑣 . 𝑃
|
||||
10: Compute KGE loss 𝐿𝑘 𝐾𝐺𝐸 on training triples
|
||||
In particular, for the TransE model [7] with the scoring function 𝑃 𝑃
|
||||
𝑔(ℎ, 𝑟, 𝑡) = |ℎ + 𝑟 − 𝑡|, the attacker can launch an untargeted poisoning 11: Update prediction model parameters (𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) with:
|
||||
𝑃 𝑃
|
||||
attack by setting the shared parameter 𝜃𝑆′ sent to the victim to identical 12: ∇𝜃𝑘𝑃 = ∇(𝛽𝐿𝑘 𝐾𝐺𝐸 + (1 − 𝛽)𝐿𝑘 𝐾𝐷 )⊳ Gradient flows through
|
||||
𝑣
|
||||
value or using negative aggregation parameters. To avoid detection, prediction model only
|
||||
𝑃 𝑃
|
||||
noise is often added to poisoned parameters. The prediction perfor- 13: 𝜃𝑘 = 𝜃𝑘 − 𝜂∇𝜃𝑘𝑃 , 𝑤ℎ𝑒𝑟𝑒 𝜃𝑘 = {𝜃𝑘 𝐿 , 𝜃𝑘 𝑆 } ⊳ Update
|
||||
mance of the victim model may even be lower than that of standalone prediction model parameters
|
||||
training without federated aggregation. 14: Unfreeze communication model parameters
|
||||
In general, the success of FKGE poisoning attacks relies on vic- 15: // Step 2: Prediction to Communication Model Distillation
|
||||
tims using attacker-provided aggregate parameters directly for training 16: Freeze prediction model parameters 𝜃𝑘𝑃 ⊳ Used as teacher
|
||||
without validation. To prevent poisoning attacks, it is critical to isolate model
|
||||
𝐶
|
||||
the parameters of the prediction model from externally provided aggre- 17: Compute distillation loss 𝐿𝑘 𝐾𝐷 using Equation (9) ⊳ Both
|
||||
gate parameters. Specifically, potentially poisoned shared parameters samples
|
||||
𝐶 𝐶
|
||||
must be filtered before training. Meanwhile, minimizing parameter ex- 18: Update communication model parameters (𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) with
|
||||
𝐶
|
||||
posure to the external environment is essential. Therefore, we propose 19: ∇𝜃𝑘𝐶 = ∇𝐿𝑘 𝐾𝐷 ⊳ Gradient flows through communication
|
||||
CoDFKGE, a defense FKGE framework based on co-distillation. model only
|
||||
𝐶 𝐶
|
||||
20: 𝜃𝑘 = 𝜃𝑘 − 𝜂∇𝜃𝑘𝐶 , 𝑤ℎ𝑒𝑟𝑒 𝜃𝑘 = {𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 }
|
||||
𝐶
|
||||
5. Model design 21: Upload updated shared parameters 𝜃𝑘 𝑆 to server
|
||||
22: Unfreeze prediction model parameters
|
||||
CoDFKGE is a training framework on the client side. Its training 23: end for
|
||||
process is shown in Fig. 2. CoDFKGE initializes two baseline models // Server Aggregation Phase
|
||||
with the same structure and scoring function, but for different purposes. 24: Server aggregates 𝜃𝑘𝑆 + 1 from all clients using baseline federated
|
||||
The communication model is mainly responsible for receiving and aggregate method.
|
||||
processing shared parameters, while the prediction model is used for 25: Set 𝑘 = 𝑘 + 1 and repeat main loop until 𝑘 > 𝐾 ⊳ Continue Main
|
||||
the final embedding and prediction. To minimize potential parameter Training Loop
|
||||
leakage and communication overhead, the feature dimension of the return Final prediction model parameters of each client.
|
||||
communication model is intentionally designed to be smaller than that
|
||||
of the prediction model.
|
||||
During the training process, the two models learn collaboratively CoDFKGE is designed to be model-agnostic, enabling seamless in-
|
||||
through knowledge distillation. Once the communication model re- tegration with diverse FKGE models based on their shared parameter
|
||||
ceives the potentially poisoned shared parameters from the server, types. Both communication and prediction models used by CoDFKGE
|
||||
it acts as a teacher model to transfer clean knowledge to the pre- clients utilize the same scoring function 𝑔 as the original KGE model.
|
||||
diction model. Following the training of the prediction model, the Clients upload and utilize shared parameters identically to the baseline
|
||||
roles are reversed: the prediction model becomes the teacher, and the
|
||||
model, with these parameters maintaining the same form and dimen-
|
||||
communication model serves as the student for distillation. This stage
|
||||
sionality as the original implementation. This parameter compatibility
|
||||
extracts knowledge from the prediction model and compresses it into
|
||||
the communication model, ensuring efficient knowledge sharing while enables the server to aggregate updates using existing federated learn-
|
||||
minimizing parameter exposure and communication overhead. By de- ing aggregation methods without modification. This design ensures that
|
||||
ploying two distinct model instances, the framework physically isolates CoDFKGE preserves the original knowledge representation capabilities
|
||||
attacker-injected parameters from the prediction model’s parameters, while maintaining consistent operational semantics with the baseline
|
||||
making poisoning attacks significantly more difficult to execute. To model.
|
||||
|
||||
5
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
5.1. Communication to prediction model distillation of 𝑝 follows the approach in [9], with its mathematical formulation
|
||||
provided in Eq. (10).
|
||||
In the first iteration, the model trains the prediction component exp 𝜏 𝑔(ℎ,𝑟,𝑡′ )
|
||||
following the standard procedure. Starting from the second iteration of 𝑝(ℎ, 𝑟, 𝑡′𝑖 ) = ∑ exp𝛼𝜏 𝑔(ℎ,𝑟,𝑡
|
||||
𝑖
|
||||
′) (10)
|
||||
𝑗 𝛼 𝑗
|
||||
the training process, the communication model loads the server-shared
|
||||
Where 𝜏𝛼 is the self-adversarial sampling temperature.
|
||||
parameters 𝜃𝑘𝑆 and initializes itself jointly with the local embeddings
|
||||
𝐿 from the previous iteration’s local prediction model. After the bidirectional distillation process of CoDFKGE, the com-
|
||||
𝜃𝑘−1 𝐶 𝐶
|
||||
munication model parameters are updated to 𝜃𝑘 𝑆 and 𝜃𝑘 𝐿 . Client then
|
||||
After the communication model receives and applies the server- 𝐶𝑆
|
||||
uploads 𝜃𝑘 to the server, which aggregates these parameters from all
|
||||
shared parameters, it filters out potentially poisoned model parameters
|
||||
clients using federated averaging to generate the next round’s shared
|
||||
through knowledge distillation. The communication model acts as a 𝑆 .
|
||||
parameters 𝜃𝑘+1
|
||||
teacher model to transfer clean knowledge to the prediction model,
|
||||
which serves as the student model. During this process, the prediction
|
||||
6. Experiments
|
||||
model parameters are frozen to ensure that the knowledge transfer
|
||||
direction is strictly from the communication model to the prediction
|
||||
Experiments are conducted on the open available dataset FB15K-
|
||||
model. Gradients only flow through the prediction model parameters,
|
||||
237 [35], which is a subset of Freebase, containing 14,505 entities,
|
||||
while the communication model parameters remain frozen, preventing
|
||||
544,230 triples, and 474 relations. To perform federated learning, we
|
||||
gradient leakage back to potentially poisoned shared parameters.
|
||||
adopt the relational partitioning method in [22]. This method first
|
||||
If the communication model suffers from poisoning attacks and
|
||||
partitions the relationships through clustering, ensuring that the triple
|
||||
contains the poisoning parameter, its outputs for negative samples are
|
||||
relationships within each partition are as close as possible. Then, these
|
||||
not reliable. Distilling or teaching such uncertain predictions would
|
||||
partitions are divided into groups of roughly equal numbers of triples
|
||||
propagate noise rather than useful knowledge. To exclude the poisoned
|
||||
and distributed to the client. This results in tighter triple relationships
|
||||
knowledge, the prediction model should focus on positive samples
|
||||
within the client, better reflecting real-world scenarios.
|
||||
during distillation, ensuring that only trustworthy knowledge is trans-
|
||||
The TransE model [7] is selected as the KGE model, serving as
|
||||
ferred. The mathematical expression for the distillation loss of the
|
||||
the foundation for all federated learning methods in the experiments—
|
||||
prediction model in the 𝑘th training epoch is provided in Eq. (7). including the attacker’s shadow model. To benchmark CoDFKGE, we
|
||||
∑ ( ) select multiple baseline models. First, the local training model without
|
||||
𝑃 𝑃𝐿 𝑃 𝑃
|
||||
𝐿𝑘 𝐾𝐷 = 𝜏 2 𝐷𝐾𝐿 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘𝑆 , 𝜃𝑘−1 )) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ))
|
||||
federated learning is selected as the KGE baseline model. It does not
|
||||
(ℎ,𝑟,𝑡)∈
|
||||
share parameters between clients, so it has no communication over-
|
||||
(7) head and is not vulnerable to poisoning attacks. Then, FedE [13] and
|
||||
Among them, 𝑡 is the distillation temperature coefficient, and 𝜎 is FedR [15] are also chosen as baseline FGKE models, representing stan-
|
||||
dard approaches in the field. Additionally, we implement a knowledge
|
||||
the softmax function of the ratio of the model output to 𝑡. 𝑔 represents
|
||||
distillation model, which utilizes communication and prediction models
|
||||
the scoring function of the prediction model, which is used to compute
|
||||
𝑃𝐿 similar to CoDFKGE but only processes a unidirectional knowledge dis-
|
||||
the KGE loss. 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘𝑆 , 𝜃𝑘−1 ) represents the communication model
|
||||
𝑃𝐿 tillation. Specifically, it uses the communication model as the teacher
|
||||
output under server-shared parameter 𝜃𝑘𝑆 and local parameter 𝜃𝑘−1 , and
|
||||
𝑃𝑆 𝑃𝐿 model and the prediction model as the student model to filter out
|
||||
𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 , 𝜃𝑘 ) represents the training prediction model output. poisoning knowledge, with the distillation loss function following Eq.
|
||||
When training distillation, the model also needs to consider the (4).
|
||||
KGE loss function. The overall loss function of the prediction model All experiments are performed on a 72-core Ubuntu 18.04.6 LTS
|
||||
is the weighted sum of the KGE loss and the distillation loss, and its machine with an Intel(R) Xeon(R) Gold 5220 CPU @ 2.20 GHz and
|
||||
mathematical expression is shown in Eq. (8). a V100S-PCIE-32GB GPU. We implemented the proposed FKGE frame-
|
||||
𝑃
|
||||
𝐿𝑃𝑘 = 𝛽𝐿𝑘 𝐾𝐺𝐸 + (1 − 𝛽)𝐿𝑘 𝐾𝐷
|
||||
𝑃
|
||||
(8) work and baseline model based on PyTorch Geometric [36] and dis-
|
||||
tributed AI framework Ray [37]. We used KGE hyperparameter settings
|
||||
𝑃𝑘
|
||||
Where, 𝐿𝐾𝐺𝐸 is the KGE loss of the 𝑘th epoch of the prediction model based on [9] and FKGE hyperparameter settings based on FedE [13].
|
||||
defined by Eq. (1), and 𝛽 is the weight. Specifically, we used the Adam [38] optimizer with a learning rate of
|
||||
1e-3. 𝛾 is 10, and self-advertise negative sampling temperature 𝜏𝛼 in
|
||||
5.2. Prediction to communication model distillation KGE is 1. The distillation temperature 𝜏 is 2, and the coefficient 𝛽 of
|
||||
distillation and KGE loss are both 0.5. The maximum training epoch
|
||||
After training the prediction model, we train the communication is 400. In each epoch, the client performs 3 iterations locally before
|
||||
model through distillation, which extracts and propagates knowledge uploading the parameters to the server.
|
||||
without directly sharing prediction parameters, thereby avoiding pri- We utilize the link prediction task, a sub-task of KGE, to validate the
|
||||
vacy leakage. During the communication model’s distillation, the out- model’s accuracy. Referencing the common implementation of the link
|
||||
put of the prediction model under positive and negative samples serves prediction, we employ the Mean Reciprocal Rank (MRR) and Hits@N as
|
||||
as soft labels. As Eq. (1) illustrates, the loss function must account accuracy metrics. The MRR is the average of the reciprocals of the ranks
|
||||
for the probability of negative samples when balancing the impact of the predicted triples among all possible triples. Mathematically, if
|
||||
of positive and negative predictions. Therefore, the distillation loss 𝑟𝑎𝑛𝑘𝑖 is the rank of the correct triple for the 𝑖th query, and 𝑛 is the
|
||||
∑
|
||||
function of the communication model is formalized in Eq. (9). total number of queries, then 𝑀𝑅𝑅 = 1𝑛 𝑛𝑖=1 𝑟𝑎𝑛𝑘 1
|
||||
. The Hits@N is the
|
||||
𝑖
|
||||
∑ proportion of query triples for which the correct triple is present among
|
||||
𝐶𝑘 𝑃 𝑃 𝐶 𝐶
|
||||
𝐿𝐾𝐷 = 𝜏2 (𝐷𝐾𝐿 (𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 )) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 )))) the top 𝑁 candidates generated by the model. Generally, higher values
|
||||
∑(ℎ,𝑟,𝑡)∈ for both metrics indicate better model performance in link prediction.
|
||||
𝑃 𝑃 𝐶 𝐶
|
||||
+ 𝑝(ℎ, 𝑟, 𝑡′𝑖 )𝐷𝐾𝐿 (𝜎(𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 )))) Through experiments, the following research questions will be ver-
|
||||
𝑖 ified.
|
||||
(9)
|
||||
𝐶 𝐶
|
||||
RQ1 Does CoDFKGE maintain KGE prediction performance while re-
|
||||
Among them, 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) represents the communication model ducing FKGE communication overhead?
|
||||
𝑃𝑆 𝑃𝐿
|
||||
output. 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 , 𝜃𝑘 ) represents the prediction model output under RQ2 Can CoDFKGE effectively defend against targeted poisoning at-
|
||||
𝑃 𝑃
|
||||
shared parameter 𝜃𝑘 𝑆 and local parameter 𝜃𝑘 𝐿 . The calculation method tacks?
|
||||
|
||||
6
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
Table 1
|
||||
Experiment result on normal link prediction.
|
||||
Fed type Model Mem(MB) CC(MB) MRR Hits@1 Hits@5 Hits@10
|
||||
Local Local(128) 57.05 – 0.4081 ± 0.0015 0.3066 ± 0.0014 0.5223 ± 0.0023 0.6077 ± 0.0015
|
||||
Entity FedE(128) 185.58 42.60 0.4082 ± 0.0004 0.3068 ± 0.0012 0.5232 ± 0.0013 0.6080 ± 0.0018
|
||||
Entity Distillation (128-128) 356.10 42.60 0.4129 ± 0.0008 0.3118 ± 0.0016 0.5279 ± 0.0008 0.6122 ± 0.0003
|
||||
Entity CoDFKGE (128-128) 356.10 42.60 0.4109 ± 0.0043 0.3097 ± 0.0041 0.5246 ± 0.0044 0.6087 ± 0.0040
|
||||
Entity Distillation (32-128) 217.39 10.65 0.3914 ± 0.0011 0.2935 ± 0.0008 0.5005 ± 0.0014 0.5838 ± 0.0032
|
||||
Entity CoDFKGE (32-128) 217.40 10.65 0.4090 ± 0.0010 0.3079 ± 0.0007 0.5233 ± 0.0019 0.6068 ± 0.0019
|
||||
Relation FedR(128) 75.49 0.69 0.4085 ± 0.0011 0.3079 ± 0.0021 0.5219 ± 0.0016 0.6066 ± 0.0017
|
||||
Relation Distillation (128-128) 151.74 0.69 0.4106 ± 0.0013 0.3092 ± 0.0023 0.5242 ± 0.0008 0.6098 ± 0.0009
|
||||
Relation CoDFKGE (128-128) 150.02 0.69 0.4065 ± 0.0007 0.3056 ± 0.0013 0.5190 ± 0.0023 0.6063 ± 0.0012
|
||||
Relation Distillation (32-128) 94.53 0.17 0.3920 ± 0.0012 0.2960 ± 0.0007 0.4996 ± 0.0019 0.5807 ± 0.0013
|
||||
Relation CoDFKGE (32-128) 93.69 0.17 0.4078 ± 0.0009 0.3060 ± 0.0007 0.5224 ± 0.0031 0.6074 ± 0.0015
|
||||
|
||||
|
||||
|
||||
RQ3 Can CoDFKGE effectively defend against untargeted poisoning 6.2. Targeted poisoning attack experiment (RQ2)
|
||||
attacks?
|
||||
RQ4 Do the two proposed distillation loss functions individually con- In the targeted poisoning attack, 32 pairs of non-existent triples
|
||||
tribute to poisoning defense? are selected as attack targets from the victim’s KG through negative
|
||||
sampling to construct a poisoned triple dataset. First, a predetermined
|
||||
6.1. Normal link prediction (RQ1) number of normal triples are selected from the victim’s training triples.
|
||||
Subsequently, the head or tail nodes of these triples are randomly re-
|
||||
To explore the performance of the proposed model in normal link placed, and any triples already existing in the training set are iteratively
|
||||
prediction, we first tested the model on a conventional dataset. The removed until 32 pairs of non-existent triples are successfully con-
|
||||
performance of the model is measured using MRR and Hits@1, Hits@5, structed. In each epoch, the shadow model undergoes the same number
|
||||
and Hits@10. The model is trained by federated learning and evaluated of local training rounds as legitimate clients on the poisoned dataset to
|
||||
on the local test sets of clients. generate poisoned parameters. The malicious server aggregates these
|
||||
Table 1 lists the performance of the local KGE model, FedE, FedR, poisoned parameters with the parameters of the normal client into
|
||||
and CoDFKGE with different dimensions. The experimental results are shared parameters and distributes them to all clients. Attackers can
|
||||
grouped according to the type of shared embeddings and the dimension assign high weights to poisoned model parameters during aggregation.
|
||||
of the prediction model. The parameter dimensions are specified in Following the setup in Ref. [33], we set the weight of the attacker’s
|
||||
parentheses within the ‘‘Model’’ column. For example, CoDFKGE(32- aggregated poisoned triples to be 256 times that of normal triples.
|
||||
128) denotes the CoDKGE model with a 32-dimensional communication Experiments focus on models with shared entity parameters (required
|
||||
model and a 128-dimensional prediction model. All link prediction
|
||||
for targeted poisoning attacks) and non-federated local baselines.
|
||||
experiments were repeated 5 times with different random seeds, and
|
||||
For space considerations, this section reports only MRR and
|
||||
the accuracy results of all models are reported as (mean ± standard
|
||||
Hits@10 metrics. Attack effectiveness is measured by the MRR and
|
||||
deviation). The best performing model results in each group (excluding
|
||||
Hits@10 of poisoned triples on the victim. The higher metrics of the
|
||||
the local model) are bolded. The results of the CoDFKGE (32-128)
|
||||
poisoned triples indicate greater vulnerability to poisoning and weaker
|
||||
model that are better than those of Distillation(32-128) are underlined.
|
||||
resistance of the model to targeted poisoning attacks.
|
||||
The performance of locally trained models is lower than most feder-
|
||||
Table 2 lists the performance of baseline models and CoDFKGE
|
||||
ated learning models, highlighting the advantages of sharing model pa-
|
||||
under targeted poisoning attacks, grouped by the prediction model
|
||||
rameters. High-dimensional distillation(128-128) models achieve better
|
||||
dimension. The parameter dimensions are specified in parentheses
|
||||
link prediction performance. Compared to distillation(128-128), CoD-
|
||||
within the ‘‘Model’’ column. The ‘‘All Clients’’ column reports av-
|
||||
FKGE models show slightly inferior prediction performance. However,
|
||||
erage performance across all clients’ test sets during attacks, while
|
||||
by comparing models with the same dimensions, CoDFKGE outperform
|
||||
both local baselines and federated baselines (FedE, FedR). The co- ‘‘Victim Poisoned’’ measures the victim’s performance on predicting
|
||||
distillation process in CoDFKGE may lead to a loss of generalization poisoned triples. All experiments were repeated 5 times with differ-
|
||||
accuracy. We believe that the main advantage of CoDFKGE is its ent random seeds, and the results are reported as (mean ± standard
|
||||
ability to enhance the security of FKGE. In addition to the security deviation). The best performing model results are bolded. Moreover,
|
||||
performance demonstrated in Sections 6.2 and 6.3, it also maintains the ‘‘Communication Poison’’ column highlights the communication
|
||||
link prediction performance comparable to its baseline FKGE models. model’s performance on poisoned triples for CoDFKGE and the dis-
|
||||
Beyond accuracy metrics, the ‘‘CC’’ (Communication Cost) column tillation model, demonstrating that both communication models are
|
||||
reports the communication overhead per training epoch, which is impacted by targeted poisoning attacks. Through distillation, the pre-
|
||||
calculated based on the byte size of PyTorch Embedding used in the diction accuracy of poisoned triples by the prediction model decreases
|
||||
implementation. The ‘‘Mem’’ column shows the GPU memory usage in both cases.
|
||||
of federated models in MB. Distillation-based model requires main- For targeted poisoning attacks, the primary evaluation metrics
|
||||
taining two KGE models, resulting in higher computational resource should be the MRR and Hits@10 performance indicators of the victim
|
||||
consumption. Distillation-based models need larger GPU memory to model when predicting poisoned triples. The Local training model,
|
||||
store the parameters of both models. Compared to using model pa- which does not employ federated learning, remains immune to poi-
|
||||
rameters of the same size, distillation-based models allow to compress soning attacks, resulting in low MRR for poisoned triples, with the
|
||||
parameters in the communication model, achieving significantly lower Hits@10 value being exactly 0. This indicates that the unpoisoned Local
|
||||
communication overhead. In cases of smaller communication overhead, model does not include non-existent poisoned triples among its top
|
||||
CoDFKGE(32-128) outperforms distillation(32-128) in link prediction 10 candidate results when making predictions. If a model incorrectly
|
||||
performance. Therefore, we believe that the CoDFKGE model does marks non-existent poisoned test triples as one of the top 10 candidates,
|
||||
not degrade the normal link prediction performance of baseline FKGE it demonstrates that the poisoning attack has successfully manipulated
|
||||
models and can effectively reduce the communication overhead of the the model’s predictions. Therefore, we use Hits@10 as the metric to
|
||||
model. measure the Attack Success Rate (ASR).
|
||||
|
||||
7
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
Table 2
|
||||
Experiment result under targeted poisoning attack.
|
||||
Model All clients Victim poison Communication poison
|
||||
MRR Hits@10 MRR Hits@10(ASR) MRR Hits@10
|
||||
Local(128, unpoisoned) 0.4081 ± 0.0015 0.6077 ± 0.0015 0.0003 ± 0.0001 0.0000 ± 0.0000 – –
|
||||
FedE(128) 0.4034 ± 0.0035 0.6004 ± 0.0029 0.4450 ± 0.0938 0.7857 ± 0.1248 – –
|
||||
Distillation(128-128) 0.4026 ± 0.0025 0.6006 ± 0.0039 0.0844 ± 0.0552 0.2000 ± 0.1311 0.4999 ± 0.1429 0.7714 ± 0.1046
|
||||
CoDFKGE(128-128) 0.4086 ± 0.0007 0.6089 ± 0.0012 0.0010 ± 0.0003 0.0009 ± 0.0005 0.4694 ± 0.1511 0.6589 ± 0.1242
|
||||
Distillation(32-128) 0.3821 ± 0.0022 0.5717 ± 0.0018 0.1511 ± 0.3356 0.1960 ± 0.4362 0.4919 ± 0.2364 0.6625 ± 0.1887
|
||||
CoDFKGE(32-128) 0.3856 ± 0.0039 0.5740 ± 0.0054 0.0010 ± 0.0001 0.0010 ± 0.0003 0.3794 ± 0.0032 0.5702 ± 0.005
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Performance degradation comparison.
|
||||
|
||||
|
||||
The FedE model maintains high prediction accuracy on normal communication model in CoDFKGE(32-128) less susceptible to poison-
|
||||
test triples when under attack, but exhibits abnormally high MRR and ing attacks.
|
||||
Hits@10 metrics for targeted poisoned triples, even exceeding those
|
||||
of normal triples. This indicates that targeted poisoning attacks can 6.3. Untargeted poisoning attack experiment (RQ3)
|
||||
effectively manipulate the FedE model to generate incorrect prediction
|
||||
results. Similarly, in distillation-based models, their communication In untargeted poisoning attack experiments, the attacker returns
|
||||
models are severely affected by poisoning attacks, while the impact on negative aggregate parameters to the victim client, making the victim
|
||||
the prediction models is relatively minor. Although the distill(128-128) model non-converge and degrading prediction performance. The results
|
||||
model can partially eliminate poisoning knowledge, it still remains vul- presented in this section reflect average prediction performance on
|
||||
nerable to the targeted poisoning attacks. Moreover, as the dimension local test triples of clients.
|
||||
of the communication model parameter increases, the extent of the Table 3 lists the performance of each model under untargeted
|
||||
model’s vulnerability to poisoning attacks also grows. poisoning attacks, grouped by the prediction model dimension and
|
||||
In contrast, CoDFKGE’s prediction model performs distillation learn- federated type. The parameter dimensions are specified in parenthe-
|
||||
ing exclusively on verified positive samples, effectively eliminating ses within the ‘‘Model’’ column. The ‘‘All Clients’’ column shows the
|
||||
potential poisoning knowledge that might exist in negative samples. average performance of all clients under untargeted poisoning attacks,
|
||||
Similar to the Local training model, CoDFKGE achieves extremely low and the ‘‘Victim Client’’ column shows the performance of the victim
|
||||
MRR and Hits@10 metrics for poisoned triples, which fully demon- client. To measure the severity of the model being attacked, the MRR of
|
||||
strates that the CoDFKGE model can effectively defend against targeted the local model in Table 1 is used as a benchmark. The ‘‘Decay Ratio’’
|
||||
poisoning attacks in FKGE. Furthermore, due to the compression of column shows the ratio of performance degradation on the victim
|
||||
the communication model’s dimension, the amount of information client compared to the local model shown in Table 1. All experiments
|
||||
that attackers can transmit is correspondingly reduced, making the were repeated 5 times with different random seeds, and the results
|
||||
|
||||
8
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
Table 3
|
||||
Experiment result under untargeted poisoning attack.
|
||||
Fed Type Model All clients Victim Decay ratio (%)
|
||||
MRR Hits@10 MRR Hits@10 MRR Hits@10
|
||||
Entity FedE(128) 0.3896 ± 0.0010 0.5939 ± 0.0009 0.3625 ± 0.0102 0.5620 ± 0.0144 11.21 7.58
|
||||
Entity Distillation(128-128) 0.3900 ± 0.0017 0.5921 ± 0.0007 0.3641 ± 0.0012 0.5664 ± 0.0018 11.82 7.54
|
||||
Entity CoDFKGE(128-128) 0.4084 ± 0.0007 0.6068 ± 0.0003 0.4017 ± 0.0010 0.6009 ± 0.0005 2.25 1.28
|
||||
Entity Distillation (32-128) 0.3024 ± 0.0208 0.5422 ± 0.0105 0.2739 ± 0.0264 0.5262 ± 0.0124 30.02 9.49
|
||||
Entity CoDFKGE (32-128) 0.4093 ± 0.0018 0.6081 ± 0.0014 0.4022 ± 0.0022 0.6023 ± 0.0011 1.66 0.75
|
||||
Relation FedR(128) 0.3915 ± 0.0010 0.5951 ± 0.0016 0.3637 ± 0.0093 0.5636 ± 0.0150 10.96 7.10
|
||||
Relation Distillation(128-128) 0.3978 ± 0.0017 0.6022 ± 0.0019 0.3881 ± 0.0023 0.5942 ± 0.0028 5.51 2.56
|
||||
Relation CoDFKGE(128-128) 0.4086 ± 0.0017 0.6075 ± 0.0029 0.4014 ± 0.0020 0.6018 ± 0.0037 1.24 0.75
|
||||
Relation Distillation (32-128) 0.3058 ± 0.0079 0.5463 ± 0.0029 0.2787 ± 0.0101 0.5307 ± 0.0038 27.78 8.61
|
||||
Relation CoDFKGE (32-128) 0.4090 ± 0.0008 0.6066 ± 0.0011 0.4026 ± 0.0008 0.6018 ± 0.0013 1.27 0.92
|
||||
|
||||
|
||||
Table 4
|
||||
Ablation study in normal link prediction and under targeted attack.
|
||||
Model Link prediction Targeted all clients Targeted victim poisoning
|
||||
MRR Hits@10 MRR Hits@10 MRR Hits@10 (targeted poisoning ASR)
|
||||
CoDFKGE 0.4112 ± 0.0039 0.6084 ± 0.0036 0.4086 ± 0.0007 0.6089 ± 0.0012 0.0010 ± 0.0003 0.0009 ± 0.0005
|
||||
Ablation(Comm) 0.4095 ± 0.0016 0.6074 ± 0.0014 0.4086 ± 0.0022 0.6076 ± 0.0021 0.0017 ± 0.0008 0.0013 ± 0.0008
|
||||
Ablation(Pred) 0.4132 ± 0.0006 0.6116 ± 0.0012 0.4098 ± 0.0011 0.6080 ± 0.0009 0.8086 ± 0.0064 0.9702 ± 0.0228
|
||||
|
||||
|
||||
|
||||
are reported as (mean ± standard deviation). The best and second best were repeated 5 times with different random seeds, and the results are
|
||||
results in each group have been marked in bold and underline. reported as (mean ± standard deviation). The best results are bolded.
|
||||
From the experimental results, it can be observed that when sub- Experimental results demonstrate that while Ablation(Pred) per-
|
||||
jected to untargeted poisoning attacks, the CoDFKGE series models forms well in conventional link prediction, its resistance to poisoning
|
||||
achieve optimal MRR and Hits@10 performance metrics compared to attacks lags behind the other two models due to not employing a
|
||||
other models. In this context, all models exhibit varying degrees of negative sample exclusion strategy in its loss function. Among the re-
|
||||
decline in both their overall performance metrics and their performance maining two models, while both demonstrate robust resilience against
|
||||
metrics on victims. In Fig. 3, we present a comparison of the predic- poisoning attacks, the CoDFKGE model achieves superior link pre-
|
||||
tion performance of various models under normal link prediction and diction performance compared to Ablation(Comm). Ablation(Comm)
|
||||
untargeted poisoning attack scenarios. It can be observed that the Dis- employs a baseline loss function during the distillation training of
|
||||
tillation(32-128) model experiences the most significant performance the communication model. In contrast, the CoDFKGE model adopts
|
||||
degradation; for Distillation(128-128), FedE, and FedR models, their the approach from [9] and utilizes self-adversarial sampling temper-
|
||||
performance degradation is also substantial and cannot be ignored. ature 𝜏𝛼 to reweight negative samples, thereby enhancing the model’s
|
||||
These models directly incorporate poisoned global knowledge as an ability to distinguish between negative samples. Overall, the ablation
|
||||
integral part of their own models, causing the convergence process of experiments demonstrate that applying the proposed distillation loss
|
||||
the models to be adversely affected. In contrast, the performance degra- functions simultaneously enhances the model’s capability in defending
|
||||
dation of CoDFKGE models is fully within 3%. This is because even in against poisoning attacks and link prediction.
|
||||
the absence of global knowledge, the prediction model of CoDFKGE still
|
||||
7. Conclusion
|
||||
utilizes local data knowledge for training, and its training effectiveness
|
||||
is comparable to that of local KGE models without knowledge sharing.
|
||||
This paper proposes CoDFKGE, a co-distillation-based defense
|
||||
Baseline models may have their results manipulated or exhibit
|
||||
framework for FKGE poisoning attacks. As the first co-distillation
|
||||
significant performance degradation when facing poisoning attacks.
|
||||
defense framework against poisoning attacks in FKGE, CoDFKGE does
|
||||
Although in link prediction experiments, distillation models exhibited
|
||||
have some limitations. First, maintaining two separate models requires
|
||||
advantages in performance, their defense effectiveness is extremely
|
||||
higher computational resource consumption on clients. Second, the
|
||||
limited when facing poisoning attacks. In contrast, CoDFKGE remains
|
||||
bidirectional distillation process may lead to a loss of generalization
|
||||
unmanipulated when encountering targeted poisoning attacks and does
|
||||
accuracy. In contrast, CoDFKGE’s advantages lie in its model-agnostic
|
||||
not exhibit significant performance degradation when subjected to
|
||||
applicability to existing FKGE models without compromising perfor-
|
||||
untargeted poisoning attacks, demonstrating its effective defense capa- mance. By decoupling clients’ prediction models from shared parameter
|
||||
bility against poisoning attacks. models, CoDFKGE effectively filters out poisoned knowledge embedded
|
||||
in shared updates. CoDFKG eliminates malicious manipulations under
|
||||
6.4. Ablation study (RQ4) targeted poisoning attacks, and significantly mitigates accuracy degra-
|
||||
dation under untargeted poisoning attacks. Leveraging distillation,
|
||||
This section evaluates the defensive effects of applying different the framework further reduces communication overhead. This work
|
||||
loss functions in CoDFKGE against poisoning attacks. Specifically, we provides new ideas for enhancing the security of FKGE.
|
||||
compare the performance of models using 128-dimensional training The limitations of FKGE poisoning defense research are partially
|
||||
parameters for both communication and prediction models across nor- rooted in the unique characteristics of KGE. When considering
|
||||
mal link prediction, targeted poisoning attack scenarios, and untargeted translation-based KGE models in FKGE, sharing entity or relation
|
||||
poisoning attack scenarios. Two ablation baselines were implemented: embeddings introduces risks related to both privacy preservation and
|
||||
Ablation(Comm) applies the baseline loss function (Eq. (4)) solely poisoning attacks. Employing GNN-based KGE models in FKGE that
|
||||
during the communication module’s distillation, while Ablation(Pred) transmit GNN parameters or gradients can alleviate these concerns.
|
||||
uses it exclusively for the prediction module’s distillation. However, due to their superior robustness to sparse data and lower
|
||||
Tables 4 and 5 shows the experiment results of models with different computational resource requirements, translation-based models still
|
||||
distillation loss functions sharing entity embeddings. All experiments maintain unparalleled advantages in specific application scenarios.
|
||||
|
||||
9
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
Table 5
|
||||
Ablation study under untargeted attack.
|
||||
Model Untargeted all clients Untargeted victim Decay ratio (%)
|
||||
MRR Hits@10 MRR Hits@10 MRR Hits@10
|
||||
CoDFKGE 0.4084 ± 0.0007 0.6068 ± 0.0003 0.4017 ± 0.0010 0.6009 ± 0.0005 2.25 1.27
|
||||
Ablation(Comm) 0.4056 ± 0.0017 0.6062 ± 0.0011 0.3996 ± 0.0018 0.6003 ± 0.0013 2.42 1.16
|
||||
Ablation(Pred) 0.3951 ± 0.0011 0.6022 ± 0.0008 0.3852 ± 0.0009 0.5951 ± 0.0005 6.76 2.69
|
||||
|
||||
|
||||
|
||||
For future research, we recommend exploring the application of the [8] Z. Wang, J. Zhang, J. Feng, Z. Chen, Knowledge graph embedding by translating
|
||||
CoDFKGE framework in more complex real-world scenarios, such as on hyperplanes, in: Proceedings of the AAAI Conference on Artificial Intelligence,
|
||||
vol. 28, 2014.
|
||||
personalized FKGE problems. Additionally, in large-scale dynamic KG
|
||||
[9] Z. Sun, Z.-H. Deng, J.-Y. Nie, J. Tang, Rotate: Knowledge graph embedding by
|
||||
environments, the security landscape for FKGE may undergo signifi- relational rotation in complex space, 2019, arXiv preprint arXiv:1902.10197.
|
||||
cant changes, necessitating further investigation into defense methods [10] Z. Zhang, J. Jia, Y. Wan, Y. Zhou, Y. Kong, Y. Qian, J. Long, Transr*: Repre-
|
||||
tailored to these evolving scenarios. sentation learning model by flexible translation and relation matrix projection,
|
||||
J. Intell. Fuzzy Systems 40 (5) (2021) 10251–10259.
|
||||
[11] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2d knowl-
|
||||
CRediT authorship contribution statement edge graph embeddings, in: Proceedings of the AAAI Conference on Artificial
|
||||
Intelligence, vol. 32, (1) 2018.
|
||||
Yiqin Lu: Supervision. Jiarui Chen: Writing – original draft, Soft- [12] B. McMahan, E. Moore, D. Ramage, S. Hampson, B.A. y Arcas, Communication-
|
||||
efficient learning of deep networks from decentralized data, in: Artificial
|
||||
ware, Methodology. Jiancheng Qin: Writing – review & editing.
|
||||
Intelligence and Statistics, PMLR, 2017, pp. 1273–1282.
|
||||
[13] M. Chen, W. Zhang, Z. Yuan, Y. Jia, H. Chen, Fede: Embedding knowledge graphs
|
||||
Declaration of Generative AI and AI-assisted technologies in the in federated setting, in: Proceedings of the 10th International Joint Conference
|
||||
writing process on Knowledge Graphs, 2021, pp. 80–88.
|
||||
[14] M. Chen, W. Zhang, Z. Yuan, Y. Jia, H. Chen, Federated knowledge graph
|
||||
During the preparation of this work the author(s) used deepseek in completion via embedding-contrastive learning, Knowl.-Based Syst. 252 (2022)
|
||||
109459.
|
||||
order to improve language and readability. After using this tool/service, [15] K. Zhang, Y. Wang, H. Wang, L. Huang, C. Yang, X. Chen, L. Sun, Efficient fed-
|
||||
the author(s) reviewed and edited the content as needed and take(s) full erated learning on knowledge graphs via privacy-preserving relation embedding
|
||||
responsibility for the content of the publication. aggregation, 2022, arXiv preprint arXiv:2203.09553.
|
||||
[16] G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network,
|
||||
2015, arXiv preprint arXiv:1503.02531.
|
||||
Declaration of competing interest [17] N. Papernot, P. McDaniel, X. Wu, S. Jha, A. Swami, Distillation as a de-
|
||||
fense to adversarial perturbations against deep neural networks, in: 2016 IEEE
|
||||
The authors declare that they have no known competing finan- Symposium on Security and Privacy, SP, IEEE, 2016, pp. 582–597.
|
||||
cial interests or personal relationships that could have appeared to [18] K. Yoshida, T. Fujino, Countermeasure against backdoor attack on neural
|
||||
networks utilizing knowledge distillation, J. Signal Process. 24 (4) (2020)
|
||||
influence the work reported in this paper.
|
||||
141–144.
|
||||
[19] K. Yoshida, T. Fujino, Disabling backdoor and identifying poison data by
|
||||
Acknowledgment using knowledge distillation in backdoor attacks on deep neural networks, in:
|
||||
Proceedings of the 13th ACM Workshop on Artificial Intelligence and Security,
|
||||
2020, pp. 117–127.
|
||||
This work is supported by the Special Project for Research and [20] R. Anil, G. Pereyra, A. Passos, R. Ormandi, G.E. Dahl, G.E. Hinton, Large
|
||||
Development in Key Areas of Guangdong Province, under Grant scale distributed neural network training through online distillation, 2018, arXiv
|
||||
2019B010137001. preprint arXiv:1804.03235.
|
||||
[21] Y. Hu, W. Liang, R. Wu, K. Xiao, W. Wang, X. Li, J. Liu, Z. Qin, Quantifying and
|
||||
defending against privacy threats on federated knowledge graph embedding, in:
|
||||
Data availability
|
||||
Proceedings of the ACM Web Conference 2023, 2023, pp. 2306–2317.
|
||||
[22] X. Zhu, G. Li, W. Hu, Heterogeneous federated knowledge graph embedding
|
||||
Data will be made available on request. learning and unlearning, in: Proceedings of the ACM Web Conference 2023,
|
||||
2023, pp. 2444–2454.
|
||||
[23] X. Zhang, Z. Zeng, X. Zhou, Z. Shen, Low-dimensional federated knowledge graph
|
||||
embedding via knowledge distillation, 2024, arXiv preprint arXiv:2408.05748.
|
||||
References
|
||||
[24] Y. Liu, Z. Sun, G. Li, W. Hu, I know what you do not know: Knowledge
|
||||
graph embedding via co-distillation learning, in: Proceedings of the 31st ACM
|
||||
[1] X. Zhao, H. Chen, Z. Xing, C. Miao, Brain-inspired search engine assistant based International Conference on Information & Knowledge Management, 2022, pp.
|
||||
on knowledge graph, IEEE Trans. Neural Netw. Learn. Syst. 34 (8) (2021) 1329–1338.
|
||||
4386–4400. [25] F. Xia, W. Cheng, A survey on privacy-preserving federated learning against
|
||||
[2] S. Sharma, Fact-finding knowledge-aware search engine, in: Data Management, poisoning attacks, Clust. Comput. 27 (10) (2024) 13565–13582.
|
||||
Analytics and Innovation: Proceedings of ICDMAI 2021, vol. 2, Springer, 2021, [26] J. Chen, H. Yan, Z. Liu, M. Zhang, H. Xiong, S. Yu, When federated learning
|
||||
pp. 225–235. meets privacy-preserving computation, ACM Comput. Surv. (ISSN: 0360-0300)
|
||||
[3] Y. Jiang, Y. Yang, L. Xia, C. Huang, DiffKG: Knowledge graph diffusion model for 56 (12) (2024).
|
||||
recommendation, in: Proceedings of the 17th ACM International Conference on [27] J. Xia, Z. Yue, Y. Zhou, Z. Ling, Y. Shi, X. Wei, M. Chen, Waveattack: Asymmetric
|
||||
Web Search and Data Mining, WSDM ’24, Association for Computing Machinery, frequency obfuscation-based backdoor attacks against deep neural networks, Adv.
|
||||
New York, NY, USA, ISBN: 9798400703713, 2024, pp. 313–321. Neural Inf. Process. Syst. 37 (2024) 43549–43570.
|
||||
[4] W. Wang, X. Shen, B. Yi, H. Zhang, J. Liu, C. Dai, Knowledge-aware fine-grained [28] P. Blanchard, E.M. El Mhamdi, R. Guerraoui, J. Stainer, Machine learning with
|
||||
attention networks with refined knowledge graph embedding for personalized adversaries: Byzantine tolerant gradient descent, Adv. Neural Inf. Process. Syst.
|
||||
recommendation, Expert Syst. Appl. 249 (2024) 123710. 30 (2017).
|
||||
[5] J. Chen, Y. Lu, Y. Zhang, F. Huang, J. Qin, A management knowledge graph [29] N.M. Jebreel, J. Domingo-Ferrer, Fl-defender: Combating targeted attacks in
|
||||
approach for critical infrastructure protection: Ontology design, information ex- federated learning, Knowl.-Based Syst. 260 (2023) 110178.
|
||||
traction and relation prediction, Int. J. Crit. Infrastruct. Prot. (ISSN: 1874-5482) [30] Z. Yue, J. Xia, Z. Ling, M. Hu, T. Wang, X. Wei, M. Chen, Model-contrastive
|
||||
43 (2023) 100634. learning for backdoor elimination, in: Proceedings of the 31st ACM International
|
||||
[6] Y. Zhang, J. Chen, Z. Cheng, X. Shen, J. Qin, Y. Han, Y. Lu, Edge propagation Conference on Multimedia, 2023, pp. 8869–8880.
|
||||
for link prediction in requirement-cyber threat intelligence knowledge graph, [31] H. Peng, H. Li, Y. Song, V. Zheng, J. Li, Differentially private federated
|
||||
Inform. Sci. (ISSN: 0020-0255) 653 (2024) 119770. knowledge graphs embedding, in: Proceedings of the 30th ACM International
|
||||
[7] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, O. Yakhnenko, Translating Conference on Information & Knowledge Management, CIKM ’21, Association
|
||||
embeddings for modeling multi-relational data, Adv. Neural Inf. Process. Syst. for Computing Machinery, New York, NY, USA, ISBN: 9781450384469, 2021,
|
||||
26 (2013). pp. 1416–1425.
|
||||
|
||||
|
||||
10
|
||||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||||
|
||||
|
||||
[32] Y. Hu, Y. Wang, J. Lou, W. Liang, R. Wu, W. Wang, X. Li, J. Liu, Z. Qin, Privacy [36] M. Fey, J.E. Lenssen, Fast graph representation learning with PyTorch Geometric,
|
||||
risks of federated knowledge graph embedding: New membership inference in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
|
||||
attacks and personalized differential privacy defense, IEEE Trans. Dependable [37] P. Moritz, R. Nishihara, S. Wang, A. Tumanov, R. Liaw, E. Liang, M. Elibol,
|
||||
Secur. Comput. (2024). Z. Yang, W. Paul, M.I. Jordan, I. Stoica, Ray: A distributed framework for
|
||||
[33] E. Zhou, S. Guo, Z. Ma, Z. Hong, T. Guo, P. Dong, Poisoning attack on federated emerging AI applications, in: 13th USENIX Symposium on Operating Systems
|
||||
knowledge graph embedding, in: Proceedings of the ACM Web Conference 2024, Design and Implementation (OSDI 18), USENIX Association, Carlsbad, CA, ISBN:
|
||||
2024, pp. 1998–2008. 978-1-939133-08-3, 2018, pp. 561–577.
|
||||
[34] G. Xia, J. Chen, C. Yu, J. Ma, Poisoning attacks in federated learning: A survey, [38] D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, 2014, arXiv
|
||||
Ieee Access 11 (2023) 10708–10722. preprint arXiv:1412.6980.
|
||||
[35] K. Toutanova, D. Chen, P. Pantel, H. Poon, P. Choudhury, M. Gamon, Repre-
|
||||
senting text for joint embedding of text and knowledge bases, in: Proceedings
|
||||
of the 2015 Conference on Empirical Methods in Natural Language Processing,
|
||||
2015, pp. 1499–1509.
|
||||
|
||||
|
||||
|
||||
|
||||
11
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,875 @@
|
||||
Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Journal of Systems Architecture
|
||||
journal homepage: www.elsevier.com/locate/sysarc
|
||||
|
||||
|
||||
|
||||
|
||||
EDF-based Energy-Efficient Probabilistic Imprecise Mixed-Criticality
|
||||
Scheduling
|
||||
Yi-Wen Zhang ∗, Jin-Long Zhang
|
||||
College of Computer Science and Technology, Huaqiao University, Xiamen, 361021, China
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: We focus on Mixed-Criticality Systems (MCS), which involves the integration of multiple subsystems with
|
||||
Imprecise Mixed-Criticality varying levels of criticality on shared hardware platforms. The classic MCS task model assumes hard real-time
|
||||
Energy management constraints and no Quality-of-Service (QoS) for low-criticality tasks in high-criticality mode. Many researchers
|
||||
DVFS
|
||||
have put forward a range of extensions to the classic MCS task model to make MCS theory more applicable in
|
||||
Probabilistic schedulability
|
||||
industry practice. In this paper, we consider an Imprecise MCS taskset scheduled with Earliest Deadline First
|
||||
algorithm on a uniprocessor platform, and propose an Energy-Efficient Task Execution Model that guarantees
|
||||
(deterministic or probabilistic) schedulability, allows degraded QoS to low-criticality tasks in high-criticality
|
||||
mode, and applies Dynamic Voltage and Frequency Scaling to save energy.
|
||||
|
||||
|
||||
|
||||
1. Introduction
|
||||
In this paper, we consider all the above different aspects within
|
||||
Mixed-Criticality Systems (MCS) [1] involve the integration of mul- a unified framework. We consider an Imprecise MCS probabilistic
|
||||
tiple sub-systems with varying criticality levels on a shared hardware taskset scheduled with Earliest Deadline First (EDF) algorithm on a
|
||||
platform. For example, the automotive safety certification standard ISO uniprocessor platform, and propose an Energy-Efficient Task Execution
|
||||
26262 and the avionics safety certification standard DO-178C. Since Model that guarantees (deterministic or probabilistic) schedulability,
|
||||
the introduction of the MCS concept by Vestal [2], there has been allows degraded QoS to LO tasks in HI mode, and applies DVFS to
|
||||
considerable research conducted on this topic [1,3,4]. Many researchers save energy. Although the work in [7] is the closest to ours, there are
|
||||
several key differences. Firstly, it schedules tasks under non-preemptive
|
||||
have put forward a range of extensions to the classic MCS task model
|
||||
fixed-priority (NPFP) [8] scheduling policy while our work schedules
|
||||
to make MCS theory more applicable in industry practice, including:
|
||||
tasks with a preemptive EDF. Secondly, it uses probabilistic WCET
|
||||
(pWCET) to determine the probability of mode transition and uses a
|
||||
• To reduce the pessimism in task worst-case execution time
|
||||
deterministic schedulability analysis while our work includes determin-
|
||||
(WCET) estimation and system schedulability analysis,
|
||||
istic or probabilistic schedulability analysis. Finally, it uses the response
|
||||
researchers have proposed probabilistic schedulability analysis
|
||||
time analysis to determine the schedulability analysis while our work
|
||||
techniques where the task WCETs (and/or periods) are repre-
|
||||
uses Demand Bound Function (DBF) to determine the schedulability
|
||||
sented by random variables, and the system is allowed to miss analysis. In short, the work is first to address the energy issue and
|
||||
deadlines with a small probability [5]. schedulability test of the Imprecise MCS probabilistic taskset MCS
|
||||
• The original assumption that all low-criticality (LO) tasks are taskset scheduling under EDF.
|
||||
discarded in high-criticality (HI) mode is likely to be undesirable The remainder of the paper is organized as follows. We present
|
||||
in industry practice, hence researchers have proposed various background and related work in Section 2. Section 3 presents prelim-
|
||||
approaches to allow a certain level of degraded Quality-of-Service inaries. Section 4 presents our probabilistic IMC scheduling; Section 5
|
||||
(QoS) to LO tasks in HI mode [1]. presents the Energy-Efficient Task Execution Model; Section 6 presents
|
||||
• To address energy-constrained safety–critical systems, researchers experimental results; Section 7 discusses practical issues. Finally, Sec-
|
||||
have proposed power and energy-aware scheduling algorithms tion 8 presents conclusions and future work.
|
||||
with Dynamic Voltage and Frequency Scaling (DVFS) for MCS [6].
|
||||
|
||||
|
||||
|
||||
∗ Corresponding author.
|
||||
E-mail addresses: zyw@hqu.edu.cn (Y.-W. Zhang), sang_yunl@stu.hqu.edu.cn (J.-L. Zhang).
|
||||
|
||||
https://doi.org/10.1016/j.sysarc.2025.103361
|
||||
Received 11 September 2024; Received in revised form 3 February 2025; Accepted 4 February 2025
|
||||
Available online 12 February 2025
|
||||
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
2. Background and related work 2.2. The classic MCS task model
|
||||
|
||||
2.1. Background and motivation The MCS taskset 𝛤 includes 𝑛 independent sporadic tasks 𝛤 =
|
||||
{𝜏𝑖 |1 ≤ 𝑖 ≤ 𝑛} [13,14]. Although there may be multiple (4–5) criticality
|
||||
Resource-constrained embedded systems. In order to motivate levels in general, we present the task model assuming a dual-criticality
|
||||
the need for probabilistic scheduling and DVFS addressed in this paper, system with criticality levels LO and HI for the sake of simplicity. The
|
||||
we first discuss the issue of hardware resource constraints in real- taskset 𝛤 includes two subsets: LO tasks 𝛤𝐿𝑂 = {𝜏𝑖 ∈ 𝛤 |𝐿𝑖 = 𝐿𝑂} and
|
||||
time embedded systems, including but not limited to MCS, which HI tasks 𝛤𝐻 𝐼 = {𝜏𝑖 ∈ 𝛤 |𝐿𝑖 = 𝐻 𝐼}. Each task 𝜏𝑖 ∈ 𝛤 is described by
|
||||
are especially pertinent for mass-produced consumer products such (𝐿𝑖 , 𝑇𝑖 , 𝐷𝑖 , 𝐶𝑖𝐿𝑂 , 𝐶𝑖𝐻 𝐼 ):
|
||||
as ground vehicles and drones (Unmanned Aerial Vehicles), due to
|
||||
monetary cost as well as Size, Weight, and Power (SWaP) constraints. • 𝐿𝑖 ∈ {𝐿𝑂, 𝐻 𝐼} denoted its criticality level.
|
||||
Automotive Electrical/Electronic (E/E) systems typically have stringent • 𝑇𝑖 denoted its period.
|
||||
hardware resource constraints. In modern high-end vehicles, there can • 𝐷𝑖 denoted its relative deadline.
|
||||
be up to 100 ECUs (Electronic Control Units) embedded within them, • 𝐶𝑖𝐿𝑂 denoted its WCET in LO mode.
|
||||
and each model can be sold millions of times. An overall savings of • 𝐶𝑖𝐻 𝐼 denoted its WCET in HI mode for HI tasks (𝐿𝑖 = 𝐻 𝐼), with
|
||||
millions of dollars may be achieved by saving a few dollars per ECU. 𝐶𝑖𝐻 𝐼 ≥ 𝐶𝑖𝐿𝑂 .
|
||||
Hence, a designer of E/E systems should choose the cheapest ECU
|
||||
according to their application’s needs. The monetary cost pressure on Task execution model of classic MCS. The system is first ini-
|
||||
relatively cheap consumer drones is even higher. Next, let us consider tialized to be in LO mode. LO tasks 𝜏𝑖 ∈ 𝛤𝐿𝑂 are monitored at run
|
||||
the issue of SWaP, which lumps together three factors that are closely time and their execution is no more than their 𝐶𝑖𝐿𝑂 . The system is
|
||||
correlated due to the same underlying cause of hardware resource schedulable in LO mode if all tasks 𝜏𝑖 ∈ 𝛤 can complete their LO mode
|
||||
constraints. The significance of SWaP is obvious in battery-powered WCETs 𝐶𝑖𝐿𝑂 within their respective deadlines. If any HI task 𝜏𝑖 ∈ 𝛤𝐻 𝐼
|
||||
mobile devices like drones and mobile robots, where operating time executes beyond its 𝐶𝑖𝐿𝑂 , the system enters HI mode while all LO tasks
|
||||
and physical constraints are limited. However, SWaP considerations in 𝛤𝐿𝑂 are abandoned. The system is schedulable in HI mode if all HI
|
||||
are equally applicable to ground vehicles that are equipped with siz- tasks 𝜏𝑖 ∈ 𝛤𝐻 𝐼 can complete their HI mode WCETs 𝐶𝑖𝐻 𝐼 within their
|
||||
able battery systems. Electronics within autonomous vehicles consume respective deadlines. The system switches back to LO mode at an idle
|
||||
substantial power, impacting the range of electric vehicles or the fuel instant if no jobs wait for executions at this time [15]. The system is
|
||||
consumption of gasoline vehicles. Size and weight affect consumer schedulable if both modes are schedulable.
|
||||
acceptance, e.g., an autonomous vehicle with a trunk full of electronics The state-of-the-art scheduling algorithms for the classic MCS task
|
||||
is not likely to be acceptable to the average consumer. The issue of model include Fixed-Priority scheduling [14], and Earliest-Deadline
|
||||
significant hardware resource constraints in MCS has motivated a line First with Virtual Deadline (EDF-VD) [16] for Dynamic-Priority
|
||||
of work on processing and memory resource optimization algorithms scheduling on uniprocessor systems. Subsequently, many extensions to
|
||||
for MCS [9]. the classic MCS task model have been proposed, as discussed next.
|
||||
Motivation for probabilistic schedulability analysis. Recently,
|
||||
Akesson et al. [10] investigated 120 industry practitioners in real-time 2.3. Degraded QoS for LO tasks
|
||||
embedded systems, and results indicated that soft or firm real-time
|
||||
constraints are prevalent even in safety–critical application domains. The degraded QoS of LO tasks in HI mode is achieved by decreasing
|
||||
A minority (15%) of the surveyed systems were considered strictly execution time budgets [17] or adding the task period [18] for LO tasks.
|
||||
hard real-time (no deadlines to be missed). Thus, designing the timing Liu et al. [17] proposed the Imprecise Mixed-Criticality (IMC) task
|
||||
behavior of a system function to ensure a much lower failure rate did model in which a HI task 𝜏𝑖 (𝐿𝑖 = 𝐻 𝐼) is assigned a greater estimated
|
||||
not affect the system’s total schedulability. WCET compared to its estimation in LO mode (𝐶𝑖𝐿𝑂 ≤ 𝐶𝑖𝐻 𝐼 ), while a
|
||||
Industry safety certification standards specify acceptable failure LO task 𝜏𝑖 (𝐿𝑖 = 𝐿𝑂) is assigned a smaller estimated WCET in HI mode
|
||||
rates depending on the system’s criticality levels such as each ASIL has compared to the estimation in LO mode (𝐶𝑖𝐿𝑂 ≥ 𝐶𝑖𝐻 𝐼 ). They considered
|
||||
a permitted failure probability of 10−9 for ASIL D, 10−8 for ASIL C EDF-VD scheduling on a single processor system, and presented two
|
||||
and B, and 10−7 for ASIL A in the automotive standard ISO-26262 [5]. schedulability tests, one based on the utilization bound test, and the
|
||||
Relaxing the hard real-time assumption can help reduce pessimism other based on the Demand Bound Function (DBF). Davis et al. [19]
|
||||
in task WCET estimation and system schedulability analysis and in- addressed the IMC task model under fixed-priority scheduling, and pre-
|
||||
crease schedulable utilization significantly. Von der Brüggen et al. [11] sented a Compensating AMC Scheduling scheme and two schedulability
|
||||
demonstrated large gains in processor utilization with experiments tests. Jiang et al. [20] presented a concrete implementation of the
|
||||
using randomly-generated workloads, e.g., a gain of at least 12% IMC task model in the form of a configurable processor floating point
|
||||
schedulable utilization for an acceptable worst-case deadline failure unit hardware design, as well as schedulability analysis and optimized
|
||||
probability of 10−6 . This motivates probabilistic schedulability analysis priority assignment algorithms based on fixed-priority scheduling.
|
||||
as an effective technique for reducing analysis pessimism and increase
|
||||
processor utilization in resource-constrained embedded systems. 2.4. Energy-aware scheduling for MCS
|
||||
Motivation for not dropping LO tasks in HI mode. Consider
|
||||
the automotive standard ISO-26262, where ASIL determination of haz- DVFS dynamically adjusts the processor supply voltage and speed
|
||||
ardous events is based on three parameters: ‘‘severity’’, ‘‘probability of (frequency) based on the system’s workload, which is an effective
|
||||
exposure’’ and ‘‘controllability’’. An individual’s vulnerability to harm energy-saving technique [21]. Most modern microprocessors, including
|
||||
in a potentially hazardous situation is determined by severity. Proba- those used in embedded systems, provide support for DVFS. Our recent
|
||||
bility is the likelihood that harm will occur, while controllability is the survey paper [6] provided an overview of recent developments in
|
||||
ability to avoid harm or damage through prompt action by the agents energy-aware real-time scheduling for MCS, predominantly focusing on
|
||||
involved (e.g. a driver of the vehicle). It cannot always be assumed that DVFS.
|
||||
a software function that is part of a high ASIL functionality is more Recently, power and energy-aware real-time scheduling for MCS
|
||||
important than one that is part of a lower ASIL functionality, as both has attracted significant attention [6]. Huang et al. [22] proposed a
|
||||
may be safety–critical, and each function’s failure may cause severe scheduling algorithm for MCS based on EDF-VD [16]. This scheduling
|
||||
damage [12]. algorithm reduces energy consumption by optimizing virtual deadlines
|
||||
|
||||
2
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
and processor speeds. Zhang [23] used the dynamic slack time gener- Table 1
|
||||
Related work on probabilistic Scheduling for MCS. Abbreviations: Prob. (Probabilistic);
|
||||
ated from late arrival tasks to reduce energy consumption. This work
|
||||
S.A. (Schedulability Analysis).
|
||||
is extended to MCS with fixed-priority preemptive scheduling [24] and
|
||||
Work Sched. Prob. Energy- LO tasks
|
||||
dynamic priority non-preemptive scheduling [25]. Zhang et al. [26] Algo. S.A. Aware dropped in
|
||||
tackled the issue of MCS with shared resources and proposed a dual- HI Mode
|
||||
speed scheduling algorithm. This algorithm ensured both the system Santinelli and George (2015) [33] EDF Y N Y
|
||||
schedulability and mutually exclusive access to shared resources. How- Maxim et al. (2017) [34] FP Y N Y
|
||||
ever, it assumed that all tasks execute with their WCET. Zhang [27] Singh et al. (2020) [35] NPFP Y N Y
|
||||
used the difference between the actual execution time and WCET Draskovic et al. (2021) [36] FP Y N N
|
||||
Guo et al. (2021) [37] EDF Y N Y
|
||||
to save energy. These works focus on the classic MCS task model.
|
||||
Bhuiyan et al. (2020) [7] NPFP N Y Y
|
||||
Zhang [28] focused on the IMC task model in which LO tasks allow This work EDF Y Y N
|
||||
Qos in HI mode and proposed an energy-aware scheduling algorithm
|
||||
(EA-IMC).
|
||||
There has been a small number of recent works on energy-aware
|
||||
MCS on multiprocessors. Narayana et al. [29] considered the energy probability that its WCET is equal to 𝑒𝑡.1 Given the PMF 𝑓𝑖 (⋅), we
|
||||
minimization problem for multiprocessor MCS based on DVFS. They can easily obtain the corresponding Cumulative Distribution Function
|
||||
∑
|
||||
first proposed an optimal solution and an effective lightweight heuristic (CDF) 𝐹𝑖 (⋅), where 𝐹𝑖 (𝑒𝑡) = 𝑃 (𝑖 ≤ 𝑒𝑡) = 𝑥≤𝑒𝑡 𝑓𝑖 (𝑥). The Complemen-
|
||||
on a uniprocessor, then extended these results to multicore systems. tary Cumulative Distribution Function (1-CDF) is defined as 𝐹̄𝑖 (𝑒𝑡) =
|
||||
Ranjbar et al. [30] proposed a heuristic algorithm for online peak 𝑃 (𝑖 > 𝑒𝑡) = 1 − 𝐹𝑖 (𝑒𝑡).
|
||||
power and thermal management of a multicore MCS by using the slack We consider the MCS taskset 𝛤 including 𝑛 independent periodic
|
||||
time and per-cluster DVFS. Recently, some researchers [31] studied the tasks 𝛤 = {𝜏𝑖 |1 ≤ 𝑖 ≤ 𝑛} scheduled with preemptive EDF on
|
||||
IMC task model on multiprocessors in which LO tasks allow QoS in HI a single processor platform. (It is a special case of EDF-VD with a
|
||||
mode and proposed the partitioned scheduling algorithm. In addition, deadline scaling factor 𝑥 = 1.) We assume a dual-criticality system with
|
||||
this work is extended to shared resource scheduling [32]. However, the criticality levels LO and HI for the sake of simplicity. The taskset 𝛤
|
||||
above studies assume that tasks execute with their deterministic WCET. consists of two subsets: LO tasks 𝛤𝐿𝑂 = {𝜏𝑖 ∈ 𝛤 |𝐿𝑖 = 𝐿𝑂} and HI tasks
|
||||
𝛤𝐻 𝐼 = {𝜏𝑖 ∈ 𝛤 |𝐿𝑖 = 𝐻 𝐼}. Each task 𝜏𝑖 ∈ 𝛤 is described by a tuple of
|
||||
2.5. Probabilistic scheduling for MCS parameters ⟨𝐿𝑖 , 𝑇𝑖 , 𝐷𝑖 , 𝑖 , 𝑖𝐿𝑂 , 𝑖𝐻 𝐼 , 𝐶𝑖𝑑 𝑒𝑔 ∨ 𝐶𝑖𝑡ℎ𝑟 ⟩:
|
||||
|
||||
• 𝐿𝑖 ∈ {𝐿𝑂, 𝐻 𝐼} denotes its criticality level.
|
||||
Santinelli and George [33] presented an initial solution to proba-
|
||||
bilistic schedulability analysis for EDF scheduling of MCS based on the • 𝑇𝑖 denotes its period.
|
||||
concept of probabilistic C-Space. Maxim et al. [34] presented a prob- • 𝐷𝑖 denotes its constrained deadline (𝐷𝑖 ≤ 𝑇𝑖 ).
|
||||
abilistic fixed-priority schedulability analysis [14]. Singh et al. [35] • 𝑖 is its nominal pWCET, a discrete random variable with 𝐾
|
||||
considered a novel MCS task model with job-level mode switching, discrete values characterized by PMF 𝑓𝑖 (⋅) and CDF 𝐹𝑖 (⋅). It has
|
||||
and presented a graph-traversal-based analytic framework for non- the minimum value 𝐶𝑖𝑚𝑖𝑛 with index 𝑖𝑛𝑑(𝐶𝑖𝑚𝑖𝑛 ) = 0 and maximum
|
||||
preemptive job-level fixed-priority probabilistic schedulability analysis. value 𝐶𝑖𝑚𝑎𝑥 with index 𝑖𝑛𝑑(𝐶𝑖𝑚𝑎𝑥 ) = 𝐾 − 1 among the 𝐾 discrete
|
||||
Draskovic et al. [36] proposed metrics that are inspired by industry values of 𝑖 .
|
||||
safety standards, including the probability of deadline miss per hour, • 𝑖𝐿𝑂 is its pWCET in LO mode, characterized by PMF 𝑓 𝐿𝑂 (⋅) and
|
||||
𝑖
|
||||
the expected time before degradation happens, and the duration of the CDF 𝐹 𝐿𝑂 (⋅).
|
||||
𝑖
|
||||
degradation, and presented a system-wide approach to probabilistic • 𝑖𝐻 𝐼 is its pWCET in HI mode, characterized by PMF 𝑓 𝐻 𝐼 (⋅) and
|
||||
𝑖
|
||||
scheduling of MCS. Guo et al. [37] proposed a new task model in CDF 𝐹 𝐻 𝐼 (⋅).
|
||||
𝑖
|
||||
which a new parameter is added to characterize the distribution of the
|
||||
• 𝐶𝑖𝑑 𝑒𝑔 is valid for LO tasks (𝐿𝑖 = 𝐿𝑂), and denotes its Degraded
|
||||
WCET estimations for each task. They presented efficient algorithms for
|
||||
WCET in HI mode 𝐶𝑖𝑑 𝑒𝑔 with index 𝑖𝑛𝑑(𝐶𝑖𝑑 𝑒𝑔 ) ∈ [0, 𝐾 − 1].
|
||||
MCS scheduling under this task model for both independent tasks and
|
||||
failure-dependent tasks. • 𝐶𝑖𝑡ℎ𝑟 is valid for HI tasks (𝐿𝑖 = 𝐻 𝐼), and denotes its Threshold
|
||||
We are aware of only one related work that addressed energy- WCET in LO mode 𝐶𝑖𝑡ℎ𝑟 with index 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) ∈ [0, 𝐾 − 1].
|
||||
aware scheduling in MCS assuming probabilistic task execution times. Task execution model. The system is first initialized to be in LO
|
||||
Bhuiyan et al. [7] proposed a probabilistic technique to derive an mode. If any HI task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 executes beyond its 𝐶𝑖𝑡ℎ𝑟 , the system
|
||||
energy-efficient processor speed that minimized the average energy switches from LO mode to HI mode. At the mode switch instant 𝑡𝑠 , if
|
||||
consumption with DVFS, while ensuring deadlines of all tasks in MCS. jobs of LO tasks have run for longer than their 𝐶𝑖𝑑 𝑒𝑔 , any such jobs will
|
||||
This work used non-preemptive fixed-priority scheduling and determin- be dropped, without suppressing future arrivals thereof. In addition, if a
|
||||
istic schedulability test based on Worst-Case Response Time analysis, LO job has executed for less than 𝐶𝑖𝑑 𝑒𝑔 by the switch time instant, these
|
||||
instead of probabilistic schedulability analysis. It is not directly com- carry-over jobs that have an arrival time before 𝑡𝑠 and have absolute
|
||||
parable to our work due to the different task models and analysis deadlines after 𝑡𝑠 will continue to execute the leftover execution up to
|
||||
techniques. 𝐶𝑖𝑑 𝑒𝑔 . While in HI mode, each LO task 𝜏𝑖 ∈ 𝛤𝐿𝑂 executes no more than
|
||||
Table 1 summarized related work on probabilistic Scheduling for its 𝐶𝑖𝑑 𝑒𝑔 , i.e., it is dropped if its execution time exceeds 𝐶𝑖𝑑 𝑒𝑔 . The system
|
||||
MCS. switches from HI mode to LO mode at an idle instant if no jobs wait
|
||||
for executions at this time. Moreover, incomplete tasks are dropped at
|
||||
3. Preliminaries their deadlines, hence there does not exist a backlog of outstanding
|
||||
execution at the end of each hyper-period (this is a common assumption
|
||||
3.1. Task model in industry practice [10].
|
||||
The pWCET of a LO task in LO mode, or the pWCET of a HI task
|
||||
Our task model is inspired by the IMC task model [17], with in HI mode, is the same as its nominal pWCET 𝑖 . The pWCET of a HI
|
||||
extensions to the probabilistic scheduling scenario. We first introduce
|
||||
some basic notations for probabilistic scheduling. A task 𝜏𝑖 ’s probabilistic
|
||||
WCET (pWCET) 𝑖 is a random variable characterized by a Probability 1
|
||||
Calligraphic letters are used to represent distributions while non
|
||||
Mass Function (PMF) 𝑓𝑖 (⋅), where 𝑓𝑖 (𝑒𝑡) = 𝑃 (𝑖 = 𝑒𝑡) denotes the calligraphic letters are for scalars.
|
||||
|
||||
|
||||
3
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
task 𝜏𝑖 in LO mode is trimmed with the upper bound 𝐶𝑖𝑡ℎ𝑟 to have the Table 2
|
||||
Taskset parameters of 𝛤1 , with 𝐶1𝑑 𝑒𝑔 = 1, 𝐶2𝑡ℎ𝑟 = 1.
|
||||
conditional PMF 𝑓 𝐿𝑂 (𝑒𝑡) = 𝑃 (𝑖 = 𝑒𝑡 ∣ 𝑒𝑡 ≤ 𝐶𝑖𝑡ℎ𝑟 ). The pWCET of a LO
|
||||
𝑖
|
||||
Task 𝐿𝑖 𝑇𝑖 = 𝐷𝑖 𝑖 𝑖𝐿𝑂 𝑖𝐻 𝐼 𝑖𝐿𝑂 𝑖𝐻 𝐼
|
||||
task 𝜏𝑖 in HI mode is trimmed with the upper bound 𝐶𝑖𝑑 𝑒𝑔 to have the
|
||||
conditional PMF 𝑓 𝐻 𝐼 (𝑒𝑡) = 𝑃 (𝑖 = 𝑒𝑡 ∣ 𝑒𝑡 ≤ 𝐶𝑖𝑑 𝑒𝑔 ). In other words, 𝐶𝑖𝑑 𝑒𝑔 ⎛1 2⎞ ⎛1 2⎞ ⎛1⎞ ⎛0.5 1.0⎞ ⎛0.5⎞
|
||||
𝑖 𝜏1 LO 2 ⎜0.5 0.5⎟ ⎜0.5 0.5⎟ ⎜1.0⎟ ⎜0.5 0.5⎟ ⎜1.0⎟
|
||||
is LO task 𝜏𝑖 ’s execution time budget in HI mode, and 𝐶𝑖𝑡ℎ𝑟 is HI task ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
|
||||
⎝0.5 1.0⎠ ⎝0.5 1.0⎠ ⎝1.0⎠ ⎝0.5 1.0⎠ ⎝1.0⎠
|
||||
𝜏𝑖 ’s execution time budget in LO mode. This is inspired by the IMC task ⎛1 2⎞ ⎛1⎞ ⎛1 2⎞ ⎛0.5⎞ ⎛0.5 1.0⎞
|
||||
𝜏2 HI 2 ⎜0.5 0.5⎟ ⎜1.0⎟ ⎜0.5 0.5⎟ ⎜1.0⎟ ⎜0.5 0.5⎟
|
||||
model [17,19,20]. They are computed with Eqs. (1) and (2): ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
|
||||
⎝0.5 1.0⎠ ⎝1.0⎠ ⎝0.5 1.0⎠ ⎝1.0⎠ ⎝0.5 1.0⎠
|
||||
∀𝜏𝑖 ∈ 𝛤𝐿𝑂 ∶ 𝑓 𝐿𝑂 (𝑒𝑡) = 𝑓𝑖 (𝑒𝑡), (1)
|
||||
𝑖
|
||||
|
||||
⎧∑ 𝑑 𝑒𝑔
|
||||
⎪ 𝑒𝑡′ ≥𝐶 𝑑 𝑒𝑔 𝑓 𝐿𝑂 (𝑒𝑡′ ), 𝑒𝑡 = 𝐶𝑖
|
||||
⎪ 𝑖 𝑖 • [[𝐴]]0 stands for max(𝐴, 0).
|
||||
𝑓 𝐻 𝐼 (𝑒𝑡) = ⎨𝑓 𝐿𝑂 (𝑒𝑡), 𝑒𝑡 < 𝐶𝑖𝑑 𝑒𝑔 • 𝑡𝑠 stands for the mode-switch time.
|
||||
𝑖
|
||||
⎪ 𝑖 𝑡−𝐷 𝑡
|
||||
⎪0, 𝑒𝑡 > 𝐶𝑖𝑑 𝑒𝑔 • 𝑚𝑖 = ⌊ 𝑇 𝑖 ⌋ and 𝑘𝑖 = ⌊ 𝑇𝑠 ⌋ are the number of jobs for 𝜏𝑖 in the
|
||||
⎩ 𝑖 𝑖
|
||||
interval [0, 𝑡) and [0, 𝑡𝑠 ), respectively.
|
||||
• 𝐷𝐵 𝐹𝐿 (𝜏𝑖 , 𝑡) stands for the processor demand of any task 𝜏𝑖 ∈ 𝛤
|
||||
∀𝜏𝑖 ∈ 𝛤𝐻 𝐼 ∶ 𝑓 𝐻 𝐼 (𝑒𝑡) = 𝑓𝑖 (𝑒𝑡) (2) within [0, 𝑡) in LO mode.
|
||||
𝑖
|
||||
∑ • 𝐷𝐵 𝐹 (𝐽𝐿 , 𝑡) and 𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡) stand for the processor demand of a
|
||||
⎧ 𝑒𝑡′ ≥𝐶 𝑡ℎ𝑟 𝑓 𝐻 𝐼 (𝑒𝑡′ ), 𝑒𝑡 = 𝐶𝑖𝑡ℎ𝑟
|
||||
⎪ 𝑖 𝑖
|
||||
carry-over job released by task 𝜏𝑖 ∈ 𝛤𝐿𝑂 and 𝜏𝑖 ∈ 𝛤𝐻 𝐼 within [0, 𝑡),
|
||||
𝑓 𝐿𝑂 (𝑒𝑡) = ⎨𝑓 𝐻 𝐼 (𝑒𝑡), 𝑒𝑡 < 𝐶𝑖𝑡ℎ𝑟
|
||||
𝑖
|
||||
⎪ 𝑖 respectively.
|
||||
⎩0, 𝑒𝑡 > 𝐶𝑖𝑡ℎ𝑟 • 𝑟𝑖 stands for the arrival time of the carry-over job that arrives
|
||||
before 𝑡𝑠 and has a deadline after 𝑡𝑠 .
|
||||
Since task 𝜏𝑖 ’s period 𝑇𝑖 is a constant in both LO and HI modes, its • 𝐷𝐵 𝐹𝐿𝐻 (𝜏𝑖 , 𝑡) stands for the processor demand of a LO task 𝜏𝑖 within
|
||||
probabilistic Worst-Case Utilization (pWCU) can be obtained by dividing 𝐻 (𝜏 , 𝑡) stands for the processor
|
||||
[0, 𝑡) in HI mode, while 𝐷𝐵 𝐹𝐻 𝑖
|
||||
its pWCET by its period: 𝑖 = 𝑖 ∕𝑇𝑖 , 𝑖𝐿𝑂 = 𝑖𝐿𝑂 ∕𝑇𝑖 in LO mode, and
|
||||
demand of a HI task 𝜏𝑖 within [0, 𝑡) in HI mode.
|
||||
𝑖𝐻 𝐼 = 𝑖𝐻 𝐼 ∕𝑇𝑖 in HI mode. The pWCU of a taskset can be obtained by
|
||||
summing the pWCUs of all tasks in the taskset. Fig. 1 illustrates a carry-over job and the mode switch. The down-
|
||||
ward arrow represents the job arrival time. If the execution time of 𝜏𝑖
|
||||
Example 1. A taskset 𝛤1 with two tasks is shown in Table 2. Each task exceeds 𝐶𝑖𝐿𝑂 without signaling completion, the system switches from
|
||||
𝜏𝑖 ’s nominal pWCET 𝑖 is shown in matrix form defined in Eq. (3). For LO mode to HI mode. 𝐽𝐻 is a carry-over job.
|
||||
the matrix form, the first row denotes each discrete value of 𝑖 ; the
|
||||
According to the Task Execution model, the processor demand
|
||||
second row denotes probability values of the PMF 𝑓𝑖 (⋅); and the third
|
||||
of LO carry-over jobs is always less than or equal to 𝐶𝑖𝐿𝑂 , while the
|
||||
row denotes cumulative probability values of the CDF 𝐹𝑖 (⋅).
|
||||
processor demand of HI carry-over jobs is always less than or equal to
|
||||
⎛ 𝐶0 𝐶1 … 𝐶𝐾−1 ⎞
|
||||
𝐶𝑖𝐻 𝐼 . Therefore, 𝐷𝐵 𝐹 (𝐽𝐿 , 𝑡) can be calculated as follows:
|
||||
⎜ 𝑓 (𝐶0 ) 𝑓 (𝐶1 ) … 𝑓 (𝐶𝐾−1 ) ⎟ (3) {
|
||||
⎜ 𝑖 𝑖 𝑖 ⎟ 𝐶𝑖𝐿𝑂 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡
|
||||
⎝𝐹𝑖 (𝐶0 ) 𝐹𝑖 (𝐶1 ) … 𝐹𝑖 (𝐶𝐾−1 )⎠ 𝐷𝐵 𝐹 (𝐽𝐿 , 𝑡) = (5)
|
||||
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
|
||||
The PMF of 𝜏𝑖 ’s pWCET in LO mode 𝑖𝐿𝑂 is obtained by Eq. (2); the
|
||||
PMF of its pWCET in HI mode 𝑖𝐻 𝐼 is obtained by Eq. (1). For the toy and 𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡) can be calculated as follows:
|
||||
{
|
||||
example, the LO task 𝜏1 ’s nominal pWCET 1 has two possible values 𝐶𝑖𝐻 𝐼 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡
|
||||
1 and 2, each with probability 0.5; its pWCET in LO mode 1𝐿𝑂 is the 𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡) = (6)
|
||||
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
|
||||
same as 1 ; its pWCET in HI mode 1𝐻 𝐼 is obtained by trimming 1 with
|
||||
the upper bound 𝐶1𝑑 𝑒𝑔 = 1 and 𝑖𝑛𝑑(𝐶1𝑑 𝑒𝑔 ) = 0 (assuming the index starts
|
||||
from 0), with one possible value of 1 with a probability 1.0. The HI From [3,17], we have the following Theorems.
|
||||
task 𝜏2 ’s nominal pWCET 2 has two possible values 1 and 2, each with
|
||||
probability 0.5; its pWCET in LO mode 2𝐿𝑂 is obtained by trimming Theorem 1. A deterministic IMC taskset 𝛤 is schedulable under EDF in
|
||||
2 with the upper bound 𝐶2𝑡ℎ𝑟 = 1 and 𝑖𝑛𝑑(𝐶2𝑡ℎ𝑟 ) = 0, with one possible LO mode, if 0 < ∀𝑡 ≤ 𝑡𝑚𝑎𝑥 ,
|
||||
value of 1 with a probability 1.0; its pWCET in HI mode 2𝐻 𝐼 is the ∑
|
||||
𝐷𝐵 𝐹𝐿 (𝜏𝑖 , 𝑡) ≤ 𝑡, (7)
|
||||
same as 2 . The matrix that denotes 𝜏𝑖 ’s pWCU is obtained by dividing 𝜏𝑖 ∈𝛤
|
||||
each term in the first row of its pWCET matrix by its period 𝑇𝑖 .
|
||||
where 𝐷𝐵 𝐹𝐿 (𝜏𝑖 , 𝑡) = [[𝑚𝑖 + 1]]0 ⋅ 𝐶𝑖𝐿𝑂 , and 𝑡𝑚𝑎𝑥 is a hyper-period.
|
||||
Eq. (4) shows the definitions of pWCU for the subset of LO tasks
|
||||
𝛤𝐿𝑂 in LO mode. (As mathematical background, the addition of two
|
||||
discrete random variables and results in a new random variable
|
||||
Theorem 2. A deterministic IMC taskset 𝛤 is schedulable under EDF in
|
||||
with PMF computed by the convolution of the two PMFs and ,
|
||||
⨂ ∑ HI mode, if 0 < ∀𝑡 ≤ 𝑡𝑚𝑎𝑥 , 0 < 𝑡𝑠 < 𝑡,
|
||||
i.e., = , where 𝑃 ( = 𝑧) = ∞ 𝑘=−∞ 𝑃 ( = 𝑘)𝑃 ( = 𝑧 − 𝑘). ∑ ∑
|
||||
⨂ ⨂ 𝐷𝐵 𝐹𝐿𝐻 (𝜏𝑖 , 𝑡𝑠 , 𝑡) + 𝐷 𝐵 𝐹𝐻 𝐻
|
||||
(𝜏𝑗 , 𝑡𝑠 , 𝑡) ≤ 𝑡, (8)
|
||||
𝐿𝑂 𝐿𝑂 𝐻𝐼
|
||||
𝐿𝑂 (𝛤 ) = 𝑖 , 𝐻 𝐼 (𝛤 ) = 𝑖𝐻 𝐼 , (4) 𝜏𝑖 ∈𝛤𝐿𝑂 𝜏𝑗 ∈𝛤𝐻 𝐼
|
||||
𝜏𝑖 ∈𝛤𝐿𝑂 𝜏𝑖 ∈𝛤𝐻 𝐼
|
||||
|
||||
𝐿𝑂 (𝛤 ) denotes pWCU of 𝛤
|
||||
where 𝐿𝑂 𝐻𝐼 where 𝐷𝐵 𝐹𝐿𝐻 (𝜏𝑖 , 𝑡𝑠 , 𝑡) = 𝑘𝑖 𝐶𝑖𝐿𝑂 + 𝐷𝐵 𝐹 (𝐽𝐿 , 𝑡) + 𝑐𝑖 𝐶𝑖𝐻 𝐼 , and 𝐷𝐵 𝐹𝐻
|
||||
𝐻 (𝜏 , 𝑡 , 𝑡)
|
||||
𝑖 𝑠
|
||||
𝐿𝑂 in LO mode; 𝐻 𝐼 (𝛤 ) denotes
|
||||
can be determined as follows:
|
||||
pWCU of 𝛤𝐻 𝐼 in HI mode. {
|
||||
𝐻 𝐷𝐵 𝐹 (1), 𝐷𝑖 ≤ 𝑡 − 𝑡𝑠 ;
|
||||
𝐷 𝐵 𝐹𝐻 (𝜏𝑖 , 𝑡𝑠 , 𝑡) = (9)
|
||||
3.2. Existing deterministic IMC scheduling max{𝐷𝐵 𝐹 (1), 𝐷𝐵 𝐹 (2)}, 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒,
|
||||
|
||||
Liu et al. [17] have studied the schedulability test for deterministic where 𝐷𝐵 𝐹 (1) = 𝑏𝑖 𝐶𝑖𝐿𝑂 + 𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡) + 𝑎𝑖 𝐶𝑖𝐻 𝐼 , 𝐷𝐵 𝐹 (2) = 𝑘𝑖 𝐶𝑖𝐿𝑂 +
|
||||
𝑡 −(𝑡−𝐷𝑖 −𝑚𝑖 𝑇𝑖 )
|
||||
IMC task model and proposed the sufficient conditions of the schedu- 𝐷𝐵 𝐹 (𝐽𝐻 , 𝑡), 𝑎𝑖 = [[𝑚𝑖 − 𝑏𝑖 ]]0 , 𝑏𝑖 = [[⌊ 𝑠 𝑇
|
||||
⌋]]0 , and 𝑐𝑖 = [[𝑚𝑖 − 𝑘𝑖 ]]0 .
|
||||
𝑖
|
||||
lability under EDF-VD. We first introduce the following notations.
|
||||
|
||||
|
||||
4
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1. Carry-over job.
|
||||
|
||||
|
||||
4. Probabilistic IMC scheduling
|
||||
According to [3,17], we should consider two cases to determine the
|
||||
4.1. Schedulability analysis probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 within [0, 𝑡) in HI
|
||||
mode.
|
||||
Before presenting the schedulability analysis, let us introduce a few Case 1: 𝐷𝑖 ≤ 𝑡 − 𝑡𝑠 . The maximum demand of a job released by the
|
||||
notations. HI task 𝜏𝑖 is generated while its deadline coincides with 𝑡. According
|
||||
to Eq. (9) in Theorem 2, the probabilistic processor demand of any
|
||||
• max{} stands for the maximum value of random variable .
|
||||
task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 within [0, 𝑡) in HI mode is equal to (1) = ((𝑏𝑖 ) ⊙
|
||||
⎛𝑥⎞ ⨂ ⨂
|
||||
𝑖𝐿𝑂 ) (𝐽𝐻 , 𝑡) ((𝑎𝑖 ) ⊙ 𝑖𝐻 𝐼 ).
|
||||
• (𝑥) = ⎜1⎟, where 𝑥 is a constant.
|
||||
⎜ ⎟ Case 2: 𝐷𝑖 > 𝑡 − 𝑡𝑠 . The HI task 𝜏𝑖 has at most one job with a
|
||||
⎝1⎠
|
||||
processor demand 𝐶𝑖𝐻 𝐼 . If the deadline of this job is 𝐷𝑖 , the probabilistic
|
||||
• 𝐿 (𝜏𝑖 , 𝑡) stands for the probabilistic processor demand of any
|
||||
processor demand is the same as (1). Moreover, the only way to
|
||||
task 𝜏𝑖 within [0, 𝑡) in LO mode.
|
||||
increase the demand of the HI task 𝜏𝑖 is to add a new job in the interval.
|
||||
• (𝐽𝐿 , 𝑡) and (𝐽𝐻 , 𝑡) stand for the probabilistic processor
|
||||
In other words, the first job of the HI task 𝜏𝑖 arrives at time 0. Therefore,
|
||||
demand of a carry-over job released by the task 𝜏𝑖 ∈ 𝛤𝐿𝑂 and
|
||||
the processor demand includes two parts: one part is the demand of
|
||||
𝜏𝑖 ∈ 𝛤𝐿𝑂 within [0, 𝑡), respectively.
|
||||
all jobs before 𝑡𝑠 , and the other part is the demand of a carry-over
|
||||
• 𝐻 𝐿 (𝜏𝑖 , 𝑡) stands for the probabilistic processor demand of a LO job 𝐽𝐻 . In this case, the probabilistic processor demand is equal to
|
||||
task 𝜏𝑖 within [0, 𝑡) in HI mode, while 𝐻 ⨂
|
||||
𝐻 (𝜏𝑖 , 𝑡) stands for the (2) = ((𝑘𝑖 ) ⊙ 𝑖𝐿𝑂 ) (𝐽𝐻 , 𝑡).
|
||||
probabilistic processor demand of a HI task 𝜏𝑖 within [0, 𝑡) in HI In short, the probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤𝐻 𝐼
|
||||
mode. within [0, 𝑡) and 𝐷𝑖 ≤ 𝑡 − 𝑡𝑠 in HI mode can be determined as follows:
|
||||
• 𝐿 (𝑡) stands for the probabilistic processor demand of all tasks {
|
||||
within [0, 𝑡) in LO mode. (1), 𝐷𝑖 ≤ 𝑡 − 𝑡𝑠 ;
|
||||
𝐻 (𝜏
|
||||
𝐻 𝑖 , 𝑡) = (15)
|
||||
• 𝐻 (𝑡) stands for the probabilistic processor demand of all tasks , 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒,
|
||||
within [0, 𝑡) in HI mode. where can be determined as follows:
|
||||
𝑡𝑚𝑎𝑥
|
||||
• 𝛱𝑡=1 𝑡 = 1 × 2 × ⋯ × 𝑡𝑚𝑎𝑥 . {
|
||||
(1), max{ (2)} ≤ max{ (1)};
|
||||
= (16)
|
||||
According to [3,17,33], the probabilistic processor demand of any (2), 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
|
||||
task 𝜏𝑖 ∈ 𝛤 within [0, 𝑡) in LO mode can be calculated as follows:
|
||||
𝐿 (𝜏𝑖 , 𝑡) = ([[𝑚𝑖 + 1]]0 ) ⊙ 𝑖𝐿𝑂 , (10) Therefore, the probabilistic processor demand of all tasks within
|
||||
[0, 𝑡) in HI mode is determined by the following:
|
||||
where ⊙ denotes the Hadamard product, where each element in the 𝑖th ⨂ ⨂ ⨂
|
||||
𝐻 (𝑡) = ( 𝐻𝐿 (𝜏𝑖 , 𝑡)) ( 𝐻
|
||||
𝐻 (𝜏𝑖 , 𝑡)). (17)
|
||||
row of the right matrix is multiplied by the element on the 𝑖th row of
|
||||
𝜏𝑖 ∈𝛤𝐿𝑂 𝜏𝑖 ∈𝛤𝐻 𝐼
|
||||
the left vector.
|
||||
In addition, the probabilistic processor demand of all tasks within
|
||||
[0, 𝑡) in LO mode can be calculated as follows: Theorem 3. An IMC taskset 𝛤 is deterministically schedulable under EDF,
|
||||
⨂
|
||||
𝐿 (𝑡) = 𝐿 (𝜏𝑖 , 𝑡). (11) if 0 < ∀𝑡 ≤ 𝑡𝑚𝑎𝑥 , 0 < 𝑡𝑠 < 𝑡,
|
||||
𝜏𝑖 ∈𝛤
|
||||
max{ 𝐿 (𝑡)} ≤ 𝑡, 𝑎𝑛𝑑 max{ 𝐻 (𝑡)} ≤ 𝑡, (18)
|
||||
The probabilistic processor demand of a carry-over job released by It is probabilistically schedulable if the maximum probability that the pro-
|
||||
LO task 𝜏𝑖 within [0, 𝑡) can be calculated as follows: cessor demand of all tasks in both LO mode and HI mode exceeds 𝑡 does
|
||||
{
|
||||
𝑖𝐿𝑂 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡 not exceed the permitted system failure probability 𝐹𝑠 ,2 expressed as:
|
||||
(𝐽𝐿 , 𝑡) = (12) 𝑡
|
||||
(0), 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒. 1 − 𝛱𝑡 𝑚𝑎𝑥
|
||||
𝑘=𝑡 𝐹 𝐿 (𝑡𝑘 ) (𝑡𝑘 ) ≤ 𝐹𝑠 , 𝑎𝑛𝑑 (19)
|
||||
𝑡
|
||||
1 − 𝛱𝑡 𝑚𝑎𝑥 𝐹 (𝑡 ) ≤ 𝐹𝑠 .
|
||||
The probabilistic processor demand of a carry-over job released by 𝑘 =𝑡 𝐻 (𝑡𝑘 ) 𝑘
|
||||
HI task 𝜏𝑖 within [0, 𝑡) can be calculated as follows:
|
||||
{
|
||||
𝑖𝐻 𝐼 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡
|
||||
(𝐽𝐻 , 𝑡) = (13)
|
||||
(0), 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒. 2
|
||||
Chen et al. [38] pointed out that there are certain flaws in the probabilis-
|
||||
tic WCRT based on critical instant instances. However, our work focuses on the
|
||||
The probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤𝐿𝑂 within [0, 𝑡)
|
||||
overall distribution of all task behaviors within a task’s hyper-period, rather
|
||||
in HI mode can be calculated as follows: than relying solely on a single critical instant and considers the probability
|
||||
⨂ ⨂
|
||||
𝐻 𝐿𝑂
|
||||
𝐿 (𝜏𝑖 , 𝑡) = ((𝑘𝑖 ) ⊙ 𝑖 ) (𝐽𝐿 , 𝑡) ((𝑐𝑖 ) ⊙ 𝑖𝐻 𝐼 ). (14) distribution of all possible processor demand throughout the hyper-period.
|
||||
|
||||
|
||||
5
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
Table 3 𝐿 (𝜏2 , 𝑡) = (0), and 𝐿 (𝜏3 , 𝑡) = 3𝐿𝑂 . In addition, we have
|
||||
Taskset parameters of 𝛤2 , with 𝐶1𝑑 𝑒𝑔 = 3, 𝐶2𝑡ℎ𝑟 = 1, 𝐶3𝑑 𝑒𝑔 = 3. ⎛ 3 4 ⋯ 8 9 10 ⎞
|
||||
Task 𝐿𝑖 𝑇𝑖 = 𝐷𝑖 𝑖𝐿𝑂 𝑖𝐻 𝐼 ⎜ ⎟
|
||||
𝐿 (𝑡) = = ⎜0.008645 0.273 ⋯ 0.00266 0.000384 0.000001⎟
|
||||
⎛ 1 3 4 5 ⎞ ⎛ 1 3 ⎞ ⎜0.008645 0.281645 ⋯ 0.999615 0.999999 1.0 ⎟⎠
|
||||
⎜0.455 ⎝
|
||||
𝜏1 LO 10 0.54 0.004 0.001⎟ ⎜0.455 0.545⎟
|
||||
⎜ ⎟ ⎜ ⎟ from Eq. (11). Moreover, from (17), we have 𝐻 (𝑡) = .
|
||||
⎝0.455 0.995 0.999 1.0 ⎠ ⎝0.455 1.0 ⎠
|
||||
⎛ 0.5 1 ⎞ ⎛ 0.5 1 2 3 ⎞ When 10 < 𝑡 < 20, 𝑚1 = 0, 𝑚2 = −1, 𝑚3 = 0, 𝑎𝑖 = 0, 𝑐𝑖 = 0, and
|
||||
𝜏2 HI 20 ⎜0.49 0.51⎟ ⎜0.49 0.5 0.009 0.001⎟
|
||||
⎜ ⎟ ⎜ ⎟ 𝑏𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (11), we have 𝐿 (𝑡) = . If
|
||||
⎝0.49 1.0 ⎠ ⎝0.49 0.99 0.999 1.0 ⎠
|
||||
𝑡𝑠 < 10, 𝑘𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (17), we have 𝐻 (𝑡) =
|
||||
⎛ 2 3 4 5 ⎞ ⎛ 2 3 ⎞
|
||||
𝜏3 LO 10 ⎜0.019 0.6 0.38 0.001⎟ ⎜0.019 0.981⎟ and max{ 𝐻 (𝑡) ≤ 𝑡}. If 10 ≤ 𝑡𝑠 < 𝑡, we have 𝑘1 = 1, 𝑘2 = 0,
|
||||
⎜ ⎟ ⎜ ⎟
|
||||
⎝0.019 0.619 0.999 1.0 ⎠ ⎝0.019 1.0 ⎠ and 𝑘3 = 1. According to Eq. (14), we have 𝐻 𝐿𝑂 and
|
||||
𝐿 (𝜏1 , 𝑡) = 1
|
||||
𝐻 (𝜏
|
||||
𝐿 3 , 𝑡) = 𝐿𝑂 . We calculate 𝐻 (𝜏 , 𝑡) = (0) from Eq. (15).
|
||||
3 𝐻 2
|
||||
In addition, we have 𝐻 (𝑡) = from Eq. (17). Therefore, we have
|
||||
max{ 𝐻 (𝑡)} ≤ 𝑡 and max{ 𝐿 (𝑡)} ≤ 𝑡.
|
||||
When 𝑡 = 20, 𝑚1 = 1, 𝑚2 = 0, 𝑚3 = 1. According to Eq. (10), we
|
||||
have 𝐿 (𝜏1 , 𝑡) = (2) ⊙ 1𝐿𝑂 , 𝐿 (𝜏2 , 𝑡) = 2𝐿𝑂 , and 𝐿 (𝜏3 , 𝑡) =
|
||||
Proof. The IMC taskset 𝛤 is deterministically schedulable under EDF if it
|
||||
(2) ⊙ 3𝐿𝑂 . In addition, we have
|
||||
is deterministically schedulable in both LO mode and HI mode. The condi-
|
||||
tion for deterministic schedulability in LO mode and HI mode Eq. (18) ⎛ 6.5 ⋯ 19 20.5 21 ⎞
|
||||
𝐿 (𝑡) = ⎜0.00423605 ⋯ 0.00019584 0.00000049 0.00000051⎟
|
||||
is self-evident, because it can be directly derived from Theorems 1 and ⎜ ⎟
|
||||
2. In addition, the IMC taskset 𝛤 is probabilistically schedulable under ⎝0.00406315 ⋯ 0.999999 0.99999949 1.0 ⎠
|
||||
EDF if it is probabilistically schedulable in both LO mode and HI mode. from Eq. (11). If 𝑡𝑠 < 10, 𝑎1 = 1, 𝑎2 = 0, 𝑎3 = 1, 𝑐1 = 1, 𝑐2 = 0,
|
||||
The condition for probabilistic schedulability (Eq. (19)) states that the 𝑐3 = 1, 𝑘𝑖 = 0, and 𝑏𝑖 = 0 (𝑖 = 1, 2, 3). From Eq. (17), we have
|
||||
probability that the processor demand of all tasks in both LO mode and max{ 𝐻 (𝑡)} = 19. If 10 ≤ 𝑡𝑠 < 𝑡, 𝑘1 = 1, 𝑘2 = 0, 𝑘3 = 1, 𝑏1 = 1,
|
||||
HI mode exceeds 𝑡 is less than or equal to 𝐹𝑠 , hence it is probabilistically 𝑏2 = 0, 𝑏3 = 1, 𝑎𝑖 = 0 and 𝑐𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (17),
|
||||
schedulable with system failure probability not exceeding 𝐹𝑠 . (Note that we have max{ 𝐻 (𝑡)} = 23. Therefore, we have max{ 𝐿 (𝑡)} > 𝑡
|
||||
the condition of deterministic schedulability in Eq. (18) is a special and max{ 𝐻 (𝑡)} > 𝑡 (10 ≤ 𝑡𝑠 < 𝑡), but 1 − 𝐹 𝐿 (𝑡) (𝑡) ≤ 𝐹𝑠
|
||||
case of the condition of probabilistic schedulability in Eq. (19), with and 1 − 𝐹 𝐻 (𝑡) (𝑡) ≤ 𝐹𝑠 . According to Theorem 3, the taskset 𝛤 is
|
||||
permitted system failure probability equal to 0 (𝐹𝑠 = 0).) Q.E.D. probabilistically schedulable.
|
||||
In the deterministic analysis, the processor demand grows in a
|
||||
stepwise manner based on the interval length. The processor demand 5. Energy-efficient task execution model
|
||||
is affected only when the increase in interval length is a multiple of the
|
||||
task period. When we switch to probabilistic analysis, the probability We present in sequence the power model, the calculation of energy-
|
||||
distribution of processor demand also increases in a stepwise manner to efficient processor speeds in LO mode, and the Energy-Efficient Task
|
||||
maintain consistency. In other words, during deterministic analysis, the Execution Model in this section.
|
||||
processor demand does not change in the given time intervals, and in
|
||||
probabilistic scheduling analysis, the values in its probability distribu- 5.1. Power model
|
||||
tion of processor demand also remain unchanged. Specifically, there are
|
||||
some 𝑡𝑘 values that can generate the same probability distribution of We adopt the state-of-the-art processor power model [39–41]
|
||||
processor demand. The values of 𝐹 𝐿 (𝑡𝑘 ) (𝑡𝑘 ) and 𝐹 𝐻 (𝑡𝑘 ) (𝑡𝑘 ), which
|
||||
𝑃 = 𝑃𝑠 + ℎ(𝑃𝑖𝑛𝑑 + 𝐶𝑒𝑓 𝑠𝑚 ), (20)
|
||||
correspond to the same probability distribution of processor demand,
|
||||
should not be computed repeatedly in Eq. (19). Therefore, we only where 𝑃𝑠 is a static power and 𝑃𝑖𝑛𝑑 is the frequency-independent active
|
||||
calculate once. In addition, If 𝑡1 , 𝑡2 and 𝑡𝑙 (𝑡1 < 𝑡2 < 𝑡𝑙 ) can generate power. ℎ = 1 if the system is active (defined as having computation in
|
||||
the same probability distribution of the processor demand for all tasks progress); otherwise, ℎ = 0. 𝐶𝑒𝑓 is an effective switching capacitance
|
||||
in both modes. We choose the minimum value 𝑡1 among these values, and 𝑚 is system-application-dependent constant. 𝑠 is the normalized
|
||||
which corresponds to 𝐹 𝐿 (𝑡1 ) (𝑡1 ) and 𝐹 𝐻 (𝑡1 ) (𝑡1 ). This is because it processor speed (frequency). Like [39], we ignore a static power (𝑃𝑠 =
|
||||
is the value that maximizes the probability of the processor demand 0) and set 𝑃𝑖𝑛𝑑 = 0.01, 𝐶𝑒𝑓 = 1, 𝑚 = 3.
|
||||
exceeding the interval length. Considering our task model, the expected energy consumption of a
|
||||
single job of task 𝜏𝑖 is [42–44]:
|
||||
4.2. Example 2 𝑥
|
||||
𝐸 𝑖 = (𝑃𝑖𝑛𝑑 + 𝐶𝑒𝑓 𝑠𝑚 ) ⋅ 𝑖 (21)
|
||||
𝑠
|
||||
We present a taskset 𝛤2 , with the parameters shown in Table 3. ∑
|
||||
(The nominal pWCET 𝑖 is omitted for brevity.) We assume that 𝐹𝑠 = where 𝑥𝑖 = 𝐾−1 𝑘 𝑘
|
||||
𝑘=0 𝐶𝑖 ⋅ 𝑓𝑖𝐿𝑂 (𝐶𝑖 ) with the normalized processor speed
|
||||
1.0 × 10−6 . 𝑆𝑚𝑎𝑥 = 1. In addition, the processor speed 𝑠 should not be lower than
|
||||
In this example, 𝑡𝑚𝑎𝑥 = 20. 0 < 𝑡 < 10, 0 < 𝑡𝑠 < 𝑡, we have 𝑆𝑐 𝑟𝑖𝑡 , where 𝑆𝑐 𝑟𝑖𝑡 (𝑆𝑐 𝑟𝑖𝑡√< 𝑆𝑚𝑎𝑥 ) is an energy-efficient speed while it can
|
||||
𝑡−𝐷 𝑡 𝑃𝑖𝑛𝑑
|
||||
𝑚𝑖 = −1 (𝑚𝑖 = ⌊ 𝑇 𝑖 ⌋), 𝑘𝑖 = 0 (𝑘𝑖 = ⌊ 𝑇𝑠 ⌋), 𝑎𝑖 = 0, 𝑐𝑖 = 0, and be computed 𝑆𝑐 𝑟𝑖𝑡 = 𝑚 [39].
|
||||
𝑖 𝑖 (𝑚−1)⋅𝐶𝑒𝑓
|
||||
𝑏𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (10), 𝐿 (𝜏𝑖 , 𝑡) = (0). In To facilitate comparisons between task sets with varying hyper-
|
||||
addition, we have 𝐿 (𝑡) = (0) from Eq. (11). From Eq. (12), we periods, we utilize the definition of normalized energy consumption of
|
||||
have (𝐽𝐿 , 𝑡) = (0) for LO tasks 𝜏1 and 𝜏3 . Moreover, we have task set 𝛤 within its hyper-period [22] (i.e., its power consumption):
|
||||
(𝐽𝐻 , 𝑡) = (0) for HI task 𝜏2 from Eq. (13). Therefore, we have ℎ𝑖
|
||||
1 ∑𝑛 ∑
|
||||
𝑥
|
||||
𝐻 𝐻
|
||||
𝐿 (𝜏1 , 𝑡) = (0) and 𝐿 (𝜏3 , 𝑡) = (0) from Eq. (14). Due to 𝑁 𝐸(𝛤 ) = (𝑃 + 𝐶𝑒𝑓 𝑠𝑚 ) ⋅ 𝑖 (22)
|
||||
𝑘2 = 0, 𝑎2 = 0, 𝑏2 = 0 and 𝐷2 > 𝑡−𝑡𝑠 , we have (1) = (0), (2) = 𝐻 𝑃 (𝛤 ) 𝑖=1 𝑗=1 𝑖𝑛𝑑 𝑠
|
||||
(0), and max{ (2)} ≤ max{ (1)}. According to Eq. (15), we ∑𝑛
|
||||
𝑥𝑖
|
||||
have 𝐻 𝐻 (𝜏2 , 𝑡) = (0). We calculate 𝐻 (𝑡) = (0) from Eq. (17).
|
||||
= (𝑃𝑖𝑛𝑑 + 𝐶𝑒𝑓 𝑠𝑚 ) ⋅ ,
|
||||
𝑖=1
|
||||
𝑠 ⋅ 𝑇𝑖
|
||||
Therefore, we have max{ 𝐿 (𝑡)} ≤ 𝑡 and max{ 𝐻 (𝑡) ≤ 𝑡}.
|
||||
When 𝑡 = 10, 𝑚1 = 0, 𝑚2 = −1, 𝑚3 = 0, 𝑘𝑖 = 0, 𝑎𝑖 = 0, 𝑐𝑖 = 0, and where ℎ𝑖 = 𝐻 𝑃 (𝛤 )∕𝑇𝑖 is the number of jobs of task 𝜏𝑖 ∈ 𝛤 released in
|
||||
𝑏𝑖 = 0 (𝑖 = 1, 2, 3). According to Eq. (10), we have 𝐿 (𝜏1 , 𝑡) = 1𝐿𝑂 , the hyper-period 𝐻 𝑃 (𝛤 ).
|
||||
|
||||
6
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
5.2. Calculating energy-efficient processor speeds Table 4
|
||||
Taskset parameters of 𝛤3 , with 𝐶1𝑑 𝑒𝑔 = 1.5, 𝐶2𝑡ℎ𝑟 = 2, 𝐶3𝑑 𝑒𝑔 = 2.
|
||||
|
||||
We determine the energy-efficient processor speed in LO mode 𝑆𝐿 Task 𝐿𝑖 𝑇𝑖 = 𝐷𝑖 𝑖𝐿𝑂 𝑖𝐻 𝐼
|
||||
and schedule the tasks with 𝑆𝑚𝑎𝑥 = 1 in HI mode if an IMC taskset 𝛤 is ⎛1 1.5 2 2.5 ⎞ ⎛1 1.5⎞
|
||||
𝜏1 LO 10 ⎜0.1 0.4 0.35 0.15⎟ ⎜0.1 0.9⎟
|
||||
deterministically schedulable by EDF on a single processor. ⎜ ⎟ ⎜ ⎟
|
||||
⎝0.1 0.5 0.85 1.0 ⎠ ⎝0.1 1.0⎠
|
||||
A taskset 𝛤 running on a processor with speed 𝑆𝐿 is equivalent
|
||||
⎛ 1 2 ⎞ ⎛ 1 2 4 5 ⎞
|
||||
to the taskset 𝛤 ∗ running on a processor with speed 𝑆max = 1 with 𝜏2 HI 20 ⎜0.01 0.99⎟ ⎜0.01 0.49 0.45 0.05⎟
|
||||
⎜ ⎟ ⎜ ⎟
|
||||
proportionally-scaled execution times 1∕𝑆𝐿 times of each task in 𝛤 . ⎝0.01 1.0 ⎠ ⎝0.01 0.5 0.95 1.0 ⎠
|
||||
Therefore, the probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤 with ⎛1.5 2 2.5 3⎞ ⎛1.5 2⎞
|
||||
𝜏3 LO 10 ⎜0.2 0.3 0.4 0.1⎟ ⎜0.2 0.8⎟
|
||||
speed 𝑆𝐿 within [0, 𝑡) in LO mode can be calculated as follows: ⎜ ⎟ ⎜ ⎟
|
||||
⎝0.2 0.5 0.9 1.0⎠ ⎝0.2 1.0⎠
|
||||
𝐿 (𝜏𝑖 , 𝑡) = ([[𝑚𝑖 + 1]]0 ) ⊙ ((1∕𝑆𝐿 ) ⊙ 𝑖𝐿𝑂 ), (23)
|
||||
|
||||
The probabilistic processor demand of a carry-over job released by
|
||||
LO task 𝜏𝑖 with speed 𝑆𝐿 within [0, 𝑡) can be calculated as follows: the energy-efficient task execution model based on DVFS as shown below.
|
||||
{
|
||||
(1∕𝑆𝐿 ) ⊙ 𝑖𝐿𝑂 , 𝑟𝑖 + 𝐷𝑖 ≤ 𝑡 Energy-efficient task execution model in probabilistic IMC. The
|
||||
(𝐽𝐿 , 𝑡) = (24) system is first initialized to be in LO mode with processor speed 𝑆𝐿 . If
|
||||
(0), 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
|
||||
any HI task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 executes beyond its 𝐶𝑖𝑡ℎ𝑟 ∕𝑆𝐿 , the system switches
|
||||
The probabilistic processor demand of any task 𝜏𝑖 ∈ 𝛤𝐿𝑂 with speed into HI mode, with processor speed 𝑆𝑚𝑎𝑥 = 1. As the mode-switch
|
||||
𝑆𝐿 within [0, 𝑡) in HI mode can be calculated as follows: instant, if jobs of LO tasks have run for longer than their 𝐶𝑖𝑑 𝑒𝑔 ∕𝑆𝐿 , the
|
||||
⨂
|
||||
𝐻 𝐿𝑂
|
||||
𝐿 (𝜏𝑖 , 𝑡) =((𝑘𝑖 ) ⊙ ((1∕𝑆𝐿 ) ⊙ 𝑖 )) (25) jobs will be stopped until new released. In addition, if the execution
|
||||
⨂ time of LO jobs is less than 𝐶𝑖𝑑 𝑒𝑔 ∕𝑆𝐿 by the switch time instant, these
|
||||
𝐻𝐼
|
||||
(𝐽𝐿 , 𝑡) ((𝑐𝑖 ) ⊙ 𝑖 ).
|
||||
carry-over jobs will continue to execute the leftover execution up to
|
||||
In addition, the system schedules tasks with 𝑆𝐿 in LO mode and 𝐶𝑖𝑙𝑒𝑓 𝑡𝑜𝑣𝑒𝑟 after the switch time instant and before their deadlines, where
|
||||
𝑆𝑚𝑎𝑥 = 1 in HI mode, (1) and (2) in Eq. (16) are calculated 𝐶𝑖𝑙𝑒𝑓 𝑡𝑜𝑣𝑒𝑟 is the leftover execution time at the nominal processor speed
|
||||
by Eqs. (26) and (27), respectively. 𝑆𝑚𝑎𝑥 = 1. While in HI mode, each LO task 𝜏𝑖 ∈ 𝛤𝐿𝑂 executes no more
|
||||
⨂ than its 𝐶𝑖𝑑 𝑒𝑔 if it is started in HI mode, or its 𝐶𝑖𝑙𝑒𝑓 𝑡𝑜𝑣𝑒𝑟 if it is a leftover
|
||||
(1) =((𝑏𝑖 ) ⊙ ((1∕𝑆𝐿 ) ⊙ 𝑖𝐿𝑂 )) (26)
|
||||
⨂ job started in LO mode. The system switches back to LO mode, with
|
||||
𝐻𝐼
|
||||
(𝐽𝐻 , 𝑡) ((𝑎𝑖 ) ⊙ 𝑖 ). processor speed 𝑆𝐿 , at an idle instant if no jobs wait for executions at
|
||||
this time. In addition, incomplete tasks are dropped at their deadlines,
|
||||
⨂ hence there does not exist a backlog of outstanding execution at the
|
||||
(2) = ((𝑘𝑖 ) ⊙ ((1∕𝑆𝐿 ) ⊙ 𝑖𝐿𝑂 )) (𝐽𝐻 , 𝑡). (27) end of each hyper-period.
|
||||
|
||||
6. Experimental evaluation
|
||||
Theorem 4. Given an IMC taskset 𝛤 that is deterministically schedulable
|
||||
by EDF on a single processor, it remains deterministically schedulable with
|
||||
We evaluate our approach based on two performance metrics: the
|
||||
the energy-efficient processor speed 𝑆𝐿 in LO mode and 𝑆𝑚𝑎𝑥 = 1 in HI
|
||||
schedulability ratio, which represents the proportion of schedulable task
|
||||
mode if 0 < ∀𝑡 ≤ 𝑡𝑚𝑎𝑥 , 0 < 𝑡𝑠 < 𝑡
|
||||
sets (either deterministically or probabilistically schedulable) out of all
|
||||
max{ 𝐿 (𝑡)} ≤ 𝑡, 𝑎𝑛𝑑 max{ 𝐻 (𝑡)} ≤ 𝑡, (28) task sets; and the normalized energy consumption of each task set, as
|
||||
defined in Eq. (22).
|
||||
where 𝑆𝑐 𝑟𝑖𝑡 ≤ 𝑆𝐿 ≤ 1, 𝐿 (𝜏𝑖 , 𝑡), (𝐽𝐿 , 𝑡), 𝐻
|
||||
𝐿 (𝜏𝑖 , 𝑡), (1) and
|
||||
We generate synthetic tasksets based on the following experiment
|
||||
(2) are given in Eqs. (23)–(27), respectively.
|
||||
settings:
|
||||
|
||||
Proof. Theorem 4 can be directly derived from Theorem 3. • Number of tasks in each taskset 𝛤 is set to 𝑛 = 4.
|
||||
• Number of HI tasks in 𝛤 is set to 𝑛 ⋅ 𝐶 𝑃 , where the Criticality
|
||||
Proportion 𝐶 𝑃 is set to 𝐶 𝑃 = 0.5.
|
||||
5.3. Example 3
|
||||
• Number of discrete values of each task 𝜏𝑖 ’s nominal pWCET 𝑖 is
|
||||
set to 𝐾 = 4.
|
||||
Let us consider the task set 𝛤3 that consists of tasks with the param-
|
||||
• Each of the 𝐾 probability values in the PMF of 𝑖 is selected
|
||||
eters presented in Table 4. The processor has tens discrete normalized
|
||||
randomly from [0, 1) while ensuring that they sum to 1 (similar
|
||||
processor speed, i.e., [0.1, 0.2, … , 1.0] [45]. According to Theorem 3, the
|
||||
to [46,47]).
|
||||
taskset is deterministically schedulable in both modes. We calculate
|
||||
• For each LO task 𝜏𝑖 ∈ 𝛤𝐿𝑂 , the index of the Degraded WCET 𝐶𝑖𝑑 𝑒𝑔
|
||||
𝑆𝐿 = 0.8 on the basis of Theorem 4, by iteratively trying out the
|
||||
among the 𝐾 discrete values of 𝑖 is set to 𝑖𝑛𝑑(𝐶𝑖𝑑 𝑒𝑔 ) = 0.5𝐾−1 = 1.
|
||||
available speeds, from lowest to highest, until we find the minimum
|
||||
speed that satisfies all constraints. According to Eq. (21), we have
|
||||
• For each HI task 𝜏𝑖 ∈ 𝛤𝐻 𝐼 , the index of the Threshold WCET 𝐶𝑖𝑡ℎ𝑟
|
||||
𝑥̄ 1 = 1.775, 𝑥̄ 2 = 1.99, 𝑥̄ 3 = 2.2. In addition, we can then use Eq. (22) to
|
||||
among the 𝐾 discrete values of 𝑖 is set to 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) = 0.5𝐾 − 1 = 1.
|
||||
obtain the taskset’s normalized energy consumption to be 0.3242925
|
||||
with processor speed 𝑆𝐿 = 0.8 with DVFS, and 0.50197 with processor
|
||||
• 𝑇𝑖 is randomly selected the set {10, 20, 40, 50, 100, 200, 400, 500,
|
||||
speed 𝑆max = 1 for EDF without DVFS, which represents significant
|
||||
1000} [48].
|
||||
energy savings.
|
||||
• To control taskset processor utilization, max{𝐿𝑂𝐿𝑂 (𝛤 )} is varied
|
||||
|
||||
from 0.1 to 0.9, in steps of 0.1, while max{𝐻𝐻𝐼𝐼 (𝛤 )} is chosen
|
||||
5.4. Energy-efficient task execution model
|
||||
randomly from the range [0.1, 1.0].
|
||||
Assuming that the system is deterministically schedulable in both (Each task 𝜏𝑖 ’s pWCET 𝑖 and period 𝑇𝑖 are implicit, since both sys-
|
||||
modes, we can use DVFS to reduce the processor speed to 𝑆𝐿 in LO tem schedulability and normalized energy consumption are dependent
|
||||
mode, and set to 𝑆𝑚𝑎𝑥 = 1 in HI mode, while maintaining schedulability on the utilization values only, i.e., pWCU equal to pWCET divided by
|
||||
in both modes. We modify the task execution model in Section 3.1 to be period.) Note that the time overhead of the proposed method is mainly
|
||||
|
||||
7
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 2. Impact on the schedulability ratio by varying the permitted system failure
|
||||
𝐿𝑂
|
||||
probability 𝐹𝑠 and max{𝐿𝑂 (𝛤 )}.
|
||||
|
||||
|
||||
|
||||
|
||||
spent on the schedulability test, with significant time consumption
|
||||
arising from the calculation of the probabilistic processor demands for
|
||||
the task set, which involves a large number of convolution operations.
|
||||
As the number of tasks increases, the time overhead grows exponen-
|
||||
tially. To maintain the accuracy of the scheduling test, we have not
|
||||
yet identified better methods to reduce the time overhead. Hence, we
|
||||
have limited the number of tasks to four. In the future, we will strive
|
||||
to reduce the time overhead associated with convolutions.
|
||||
In the first experiment, we vary 𝐹𝑠 from 10−1 to 10−9 with a step
|
||||
size of 10 by multiplication, i.e., 𝐹𝑠 is plotted with log scale. The value
|
||||
𝐹𝑠 = 10−9 is based on the permitted failure probability of 10−9 for ASIL
|
||||
D, the highest safety certification level in ISO 26262. The additional
|
||||
case of 𝐹𝑠 = 0 is the special case of deterministic schedulability only for
|
||||
Fig. 3. Varying each HI task’s Threshold WCET index 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) and max{𝐿𝑂
|
||||
𝐿𝑂
|
||||
(𝛤 )}.
|
||||
hard real-time systems. Fig. 2 shows the results, where each data point
|
||||
represents the average outcome obtained from a variable number of
|
||||
task sets selected from 500 synthetic tasksets generated for each value
|
||||
of max{𝐿𝑂 𝐿𝑂 (𝛤 )}, using different seeds for the pseudo-random number • The schedulability ratio is negatively correlated with max
|
||||
𝐿𝑂 (𝛤 )}, as expected.
|
||||
{𝐿𝑂
|
||||
generator.
|
||||
• The schedulability ratio is negatively correlated with 𝐶𝑖𝑡ℎ𝑟 . With
|
||||
We make the following observations from Fig. 2:
|
||||
increasing 𝐶𝑖𝑡ℎ𝑟 , HI tasks have larger WCETs (both expected
|
||||
and maximum) in LO mode according to the trimming opera-
|
||||
• The schedulability ratio is positively correlated with 𝐹𝑠 , con- tion for pWCET defined in Eq. (2), causing max{ 𝐿 (𝑡)} and
|
||||
firming the significant advantages of considering probabilistic max{ 𝐻 (𝑡)} to increase, which reduces system schedulability.
|
||||
schedulability compared to considering deterministic schedulabil- • The average normalized energy consumption 𝑁 𝐸(𝛤 ) is positively
|
||||
ity only, even at very small values of 𝐹𝑠 for high levels of safety correlated with max{𝐿𝑂 𝐿𝑂 (𝛤 )}. From Eq. (22), 𝑁 𝐸(𝛤 ) is depen-
|
||||
certification. dent on each task’s expected pWCET 𝑥𝑖 and the energy-efficient
|
||||
• The schedulability ratio is negatively correlated with max processor speed in LO mode 𝑆𝐿 . With increasing max{𝐿𝑂 𝐿𝑂 (𝛤 )},
|
||||
{𝐿𝑂 𝐿𝑂 (𝛤 )}, since both max{ (𝑡)} and max{ (𝑡)} increase
|
||||
𝐿 𝐻 both 𝑥𝑖 and 𝑆𝐿 increase, causing 𝑁 𝐸(𝛤 ) to increase.
|
||||
with increasing max{𝐿𝑂 𝐿𝑂 (𝛤 )}, which reduces system schedulabil-
|
||||
• 𝑁 𝐸(𝛤 ) is positively correlated with 𝐶𝑖𝑡ℎ𝑟 . With increasing 𝐶𝑖𝑡ℎ𝑟 , HI
|
||||
ity. task 𝜏𝑖 has a larger expected pWCET in LO mode, causing both 𝑥𝑖
|
||||
and 𝑆𝐿 to increase, which in turn causes 𝑁 𝐸(𝛤 ) to increase.
|
||||
In the second experiment, we fix the permitted system failure prob-
|
||||
ability to be 𝐹𝑠 = 10−7 (based on the requirement for ASIL A in ISO Averaged over all cases, our approach achieves an average reduction
|
||||
26262). We vary each HI task’s 𝐶𝑖𝑡ℎ𝑟 through varying its index 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) of 33.49% for the average normalized energy consumption compared
|
||||
from 0 to 𝐾 − 1 with step size 1, i.e., the sequence {0, 1, 2, 3} (The to EDF without DVFS.
|
||||
case of 𝑖𝑛𝑑(𝐶𝑖𝑡ℎ𝑟 ) = 3 is the special case where each HI task 𝜏𝑖 has the
|
||||
7. Practical considerations
|
||||
same WCET in both modes.). Each LO task’s 𝐶𝑖𝑑 𝑒𝑔 is fixed to be the
|
||||
default value of 𝑖𝑛𝑑(𝐶𝑖𝑑 𝑒𝑔 ) = 1. The results are shown in Fig. 3, including
|
||||
In this section, we address some practical considerations in trans-
|
||||
both the schedulability ratio, and the normalized energy consumption
|
||||
posing our proposal into to industry practice.
|
||||
(𝑁 𝐸(𝛤 ) defined in Eq. (22)). Each data point represents the average
|
||||
Timing analysis for pWCET. Task 𝜏𝑖 ’s pWCET 𝑖 , as specified
|
||||
outcome obtained from a variable number of task sets selected from 500
|
||||
𝐿𝑂 (𝛤 )}, depending by its PMF, may be obtained via static, dynamic or measurement-
|
||||
synthetic tasksets generated for each value of max{𝐿𝑂
|
||||
𝑡ℎ𝑟 based, or hybrid timing analysis methods, as discussed in the survey
|
||||
on the value of 𝑖𝑛𝑑(𝐶𝑖 ).
|
||||
paper [49]. Static Probabilistic Timing Analysis (SPTA) is based on
|
||||
We make the following observations from Fig. 3: the analysis of the program code, along with an abstract model of the
|
||||
|
||||
8
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
hardware behavior. Measurement-Based Probabilistic Timing Analysis CRediT authorship contribution statement
|
||||
(MBPTA) typically applies Extreme Value Theory (EVT) to make a
|
||||
statistical estimate of the pWCET distribution of a program. Hybrid Yi-Wen Zhang: Writing – review & editing, Writing – original draft,
|
||||
Probabilistic Timing Analysis (HyPTA) combines both statistical and Methodology, Funding acquisition, Formal analysis, Conceptualization.
|
||||
analytical approaches, e.g., by taking measurements at the level of basic Jin-Long Zhang: Writing – original draft, Visualization, Software, Data
|
||||
blocks or sub-paths, and then composing the results using structural curation.
|
||||
information obtained from static analysis of the code.
|
||||
Number of discrete value (𝐾) of pWCET 𝑖 . The value of 𝐾 Declaration of competing interest
|
||||
determines the granularity of modeling the pWCET’s PMF: larger 𝐾
|
||||
implies finer granularity modeling, but may not be well-supported by
|
||||
The authors declare that they have no known competing finan-
|
||||
timing analysis techniques, and also leads to higher computational costs
|
||||
cial interests or personal relationships that could have appeared to
|
||||
in schedulability analysis. The typical value of 𝐾 is 2-8 [5], although
|
||||
influence the work reported in this paper.
|
||||
there is no hard lower or upper bound on its value. Our experiments
|
||||
with 𝐾 varying from 4 to 8 indicate that its value does not affect
|
||||
system schedulability and power consumption significantly, indicating Acknowledgments
|
||||
that 𝐾 = 4 already provides sufficiently fine granularity modeling
|
||||
under our experimental setup. This work has been supported by the Natural Science Foundation
|
||||
PMF of pWCET 𝑖 . In the absence of real industry tasksets, we of Fujian Province of China under Grant 2023J01139 and the Funda-
|
||||
need to generate each task’s pWCET 𝑖 synthetically, as defined by mental Research Funds for the Central Universities, China under Grant
|
||||
the PMF. There is no clear consensus on the generation method in the ZQN-1009.
|
||||
literature on probabilistic schedulability analysis. An early work Edgar
|
||||
and Burns [50] used the trimmed and scaled Gumbel distribution to Data availability
|
||||
model likely WCET values; Draskovic [36] used the Weibull distribution
|
||||
with an upper bound, which was used for modeling the distribution No data was used for the research described in the article.
|
||||
of long but unlikely execution times based on EVT [51] (the Log of a
|
||||
Weibull distribution is a Gumbel distribution); Wang et al. [46] and
|
||||
Markovic et al. [47] adopted the uniform random distribution; Bozhko References
|
||||
et al. [52] assumed two execution modes for each task in an MCS: a
|
||||
typical mode and a rare exceptional mode. Its pWCET is equal to 𝑐 [1] Alan Burns, Robert Ian Davis, Mixed criticality systems-a review:(february 2022),
|
||||
with probability .95 (the typical mode), and 4𝑐 with probability .05 2022, pp. 1–97, https://eprints.whiterose.ac.uk/183619/.
|
||||
[2] Steve Vestal, Preemptive scheduling of multi-criticality systems with varying
|
||||
(the exceptional mode), where 𝑐 was scaled to match the expected task
|
||||
degrees of execution time assurance, in: 28th IEEE International Real-Time
|
||||
utilization. In this paper, we adopt the simple approach of the uniform Systems Symposium, RTSS 2007, IEEE, 2007, pp. 239–243.
|
||||
random distribution similar to [46,47]. [3] Yi-Wen Zhang, Jin-Peng Ma, Hui Zheng, Zonghua Gu, Criticality-aware EDF
|
||||
Runtime overhead of DVFS. The overhead of varying the pro- scheduling for constrained-deadline imprecise mixed-criticality systems, IEEE
|
||||
cessor speed with DVFS is assumed to be zero. This is a common Trans. Comput.-Aided Des. Integr. Circuits Syst. 43 (2) (2024) 480–491.
|
||||
[4] Yi-Wen Zhang, Hui Zheng, Slack time management for imprecise mixed-criticality
|
||||
assumption adopted in the DVFS literature [7]. We can determine
|
||||
systems with reliability constraints, IEEE Trans. Comput. (2025).
|
||||
through offline measurement an upper bound on the processor speed [5] Robert I. Davis, Liliana Cucu-Grosjean, A survey of probabilistic schedulability
|
||||
transition overhead, which is typically relatively small compared to the analysis techniques for real-time systems, Leibniz Trans. Embed. Syst. 6 (1)
|
||||
WCET of the task, hence it can be added to each task’s execution time (2019) 04:1–04:53.
|
||||
without a significant impact on the solution. [6] Yi-Wen Zhang, Rong-Kun Chen, A survey of energy-aware scheduling in
|
||||
Multiprocessor platforms. Our work can be easily extended to mixed-criticality systems, J. Syst. Archit. 127 (2022) 102524.
|
||||
[7] Ashikahmed Bhuiyan, Federico Reghenzani, William Fornaciari, Zhishan Guo,
|
||||
multi-processor platforms by a partitioned scheduling approach [31,32,
|
||||
Optimizing energy in non-preemptive mixed-criticality scheduling by exploiting
|
||||
53]. In partitioned scheduling, tasks are statically assigned to proces- probabilistic information, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
|
||||
sors, with each processor managed by a local scheduler. We can use 39 (11) (2020) 3906–3917.
|
||||
simple allocation methods, e.g., Criticality-unaware worst-fit decreas- [8] Yi-Wen Zhang, Chen Ouyang, Semi-clairvoyant scheduling in non-preemptive
|
||||
ing (CU-WFD), and criticality-aware first-fit decreasing (CA-FFD), to fixed-priority mixed-criticality systems, J. Syst. Archit. 159 (2025) 103332.
|
||||
[9] Qingling Zhao, Mengfei Qu, Zonghua Gu, Haibo Zeng, Minimizing stack memory
|
||||
allocate tasks to each processor while using an Energy-Efficient Task
|
||||
for partitioned mixed-criticality scheduling on multiprocessor platforms, ACM
|
||||
Execution Model to schedule tasks in each processor. Trans. Embed. Comput. Syst. (TECS) 21 (2) (2022) 1–30.
|
||||
[10] Benny Akesson, Mitra Nasri, Geoffrey Nelissen, Sebastian Altmeyer, Robert I
|
||||
8. Conclusions and future work Davis, A comprehensive survey of industry practice in real-time systems,
|
||||
Real-Time Syst. (2021) 1–41.
|
||||
The classic MCS task model has several restrictive assumptions, [11] Georg von der Brüggen, Nico Piatkowski, Kuan-Hsun Chen, Jian-Jia Chen,
|
||||
Katharina Morik, Björn B Brandenburg, Efficiently approximating the worst-
|
||||
including hard real-time constraints, dropping LO tasks in HI mode,
|
||||
case deadline failure probability under EDF, in: 2021 IEEE Real-Time Systems
|
||||
and lack of consideration of power/energy consumption issues. In Symposium, RTSS, IEEE, 2021, pp. 214–226.
|
||||
this paper, we relax these assumptions to make the MCS task model [12] Alexandre Esper, Geoffrey Nelissen, Vincent Nélis, Eduardo Tovar, An industrial
|
||||
more practically applicable. We consider an IMC taskset scheduled view on the common academic understanding of mixed-criticality systems,
|
||||
with the EDF algorithm on a uniprocessor platform, and propose an Real-Time Syst. 54 (3) (2018) 745–795.
|
||||
[13] Sanjoy Baruah, Alan Burns, Implementing mixed criticality systems in ADA, in:
|
||||
Energy-Efficient Task Execution Model that guarantees (deterministic
|
||||
International Conference on Reliable Software Technologies, Springer, 2011, pp.
|
||||
or probabilistic) schedulability, allows degraded QoS to LO tasks in HI 174–188.
|
||||
mode, and applies DVFS to save energy. [14] Sanjoy K. Baruah, Alan Burns, Robert I. Davis, Response-time analysis for mixed
|
||||
In this paper, we have considered EDF-based uniprocessor schedul- criticality systems, in: 2011 IEEE 32nd Real-Time Systems Symposium, IEEE
|
||||
ing, dual-criticality MCS, and task execution time as probabilistic vari- Computer Society, 2011, pp. 34–43.
|
||||
ables. As part of future work, these assumptions can be further relaxed [15] François Santy, Gurulingesh Raravi, Geoffrey Nelissen, Vincent Nelis, Pratyush
|
||||
Kumar, Joël Goossens, Eduardo Tovar, Two protocols to reduce the critical-
|
||||
to fixed-priority scheduling, multi-processor platforms, multiple crit- ity level of multiprocessor mixed-criticality systems, in: Proceedings of the
|
||||
icality levels, and the multiple task parameters (e.g., task period) 21st International Conference on Real-Time Networks and Systems, 2013, pp.
|
||||
represented by random variables. 183–192.
|
||||
|
||||
|
||||
9
|
||||
Y.-W. Zhang and J.-L. Zhang Journal of Systems Architecture 160 (2025) 103361
|
||||
|
||||
|
||||
[16] Sanjoy Baruah, Vincenzo Bonifaci, Gianlorenzo DAngelo, Haohan Li, Alberto [39] Yifeng Guo, Dakai Zhu, Hakan Aydin, Jian-Jun Han, Laurence T Yang, Exploit-
|
||||
Marchetti-Spaccamela, Suzanne Van Der Ster, Leen Stougie, The preemptive ing primary/backup mechanism for energy efficiency in dependable real-time
|
||||
uniprocessor scheduling of mixed-criticality implicit-deadline sporadic task sys- systems, J. Syst. Archit. 78 (2017) 68–80.
|
||||
tems, in: 2012 24th Euromicro Conference on Real-Time Systems, IEEE, 2012, [40] Yi-Wen Zhang, System level fixed priority energy management algorithm for
|
||||
pp. 145–154. embedded real time application, Microprocess. Microsyst. 64 (2019) 170–177.
|
||||
[17] Di Liu, Nan Guan, Jelena Spasic, Gang Chen, Songran Liu, Todor Stefanov, Wang [41] Yi-Wen Zhang, Chu-Gui Xu, Low power fixed priority scheduling sporadic task
|
||||
Yi, Scheduling analysis of imprecise mixed-criticality real-time tasks, IEEE Trans. with shared resources in hard real time systems, Microprocess. Microsyst. 45
|
||||
Comput. 67 (7) (2018) 975–991. (2016) 164–175.
|
||||
[18] Hang Su, Nan Guan, Dakai Zhu, Service guarantee exploration for mixed- [42] Wei Jiang, Xiong Pan, Ke Jiang, Liang Wen, Qi Dong, Energy-aware design of
|
||||
criticality systems, in: 2014 IEEE 20th International Conference on Embedded stochastic applications with statistical deadline and reliability guarantees, IEEE
|
||||
and Real-Time Computing Systems and Applications, IEEE, 2014, pp. 1–10. Trans. Comput.-Aided Des. Integr. Circuits Syst. 38 (8) (2019) 1413–1426.
|
||||
[19] Robert I. Davis, Alan Burns, Iain Bate, Compensating adaptive mixed criticality [43] Yi-Wen Zhang, Hui Zheng, Energy-aware fault-tolerant scheduling for imprecise
|
||||
scheduling, in: Proceedings of the 30th International Conference on Real-Time mixed-criticality systems with semi-clairvoyance, J. Syst. Archit. 151 (2024)
|
||||
Networks and Systems, Association for Computing Machinery, 2022, pp. 81–93. 103141.
|
||||
[20] Zhe Jiang, Xiaotian Dai, Alan Burns, Neil Audsley, Zonghua Gu, Ian Gray, A [44] Yi-Wen Zhang, Hui Zheng, Energy-aware reliability guarantee scheduling with
|
||||
high-resilience imprecise computing architecture for mixed-criticality systems, semi-clairvoyant in mixed-criticality systems, J. Syst. Archit. 156 (2024) 103269.
|
||||
IEEE Trans. Comput. (2022). [45] Baoxian Zhao, Hakan Aydin, Dakai Zhu, Energy management under general
|
||||
[21] Yi-Wen Zhang, Rui-Feng Guo, Low-power scheduling algorithms for sporadic task-level reliability constraints, in: 2012 IEEE 18th Real Time and Embedded
|
||||
task with shared resources in hard real-time systems, Comput. J. 58 (7) (2015) Technology and Applications Symposium, IEEE, 2012, pp. 285–294.
|
||||
1585–1597. [46] Tianyi Wang, Soamar Homsi, Linwei Niu, Shaolei Ren, Ou Bai, Gang Quan,
|
||||
[22] Pengcheng Huang, Pratyush Kumar, Georgia Giannopoulou, Lothar Thiele, En- Meikang Qiu, Harmonicity-aware task partitioning for fixed priority scheduling
|
||||
ergy efficient dvfs scheduling for mixed-criticality systems, in: 2014 International of probabilistic real-time tasks on multi-core platforms, ACM Trans. Embed.
|
||||
Conference on Embedded Software, EMSOFT, IEEE, 2014, pp. 1–10. Comput. Syst. (TECS) 16 (4) (2017) 1–21.
|
||||
[23] Yi-Wen Zhang, Energy-aware mixed-criticality sporadic task scheduling algo- [47] Filip Markovic, Thomas Nolte, Alessandro Vittorio Papadopoulos, Analytical
|
||||
rithm, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 40 (1) (2021) approximations in probabilistic analysis of real-time systems, in: Proceedings of
|
||||
78–86. the 43rd IEEE Real-Time Systems Symposium, RTSS, IEEE, 2022.
|
||||
[24] Yi-Wen Zhang, Rong-Kun Chen, Energy aware fixed priority scheduling in [48] Jonah Caplan, Zaid Al-Bayati, Haibo Zeng, Brett H. Meyer, Mapping and
|
||||
mixed-criticality systems, Comput. Stand. Interfaces 83 (2023) 103671. scheduling mixed-criticality systems with on-demand redundancy, IEEE Trans.
|
||||
[25] Yi-Wen Zhang, Energy efficient non-preemptive scheduling of imprecise Comput. 67 (4) (2017) 582–588.
|
||||
mixed-criticality real-time tasks, Sustain. Comput.: Inform. Syst. 37 (2023) [49] Robert I. Davis, Liliana Cucu-Grosjean, A survey of probabilistic timing analysis
|
||||
100840. techniques for real-time systems, LITES: Leibniz Trans. Embed. Syst. (2019) 1–60.
|
||||
[26] Yi-Wen Zhang, Ning Cai, Energy efficient EDF-VD-based mixed-criticality [50] Stewart Edgar, Alan Burns, Statistical analysis of WCET for scheduling, in:
|
||||
scheduling with shared resources, J. Syst. Archit. 119 (2021) 102246. Proceedings 22nd IEEE Real-Time Systems Symposium (RTSS 2001)(Cat. No.
|
||||
[27] Y.-W. Zhang, Energy aware algorithm based on actual utilization for periodic 01PR1420), IEEE, 2001, pp. 215–224.
|
||||
tasks in mixed-criticality real-time systems, Comput. Stand. Interfaces 79 (2022) [51] Liliana Cucu-Grosjean, Luca Santinelli, Michael Houston, Code Lo, Tullio Var-
|
||||
103563. danega, Leonidas Kosmidis, Jaume Abella, Enrico Mezzetti, Eduardo Quinones,
|
||||
[28] Yi-Wen Zhang, DVFS-based energy-aware scheduling of imprecise mixed- Francisco J Cazorla, Measurement-based probabilistic timing analysis for multi-
|
||||
criticality real-time tasks, J. Syst. Archit. 137 (2023) 102849. path programs, in: 2012 24th Euromicro Conference on Real-Time Systems, IEEE,
|
||||
[29] Sujay Narayana, Pengcheng Huang, Georgia Giannopoulou, Lothar Thiele, 2012, pp. 91–101.
|
||||
R Venkatesha Prasad, Exploring energy saving for mixed-criticality systems [52] Sergey Bozhko, Georg von der Brüggen, Björn Brandenburg, Monte carlo
|
||||
on multi-cores, in: 2016 IEEE Real-Time and Embedded Technology and response-time analysis, in: IEEE 42nd Real-Time Systems Symposium, IEEE, 2021,
|
||||
Applications Symposium, RTAS, IEEE, 2016, pp. 1–12. pp. 342–355.
|
||||
[30] Behnaz Ranjbar, Tuan D.A. Nguyen, Alireza Ejlali, Akash Kumar, Power-aware [53] Yi-Wen Zhang, Rong-Kun Chen, Energy-efficient scheduling of imprecise mixed-
|
||||
runtime scheduler for mixed-criticality systems on multicore platform, IEEE criticality real-time tasks based on genetic algorithm, J. Syst. Archit. 143 (2023)
|
||||
Trans. Comput.-Aided Des. Integr. Circuits Syst. 40 (10) (2021) 2009–2023. 102980.
|
||||
[31] Yi-Wen Zhang, Rong-Kun Chen, Zonghua Gu, Energy-aware partitioned schedul-
|
||||
ing of imprecise mixed-criticality systems, IEEE Trans. Comput.-Aided Des. Integr.
|
||||
Circuits Syst. 42 (11) (2023) 3733–3742. Yi-Wen Zhang (Senior Member, IEEE) received his Ph.D
|
||||
[32] Yi-Wen Zhang, Jin-Peng Ma, Zonghua Gu, Partitioned scheduling with shared in Computer Application Technology from University of Chi-
|
||||
resources on imprecise mixed-criticality multiprocessor systems, IEEE Trans. nese Academy of Sciences in 2016. He was a Post-doctoral
|
||||
Comput.-Aided Des. Integr. Circuits Syst. 44 (1) (2025) 65–76. Fellow with Shenyang Institute of Computing Technology,
|
||||
[33] Luca Santinelli, Laurent George, Probabilities and mixed-criticalities: the Chinese Academy of Sciences from 2017 to 2019.
|
||||
probabilistic c-space, in: Proceedings of WMC, 2015. He has been an associate professor since 2020. He is
|
||||
[34] Dorin Maxim, Robert I Davis, Liliana Cucu-Grosjean, Arvind Easwaran, Prob- named in the world’s top 2% of Scientists List 2023 and
|
||||
abilistic analysis for mixed criticality systems using fixed priority preemptive 2024 by Stanford University. His current research interests
|
||||
scheduling, in: Proceedings of the 25th International Conference on Real-Time include real-time systems and low-power design.
|
||||
Networks and Systems, 2017, pp. 237–246.
|
||||
[35] Jasdeep Singh, Luca Santinelli, Federico Reghenzani, Konstantinos Bletsas,
|
||||
Zhishan Guo, Non-preemptive scheduling of periodic mixed-criticality real-time Jin-Long Zhang received the B.E. degree in Software En-
|
||||
systems, in: Proceedings of the 10th European Congress on Embedded Real-Time gineering from Jiangxi Agricultural University in 2023. He
|
||||
Systems, ERTS 2020, IEEE, 2020. is currently pursuing the MS degree in Huaqiao University.
|
||||
[36] Stefan Draskovic, Rehan Ahmed, Pengcheng Huang, Lothar Thiele, Schedulability His current research interests include real-time systems and
|
||||
of probabilistic mixed-criticality systems, Real-Time Syst. 57 (4) (2021) 397–442. low power design.
|
||||
[37] Zhishan Guo, Sudharsan Vaidhun, Luca Satinelli, Samsil Arefin, Jun Wang,
|
||||
Kecheng Yang, Mixed-criticality scheduling upon permitted failure probability
|
||||
and dynamic priority, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 41
|
||||
(1) (2021) 62–75.
|
||||
[38] Kuan-Hsun Chen, Mario Günzel, Georg von der Brüggen, Jian-Jia Chen, Critical
|
||||
instant for probabilistic timing guarantees: Refuted and revisited, in: 2022 IEEE
|
||||
Real-Time Systems Symposium, RTSS, IEEE, 2022, pp. 145–157.
|
||||
|
||||
|
||||
|
||||
|
||||
10
|
||||
|
||||
@@ -0,0 +1,70 @@
|
||||
Embedded
|
||||
Software Design
|
||||
Journal of Systems Architecture
|
||||
|
||||
The EUROMICRO Journal
|
||||
Editor-in-Chief
|
||||
Dr. Zonghua Gu
|
||||
Department of Computer Science, Hofstra University, USA
|
||||
|
||||
|
||||
Subject Area Editors W. Meng
|
||||
L. Almeida Technical University of Denmark, Lyngby, Denmark
|
||||
Faculdade de Engenharia, Dept. of Electrical and Computer Engineering, M. Nasri
|
||||
Universidade do Porto, Porto, Portugal Department of Mathematics and Computer Science, Eindhoven University of
|
||||
J.H. Anderson Technology, Eindhoven, the Netherlands
|
||||
Dept. of Computer Science, University of North Carolina at Chapel Hill, G. Palermo
|
||||
Chapel Hill, North Carolina, USA Department of Electronics Information and Bioengineering,
|
||||
P. Bellavista Polytechnic University of Milan, Italy
|
||||
Dept. Computer Science and Engineering (DISI), Alma Mater Studiorum, L. Palopoli
|
||||
Università di Bologna, Bologna, Italy Dipartimento di Ingegneria e Scienza dell’Informazione (DISI),
|
||||
C.-S. Bouganis Università di Trento, Povo (Trento), Italy
|
||||
South Kensington Campus, Department of Electrical and Electronic S. Ren
|
||||
Engineering, Imperial College London, London, England, UK Department of Electrical and Computer Engineering, San Diego State University,
|
||||
L. Cassano USA
|
||||
Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, S. Sarangi
|
||||
Italy Department of Computer Science and Engineering, Indian Institute of
|
||||
G. Chen Technology Delhi, India
|
||||
School of Computer Science and Engineering, Sun Yat-sen University, M. Schoeberl
|
||||
Guangzhou, China DTU Informatics, Danmarks Tekniske Universitet (DTU), Richard Petersens
|
||||
M. García-Valls Plads, Kongens Lyngby, Denmark
|
||||
Departamento de Ingeniería Telemática, Universidad Carlos III de Madrid, Z. Shao
|
||||
Leganés, Madrid, Spain Dept. of Computing, The Hong Kong Polytechnic University, Hong Kong
|
||||
C. Gill M. Staron
|
||||
Department of Computer Science and Engineering, Washington University, USA Computer Science and Engineering, University of Gothenburg,
|
||||
A. Gokhale Gothenburg, Sweden
|
||||
Dept. of Electrical Engineering and Computer Science, Vanderbilt University, F. Tramarin
|
||||
Nashville, Tennessee, USA Dip. Gestione e Tecnica dei Sistemi Industriali (DTG), Università degli Studi di
|
||||
N. Guan Padova, Vicenza, Italy
|
||||
Dept. of Computing, The Hong Kong Polytechnic University, Hong Kong M.A. Vega-Rodriguez
|
||||
J. Hu ARCO Research Group, Dept. Technologies of Computers & Communications,
|
||||
Department of Electrical and Computer Engineering, University of Pittsburgh, USA Universidad de Extremadura, Escuela Politecnica. Campus Universitario,
|
||||
Y. Jiang Cáceres, Spain
|
||||
School of Software, Tsinghua University, China S. Wan
|
||||
H. Kapoor School of Information and Safety Engineering, Zhongnan University of
|
||||
Department of Computer Science and Engineering, Indian Institute of Technology Economics and Law, China
|
||||
Guwahati, India H. Wu
|
||||
A. Kritikakou Center for Applied Mathematics, Tianjin University, China
|
||||
University of Rennes, Inria, Irisa and CNRS, France G. Xie
|
||||
F. Li College of Computer Science and Electronic Engineering, Hunan University,
|
||||
School of Computer Science and Engineering, University of Electronics Science and Changsha, China
|
||||
Technology of China, China W. Xu
|
||||
S. Li Zhejiang University College of Electrical Engineering, Hangzhou, China
|
||||
College of Computer Science, Zhejiang University Hangzhou, China H. Zeng
|
||||
G. Lima Virginia Tech, Blacksburg, Virginia, USA
|
||||
Instituto de Matematica, Departamento de Ciencia da Computacao, Y. Zhang
|
||||
Federal University of Bahia, Salvador, Bahia, Brazil Department of Computer Science, University of Pittsburgh,
|
||||
M. Lin Pittsburgh, Pennsylvania, USA
|
||||
Department of Computer Science, St. Francis Xavier University, Canada Q. Zhao
|
||||
G. Lipari Nanjing University of Science and Technology, Nanjing, China
|
||||
Ecole Normale Superieure (ENS) de Cachan, Cachan, France N. Zheng
|
||||
D. Liu Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, China
|
||||
College of Computer Science and Technology, Chongqing University, Chongqing, J. Zhou
|
||||
China Department of Computer Science and Technology, Nanjing University of Science
|
||||
W. Liu and Technology, China
|
||||
School of Computer Science and Engineering, Nanyang Technological University, D. Zhu
|
||||
Singapore Dept. of Computer Science, University of Texas at San Antonio, San Antonio,
|
||||
L. Lo Bello Texas, USA
|
||||
Dipart. di Ingegneria Elettrica Elettronica e Informatica (DIEEI),
|
||||
Università degli Studi di Catania, Catania, Italy
|
||||
|
||||
@@ -0,0 +1,999 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
Efficient and secure multi-user 𝑘NN queries with dynamic POIs updating
|
||||
Yining Jia a,b,c , Yali Liu a,b,c ,∗, Congai Zeng a,b,c , Xujie Ding a,b,c , Jianting Ning d,e
|
||||
a
|
||||
School of Artificial Intelligence and Computer Science, Jiangsu Normal University, Xuzhou, Jiangsu Province, 221116, China
|
||||
b
|
||||
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu Province, 210023, China
|
||||
c Guangxi Key Laboratory of Cryptography and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi Province, 541004, China
|
||||
d School of Cyber Science and Engineering, Wuhan University, Wuhan, Hubei Province, 430072, China
|
||||
e Faculty of Data Science, City University of Macau, 999078, Macao Special Administrative Region of China
|
||||
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: The 𝑘-nearest neighbors (𝑘NN) query is a key operation in spatial and multimedia databases, which is widely
|
||||
Cloud computing applied in fields such as electronic healthcare and Location-Based Services (LBS). With the rapid development
|
||||
Security of cloud computing, uploading private data of Data Owner (DO) to Cloud Servers (CS) has become a trend.
|
||||
kNN queries
|
||||
However, existing 𝑘NN queries schemes are not designed for multi-user environments, cannot timely update
|
||||
Dynamic POIs updating
|
||||
the points of interest (POIs) stored in CS, and suffer from low query efficiency. Therefore, this paper proposes
|
||||
efficient and secure multi-user 𝑘NN queries with dynamic POIs updating, named DESM𝑘NN, which achieves
|
||||
secure multi-user 𝑘NN queries. To improve query efficiency, DESM𝑘NN adopts a two-stage search framework,
|
||||
which consists of an initial filtering stage based on hierarchical clustering to effectively constrain the search
|
||||
range, followed by a more efficient precise search stage. Based on this framework, DESM𝑘NN designs a set of
|
||||
security protocols for efficient query processing and enables dynamic POIs updates. Meanwhile, DESM𝑘NN not
|
||||
only utilizes Distributed Two Trapdoors Public-Key Cryptosystem (DT-PKC) to enable multi-user queries but
|
||||
also ensures data privacy, query privacy, result privacy and access pattern privacy. Moreover, DESM𝑘NN can
|
||||
verify the correctness and completeness of queries results. Finally, security analysis proves that DESM𝑘NN
|
||||
meets the formal security definition of multiparty computation, and experimental evaluation shows that
|
||||
DESM𝑘NN improves query efficiency by up to 45.5% compared with existing 𝑘NN queries scheme.
|
||||
|
||||
|
||||
|
||||
1. Introduction and LBS systems. Once such information is exposed, it can lead to
|
||||
privacy leakage, commercial losses, or even public security risks [4].
|
||||
LBS [1–3] are increasingly integrated into real-world applications, Therefore, to protect POIs from malicious access or theft by CS and
|
||||
such as ride-hailing platforms (e.g., Uber, DiDi), navigation systems unauthorized users, DO needs to encrypt them before outsourcing to
|
||||
(e.g., Google Maps, Baidu Maps), and online food delivery services. CS. In addition, security needs to be considered in query processing to
|
||||
These services heavily rely on POIs databases to provide personalized maintain efficiency and protect the confidentiality of POIs databases.
|
||||
and efficient responses to queries of query user (QU). Among various Although 𝑘NN queries have been widely studied in recent years,
|
||||
query types, the 𝑘NN query [4,5] is one of the most fundamental several limitations still hinder their applicability in practice. First, most
|
||||
methods, which aims to find the 𝑘 nearest POIs to a given query point. existing schemes [8,9] for 𝑘NN queries are based on static spatial
|
||||
With the rapid development of cloud computing [6,7], DO increasingly
|
||||
data [10], where the database remains unchanged within a certain
|
||||
outsource their POIs databases to CS, which provides scalable storage
|
||||
time interval. Consistent with this common setting, DESM𝑘NN also
|
||||
and massive computing resources. Well-known commercial platforms,
|
||||
assumes that POIs are static during query processing to enable fair
|
||||
such as Amazon Web Services and Google Cloud Platform, already
|
||||
performance comparison. However, in practice, POIs may change over
|
||||
provide such services to support efficient 𝑘NN queries in LBS. Although
|
||||
time, and their insertion or deletion frequency varies across different
|
||||
outsourcing databases to CS improves data accessibility and flexibility,
|
||||
it makes data more susceptible to unauthorized access threats. In prac- areas because these updates are driven by real-world change. In rapidly
|
||||
tice, POIs often contain sensitive or private information. For instance, developing areas where new facilities emerge or existing ones close
|
||||
POIs databases may include the locations of hospitals, government frequently, POI updates occur more frequently, whereas in more stable
|
||||
facilities, or user-related activity areas in intelligent transportation regions, such updates tend to be infrequent. This dynamic updates of
|
||||
|
||||
|
||||
∗ Corresponding author at: School of Artificial Intelligence and Computer Science, Jiangsu Normal University, Xuzhou, Jiangsu Province, 221116, China.
|
||||
E-mail address: liuyali@jsnu.edu.cn (Y. Liu).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104112
|
||||
Received 12 June 2025; Received in revised form 18 November 2025; Accepted 8 December 2025
|
||||
Available online 11 December 2025
|
||||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
system construction is introduced. Section 6 presents the specific query
|
||||
procedure for DESM𝑘NN. Next, Section 7 analyzes computational com-
|
||||
plexity, communication complexity, and security. Section 8 provides an
|
||||
experimental evaluation of DESM𝑘NN. Section 9 concludes this paper.
|
||||
|
||||
2. Related work
|
||||
|
||||
Secure Key-Sharing Query: Wong et al. [11] introduced a 𝑘NN
|
||||
queries scheme for encrypted data based on ASPE. However, ASPE re-
|
||||
lied on a secret matrix to transform data points and query points, which
|
||||
required secret key to be shared among all QUs and DO. Additionally,
|
||||
ASPE has been proven insecure against known-plaintext attacks [13].
|
||||
To enhance query security, Elmehdwi et al. [15] developed a set of
|
||||
two-party computation protocols based on the Paillier cryptosystem.
|
||||
Although scheme [15] preserved the privacy of query results, QUs hold
|
||||
DO’s private key, and the query efficiency remains low. Moreover,
|
||||
scheme [16] employed Delaunay triangulation and order-preserving
|
||||
Fig. 1. Sample of the 𝑘NN query (𝑘 = 2). encryption [18] to accurately solve the secure 𝑘NN problem. Neverthe-
|
||||
less, the encryption schemes in [16] are symmetric, which also required
|
||||
DO and QUs to share the key. Cui et al. [8] proposed an efficient,
|
||||
POIs reflects the continuous changes in the physical environment. As secure, and verifiable 𝑘NN queries scheme, which employed a secure
|
||||
shown in Fig. 1, 𝑈0 searches for the two nearest neighbors (𝑘 = 2) index structure to ensure data security and result integrity, along with
|
||||
in a POIs database 𝐷 = {𝑝0 , … , 𝑝7 }. The original 2NN query 𝑄 was a set of novel protocols and verification strategies for various index
|
||||
{𝑝0 , 𝑝1 }. When a new and closer point 𝑝8 is inserted, the correct 2NN operations. However, the search complexity of scheme [8] was linearly
|
||||
result becomes {𝑝1 , 𝑝8 }. This example shows that any updates to the related to the database size, which led to a lack of scalability. To
|
||||
POI database, such as the insertion, modification, or deletion of POIs, address the efficiency issues in [8], Liu et al. [14] introduced a two-
|
||||
may change the query results. Therefore, dynamic updates must be sup- stage search framework for secure and verifiable 𝑘NN queries, which
|
||||
ported in outsourced POI databases. Second, existing schemes mostly integrated Edge Servers (ES) into the classic Twin-Cloud model by
|
||||
use Asymmetric-Scalar-Product-Preserving Encryption (ASPE) [11,12] leveraging adaptive encryption strategies and secure data partition-
|
||||
or pure homomorphic encryption algorithms to encrypt outsourced ing to optimize query performance. However, both scheme [8] and
|
||||
data. Unfortunately, ASPE has been demonstrated to be insecure under scheme [14] could not resolve the key-sharing issue.
|
||||
the known-plaintext attacks [13], and homomorphic operations lead to Secure Multi-User Query: To support multi-user 𝑘NN queries, re-
|
||||
a significant computational cost. These limitations raise the challenge searchers first focused on multi-key queries. Cheng et al. [17] imple-
|
||||
of designing an efficient and secure query mechanism. Finally, most mented 𝑘NN queries with multi-key support, where DO and QUs had
|
||||
solutions [14,15] assume a single-user setting, where all QUs share the their own keys, and each QU’s key was not shared with others. How-
|
||||
same secret key to enable computability of encrypted data across multi- ever, scheme [17] incurred high computational cost and lacked result
|
||||
user. In practice, the assumption of single-user setting has obvious verification. Subsequently, Liu et al. proposed the DT-PKC [19], which
|
||||
flaws. Once the unique key of any QUs is leaked, the entire encrypted also allowed different QUs to use different keys during queries. Building
|
||||
database can be completely decrypted, and the query content may also on the DT-PKC, Cheng et al. [20] and Nayak et al. [21] explored range
|
||||
be intercepted by the adversary. As illustrated in Fig. 1, in such a single- queries and keyword queries, respectively. Nevertheless, scheme [20]
|
||||
user setting, 𝑈1 and 𝑈2 can capture the query content and result of 𝑈0 and scheme [21] still suffered from computational cost and the inability
|
||||
and decrypt them using the same secret key as 𝑈0 . This highlights the to verify results. Cui et al. [9] introduced a method for secure and
|
||||
need for secure multi-user queries. verifiable 𝑘NN queries by utilizing DT-PKC, which encrypted grid and
|
||||
To resolve the aforementioned challenges, this paper proposes bucket divisions within the Voronoi diagram to maintain data security,
|
||||
DESM𝑘NN. The contributions of DESM𝑘NN are as follows: while also introducing a verification strategy to ensure the correctness
|
||||
and completeness of the query results. However, scheme [9] relied
|
||||
(1) Dynamic POIs Updating : DESM𝑘NN innovatively designs secure heavily on homomorphic encryption and data packing techniques,
|
||||
insertion and deletion protocols, which avoids the problem of which led to high computational cost and search complexity. Moreover,
|
||||
incorrect and incomplete query results. scheme [9] fails to address the issue of dynamic updates for POIs.
|
||||
(2) Efficient Query: DESM𝑘NN proposes an efficient two-stage In summary, the limitations in the existing 𝑘NN queries schemes
|
||||
search framework, which improves the query performance. are as follows: (1) The single-user queries schemes have a risk of key
|
||||
(3) Multi-User Query: DESM𝑘NN designs a series of secure protocols leakage. (2) The multi-user queries schemes have low efficiency. (3)
|
||||
based on DT-PKC, which achieves secure multi-user 𝑘NN queries. Most existing queries schemes unable to achieve dynamic updates of
|
||||
(4) Security & Performance: Security analysis shows that the pro- POIs. For ease of exhibition, we summarize the above works in Table
|
||||
posed DESM𝑘NN is secure. Additionally, experimental evalua- 1.
|
||||
tion shows that DESM𝑘NN improves query efficiency by up to
|
||||
45.5% compared with existing 𝑘NN queries scheme on two real
|
||||
3. Preliminaries
|
||||
datasets (California Road Network and Points of Interest, San
|
||||
Francisco Road Network1 ).
|
||||
3.1. Voronoi diagram
|
||||
The rest of this paper is structured as follows. Section 2 presents
|
||||
related work. Section 3 describes preliminaries. The architecture and The Voronoi diagram [22] partitions the plane according to a set of
|
||||
security model of DESM𝑘NN is defined in Section 4. In Section 5, the points. Each Voronoi Cell (VC) corresponds to a point and contains all
|
||||
locations that are closer to this point than to any other. Two points are
|
||||
Voronoi neighbors if their cells share an edge, and the neighbor set of
|
||||
1
|
||||
https://users.cs.utah.edu/~lifeifei/SpatialDataset.htm. a point is denoted as 𝑉 𝑁(𝑝).
|
||||
|
||||
2
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
Table 1
|
||||
Summary of existing 𝑘NN query works.
|
||||
Method Data privacy Query privacy Result privacy Access patterns Verifiable Multi-user POIs updating
|
||||
√ √
|
||||
Wong [11] × × × × ×
|
||||
√ √ √ √
|
||||
Elmehdwi [15] × × ×
|
||||
√ √ √
|
||||
Choi [16] × × × ×
|
||||
√ √ √ √
|
||||
Cheng [17] × × ×
|
||||
√ √ √ √ √
|
||||
Cui [8] × ×
|
||||
√ √ √ √
|
||||
Liu [14] × × ×
|
||||
√ √ √ √ √ √
|
||||
Cui [9] ×
|
||||
√
|
||||
Notations: ‘ ’ represents the approach satisfies the condition; ‘×’ represents it fails to satisfy the condition.
|
||||
|
||||
|
||||
DESM𝑘NN introduces hierarchical clustering, which improves both the
|
||||
organization of spatial objects and the performance of query processing.
|
||||
As shown in Fig. 3, it presents an R-tree with a fanout of 𝑓 = 2,
|
||||
which is built from the POIs in 𝑅𝑒𝑐𝑡1 . In this construction, the data are
|
||||
first grouped by applying hierarchical clustering based on the Euclidean
|
||||
distance. This process is performed in two rounds, and the resulting
|
||||
clusters naturally determine the partitioning of the dataset, which is
|
||||
then used to build the tree structure.
|
||||
|
||||
3.3. Distributed two trapdoors public-key cryptosystem
|
||||
|
||||
The DT-PKC [19] is a variant of the traditional double trapdoor
|
||||
decryption cryptosystem. Given a public key 𝑝𝑘, a private key 𝑠𝑘, and
|
||||
Fig. 2. An example of Voronoi diagram. a strong private key 𝑆𝐾, the cryptosystem supports several algorithms
|
||||
that enable encryption, decryption, and collaborative key operations.
|
||||
First, encryption is carried out by the algorithm 𝐸𝑛𝑐. Given a
|
||||
message 𝑝 ∈ Z𝑁 and the public key 𝑝𝑘, the algorithm outputs the
|
||||
ciphertext 𝐸𝑝𝑘 (𝑝). The system then allows two types of decryption:
|
||||
|
||||
(1) With the private key (𝑠𝑘), the algorithm 𝑊 𝐷𝑒𝑐 takes 𝐸𝑝𝑘 (𝑝) as
|
||||
input and recovers 𝑝.
|
||||
(2) With the strong private key (𝑆𝐾), the algorithm 𝑆𝐷𝑒𝑐 also
|
||||
decrypts 𝐸𝑝𝑘 (𝑝) to obtain 𝑝.
|
||||
|
||||
A distinctive feature of DT-PKC lies in the management of the strong
|
||||
private key. The algorithm 𝑆𝑘𝑒𝑦𝑆 enables the strong private key 𝑆𝐾 to
|
||||
be split into two partial strong private keys, 𝑆𝐾1 and 𝑆𝐾2 . This splitting
|
||||
supports a collaborative decryption mechanism in two steps:
|
||||
|
||||
(1) In step 1, 𝑃 𝑆𝐷𝑒𝑐1 takes 𝐸𝑝𝑘 (𝑝) and 𝑆𝐾1 as input, which results
|
||||
in a partially decrypted ciphertext 𝐶𝑇1 .
|
||||
Fig. 3. R-tree structure based on hierarchical clustering. (2) In step 2, 𝑃 𝑆𝐷𝑒𝑐2 completes the process by using 𝐶𝑇1 and 𝑆𝐾2 ,
|
||||
which ultimately recovers 𝑝.
|
||||
|
||||
|
||||
For example, given a dataset 𝐷 that contains 16 POIs as shown in 3.4. Advanced comparable inner product encoding
|
||||
Fig. 2-(b), the Voronoi diagram is shown in Fig. 2-(a). Since 𝑉 𝐶(𝑝8 )
|
||||
The CIPE𝑠 scheme [25] allows edges to determine whether a value
|
||||
shares a common edge with 𝑉 𝐶(𝑝𝑖 ) for 𝑖 ∈ {3, 4, 9, 11, 12, 13}, the
|
||||
lies within a query range based on encrypted data. Compared to the
|
||||
Voronoi neighbors of 𝑝8 include 𝑉 𝑁(𝑝8 ) = {𝑝3 , 𝑝4 , 𝑝9 , 𝑝11 , 𝑝12 , 𝑝13 }.
|
||||
original CIPE scheme, CIPE𝑠 enhances security by extending query
|
||||
Therefore, the search result of a 3NN query is 𝑅𝑒𝑠𝑢𝑙𝑡 = {𝑝9 , 𝑝11 , 𝑝13 }.
|
||||
vectors into random query matrices, which makes it more resilient to
|
||||
The Voronoi diagram has two useful properties for 𝑘NN verification:
|
||||
chosen plaintext attacks.
|
||||
(1) Given a query point 𝑞, the nearest neighbor of 𝑞 is data point 𝑝, CIPE𝑠 supports several key algorithms for encryption and range
|
||||
query evaluation. First, the key generation algorithm 𝐺𝑒𝑛𝐾𝑒𝑦 takes a
|
||||
if 𝑞 ∈ 𝑉 𝐶(𝑝).
|
||||
security parameter 𝜅 ∈ N as input and outputs a secret key 𝑠𝑘𝑐 . The data
|
||||
(2) If data points 𝑝1 , … , 𝑝𝑘 are the 𝑘(𝑘 > 1) nearest neighbors of the
|
||||
encryption algorithm 𝐸𝑛𝑐𝐼 encrypts a plaintext 𝑥 into ciphertext 𝐸𝑐 (𝑥)
|
||||
query point 𝑞, then 𝑝𝑖 belongs to 𝑉 𝑁(𝑝1 ) ∪ ⋯ ∪ 𝑉 𝑁(𝑝𝑖−1 ), for
|
||||
with 𝑠𝑘𝑐 . To perform queries, the query encryption algorithm 𝐸𝑛𝑐𝑄
|
||||
𝑖 = 2, … , 𝑘.
|
||||
transforms a query range 𝑄 = [𝑏𝑙 , 𝑏𝑢 ] into an encrypted range 𝐸𝑐 (𝑄).
|
||||
Finally, the calculation algorithm 𝐶𝑎𝑙 compares the encrypted value
|
||||
3.2. R-tree index based on hierarchical clustering 𝐸𝑐 (𝑥) with the encrypted query range 𝐸𝑐 (𝑄) and outputs a comparison
|
||||
result: −1 if 𝑥 < 𝑏𝑙 , 1 if 𝑥 > 𝑏𝑢 , and 0 if 𝑥 ∈ [𝑏𝑙 , 𝑏𝑢 ].
|
||||
The R-tree index [23] organizes spatial objects into nested rect-
|
||||
angles, known as Minimum Bounding Rectangles, to enable efficient 4. System architecture and security model
|
||||
querying of spatial data, such as range queries [24] and nearest neigh-
|
||||
bor searches. However, the efficiency of the R-tree strongly depends This section introduces the system architecture and security model
|
||||
on how the data are grouped during construction. To address this, of DESM𝑘NN. A summary of notations is given in Table 2.
|
||||
|
||||
3
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
Table 2 verification object 𝑉 𝑂 to the QU (Step 7). The QU then verifies the
|
||||
Summary of notations. correctness of the result before finalizing the query.
|
||||
𝐷 A spatial dataset that includes 𝑛 points {𝑃1 , … , 𝑃𝑛 }
|
||||
𝑉𝐷 Voronoi diagram built from 𝐷
|
||||
𝑠𝑘𝑐 The secret key for CIPE𝑠 scheme 4.2. Security model
|
||||
𝑠𝑘0 , 𝑝𝑘0 The secret/public key for DO
|
||||
𝑠𝑘𝑢 , 𝑝𝑘𝑢 The secret/public key for users DESM𝑘NN is designed to address three security threats. First, CS
|
||||
𝑆𝐾, 𝑆𝐾1 , 𝑆𝐾2 Strong private key and partial ones
|
||||
𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , ∗) The first step of partial decryption
|
||||
cannot be fully trusted and may tamper with query results. Second, CS
|
||||
𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , ∗, ∗) The second step of partial decryption may act as honest-but-curious adversaries that attempt to infer sensitive
|
||||
𝑄, 𝐸𝑐 (𝑄) A query coverage and its encrypted range information from the encrypted data. Third, QUs themselves may be
|
||||
𝑞, 𝐸𝑝𝑘0 (𝑞) A query point and its encrypted coordinates curious and try to learn the query information of others.
|
||||
𝑃𝑖 , 𝐸𝑝𝑘0 (𝑃𝑖 ) A POI and its encrypted coordinates
|
||||
To counter the risk of result tampering, DESM𝑘NN incorporates a
|
||||
𝑇̂𝑟𝑒𝑒𝑅 , 𝑇 𝑟𝑒𝑒𝑅 The encrypted/clear R-tree index built from 𝐷
|
||||
̂
|
||||
𝑃 𝐷, 𝑃 𝐷 The encrypted/clear preprocessed data built from 𝑉 𝐷 verification mechanism that ensures both correctness and complete-
|
||||
̂ 𝑄 , 𝑅𝑒𝑐𝑡𝑄
|
||||
𝑅𝑒𝑐𝑡 The encrypted/clear range query generated for 𝑄 ness [27]. Correctness requires that every returned point 𝑝 ∈ 𝑅𝑒𝑠𝑢𝑙𝑡
|
||||
𝐼𝑅 The immediate result remains unmodified and originates from the authentic database, while
|
||||
̂ 𝑅𝑒𝑠𝑢𝑙𝑡
|
||||
𝑅𝑒𝑠𝑢𝑙𝑡, The encrypted/clear result in the exact search phase completeness guarantees that all true 𝑘NN results are included and no
|
||||
𝐻(∗) A hash function
|
||||
𝑉𝑂 The verification object
|
||||
irrelevant points are omitted.
|
||||
The other two threats are addressed by designing a secure index and
|
||||
a set of novel secure protocols that jointly preserve multiple dimensions
|
||||
of privacy [4,28]. Specifically, data privacy ensures that the database
|
||||
𝐷 remains hidden from the CS; query privacy requires that the content
|
||||
of a QU’s query 𝑆𝑄 is concealed from both the CS and other QUs; result
|
||||
privacy guarantees that only the QU can access the returned 𝑅𝑒𝑠𝑢𝑙𝑡; and
|
||||
access-pattern privacy prevents the CS from learning which database
|
||||
entries satisfy a given query.
|
||||
It is noteworthy that during system setup stage, CCS is prevented
|
||||
from compromising or collaborating with CSS. Furthermore, collusion
|
||||
between CS and QUs must be prevented throughout the query process.
|
||||
|
||||
5. DESM𝒌NN construction
|
||||
|
||||
This section first introduces an optimized two-stage search frame-
|
||||
work that supports efficient and secure multi-user 𝑘NN queries with
|
||||
dynamic POIs updating. Subsequently, several well-designed secure
|
||||
protocols are proposed to enable private 𝑘NN search operations on the
|
||||
two-stage search framework.
|
||||
|
||||
5.1. Two-stage search framework
|
||||
|
||||
Fig. 4. System architecture. DESM𝑘NN adopts a two-stage search framework, which consists of
|
||||
an initial filtering stage based on hierarchical clustering to effectively
|
||||
constrain the search range, followed by a precise search stage to
|
||||
4.1. System architecture achieve efficient querying.
|
||||
Initial Filtering Stage: DO first preprocesses the dataset by using
|
||||
DESM𝑘NN employs a two-stage framework: an initial filtering stage hierarchical clustering to construct a suitable 𝑇 𝑟𝑒𝑒𝑅 . Each node in the
|
||||
on ESs and a precise search stage on dual cloud servers. To protect tree is encrypted by using the CIPE𝑠 .EncI algorithm to ensure security.
|
||||
privacy, the system adopts a dual-cloud architecture [8,9,14,26], where The 𝑇̂ 𝑟𝑒𝑒𝑅 is then uploaded to ESs. When a QU at position (𝑥𝑞 , 𝑦𝑞 )
|
||||
collusion-resilient protocols ensure both efficiency and security beyond initiates a query, they define a scope 𝐿 and construct a rectangle 𝑅𝑒𝑐𝑡𝑞
|
||||
traditional single-cloud settings. As shown in Fig. 4, the architecture centered at (𝑥𝑞 , 𝑦𝑞 ) with edge length 𝐿. Each dimension of 𝑅𝑒𝑐𝑡𝑞 is
|
||||
involves several entities with distinct roles. encrypted by using the CIPE𝑠 .EncQ algorithm and sent to the nearby
|
||||
In the setup phase (Step 1), the Certified Authority (CA) generates ̂𝑞 over 𝑇̂
|
||||
ES. The ES evaluates 𝑅𝑒𝑐𝑡 𝑟𝑒𝑒𝑅 to generate 𝐼𝑅, which efficiently
|
||||
cryptographic keys: (𝑝𝑘0 , 𝑠𝑘0 ) for the DO, (𝑝𝑘𝑖𝑢 , 𝑠𝑘𝑖𝑢 ) for each QU, and narrows down the candidate objects.
|
||||
a split strong key (𝑆𝐾1 , 𝑆𝐾2 ), which are respectively assigned to the Precise Search Stage: Once receiving (𝐸𝑝𝑘0 (𝑞), 𝑘) and 𝐼𝑅 from ES,
|
||||
two cloud servers (CSS and CCS). All public keys are shared among the the dual-cloud servers collaboratively execute secure protocols over the
|
||||
entities. The DO then prepares the dataset. For sensitive data (Step 2), preprocessed dataset to obtain the exact 𝑘 nearest neighbors (𝑅𝑒𝑠𝑢𝑙𝑡).
|
||||
it preprocesses 𝑉 𝐷 into 𝑃 𝐷, encrypts 𝑃 𝐷 with DT-PKC to obtain 𝑃 ̂𝐷, The servers also generate a verification object (𝑉 𝑂) and send it with
|
||||
and uploads it to CSS. For less sensitive data (Step 3), it builds an R-tree the 𝑅𝑒𝑠𝑢𝑙𝑡 back to QU for checking. This stage ensures both accuracy
|
||||
index 𝑇 𝑟𝑒𝑒𝑅 , encrypts it with CIPE𝑠 , and distributes the encrypted index and security of the 𝑘NN search.
|
||||
𝑇̂𝑟𝑒𝑒𝑅 to ESs for efficient query filtering.
|
||||
When a QU issues a query (Step 4), it constructs 𝑆𝑄 = (𝑅𝑒𝑐𝑡 ̂𝑞 , 𝐸𝑝𝑘 5.2. Data pre-processing
|
||||
0
|
||||
(𝑞), 𝑘) and sends it to a nearby ES. The ES evaluates 𝑅𝑒𝑐𝑡 ̂𝑞 over
|
||||
𝑇̂𝑟𝑒𝑒𝑅 , filters candidate results 𝐼𝑅, and forwards them together with To support DESM𝑘NN, DO preprocesses the dataset before outsourc-
|
||||
(𝐸𝑝𝑘0 (𝑞), 𝑘) to CSS (Step 5). Next, CSS and CCS jointly execute secure ing, which aims to protect sensitive information while retaining the
|
||||
protocols (Step 6), and return the final result set 𝑅𝑒𝑠𝑢𝑙𝑡 along with a structural relationships required for queries. First, DO constructs a
|
||||
|
||||
4
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
Voronoi diagram 𝑉 𝐷 from the dataset 𝐷, and encrypts the coordinates Algorithm 1 Secure Squared Distance Computation
|
||||
of each POI and query point 𝑞 using DT-PKC. For every POI 𝑝𝑖 ∈
|
||||
Require: CSS has 𝐸𝑝𝑘0 (𝑥1 ), 𝐸𝑝𝑘0 (𝑦1 ), 𝐸𝑝𝑘0 (𝑥2 ), 𝐸𝑝𝑘0 (𝑦2 );
|
||||
𝑉 𝐷, a unique label 𝑖 = 𝐻(𝑥𝑖 |𝑦𝑖 ) is generated through the SHA-
|
||||
CSS has 𝑆𝐾1 , 𝑝𝑘0 ; CCS has 𝑆𝐾2 , 𝑝𝑘0 ;
|
||||
256 hash function, which serves as a compact identifier. Subsequently, Ensure: 𝐸𝑝𝑘0 (|𝑥1 − 𝑥2 |2 + |𝑦1 − 𝑦2 |2 );
|
||||
DO obtains the neighborhood 𝑉 𝑁(𝑝𝑖 ) and its corresponding label set // Calculation in CSS:
|
||||
𝑉 𝑁(𝑝𝑖 ), then employs DT-PKC to encrypt the packaged 𝑉 𝑁(𝑝𝑖 ) after 1: Choose 4 random numbers 𝑟1 , 𝑟2 , 𝑟4 , 𝑟5 ∈ Z𝑁 ;
|
||||
applying data packaging technology [29]. This technique helps handle 2: Randomly choose the functionality 𝐹 ∈ {0, 1};
|
||||
multiple values together, which makes encryption more straightfor- 3: if 𝐹 = 1 then
|
||||
ward. To guarantee integrity, a signature 𝑆𝐼𝐺𝑝𝑖 = 𝐻(𝐻(𝑝𝑖 )|𝐻(𝑉 𝑁(𝑝𝑖 ))) 4: 𝐸𝑝𝑘0 (𝐴) ← 𝐸𝑝𝑘0 (𝑥1 ) ∗ 𝐸𝑝𝑘0 (𝑥2 )𝑁−1 ;
|
||||
is created, where 𝐻(𝑉 𝑁(𝑝𝑖 )) is obtained by hashing all neighbors 5: 𝐸𝑝𝑘0 (𝐵) ← 𝐸𝑝𝑘0 (𝑦1 ) ∗ 𝐸𝑝𝑘0 (𝑦2 )𝑁−1 ;
|
||||
together as 6: else if 𝐹 = 0 then
|
||||
𝐻(𝑉 𝑁(𝑝𝑖 )) = 𝐻(𝐻(𝑝𝑉 𝑁1 )|𝐻(𝑝𝑉 𝑁2 )|...|𝐻(𝑝𝑉 𝑁𝑚𝑎𝑥 )). 7: Swap 𝑥1 with 𝑥2 and 𝑦1 with 𝑦2 ;
|
||||
′′ ′′
|
||||
8: 𝑎 ← 𝐸𝑝𝑘0 (𝐴)𝑟1 , 𝑏 ← 𝐸𝑝𝑘0 (𝐵)𝑟2 ;
|
||||
Intuitively, this signature ensures any tampering with 𝑝𝑖 or its neighbors ′ ′′ ′
|
||||
9: 𝑎 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝑎 ), 𝑏 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝑏 );
|
||||
′′
|
||||
|
||||
can be detected. Since homomorphic encryption requires uniform input ′′ ′′ ′ ′
|
||||
10: Send 𝑎 , 𝑏 , 𝑎 , 𝑏 and 𝐸𝑝𝑘0 (𝐴), 𝐸𝑝𝑘0 (𝐵) to CCS;
|
||||
length, DO also performs incremental obfuscation: if a POI has fewer // Calculation in CCS:
|
||||
neighbors than the maximum in 𝑉 𝐷, dummy neighbors are added to 11: Choose a random number 𝑟3 ∈ Z𝑁 ;
|
||||
conceal the actual degree. Afterward, each POI is represented by a ′′ ′ ′′ ′
|
||||
12: 𝑎 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝑎 , 𝑎 ), 𝑏 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝑏 , 𝑏 );
|
||||
sextuple 13: if 𝑎 > 0 then
|
||||
14: 𝐸1 ← 𝐸𝑝𝑘0 (𝐴);
|
||||
(𝐸𝑝𝑘0 (𝑖𝑑), 𝐸𝑝𝑘0 (𝑝𝑖 ), 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖 )), 𝑖, 𝑉 𝑁(𝑝𝑖 ), 𝑆𝐼𝐺𝑝𝑖 ),
|
||||
15: else if 𝐸𝑝𝑘0 (𝑟3 ) ∗ 𝐸𝑝𝑘0 (𝐴)𝑁−1 = 𝐸𝑝𝑘0 (𝑟3 ) then
|
||||
which combines encrypted attributes, hashed labels, and a verifiable 16: 𝐸1 ← 𝐸𝑝𝑘0 (𝑟3 )0 ;
|
||||
signature. 17: else
|
||||
To further protect access pattern privacy, DO divides the sextuple 18: 𝐸1 ← 𝐸𝑝𝑘0 (𝐴)𝑁−1 ;
|
||||
table into buckets [8,9] of size 𝑤, which ensures queries operate over 19: Apply the same steps to 𝑏 to obtain 𝐸2 ;
|
||||
fixed-size groups instead of revealing individual record access. Since 20: Send 𝐸1 , 𝐸2 to CSS;
|
||||
the final bucket may not be completely filled, DO pads it with randomly // Calculation in CSS:
|
||||
′′
|
||||
generated dummy records, which prevents inference attacks [30,31] 21: 𝑐 ← 𝐸1 ∗ 𝐸𝑝𝑘0 (𝑟4 );
|
||||
′ ′′
|
||||
where an adversary could deduce whether two queries target the 22: 𝑐 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝑐 );
|
||||
′′ ′
|
||||
same bucket based on its record count. At this point, DO completes 23: Apply the same steps to 𝐸2 , 𝑟5 to obtain 𝑑 , 𝑑 ;
|
||||
′′ ′ ′′ ′
|
||||
preprocessing and securely outsources the bucketized sextuples to CSS. 24: Send 𝑐 , 𝑐 , 𝑑 , 𝑑 to CCS;
|
||||
// Calculation in CCS:
|
||||
′′ ′
|
||||
5.3. Secure Square Distance Computation(SSDC) 25: 𝑐 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝑐 , 𝑐 );
|
||||
26: 𝑠 ← 𝑐 ∗ 𝑐;
|
||||
′′ ′
|
||||
The goal of SSDC is to compute the secure squared distance without 27: Apply the same steps to 𝑑 , 𝑑 to obtain 𝑑, 𝑧;
|
||||
revealing any valid coordinate information to CSS and CCS. The process 28: Send 𝐸𝑝𝑘0 (𝑠), 𝐸𝑝𝑘0 (𝑧) to CSS;
|
||||
is shown in Algorithm 1. // Calculation in CSS:
|
||||
𝑁−𝑟4 𝑁−𝑟
|
||||
Initially, CSS randomly chooses 4 random numbers 𝑟1 , 𝑟2 , 𝑟4 , 𝑟5 ∈ 29: 1 ← 𝐸𝑝𝑘0 (𝑠) ∗ 𝐸1 ∗ 𝐸1 4 ∗ 𝐸𝑝𝑘0 (𝑟4 ∗ 𝑟4 )𝑁−1 ;
|
||||
𝑁−𝑟5 𝑁−𝑟
|
||||
Z𝑁 , and chooses the functionality 𝐹 ∈ {0, 1} (line 1–2). If 𝐹 = 1, CSS 30: 2 ← 𝐸𝑝𝑘0 (𝑑) ∗ 𝐸2 ∗ 𝐸2 5 ∗ 𝐸𝑝𝑘0 (𝑟5 ∗ 𝑟5 )𝑁−1 ;
|
||||
calculates the encrypted coordinate differences 𝐸𝑝𝑘0 (𝐴), 𝐸𝑝𝑘0 (𝐵) (line 31: 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 ← 𝐸𝑝𝑘0 (|𝑥1 − 𝑥2 |2 + |𝑦1 − 𝑦2 |2 ) ← 1 ∗ 2 ;
|
||||
3–5). If 𝐹 = 0, the procedure is the same except that the positions
|
||||
of 𝑥1 and 𝑥2 , as well as 𝑦1 and 𝑦2 , are swapped when computing the
|
||||
differences (line 6–7). To mask these values and avoid direct leak- 5.4. Secure Minimum Computation(SMC)
|
||||
age, CSS applies randomization with 𝑟1 and 𝑟2 (line 8). Subsequently,
|
||||
CSS partially decrypts the masked values 𝑎′′ , 𝑏′′ by using the PSDec1 The goal of SMC is to compare two secure squared distances ob-
|
||||
function to get 𝑎′ , 𝑏′ (line 9). Eventually, CSS sends 𝑎′′ , 𝑏′′ , 𝑎′ , 𝑏′ and tained by SSDC, determine the smaller one, and also obtain the corre-
|
||||
𝐸𝑝𝑘0 (𝐴), 𝐸𝑝𝑘0 (𝐵) to CCS (line 10). sponding 𝑖𝑑𝑚𝑖𝑛 and 𝑚𝑖𝑛 . The process is shown in Algorithm 2.
|
||||
Upon receiving a series of encrypted values from CSS, CCS chooses To start with, CSS generates 7 random numbers and randomly
|
||||
a random number 𝑟3 ∈ Z𝑁 and decrypts the encrypted values to obtain selects a functionality 𝐹 , in a manner similar to SSDC (line 1–2). If
|
||||
𝑎 and 𝑏 (line 11–12). To conceal the sign information of the differences, 𝐹 = 1, CSS masks the differences between the distances, identifiers,
|
||||
CCS applies a randomized comparison procedure (line 13–18). Specifi- and location labels by incorporating random numbers either as mul-
|
||||
cally, depending on the outcomes of 𝑎 versus 0 and related conditions, tiplicative factors or as exponents (line 3–10). For example, the key
|
||||
CCS produces three possible cases and outputs 𝐸1 accordingly; this step
|
||||
design prevents CSS from learning whether 𝑥1 − 𝑥2 or 𝑦1 − 𝑦2 is positive
|
||||
𝐸𝑝𝑘0 (𝛼) ← (𝐸𝑝𝑘0 (𝑑1 ) ∗ 𝐸𝑝𝑘0 (𝑑2 )𝑁−1 )𝑟𝛼
|
||||
or negative. The same process is repeated for 𝑏 to obtain 𝐸2 (line 19).
|
||||
Finally, CCS returns 𝐸1 , 𝐸2 to CSS (line 20). ensures that CCS cannot infer the exact magnitude of 𝑑1 and 𝑑2 with
|
||||
Upon receiving a series of encrypted values from CCS, CSS further no less than 1/2 probability, which enables to preserve the magnitude
|
||||
randomizes 𝐸1 and 𝐸2 with 𝑟4 and 𝑟5 , then partially decrypts them to relationship with semantic security. If 𝐹 = 0, the roles of 𝑑1 and 𝑑2 are
|
||||
produce (𝑐 ′′ , 𝑑 ′′ ) and (𝑐 ′ , 𝑑 ′ ), and sends these values to CCS (line 21–24). swapped, and the same randomization procedure follows (line 11–12).
|
||||
CCS completes the decryption (line 25), squares the plaintexts to derive After randomization, CSS partially decrypts one of the masked values
|
||||
𝑠 = 𝑐 2 and 𝑧 = 𝑑 2 (line 26–27), and sends back 𝐸𝑝𝑘0 (𝑠), 𝐸𝑝𝑘0 (𝑧) (line to obtain 𝛼1 and sends it together with the corresponding encrypted
|
||||
28). Finally, CSS combines these ciphertexts through homomorphic terms to CCS (line 13–14).
|
||||
operations to obtain 1 and 2 , and computes the secure squared Upon receiving these values, CCS decrypts 𝛼1 to obtain 𝛼2 (line 15).
|
||||
distance as 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 = 1 ∗ 2 . By checking whether the bit-length of 𝛼2 exceeds half modulus size, CCS
|
||||
|
||||
5
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
decides whether 𝑑1 or 𝑑2 is smaller, and records this decision in a flag token is then partially decrypted using 𝑆𝐾1 , producing an auxiliary
|
||||
𝑤 (line 16–19). Using 𝑤 and the remaining encrypted values from CSS, value that, together with the token, is stored in a permuted list under
|
||||
CCS computes three encrypted auxiliary terms that encode the correct a pseudo-random permutation to prevent linkability (line 7–9). After
|
||||
selection of the minimum distance, identifier, and label (line 20–22). completing all comparisons, CSS sends the resulting table to CCS for
|
||||
These results, along with 𝑤, are then sent back to CSS (line 23). further processing (line 10).
|
||||
On the CCS side, the server initializes an empty set and parses
|
||||
Algorithm 2 Secure Minimum Computation the received tokens (line 11–12). Each token is decrypted with 𝑆𝐾2 ,
|
||||
Require: CSS has 𝐸𝑝𝑘0 (𝑑1 ), 𝐸𝑝𝑘0 (𝑑2 ), 𝐸𝑝𝑘0 (𝑖𝑑1 ), 𝐸𝑝𝑘0 (𝑖𝑑2 ), and whenever a decryption reveals equality between an element of 𝑆 ̂1
|
||||
𝐸𝑝𝑘0 (1 ), 𝐸𝑝𝑘0 (2 ); and 𝑆̂2 , the corresponding index is added to the set (line 13–15). This
|
||||
CSS has 𝑆𝐾1 , 𝑝𝑘0 ; CCS has 𝑆𝐾2 , 𝑝𝑘0 ; set, containing the indices of overlapping elements, is then returned to
|
||||
Ensure: 𝐸𝑝𝑘0 (𝑑𝑚𝑖𝑛 ), 𝐸𝑝𝑘0 (𝑖𝑑𝑚𝑖𝑛 ), 𝐸𝑝𝑘0 (𝑚𝑖𝑛 ); CSS (line 16). Finally, CSS uses the inverse permutation to locate the
|
||||
// Calculation in CSS: original positions and removes the identified elements from 𝑆 ̂1 (line
|
||||
1: Choose 7 random numbers 𝑟𝛼 , 𝑟𝛽 , 𝑟𝛾 , 𝑟𝛿 , 𝑟𝜖 , 𝑟𝜁 , 𝑟𝜂 ∈ Z𝑁 ; 17–19). The remaining encrypted elements constitute the secure set
|
||||
2: Randomly choose the functionality 𝐹 ∈ {0, 1}; difference 𝑆̂′ , which represents all values in 𝑆1 but not in 𝑆2 (line 20).
|
||||
3: if 𝐹 = 1 then
|
||||
Algorithm 3 Secure Set Difference
|
||||
4: 𝐸𝑝𝑘0 (𝛼) ← (𝐸𝑝𝑘0 (𝑑1 ) ∗ 𝐸𝑝𝑘0 (𝑑2 )𝑁−1 )𝑟𝛼 ;
|
||||
5: 𝐸𝑝𝑘0 (𝛽) ← (𝐸𝑝𝑘0 (𝑑1 ) ∗ 𝐸𝑝𝑘0 (𝑑2 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝛽 )); Require: CSS has two sets of encrypted values
|
||||
̂1 = {𝐸𝑝𝑘 (𝑥1 ), ..., 𝐸𝑝𝑘 (𝑥𝑀 )};
|
||||
𝑆
|
||||
6: 𝐸𝑝𝑘0 (𝛾) ← (𝐸𝑝𝑘0 (𝑑2 ) ∗ 𝐸𝑝𝑘0 (𝑑1 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝛾 )); 0 0
|
||||
̂2 = {𝐸𝑝𝑘 (𝑦1 ), ..., 𝐸𝑝𝑘 (𝑦𝑇 )};
|
||||
𝑆
|
||||
7: 𝐸𝑝𝑘0 (𝛿) ← (𝐸𝑝𝑘0 (𝑖𝑑1 ) ∗ 𝐸𝑝𝑘0 (𝑖𝑑2 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝛿 )); 0 0
|
||||
CSS has 𝑆𝐾1 ; CCS has 𝑆𝐾2 ;
|
||||
8: 𝐸𝑝𝑘0 (𝜖) ← (𝐸𝑝𝑘0 (𝑖𝑑2 ) ∗ 𝐸𝑝𝑘0 (𝑖𝑑1 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝜖 )); ′
|
||||
Ensure: CSS obtains an encrypted difference set 𝑆̂ ;
|
||||
9: 𝐸𝑝𝑘0 (𝜁) ← (𝐸𝑝𝑘0 (1 ) ∗ 𝐸𝑝𝑘0 (2 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝜁 ));
|
||||
// Calculation in CSS:
|
||||
10: 𝐸𝑝𝑘0 (𝜂) ← (𝐸𝑝𝑘0 (2 ) ∗ 𝐸𝑝𝑘0 (1 )𝑁−1 ∗ 𝐸𝑝𝑘0 (𝑟𝜂 ));
|
||||
1: Initialize 𝑇 to an empty table;
|
||||
11: else if 𝐹 = 0 then ̂1 do
|
||||
2: for the 𝑖−th element 𝐸𝑝𝑘0 (𝑥𝑖 ) ∈ 𝑆
|
||||
12: Swaps the roles of 𝑑1 , 𝑖𝑑1 , 1 with 𝑑2 , 𝑖𝑑2 , 2 .
|
||||
3: Initialize 𝑡 to an empty list;
|
||||
13: 𝛼1 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝐸𝑝𝑘0 (𝛼)); ̂2 in random order do
|
||||
4: for all 𝐸𝑝𝑘0 (𝑦𝑗 ) ∈ 𝑆
|
||||
14: Send 𝛼1 , 𝐸𝑝𝑘0 (𝛼), 𝐸𝑝𝑘0 (𝛽), 𝐸𝑝𝑘0 (𝛾), 𝐸𝑝𝑘0 (𝛿), 𝐸𝑝𝑘0 (𝜖),
|
||||
5: Generate a random number 𝑟𝑖,𝑗 ;
|
||||
𝐸𝑝𝑘0 (𝜁), 𝐸𝑝𝑘0 (𝜂) to CSS;
|
||||
6: 𝑡𝑖,𝑗 [0] ← (𝐸𝑝𝑘0 (𝑥𝑖 ) ∗ 𝐸𝑝𝑘0 (𝑦𝑗 )𝑁−1 )𝑟𝑖,𝑗 ;
|
||||
// Calculation in CCS:
|
||||
7: 𝑡𝑖,𝑗 [1] ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝑡𝑖,𝑗 [0]);
|
||||
15: 𝛼2 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝐸𝑝𝑘0 (𝛼), 𝛼1 );
|
||||
8: Append 𝑡𝑖,𝑗 to t;
|
||||
16: if 𝐿𝑒𝑛𝑔𝑡ℎ(𝛼2 ) > 𝐿𝑒𝑛𝑔𝑡ℎ(𝑁)∕2 then
|
||||
9: 𝑇 [𝜋(𝑖)] ← 𝑡;
|
||||
17: 𝑤 ← 1;
|
||||
10: Send 𝑇 to CCS;
|
||||
18: else
|
||||
// Calculation in CCS:
|
||||
19: 𝑤 ← 0;
|
||||
11: Initialize 𝑉 to an empty set;
|
||||
20: 𝐸𝑝𝑘0 (𝜃) ← (𝐸𝑝𝑘0 (𝛽)1−𝑤 ∗ 𝐸𝑝𝑘0 (𝛾)𝑤 )𝑁−1 ;
|
||||
12: for 𝑖 ∈ [𝑀] do
|
||||
21: 𝐸𝑝𝑘0 (𝜗) ← (𝐸𝑝𝑘0 (𝛿)1−𝑤 ∗ 𝐸𝑝𝑘0 (𝜖)𝑤 )𝑁−1 ; 13: Parse 𝑇 [𝑖] as (𝑡𝑖,1 , ..., 𝑡𝑖,𝑇 );
|
||||
22: 𝐸𝑝𝑘0 (𝜄) ← (𝐸𝑝𝑘0 (𝜁)1−𝑤 ∗ 𝐸𝑝𝑘0 (𝜂)𝑤 )𝑁−1 ; 14: if ∃𝑡𝑖,𝑗 ∈ 𝑇 [𝑖] ∩ 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝑡𝑖,𝑗 [0], 𝑡𝑖,𝑗 [1]) then
|
||||
23: Send 𝑤, 𝐸𝑝𝑘0 (𝜃), 𝐸𝑝𝑘0 (𝜗), 𝐸𝑝𝑘0 (𝜄) to CSS; 15: Add 𝑖 into set 𝑉 ;
|
||||
// Calculation in CSS: 16: Send 𝑉 to CSS;
|
||||
24: if 𝑠 = 𝑤 then // Calculation in CSS:
|
||||
25: 𝐸𝑝𝑘0 (𝑑𝑚𝑖𝑛 ) = 𝐸𝑝𝑘0 (𝑑2 ) ∗ 𝐸𝑝𝑘0 (𝜃) ∗ 𝐸𝑝𝑘0 (𝑤)𝑟𝛾 ∗ 17: for each element 𝑖 in 𝑉 do
|
||||
(𝐸𝑝𝑘0 (1 − 𝑤))𝑟𝛽 ; 18: 𝑗 ← 𝜋 −1 (𝑖);
|
||||
26: 𝐸𝑝𝑘0 (𝑖𝑑𝑚𝑖𝑛 ) = 𝐸𝑝𝑘0 (𝑖𝑑2 ) ∗ 𝐸𝑝𝑘0 (𝜗) ∗ 𝐸𝑝𝑘0 (𝑤)𝑟𝜖 ∗ 19: Remove the 𝑗−th element 𝐸𝑝𝑘0 (𝑥𝑗 ) from 𝑆 ̂1 ;
|
||||
(𝐸𝑝𝑘0 (1 − 𝑤))𝑟𝛿 ; ̂′ ← 𝑆̂1 ;
|
||||
20: 𝑆
|
||||
27: 𝐸𝑝𝑘0 (𝑚𝑖𝑛 ) = 𝐸𝑝𝑘0 (2 ) ∗ 𝐸𝑝𝑘0 (𝜄) ∗ 𝐸𝑝𝑘0 (𝑤)𝑟𝜂 ∗
|
||||
(𝐸𝑝𝑘0 (1 − 𝑤))𝑟𝜁 ;
|
||||
28: else 5.6. Secure Insertion(SI)
|
||||
29: Swaps the roles of 𝑑2 , 𝑖𝑑2 , 2 with 𝑑1 , 𝑖𝑑1 , 1 .
|
||||
To support secure data insertion in databases, DESM𝑘NN innova-
|
||||
At the end of Algorithm 2, CSS computes 3 encrypted values:
|
||||
tively proposes a secure insertion protocol. When DO inserts a new POI
|
||||
𝐸𝑝𝑘0 (𝑑𝑚𝑖𝑛 ), 𝐸𝑝𝑘0 (𝑖𝑑𝑚𝑖𝑛 ), 𝐸𝑝𝑘0 (𝑚𝑖𝑛 ) via homomorphic encryption. The
|
||||
into the database, two key problems must be addressed.
|
||||
computation applies to 𝑠 = 𝑤 and 𝑠 ≠ 𝑤 (line 24-29). In this way, the
|
||||
protocol securely determines the minimum distance and its associated • How to determine the insertion position of the POI?
|
||||
information without revealing any intermediate values. • How to update 𝑇 𝑟𝑒𝑒𝑅 and 𝑉 𝐷?
|
||||
|
||||
5.5. Secure Set Difference(SSD) The first problem can be effectively resolved by CIPE𝑠 . First, DO
|
||||
generates an insertion query rectangle 𝑅𝑒𝑐𝑡𝑖𝑛𝑠 for the POI to be inserted,
|
||||
The goal of SSD is to securely compute the set difference between similar to generating a query rectangle 𝑅𝑒𝑐𝑡𝑞 for the query point 𝑞 in the
|
||||
two encrypted sets, which allows CSS to obtain the elements in 𝑆1 initial filtering stage, where the 𝐿 of the rectangle can be customized.
|
||||
that are not in 𝑆2 , without exposing any plaintext values. To achieve Then, DO encrypts each dimension of 𝑅𝑒𝑐𝑡𝑖𝑛𝑠 with CIPE𝑠 .EncQ algo-
|
||||
̂1 and 𝑆
|
||||
this, CSS holds the encrypted sets 𝑆 ̂2 together with 𝑆𝐾1 , while rithm and sends 𝑅𝑒𝑐𝑡 ̂ 𝑖𝑛𝑠 to ES near the inserted POI. ES will evaluate
|
||||
CCS holds 𝑆𝐾2 . The protocol begins with CSS initializing an empty the obtained 𝑅𝑒𝑐𝑡̂ ̂
|
||||
𝑖𝑛𝑠 over 𝑇 𝑟𝑒𝑒𝑅 to obtain the insertion position.
|
||||
table and iteratively processing each encrypted element in 𝑆 ̂1 (line Once the insertion position is determined, the label of the inserted
|
||||
1–2). For each comparison with an element in 𝑆 ̂2 , CSS generates a POI can be added to the 𝑇 𝑟𝑒𝑒𝑅 , thus completing the update of 𝑇 𝑟𝑒𝑒𝑅 .
|
||||
random blinding factor and constructs a masked comparison token To address the problem of how to update 𝑉 𝐷, the Bowyer-Watson
|
||||
that conceals the difference between the two values (line 3–6). This algorithm [32,33] is introduced. The Bowyer-Watson algorithm is an
|
||||
|
||||
6
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
incremental method that updates 𝑉 𝐷 by progressively updating the
|
||||
Delaunay triangulation. When inserting a new point, algorithm first
|
||||
identifies all the affected triangles, then removes them and reconstructs
|
||||
the triangulation mesh by using the new point and the boundary of
|
||||
the cavity, which ensures that the new Delaunay triangulation is valid.
|
||||
Since 𝑉 𝐷 and Delaunay triangulation are duals, when the Delaunay
|
||||
triangulation is updated by using the Bowyer-Watson algorithm, 𝑉 𝐷
|
||||
is updated accordingly. When a new generating point is inserted, the
|
||||
shape and boundaries of the Voronoi cells are adjusted. Therefore, DO
|
||||
obtains the updated Voronoi diagram based on the Bowyer-Watson
|
||||
algorithm and can obtain the encrypted id of the newly inserted POI:
|
||||
𝐸𝑝𝑘0 (𝑖𝑑𝑖𝑛𝑠 ), the encrypted inserted POI: 𝐸𝑝𝑘0 (𝑝𝑖𝑛𝑠 ), the label of the newly
|
||||
inserted POI: 𝑖𝑛𝑠 , the encrypted Voronoi neighbors: 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖𝑛𝑠 )),
|
||||
the encrypted labels of Voronoi neighbors: 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖𝑛𝑠 )), and the
|
||||
signature: 𝑆𝐼𝐺𝑖𝑛𝑠 used for verification. Finally, these six values are Fig. 5. Secure insertion and deletion in R-tree. (For interpretation of the
|
||||
organized into a tuple and sent to CSS for storage. As shown in Fig. references to color in this figure legend, the reader is referred to the web
|
||||
version of this article.)
|
||||
5, the secure insertion in the R-tree is highlighted with green lines.
|
||||
Algorithm 4 Secure 𝑘NN Query
|
||||
Require: CSS has 𝐼𝑅, 𝐸𝑝𝑘0 (𝑞), 𝑆𝐾1 ;
|
||||
CCS has 𝑆𝐾2 ; diagrams. The key idea behind dynamic deletion and update algorithm
|
||||
Ensure: CSS obtains the encrypted search result 𝑅𝑒𝑠𝑢𝑙𝑡; is that Voronoi diagrams and Delaunay triangulations are dual to each
|
||||
// Calculations in CSS and CCS: other: the vertices of Delaunay triangles correspond to the vertices of
|
||||
1: CSS initializes 𝑅, 𝐶, 𝐷𝑒 to empty sets; Voronoi diagram, and the edges of Delaunay triangles correspond to the
|
||||
2: for each triple (𝐸𝑝𝑘0 (𝑝𝑖 ), 𝐸𝑝𝑘0 (𝑖𝑑𝑖 ), 𝐸𝑝𝑘0 (𝑖 )) ∈ 𝐼𝑅 do edges of Voronoi diagram. The Delaunay triangulation-based Voronoi
|
||||
3: CSS appends 𝐸𝑝𝑘0 (𝑃𝑖 ) to 𝐶; diagram dynamic deletion and update algorithm leverages the duality
|
||||
4: CSS with input (𝐶, 𝐸𝑝𝑘0 (𝑞), 𝑆𝐾1 , 𝑝𝑘0 ) and CCS with input (𝑆𝐾2 , 𝑝𝑘0 ) of Delaunay triangles to efficiently update Voronoi diagram. When a
|
||||
run SSDC protocol, and CSS obtains {𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒1 , ..., 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒|𝐶| }; point is deleted, the corresponding Delaunay triangles are removed,
|
||||
5: if |𝐶| ≥ 𝑘 then and the algorithm updates the connectivity of affected neighboring
|
||||
|𝐶|
|
||||
6: ({𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑖 , 𝐸𝑝𝑘0 (𝑖𝑑𝑖 ), 𝐸𝑝𝑘0 (𝑖 )}𝑖=1 , 𝑆𝐾1 , 𝑝𝑘0 ) as
|
||||
triangles to maintain the Delaunay condition, which ensures that
|
||||
input in CSS and CCS with input (𝑆𝐾2 , 𝑝𝑘0 ) run
|
||||
the triangulation is reconstructed. Then, based on the new Delaunay
|
||||
SMC protocol, and CSS puts (𝐸𝑝𝑘0 (𝑖𝑑𝑖∗ ))𝑘𝑖=1
|
||||
triangulation, Voronoi diagram’s boundaries are updated to ensure the
|
||||
into 𝑅𝑒𝑠𝑢𝑙𝑡;
|
||||
7: else
|
||||
correct topological structure of the diagram.
|
||||
|𝐶| Similarly, DO obtains the updated 𝑉 𝐷 and the labels of affected
|
||||
8: ({𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑖 , 𝐸𝑝𝑘0 (𝑖𝑑𝑖 ), 𝐸𝑝𝑘0 (𝑖 )}𝑖=1 , 𝑆𝐾1 , 𝑝𝑘0 ) as
|
||||
input in CSS and CCS with input (𝑆𝐾2 , 𝑝𝑘0 ) run POIs 𝑎𝑓 𝑓 𝑒𝑐𝑡𝑖 , the encrypted Voronoi neighbors 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑎𝑓 𝑓 𝑒𝑐𝑡𝑖 )), the
|
||||
SMC protocol, and CSS puts (𝐸𝑝𝑘0 (𝑖𝑑1∗ )) into encrypted labels of Voronoi neighbors 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑎𝑓 𝑓 𝑒𝑐𝑡𝑖 )), and the
|
||||
𝑅𝑒𝑠𝑢𝑙𝑡 and puts (𝐸𝑝𝑘0 (∗1 )) into 𝐷𝑒; signature 𝑆𝐼𝐺𝑎𝑓 𝑓 𝑒𝑐𝑡𝑖 used for verification. Finally, these four values are
|
||||
9: CSS and CCS collaborate to run SCR protocol to get the row organized into a quadruple and sent to CSS, which updates the database
|
||||
corresponding to the 𝐸𝑝𝑘0 (𝑖𝑑1∗ ); based on the labels of the affected POIs. As shown in Fig. 5, the secure
|
||||
10: CSS with input (𝐸𝑝𝑘0 (𝑉 𝑁(𝑝∗1 )), 𝐷𝑒, 𝑆𝐾1 ) and CCS with input 𝑆𝐾2
|
||||
′
|
||||
deletion in the R-tree is highlighted with red lines.
|
||||
run SSD protocol, and CSS obtains 𝑉 𝑁 (𝑝∗1 );
|
||||
′
|
||||
11: for 𝐸𝑝𝑘0 (𝑝𝑗 ) ∈ 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝∗1 )) ∩ 𝑉 𝑁 (𝑝∗1 ) do
|
||||
Algorithm 5 Secure Transformation
|
||||
12: CSS puts 𝐸𝑝𝑘0 (𝑝𝑗 ) into 𝐶 and 𝐸𝑝𝑘0 (𝑗 ) into 𝐷𝑒;
|
||||
13: CSS and CCS collaborate to run SSD and SMC protocols to select Require: CSS has 𝐸𝑝𝑘0 (𝑎), 𝑆𝐾1 ;
|
||||
the POI closest to 𝑞 from 𝐶 again, and removing it from 𝐶; CCS has 𝑆𝐾2 ;
|
||||
14: CSS inserts 𝐸𝑝𝑘0 (𝑖𝑑2∗ ) into 𝑅𝑒𝑠𝑢𝑙𝑡; Ensure: CSS obtains 𝐸𝑝𝑘𝑢 (𝑎);
|
||||
15: while |𝑅| < 𝑘 // Calculations in CSS:
|
||||
16: Repeat line 9-14; 1: Choose one random number 𝑟 ∈ Z𝑁 ;
|
||||
2: 𝐸𝑝𝑘0 (𝛼) = 𝐸𝑝𝑘0 (𝑎) ∗ 𝐸𝑝𝑘0 (𝑟);
|
||||
′
|
||||
3: 𝛼 ← 𝑃 𝑆𝐷𝑒𝑐1(𝑆𝐾1 , 𝐸𝑝𝑘0 (𝛼));
|
||||
5.7. Secure Deletion(SD) ′
|
||||
4: Send 𝐸𝑝𝑘0 (𝛼), 𝛼 to CCS;
|
||||
// Calculations in CCS:
|
||||
To support secure data deletion in database, DESM𝑘NN innovatively 5: 𝛼 ← 𝑃 𝑆𝐷𝑒𝑐2(𝑆𝐾2 , 𝐸𝑝𝑘0 (𝛼), 𝛼 );
|
||||
′
|
||||
|
||||
proposes a secure deletion protocol. First, DO generates an deletion 6: Send 𝐸𝑝𝑘𝑢 (𝛼) to CSS;
|
||||
query rectangle 𝑅𝑒𝑐𝑡𝑑𝑒𝑙 for the POI to be deleted, where the 𝐿 of the // Calculations in CSS:
|
||||
rectangle can be customized. Then, DO encrypts each dimension of 7: 𝐸𝑝𝑘𝑢 (𝑎) = 𝐸𝑝𝑘𝑢 (𝛼) ∗ 𝐸𝑝𝑘𝑢 (𝑟)𝑁−1 ;
|
||||
̂
|
||||
𝑅𝑒𝑐𝑡𝑑𝑒𝑙 with the CIPE𝑠 .EncQ algorithm and sends 𝑅𝑒𝑐𝑡 𝑑𝑒𝑙 to ES near the
|
||||
̂
|
||||
deleted POI. ES will evaluate the obtained 𝑅𝑒𝑐𝑡 ̂
|
||||
𝑑𝑒𝑙 over 𝑇 𝑟𝑒𝑒𝑅 to obtain
|
||||
the deletion position.
|
||||
Once the deletion position is determined, DO sends 𝑑𝑒𝑙 , which is 6. DESM𝒌NN query processing
|
||||
the label of the POI, to ES near the deleted POI. ES deletes the POI
|
||||
label from the data at deletion location based on 𝑑𝑒𝑙 sent by DO. At
|
||||
this point, the deletion update of 𝑇 𝑟𝑒𝑒𝑅 is completed. This section provides a detailed introduction to DESM𝑘NN query
|
||||
Similar to SI protocol, DESM𝑘NN introduces a Delaunay processing, which consists of two parts: secure 𝑘NN query processing
|
||||
triangulation-based dynamic deletion and update algorithm for Voronoi and verification processing.
|
||||
|
||||
7
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
6.1. Secure 𝑘NN query processing • Verifying completeness: Similar to correctness, completeness is de-
|
||||
fined as follows: all the points returned are valid solutions to the
|
||||
Based on comprehensive search framework, DESM𝑘NN proposes a 𝑘NN query, while the points not returned do not correspond to
|
||||
secure and verifiable query processing strategy, which is divided into the actual answers. First, assume that 𝑝∗𝑖 represents the 𝑖th nearest
|
||||
three steps as follows: point to the query point 𝑞 in 𝑅𝑒𝑠𝑢𝑙𝑡. Subsequently, based on the
|
||||
properties of the Voronoi diagram, 𝑉 𝐶(𝑝∗𝑖 ) can be derived from
|
||||
• Step 1. Calculating k nearest neighbors: The specific details and 𝑉 𝑁(𝑝∗𝑖 ) and 𝑝∗𝑖 . The specific process is divided into four steps: (1)
|
||||
procedures are illustrated in Algorithm 4. First, CSS will create determine the coordinates of the neighboring points; (2) calculate
|
||||
three new sets, which includes the result set 𝑅𝑒𝑠𝑢𝑙𝑡, the candidate the perpendicular bisectors between 𝑝∗𝑖 and each neighboring
|
||||
set 𝐶, and the deduplication set 𝐷𝑒 (line 1). After initial filtering point; (3) identify the intersection points of all these perpen-
|
||||
stage, CSS has 𝐼𝑅 = {(𝐸𝑝𝑘0 (𝑝𝑖 ), 𝐸𝑝𝑘0 (𝑖𝑑𝑖 ), 𝐸𝑝𝑘0 (𝑖 ))}. Next, CSS dicular bisectors, these intersection points form the vertices of
|
||||
will insert each encrypted POI 𝐸𝑝𝑘0 (𝑝𝑖 ) from 𝐼𝑅 into 𝐶 (line the polygon, which represent the Voronoi cell; (4) connect these
|
||||
2–3). Since CSS has already stored the encrypted query point vertices in either a clockwise or counterclockwise order to form
|
||||
𝐸𝑝𝑘0 (𝑞), the SSDC protocol is executed for each intermediate POI the Voronoi cell surrounding the point 𝑝∗𝑖 . Thereafter, the final
|
||||
verification is conducted based on the two important properties
|
||||
to obtain the secure squared distance between each POI and the
|
||||
of the Voronoi diagram. The first step is to determine whether 𝑞
|
||||
query point (line 4). If |𝐶| ≥ 𝑘, which means that the required
|
||||
lies within 𝑉 𝐶(𝑝∗1 ). If it does, 𝑝∗1 is confirmed as the nearest POI;
|
||||
𝑘 POIs can be found in 𝐼𝑅, CSS and CCS will collaborate to
|
||||
otherwise, the verification process is terminated immediately.
|
||||
execute SMC protocol to obtain the desired 𝑘 POIs (line 5–6). If
|
||||
The second step is to test each point (except for 𝑝∗1 ) in 𝑅𝑒𝑠𝑢𝑙𝑡
|
||||
|𝐶| < 𝑘, CSS and CCS collaborate to execute the SMC protocol
|
||||
individually, which determines whether 𝑝∗𝑖 ∈ {𝑉 𝑁(𝑝∗1 ) ∪ ⋯ ∪
|
||||
to obtain the nearest POI, and insert the corresponding 𝐸𝑝𝑘0 (𝑖𝑑1∗ )
|
||||
𝑉 𝑁(𝑝∗𝑖−1 )}, 𝑖 > 1. If it does, 𝑝∗𝑖 is confirmed as the 𝑖th nearest POI.
|
||||
into 𝑅𝑒𝑠𝑢𝑙𝑡, and the corresponding 𝐸𝑝𝑘0 (∗1 ) into 𝐷𝑒 (line 7–8).
|
||||
To further get the next nearest neighbor, CSS and CCS collaborate
|
||||
7. Analysis
|
||||
to execute the SCR protocol [8,9], to get the row corresponding
|
||||
to the 𝐸𝑝𝑘0 (𝑖𝑑1∗ ): 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝∗1 )), 𝑉 𝑁(𝑝∗1 ), 𝑆𝐼𝐺𝑝∗ (line 9). CSS and
|
||||
1 7.1. Computational complexity
|
||||
CCS collaborate to execute the SSD protocol, with two input
|
||||
sets 𝑉 𝑁(𝑝∗1 ) and 𝐷𝑒. CSS obtains 𝑉 𝑁′ (𝑝∗1 ) (line 10). If one To verify the efficiency of DESMkNN, we analyze the computational
|
||||
POI 𝐸𝑝𝑘0 (𝑃𝑗 ) in 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝∗1 )) also exists in 𝑉 𝑁′ (𝑝∗1 ), 𝐸𝑝𝑘0 (𝑝𝑗 ) complexity of all four entities involved in the system: DO, QU, ESs, and
|
||||
is added to 𝐶, and 𝐸𝑝𝑘0 (𝑗 ) is added to 𝐷𝑒 (line 11–12). CSS dual-cloud servers. Let 𝑒𝑐 and 𝑑𝑐 denote the encryption and decryption
|
||||
and CCS collaborate to execute SSD protocol and SMC protocol, operations of CIPE𝑆 , and let 𝑒𝑑𝑡 and 𝑑𝑑𝑡 represent the encryption and
|
||||
which selects the POI closest to the query point from 𝐶 again decryption operations of DT-PKC.
|
||||
and removes it from 𝐶 (line 13). CSS inserts 𝐸𝑝𝑘0 (𝑖𝑑2∗ ), which
|
||||
corresponds to the obtained point, into 𝑅𝑒𝑠𝑢𝑙𝑡 and checks whether (1) DO: In the data pre-processing stage, DO needs to generate
|
||||
the content in 𝑅𝑒𝑠𝑢𝑙𝑡 meets the requirements of 𝑘NN queries. If 𝑇 𝑟𝑒𝑒𝑅 and 𝑉 𝐷 based on the database 𝐷. 𝑇 𝑟𝑒𝑒𝑅 and the 𝑃 𝐷
|
||||
not, S𝑘Q will repeat line 9–14. generated from 𝑉 𝐷 are encrypted by using CIPE𝑆 and DT-PKC,
|
||||
respectively. Therefore, the total computational complexity is
|
||||
• Step 2. Generating verification object : During secure 𝑘NN queries,
|
||||
𝑂(𝑛)𝑒𝑐 + 𝑂(𝑛 ∗ 𝑀)𝑒𝑑𝑡 ,
|
||||
DESM𝑘NN also need to generate 𝑉 𝑂. By collaborating to execute
|
||||
the SCR protocol, CSS and CCS can obtain 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖 )) and where 𝑀 represents the maximum number of neighbors in 𝑉 𝐷.
|
||||
𝑆𝐼𝐺𝑝𝑖 from the row, which corresponds to 𝑝𝑖 . Additionally, al- (2) QU : Due to the key conversion mechanism in Algorithm 5, QU
|
||||
gorithm 5 enables key conversion, which transforms 𝐸𝑝𝑘0 (𝑉 𝑁(𝑝𝑖 )) only needs to perform a single DT-PKC decryption to obtain the
|
||||
into 𝐸𝑝𝑘𝑢 (𝑉 𝑁(𝑝𝑖 )). At last, CSS adds 𝐸𝑝𝑘𝑢 (𝑉 𝑁(𝑝𝑖 )) and 𝐸𝑝𝑘𝑢 (𝑆𝐼𝐺𝑝𝑖 ) final result and 𝑉 𝑂. Thus, the computational cost is 𝑂(1)𝑑𝑑𝑡 .
|
||||
of each result point into 𝑉 𝑂. (3) ESs: The ESs perform initial filtering by evaluating the encrypted
|
||||
̂𝑞 over the encrypted R-tree 𝑇̂
|
||||
query rectangle 𝑅𝑒𝑐𝑡 𝑟𝑒𝑒𝑅 to gen-
|
||||
• Step 3. Returning results and verification object to QU : Based on erate the intermediate result set 𝐼𝑅. Their total computational
|
||||
secure protocols we proposed, CSS can directly retrieve the final complexity is 𝑂(𝑙𝑜𝑔𝑛 )𝑑𝑐 .
|
||||
results encrypted with 𝑝𝑘𝑢 in order, without needing an additional (4) Dual-Cloud Servers: The dual-cloud servers undertake the pre-
|
||||
transformation process. Therefore, CSS puts the final points into cise search stage and therefore incur the highest computational
|
||||
𝑅𝑒𝑠𝑢𝑙𝑡 and sends it, along with 𝑉 𝑂, to QU. complexity, as this stage requires executing several secure sub-
|
||||
protocols. Specifically, the SSDC protocol is used to compute
|
||||
6.2. Verification processing the secure squared distance between the query point 𝑞 and each
|
||||
POI in the intermediate result set 𝐼𝑅. The SMC protocol is re-
|
||||
sponsible for comparing encrypted distance values and obtaining
|
||||
QU utilizes 𝑅𝑒𝑠𝑢𝑙𝑡 and 𝑉 𝑂 to authenticate the correctness and
|
||||
the corresponding encrypted identifiers and location records. To
|
||||
completeness of 𝑅𝑒𝑠𝑢𝑙𝑡.
|
||||
determine the nearest POI among candidates, the SMC proto-
|
||||
• Verifying correctness: Recall the definition of correctness described col must be executed 𝑛-1 times. In addition, the SSD protocol
|
||||
in the security model, which means that each returned point computes the set difference between two encrypted sets and
|
||||
must perform DT-PKC decryption |𝑆 ̂1 | ∗ |𝑆
|
||||
̂2 | times. The overall
|
||||
𝑝 ∈ 𝑅𝑒𝑠𝑢𝑙𝑡 remains unmodified and is an authentic entry in the
|
||||
complexity depends on whether the number of candidates in
|
||||
original database. To verify the correctness of 𝑅𝑒𝑠𝑢𝑙𝑡, QU first de-
|
||||
𝐼𝑅 is greater than or smaller than 𝑘. When |𝐼𝑅| > 𝑘, the
|
||||
crypts 𝑉 𝑂 by using his private key 𝑠𝑘𝑢 to obtain {𝑉 𝑁(𝑝𝑖 ), 𝑆𝐼𝐺𝑝𝑖 }.
|
||||
SkQ protocol repeatedly invokes the SMC protocol to iteratively
|
||||
Next, QU uses the obtained 𝑉 𝑁(𝑝𝑖 ) to compute 𝐻(𝑉 𝑁(𝑝𝑖 )) and
|
||||
determine the top-𝑘 POIs, which requires (|𝐼𝑅|−1+|𝐼𝑅|−𝑘) ∗ 𝑘∕2
|
||||
further calculates 𝐻(𝐻(𝑝𝑖 )|𝐻(𝑉 𝑁(𝑝𝑖 ))) (the specific method has
|
||||
executions in total. In this case, the computational complexity of
|
||||
been detailed in Data Pre-processing). Finally, QU only needs to
|
||||
the precise search stage is
|
||||
check whether 𝑆𝐼𝐺𝑝𝑖 matches the computed 𝐻(𝐻(𝑝𝑖 )|𝐻(𝑉 𝑁(𝑝𝑖 )))
|
||||
to verify correctness. 𝑂(|𝐼𝑅| ∗ 𝑘)(𝑒𝑑𝑡 + 𝑑𝑑𝑡 ),
|
||||
|
||||
8
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
Table 3
|
||||
Computational complexity of existing approaches and DESM𝑘NN.
|
||||
DO QU ES Dual-cloud servers
|
||||
{
|
||||
𝑂(|𝐼𝑅| ∗ 𝑘)(𝑒𝑑𝑡 + 𝑑𝑑𝑡 )
|
||||
DESM𝑘NN 𝑂(𝑛)𝑒𝑐 + 𝑂(𝑛 ∗ 𝑀)𝑒𝑑𝑡 𝑂(1)𝑑𝑑𝑡 𝑂(𝑙𝑜𝑔𝑛 )𝑑𝑐 √
|
||||
𝑂(|𝐼𝑅| + 𝑘2 ∗ 𝑀)𝑒𝑑𝑡 + 𝑂(|𝐼𝑅| + 𝑘 ∗ ( 𝑛 + 𝑘 ∗ 𝑀))𝑑𝑑𝑡
|
||||
2
|
||||
√
|
||||
MSV𝑘NN [9] 𝑂(𝑚 ∗ 𝑔 + 𝑛 ∗ 𝑀)𝑒𝑑𝑡 𝑂(1)𝑑𝑑𝑡 – 𝑂(𝑘 ∗ (𝑛 + 𝑀))𝑒𝑑𝑡 + 𝑂(𝑘 ∗ ( 𝑛 + 𝑀))𝑑𝑑𝑡
|
||||
{
|
||||
𝑂(|𝐼𝑅| ∗ 𝑘)(𝑒𝑝 + 𝑑𝑝 )
|
||||
SecVKQ [14] 𝑂(𝑛)𝑒𝑐 + 𝑂(𝑛 ∗ 𝑀)𝑒𝑝 𝑂(1)(𝑒𝑐 + 𝑒𝑝 ) 𝑂(𝑙𝑜𝑔𝑛 )𝑑𝑐
|
||||
𝑂(|𝐼𝑅| + 𝑘2 ∗ 𝑀)(𝑒𝑝 + 𝑑𝑝 )
|
||||
√
|
||||
SV𝑘NN [8] 𝑂(𝑚2 ∗ 𝑔 + 𝑛 ∗ 𝑀)𝑒𝑝 𝑂(1)𝑑𝑝 – 𝑂(𝑘 ∗ (𝑛 + 𝑀))𝑒𝑝 + 𝑂(𝑘 ∗ ( 𝑛 + 𝑀))𝑑𝑝
|
||||
|
||||
Notations: Let 𝑛 represents the size of dataset 𝐷, 𝑘 represents the search parameter for 𝑘NN search, and 𝑀 represents the maximal number of Voronoi neighbors. 𝑚 refers to the
|
||||
number of grids, while 𝑔 represents the maximum number of grid points, as discussed in [8,9].
|
||||
|
||||
|
||||
Table 4 Theorem 1. The DT-PKC cryptosystem described in Section 3 is seman-
|
||||
Comparison of communication costs (MB) under the setting of 𝐾 = tically secure under the assumed intractability of the DDH problem over
|
||||
{1024, 2048}. Z∗ 2 . This ensures that ciphertexts produced by DT-PKC reveal no infor-
|
||||
𝑁
|
||||
𝑛 DESM𝑘NN MSV𝑘NN mation about the underlying plaintexts, even to computationally bounded
|
||||
California San Francisco California San Francisco adversaries (The details of the proof can be referred to [19]).
|
||||
1024 2048 1024 2048 1024 2048 1024 2048
|
||||
1024 6.1 12.7 5.9 12.3 6.5 13.1 6.1 12.4 Theorem 2 (Composition Theorem [35]). If a protocol is composed of mul-
|
||||
2048 12.8 27.8 11.9 25.6 14.3 31.4 13.9 30.7 tiple subprotocols, each of which is secure under the simulation paradigm,
|
||||
and all intermediate values are either random or pseudorandom, the com-
|
||||
posed protocol is secure. This theorem allows the security of DESM𝑘NN to
|
||||
When |𝐼𝑅| < 𝑘, the nearest POI is first identified by using |𝐼𝑅|−1 be deduced from the security of its individual subprotocols.
|
||||
SMC comparisons. Next, the SCR protocol is executed to locate
|
||||
the bucket row containing this POI, after which the remaining Theorem 3 (Security of SSDC). Assuming DT-PKC is semantically se-
|
||||
𝑘 − 1 POIs are obtained through the subsequent steps of SkQ. cure, the SSDC subprotocol securely computes encrypted squared distances
|
||||
In this case, the computational complexity of the precise search between the query point and candidate points in 𝐼𝑅 for semi-honest adver-
|
||||
stage is saries.
|
||||
√
|
||||
𝑂(|𝐼𝑅| + 𝑘2 ∗ 𝑀)𝑒𝑑𝑡 + 𝑂(|𝐼𝑅| + 𝑘 ∗ ( 𝑛 + 𝑘 ∗ 𝑀))𝑑𝑑𝑡 . Proof. In SSDC, the cloud server’s view consists of the ciphertexts
|
||||
𝑎′′ , 𝑏′′ , 𝑎′ , 𝑏′ , which are derived from plaintext differences scaled by
|
||||
where 𝑀 denotes the maximum number of neighbors in the
|
||||
random factors, and the encrypted comparison results 𝐸1 , 𝐸2 . The sim-
|
||||
Voronoi diagram. The comparison results between DESM𝑘NN ∏
|
||||
ulated view 𝑠𝐶𝐶𝑆 (𝑆𝑆𝐷𝐶) is constructed by sampling all elements uni-
|
||||
and existing secure 𝑘NN query schemes are summarized in Table
|
||||
3. formly at random from the appropriate domain. The semantic security
|
||||
of DT-PKC ensures that 𝑎′′ , 𝑏′′ , 𝑎′ , 𝑏′ are computationally indistinguish-
|
||||
Moreover, The computational complexity of POI insertion and dele- able from the corresponding simulated values (𝑎′′ ′′ ′ ′
|
||||
𝑠 , 𝑏𝑠 , 𝑎𝑠 , 𝑏𝑠 ). Similarly,
|
||||
tion in DESM𝑘NN is 𝑂(𝑙𝑜𝑔𝑛 + 𝑙𝑜𝑔(𝑀1 )) on average, which is asymp- the randomized encryption of the comparison outcomes 𝐸1 , 𝐸2 ensures
|
||||
totically equivalent to 𝑂(𝑙𝑜𝑔(𝑀1 𝑛)). Here, 𝑀1 represents the number that these values are indistinguishable from their simulated counter-
|
||||
of neighboring POIs affected by the local Voronoi diagram update. parts 𝐸1𝑠 , 𝐸2𝑠 . This demonstrates that the real execution reveals no
|
||||
This complexity arises from updating the encrypted R-tree and locally additional information beyond what is contained in the input and
|
||||
maintaining the Voronoi diagram. output, which confirms the security of SSDC. For CSS, the execu-
|
||||
∏
|
||||
tion image is 𝐶𝐶𝑆 (𝑆𝑆𝐷𝐶) = {𝐸1 , 𝐸2 }, and the simulated image is
|
||||
∏𝑠
|
||||
7.2. Communication complexity 𝐶𝐶𝑆 (𝑆𝑆𝐷𝐶) = {𝐸1𝑠 , 𝐸2𝑠 }. Since 𝐸1 , 𝐸2 are produced by randomized
|
||||
procedures, they are computationally indistinguishable from 𝐸1𝑠 , 𝐸2𝑠 ,
|
||||
In this subsection, the communication cost incurred during the which further supports the security argument.
|
||||
entire query processing is evaluated. As shown in Table 4, it presents
|
||||
the communication cost of DESM𝑘NN compared with MSV𝑘NN. It is Theorem 4 (Security of SMC). Assuming DT-PKC is semantically secure,
|
||||
observed that DESM𝑘NN consistently incurs the lowest communication the SMC protocol securely compares encrypted distance values and returns
|
||||
cost. These experimental results align well with the theoretical analysis. encrypted identifiers or labels.
|
||||
|
||||
7.3. Security analysis Proof. In SMC, the server’s view contains ciphertexts (𝐸𝑝𝑘0 (𝛼), 𝛼1 , 𝛼2 )
|
||||
∏
|
||||
and a local output bit 𝑤. The simulated view 𝑠𝐶𝐶𝑆 (𝑆𝑀𝐶) is obtained
|
||||
To establish the security of the proposed subprotocol, it is important by sampling all elements randomly. Semantic security guarantees that
|
||||
to highlight that the semantic security of the DT-PKC cryptosystem has (𝐸𝑝𝑘0 (𝛼), 𝛼1 ) are indistinguishable from their simulated counterparts
|
||||
been proven in [19]. Additionally, in accordance with the formal secu- (𝐸𝑝𝑘0 (𝛼)𝑠 , 𝛼1𝑠 ). Additionally, 𝛼2 is derived from random coin flips and
|
||||
rity definition of multiparty computation introduced in [29] and [34], is indistinguishable from 𝛼2𝑠 . The local output bit 𝑤 also matches
|
||||
the framework of the simulation paradigm proposed in [35] is adopted. the distribution of the simulated 𝑤𝑠 . Hence, the simulated view is
|
||||
Specifically, the simulation paradigm requires that the view of each computationally indistinguishable from the real view, which confirms
|
||||
participant in the protocol can be simulated based solely on its input the security of SMC.
|
||||
and output, which ensures that no participant gains any additional in-
|
||||
formation from the protocol. In other words, the real execution of each Theorem 5 (Security of DESM𝑘NN). If DT-PKC is semantically secure,
|
||||
subprotocol is computationally indistinguishable from its simulated DESM𝑘NN is secure under the semi-honest model.
|
||||
counterpart. For clarity, the SSDC and SMC are formally demonstrated
|
||||
as examples, and other protocols we proposed can be proven in a Proof. Since each subprotocol (SSDC, SMC, SSD, and others) produces
|
||||
similar manner. views indistinguishable from their respective simulated views, and all
|
||||
|
||||
9
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 6. The data processing time with varying parameters.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 7. Comparison of search time between MSV𝑘NN and DESM𝑘NN on two datasets (𝑘 = 1 to 10).
|
||||
|
||||
|
||||
intermediate values are either DT-PKC ciphertexts or explicitly ran- 8.1. Parameter setting
|
||||
domized, the composition theorem applies. Consequently, the overall
|
||||
DESM𝑘NN protocol is secure, ensuring confidentiality of the database, The evaluation of DESM𝑘NN is carried out on a system equipped
|
||||
privacy of queries, and integrity of computation. with an Intel Core i7-14650HQ processor, clocked at 2.80 GHz, and
|
||||
16 GB of RAM, which runs Windows 11. For this purpose, the DT-
|
||||
In DESM𝑘NN, a quantitative security comparison across existing
|
||||
PKC cryptosystem is implemented by using the JAVA development kit,
|
||||
methods is not conducted due to significant differences in their threat
|
||||
models, cryptographic assumptions, and supported functionalities, which forms the core element of the proposed protocol.
|
||||
which make such evaluation extremely difficult. Instead, DESM𝑘NN In the experiment, the dataset size 𝑛 ranges from 1024 to 2024. The
|
||||
focuses on formally achieving and proving multiple security properties search parameter 𝑘 is set between 1 and 10. The key size 𝐾 of the DT-
|
||||
that prior methods do not simultaneously provide. DESM𝑘NN ensures PKC cryptosystem are selected from {1024, 2048, 3072}. These settings
|
||||
data privacy, query privacy, result privacy, and access patterns privacy, apply to all values of 𝑛, 𝑘, 𝐾 in the experiment. While implementing the
|
||||
while also supporting result verification, multi-user querying, and MSV𝑘NN and SV𝑘NN schemes, the grid granularity is fixed at 90 and
|
||||
dynamic updates to the encrypted POIs database in outsourced POIs the cryptographic hash functions are implemented via HMAC-SHA-256.
|
||||
queries, which prior methods cannot achieve simultaneously.
|
||||
8.2. Experiment results
|
||||
8. Experimental evaluation
|
||||
The following analysis of the experimental results will focus on DO
|
||||
This section evaluates the computational cost of DESM𝑘NN by us- and Dual-Cloud Servers. It should be noted that the experiment results
|
||||
ing real-world datasets for spatial databases: California Road Network for the CIPE𝑠 scheme are not included, as its execution time is negligible
|
||||
and San Francisco Road Network. A comparison is made between compared to the DT-PKC cryptosystem. For example, the CIPE𝑠 scheme
|
||||
DESM𝑘NN and scheme MSV𝑘NN [9] in different phases. takes less than 1 s to retrieve 𝐼𝑅 from 1 million POIs.
|
||||
|
||||
10
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. Comparison of search time between MSV𝑘NN and DESM𝑘NN on two datasets (𝐾 = 1024 to 3072).
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 9. Comparison of search time between MSV𝑘NN and DESM𝑘NN on two datasets (𝑛 = 1024 to 2024).
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 10. The search time of DESM𝑘NN on two datasets (𝐾 = 1024 to 3072).
|
||||
|
||||
|
||||
|
||||
|
||||
• DO: The execution time in data preprocessing are shown in Fig. that in Fig. 7, both datasets (California Road Network and Points
|
||||
6. The computational cost includes two components: the cost of Interest, San Francisco Road Network) are real-world datasets,
|
||||
of encrypting 𝑉 𝐷 and the cost of generating 𝑆𝐼𝐺. Experiment where realistic POI distributions result in consistent performance
|
||||
results show that MSV𝑘NN and SV𝑘NN require additional oper- gaps between DESM𝑘NN and MSV𝑘NN. Moreover, real-world
|
||||
ations such as grid partition, grid padding, and grid encryption, datasets often exhibit a high density of POIs. Due to the grid
|
||||
and thus perform worse in this stage. partitioning mechanism, MSV𝑘NN tends to be inefficient when
|
||||
handling real-world datasets. For example, in the California road
|
||||
• Dual-Cloud Servers: As shown in Section 7, the execution time network dataset, when setting the fine-grained grid parameter 𝑚
|
||||
in search stage is influenced by parameters 𝑛, 𝑘, 𝐾. Experiments in MSV𝑘NN to 32 (which is the optimal parameter for MSV𝑘NN),
|
||||
are conducted under different parameter settings to demonstrate the number of POIs contained within each grid reaches as high as
|
||||
the effectiveness of DESM𝑘NN. We can observe that the search 108. To utilize data packing techniques, the parameter 𝐾 needs
|
||||
time of DESM𝑘NN is significantly shorter than MSV𝑘NN, as shown to be adjusted to no less than 4096, which results in extremely
|
||||
in Figs. 7–9, primarily because MSV𝑘NN incurs a high computa- high computational costs. However, in DESM𝑘NN, well-designed
|
||||
tional cost when executing the critical SGC protocol. Please note data structures are employed to regulate the number of POIs
|
||||
|
||||
11
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
per partition, which keeps 𝐾 within a reasonable range and References
|
||||
prevents excessive computational overhead. As shown in Fig. 10,
|
||||
when 𝐼𝑅 is smaller than the query parameter 𝑘, the query time [1] R. Li, A. Liu, A. Wang, Fast and scalable range query processing with strong
|
||||
privacy protection for cloud computing, IEEE/ACM Trans. Netw. 24 (4) (2015)
|
||||
is significantly higher compared to when 𝐼𝑅 exceeds 𝑘, since
|
||||
2305–2318.
|
||||
CS need to perform more calculations related to homomorphic [2] G. Xiao, F. Wu, X. Zhou, K. Li, Probabilistic top-k range query processing for
|
||||
encryption. For a given scheme, larger values of 𝑘 and 𝑛 increase uncertain databases, J. Intell. Fuzzy Syst. 31 (2) (2016) 1109–1120.
|
||||
query time by expanding the search space and raising computa- [3] K. Xue, S. Li, J. Hong, Y. Xue, N. Yu, P. Hong, Two-cloud secure database
|
||||
tional demands. Likewise, a larger 𝐾 leads to longer plaintexts for for numeric-related SQL range queries with privacy preserving, IEEE Trans. Inf.
|
||||
Forensic Secur. 12 (7) (2017) 1596–1608.
|
||||
encryption, which adds overhead from cryptographic operations. [4] Y. Miao, Y. Yang, X. Li, K.-K.R. Choo, X. Meng, R.H. Deng, Comprehensive survey
|
||||
on privacy-preserving spatial data query in transportation systems, IEEE Trans.
|
||||
In general, it can be concluded that DESM𝑘NN not only meets the Intell. Transp. Syst. 24 (12) (2023) 13603–13616.
|
||||
security requirements mentioned in Section 4 but also achieves higher [5] Y. Zhang, B. Wang, Z. Zhao, Verifiable and privacy-preserving 𝑘-NN query
|
||||
efficiency than scheme MSV𝑘NN in all stages of POI queries, with an scheme with multiple keys, IEEE Trans. Big Data 11 (3) (2024) 1434–1446.
|
||||
improvement of up to 45.5%. [6] Q. Liu, Y. Peng, J. Wu, T. Wang, G. Wang, Secure multi-keyword fuzzy searches
|
||||
with enhanced service quality in cloud computing, IEEE Trans. Netw. Serv.
|
||||
Manag. 18 (2) (2021) 2046–2062.
|
||||
9. Conclusion [7] Q. Liu, Y. Peng, Q. Xu, H. Jiang, J. Wu, T. Wang, T. Peng, G. Wang, S.
|
||||
Zhang, 𝖬𝖠𝖱𝖲Mars: Enabling verifiable range-aggregate queries in multi-source
|
||||
This paper proposes efficient and secure multi-user 𝑘NN queries environments, IEEE Trans. Dependable Secur. Comput. 21 (4) (2024) 1994–2011.
|
||||
[8] N. Cui, X. Yang, B. Wang, J. Li, G. Wang, SVkNN: Efficient secure and verifiable
|
||||
with dynamic POIs updating, which preserves the privacy of data,
|
||||
k-nearest neighbor query on the cloud platform, in: Proc. of ICDE, 2020, pp.
|
||||
queries, results, access patterns and ensures the results are correct 253–264.
|
||||
and complete in a multi-user environment. Firstly, DESM𝑘NN proposes [9] N. Cui, K. Qian, T. Cai, J. Li, X. Yang, J. Cui, H. Zhong, Towards multi-user,
|
||||
a two-stage search framework to accelerate query speed. Secondly, secure, and verifiable 𝑘 NN query in cloud database, IEEE Trans. Knowl. Data
|
||||
DESM𝑘NN designs a series of novel secure protocols and a compact ver- Eng. 35 (9) (2023) 9333–9349.
|
||||
[10] H. Xie, Y. Guo, X. Jia, A privacy-preserving online ride-hailing system without
|
||||
ification strategy to facilitate the operation over the two-stage search involving a third trusted server, IEEE Trans. Inf. Forensics Secur. 16 (2021)
|
||||
framework. Finally, computational complexity, security analysis and 3068–3081.
|
||||
experimental evaluation demonstrate that DESM𝑘NN improves query [11] W. Wong, D. Cheung, B. Kao, N. Mamoulis, Secure kNN computation on
|
||||
efficiency by up tp 45.5% compared to MSV𝑘NN. In future research, encrypted databases, in: Proc. of SIGMOD, 2009, pp. 139–152.
|
||||
[12] Y. Zhu, R. Xu, T. Takagi, Secure k-NN computation on encrypted cloud data
|
||||
we plan to study 𝑘NN queries for multi-type POIs to address the
|
||||
without sharing key with query users, in: Proc. of IWSEC, 2013, pp. 55–60.
|
||||
limitation of single-type POI scenarios, where query results are too [13] B. Yao, F. Li, X. Xiao, Secure nearest neighbor revisited, in: Proc. of ICDE, 2013,
|
||||
homogeneous. Moreover, we will focus more on exploring the balance pp. 733–744.
|
||||
between security and efficiency. [14] Q. Liu, Z. Hao, Y. Peng, H. Jiang, J. Wu, T. Peng, G. Wang, S. Zhang, SecVKQ:
|
||||
Secure and verifiable kNN queries in sensor–cloud systems, J. Syst. Archit. 120
|
||||
(2021) 102300.
|
||||
CRediT authorship contribution statement [15] Y. Elmehdwi, B.K. Samanthula, W. Jiang, Secure k-nearest neighbor query
|
||||
over encrypted data in outsourced environments, in: Proc. of ICDE, 2014, pp.
|
||||
Yining Jia: Writing – original draft, Software, Methodology, In- 664–675.
|
||||
vestigation, Conceptualization. Yali Liu: Writing – review & editing, [16] S. Choi, G. Ghinita, H.-S. Lim, E. Bertino, Secure kNN query processing in
|
||||
untrusted cloud environments, IEEE Trans. Knowl. Data Eng. 26 (11) (2014)
|
||||
Resources. Congai Zeng: Writing – review & editing. Xujie Ding:
|
||||
2818–2831.
|
||||
Writing – review & editing. Jianting Ning: Writing – review & editing. [17] K. Cheng, L. Wang, Y. Shen, H. Wang, Y. Wang, X. Jiang, H. Zhong, Secure 𝑘
|
||||
k-NN query on encrypted cloud data with multiple keys, IEEE Trans. Big Data
|
||||
Declaration of competing interest 7 (4) (2021) 689–702.
|
||||
[18] A. Boldyreva, N. Chenette, Y. Lee, A. O’neill, Order-preserving symmetric
|
||||
encryption, in: Proc. of EUROCRYPT, 2009, pp. 224–241.
|
||||
The authors declare that they have no known competing finan-
|
||||
[19] X. Liu, R.H. Deng, K.-K.R. Choo, J. Weng, An efficient privacy-preserving
|
||||
cial interests or personal relationships that could have appeared to outsourced calculation toolkit with multiple keys, IEEE Trans. Inf. Forensics
|
||||
influence the work reported in this paper. Secur. 11 (11) (2016) 2401–2414.
|
||||
[20] K. Cheng, Y. Shen, Y. Wang, L. Wang, J. Ma, X. Jiang, C. Su, Strongly secure
|
||||
and efficient range queries in cloud databases under multiple keys, in: Proc. of
|
||||
Acknowledgments
|
||||
INFOCOM, 2019, pp. 2494–2502.
|
||||
[21] S.K. Nayak, S. Tripathy, SEMKC: Secure and efficient computation over out-
|
||||
The authors thank the editor and the reviewers for their comments sourced data encrypted under multiple keys, IEEE Trans. Emerg. Top. Comput.
|
||||
and suggestions. This work was supported by the National Natural Sci- 9 (1) (2018) 414–428.
|
||||
ence Foundation of China under Grant No. 61702237, No. 62425205, [22] A. Okabe, B. Boots, K. Sugihara, S. Chiu, Spatial tessellations: Concepts and
|
||||
applications of voronoi diagrams, College Math. J. (2001).
|
||||
and No. 12441101, the Opening Foundation of State Key Laboratory
|
||||
[23] Y. Manolopoulos, A. Nanopoulos, A.N. Papadopoulos, Y. Theodoridis, R-Trees:
|
||||
for Novel Software Technology, Nanjing University under Grant No. Theory and Applications: Theory and Applications, Springer Science & Business
|
||||
KFKT2025B54, the Science and Technology Planning Foundation of Media, 2006.
|
||||
Xuzhou City under Grant No. KC22052, the Opening Foundation of [24] N. Cui, D. Wang, H. Zhu, J. Li, J. Xu, X. Yang, Enabling verifiable and secure
|
||||
range query in multi-user setting under cloud environments, IEEE Trans. Knowl.
|
||||
Guangxi Key Laboratory of Cryptography and Information Security,
|
||||
Data Eng. 36 (12) (2024) 8148–8163.
|
||||
Guilin University of Electronic Technology under Grant GCIS202114, [25] Q. Liu, S. Wu, S. Pei, J. Wu, T. Peng, G. Wang, Secure and efficient multi-
|
||||
the Postgraduate Research & Practice Innovation Program of Jiangsu attribute range queries based on comparable inner product encoding, in: Proc.
|
||||
Normal University under Grant 2024XKT2579, and the University- of CNS, 2018, pp. 1–9.
|
||||
Industry Collaborative Education Program of China under Grant No. [26] Y. Zhang, B. Wang, Z. Zhao, Secure k-NN query with multiple keys based on
|
||||
random projection forests, IEEE Internet Things J. 11 (9) (2023) 15205–15218.
|
||||
202101374001. All authors have read and approved the final version
|
||||
[27] S. Wu, Q. Li, G. Li, D. Yuan, X. Yuan, C. Wang, ServeDB: Secure, verifiable,
|
||||
of the manuscript. and efficient range queries on outsourced database, in: Proc. of ICDE, 2019, pp.
|
||||
626–637.
|
||||
Data availability [28] H.-I. Kim, H.-J. Kim, J.-W. Chang, A secure kNN query processing algorithm
|
||||
using homomorphic encryption on outsourced database, Data Knowl. Eng. 123
|
||||
(2019) 101602.
|
||||
Data will be made available on request. [29] A. Liu, K. Zhengy, L. Liz, G. Liu, L. Zhao, X. Zhou, Efficient secure similarity
|
||||
computation on encrypted trajectory data, in: Proc. of ICDE, 2015, pp. 66–77.
|
||||
|
||||
|
||||
12
|
||||
Y. Jia et al. Computer Standards & Interfaces 97 (2026) 104112
|
||||
|
||||
|
||||
[30] P. Williams, R. Sion, B. Carbunar, Building castles out of mud: practical access Congai Zeng received her M.Sc. in Electronic Information in
|
||||
pattern privacy and correctness on untrusted storage, in: Proc. of CCS, 2008, pp. 2024 from Jiangsu Normal University, China. Currently, she
|
||||
139–148. is pursuing the Ph.D. degree in the Faculty of Information
|
||||
[31] M.S. Islam, M. Kuzu, M. Kantarcioglu, Access pattern disclosure on searchable Technology at Beijing University of Technology, China. Her
|
||||
encryption: ramification, attack and mitigation, in: Proc. of NDSS, vol. 20, 2012, research interests include Internet of Vehicles security and
|
||||
p. 12. privacy.
|
||||
[32] A. Bowyer, Computing dirichlet tessellations, Comput. J. 24 (2) (1981) 162–166.
|
||||
[33] D.F. Watson, Computing the n-dimensional delaunay tessellation with application
|
||||
to voronoi polytopes, Comput. J. 24 (2) (1981) 167–172.
|
||||
[34] J. Liu, J. Yang, L. Xiong, J. Pei, Secure skyline queries on cloud platform, in:
|
||||
Proc. of ICDE, 2017, pp. 633–644.
|
||||
[35] A.C.-C. Yao, How to generate and exchange secrets, in: Proc. of Sfcs, 1986, pp.
|
||||
162–167. Xujie Ding received his B.Sc. in Software Engineering in
|
||||
2023 from Jiangsu Normal University, China. Currently, he
|
||||
is pursuing the M.Sc. degree in the School of Artificial Intel-
|
||||
ligence and Computer Science at Jiangsu Normal University,
|
||||
Yining Jia received his B.Sc. in Computer Science and Tech-
|
||||
China. His research interests include privacy preservation
|
||||
nology in 2023 from Nanjing Forestry University, China.
|
||||
and secure data sharing technology in smart healthcare.
|
||||
Currently, he is pursuing the M.Sc. degree in the School
|
||||
of Artificial Intelligence and Computer Science at Jiangsu
|
||||
Normal University, China. His research interests include
|
||||
data privacy, query processing, information security.
|
||||
|
||||
Jianting Ning received his Ph.D. in 2016 from Shanghai
|
||||
Jiao Tong University, China. He has been a Research Sci-
|
||||
entist at the School of Computing and Information Systems,
|
||||
Singapore Management University, and a Research Fellow at
|
||||
Yali Liu received her Ph.D. in 2014 from Nanjing Uni-
|
||||
the National University of Singapore. His research interests
|
||||
versity of Aeronautics and Astronautics, China. She is a
|
||||
include applied cryptography and information security. He
|
||||
senior member of China Computer Federation (CCF). She
|
||||
is currently a Professor with the School of Cyber Science
|
||||
has been a Research Scientist at Nanyang Technological
|
||||
and Engineering, Wuhan University, China, and with Fac-
|
||||
University, Singapore. She is currently a Professor in the
|
||||
ulty of Data Science, City University of Macau, China. He
|
||||
School of Artificial Intelligence and Computer Science at
|
||||
has published papers in major conferences/journals, such
|
||||
Jiangsu Normal University, China. Her research interests
|
||||
as ACM CCS, NDSS, ASIACRYPT, ESORICS, ACSAC, IEEE
|
||||
include information security, authentication and privacy-
|
||||
Transactions on Information Forensics and Security, and
|
||||
preserving technology, blockchain security and privacy,
|
||||
IEEE Transactions on Dependable and Secure Computing.
|
||||
vehicular ad hoc networks, cryptographic algorithms and
|
||||
protocols and their applications in Internet of things and
|
||||
mobile communication.
|
||||
|
||||
|
||||
|
||||
|
||||
13
|
||||
|
||||
@@ -0,0 +1,654 @@
|
||||
Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Journal of Systems Architecture
|
||||
journal homepage: www.elsevier.com/locate/sysarc
|
||||
|
||||
|
||||
|
||||
|
||||
Eliminating duplicate writes of logging via no-logging flash translation layer
|
||||
in SSDs
|
||||
Zhenghao Yin a , Yajuan Du a ,∗, Yi Fan a , Sam H. Noh b
|
||||
a Wuhan University of Technology, Wuhan, 430070, Hubei Province, China
|
||||
b
|
||||
Virginia Tech, Blacksburg, 24061-0326, VA, USA
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: With the development of high-density flash memory techniques, SSDs have achieved high performance and
|
||||
Flash memory large capacity. Databases often use logging to ensure transactional atomicity of data updates. However, it
|
||||
Transaction introduces duplicate writes because of multi-versioning, which significantly weakens the performance and
|
||||
Flash translation layer
|
||||
endurance of SSDs. This is also often considered as the main reason for slow response of databases. This
|
||||
Duplicate writes
|
||||
paper proposes a novel flash translation layer (FTL) for SSDs, which we refer to as NoLgn-FTL, to reduce
|
||||
the overhead of logging-induced duplicate writes by exploiting the inherent multi-version feature of flash
|
||||
memories. Specifically, during a transaction, NoLgn-FTL retains the old data as valid and establishes the
|
||||
mapping between the new physical addresses and the old physical addresses. Thus, the database can easily
|
||||
roll back to the old-version data to maintain system consistency when a power failure occurs. To evaluate
|
||||
NoLgn-FTL, we implement it within FEMU and modify the SQLite database and the file system to make them
|
||||
compatible with the extended abstractions provided by NoLgn-FTL. Experimental results show that, in normal
|
||||
synchronization mode, NoLgn-FTL can reduce SSD writes by 20% and improve database performance by 15%
|
||||
on average.
|
||||
|
||||
|
||||
|
||||
1. Introduction To investigate the performance of database logging in SSD, this
|
||||
paper first performs a preliminary study to collect latency that happens
|
||||
Solid-state drives (SSDs) have been widely adopted in database sys- during WAL-based data updates. We find that WAL takes a larger
|
||||
tems due to their high performance. Databases employ logging-based proportion of latency than regular data updates, especially for small
|
||||
methods, such as write-ahead logging (WAL) and rollback journals, to data updates. This inspires us to design a direct update scheme to
|
||||
ensure the transactional atomicity of multiple data updates. In these alleviate the overhead of duplicate writes by leveraging the out-of-
|
||||
methods, data is first written to persistent logs before updating the place update feature of flash memory. This feature inherently maintains
|
||||
original data, which induces duplicate writes [1]. For SSDs, duplicate multiple versions of data upon updates, allowing the database to easily
|
||||
writes occur in the following manner. First, the updated data and roll back to the previous version of the data in the event of a power
|
||||
metadata are written into log files in flash memory. Then, due to the failure or system crash, ensuring data consistency without the need for
|
||||
inherent out-of-place update nature of the SSD [2], the updated data explicit logging.
|
||||
is written into new flash pages rather than overwriting the original
|
||||
This paper proposes a no-logging flash translation layer (NoLgn-
|
||||
ones [3]. Thus, one user data write induces two SSD internal writes
|
||||
FTL) by reusing old flash data pages. The key idea is to keep the
|
||||
onto two different flash pages, increasing extra program/erase (P/E)
|
||||
mapping information of old data during transactions, eliminating the
|
||||
cycles. This reduces SSD lifespan and degrades overall performance by
|
||||
need for separate log writes. We establish a mapping table between
|
||||
consuming write throughput.
|
||||
new and old physical addresses (called a P2P table) in the RAM of
|
||||
To address the issue of SSD duplicate writes in logging-based
|
||||
the flash controller. Meanwhile, the old physical address is written
|
||||
databases, researchers have proposed data remapping methods. These
|
||||
methods aim to convert logs directly into new data by modifying the into the out-of-band area of new flash pages, providing a backup
|
||||
mapping between logical pages (LPs) and physical pages (PPs) in flash of the mapping information. In this way, uncommitted transactions
|
||||
memory [4,5]. However, dealing with the inconsistency of logging and can be rolled back to the old data version upon power failure, thus
|
||||
data LPs is challenging during power failures. maintaining consistency. We implement NoLgn-FTL within FEMU and
|
||||
|
||||
|
||||
∗ Corresponding author.
|
||||
E-mail address: dyj@whut.edu.cn (Y. Du).
|
||||
|
||||
https://doi.org/10.1016/j.sysarc.2025.103347
|
||||
Received 31 October 2024; Received in revised form 15 December 2024; Accepted 18 January 2025
|
||||
Available online 25 January 2025
|
||||
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
evaluate it with the SQLite database. Experimental results show that, The write overhead incurred by WAL cannot be overlooked com-
|
||||
in normal synchronization mode, NoLgn-FTL can reduce SSD writes by pared to directly updating the page. Multiple update operations may
|
||||
20% and improve database performance by 15% on average, compared be performed on the same data page in the buffer, but during a
|
||||
to existing methods. Our paper makes the following contributions. checkpoint, the storage engine writes the latest data page to a database
|
||||
file. Fig. 2 illustrates the storage engine layer writing process. In the
|
||||
• We conduct a preliminary study that reveals the significant la- example, two concurrent transactions, Transaction1 and Transaction2,
|
||||
tency impact of logging, compared to pure data updates in modify the database. Transaction1 updates A and B with values 2
|
||||
databases, motivating the need for a more efficient approach to and 4, while Transaction2 updates A and C with values 3 and 7.
|
||||
handling duplicate writes. During the first step of the write merging process, the modifications
|
||||
• We propose a novel SSD FTL, called NoLgn-FTL, which fully made by both transactions are recorded in the WAL file. The WAL file
|
||||
utilizes the out-of-place update nature of flash memory to largely maintains separate regions for each transaction, capturing the updated
|
||||
remove duplicate writes caused by database logging. page identifiers and their corresponding values. Consequently, the WAL
|
||||
• We modify SQLite and integrate NoLgn-FTL in the FEMU simula- file contains two distinct entries: one for Transaction1, documenting
|
||||
tor. We verify the efficiency of NoLgn-FTL in reducing duplicate the updates to pages A(2) and B(4), and another for Transaction2,
|
||||
writes and improving database performance through extensive recording the updates to pages A(3) and C(7). In the second step, the
|
||||
experiments. changes recorded in the WAL file are applied to the database during the
|
||||
checkpointing process. As both transactions modify page A, the WAL
|
||||
The rest of this paper is organized as follows. Section 2 introduces
|
||||
mechanism merges these updates into a single write operation. The
|
||||
the basics of SSDs and logging methods as well as the motivation of
|
||||
WAL mechanism consolidates the updates and writes the final value
|
||||
this paper. Section 3 presents the design of NoLgn-FTL. Section 4 shows
|
||||
of page A(3) to the database file. A contains the merged value of 3,
|
||||
the experimental setup and evaluation results of NoLgn-FTL. Section 5
|
||||
while B and C hold 4 and 7.
|
||||
reviews existing work, and Section 6 concludes this paper.
|
||||
|
||||
2.3. Existing solutions
|
||||
2. Background and motivation
|
||||
Existing works propose to exploit data remapping to eliminate
|
||||
This section begins by introducing the basics of SSDs, with a focus
|
||||
duplicate writes in SSDs [8–10]. The key design is not to remove the
|
||||
on logging methods. Then, we present existing remapping-based meth-
|
||||
out-of-place data update but to directly remap the WAL file to the
|
||||
ods. Finally, we present the preliminary study as the motivation for this
|
||||
new-version data, as shown in Fig. 1b.
|
||||
paper.
|
||||
However, address remapping can lead to mapping inconsistency.
|
||||
Flash pages are divided into a data area for storing user data and
|
||||
2.1. Basics of SSD an OOB area for maintaining metadata. The OOB area contains the
|
||||
physical-to-logical (P2L) mappings, which are crucial for maintaining
|
||||
Flash memory utilizes a flash translation layer (FTL) to store and data consistency during garbage collection and database recovery.
|
||||
manage a logical-to-physical address translation, called L2P mapping. During garbage collection, the P2L mappings enable quick identifica-
|
||||
This mapping is often stored in the SRAM internal to the SSD to achieve tion of the logical address corresponding to a physical address, which
|
||||
high access performance. Meanwhile, the logical address is also stored accelerates the update of L2P mappings during data migration. During
|
||||
in the out-of-band (OOB) area of physical flash pages. Upon a data recovery upon a system crash, the FTL can reconstruct the lost L2P
|
||||
update request, the FTL first stores the new data in new flash pages and mapping table using the P2L mapping stored within the page.
|
||||
invalidates the old flash pages. Meanwhile, the L2P mapping is directed Without remapping, the P2L mappings in the OOB area directly
|
||||
to the new physical page addresses, and the requested logical addresses correspond to the LPN in the L2P mapping table. However, mapping
|
||||
are also stored in the OOB areas as the new flash pages are written. The inconsistencies may arise after remapping because remapping opera-
|
||||
invalidated old pages are reclaimed during garbage collection (GC). tions do not simultaneously update the related P2L mappings in the
|
||||
As shown in Fig. 1a, when data with physical addresses P1, P2, and OOB area.
|
||||
P3 need to be updated, new data would eventually be stored in new
|
||||
physical pages P1′ , P2′ , and P3′ . (Note L𝑖 and P𝑖 in the figure represent 2.4. Preliminary study and motivation
|
||||
the logical address and physical addresses).
|
||||
To investigate the performance of database transactions, we conduct
|
||||
2.2. Write ahead logging preliminary experiments using the FEMU simulator [11], which is
|
||||
discussed in more detail in Section 4.
|
||||
Relational databases are typically run in rollback mode or write- We run the SQLite database, perform 1 million overwrite operations
|
||||
ahead log mode in order to support atomic execution of transactions [1, for each fixed value size, and collect the transaction latency under four
|
||||
6,7]. New updates are first written in a dedicated log, and the data value sizes. In Fig. 3, the 𝑥-axis represents the transaction value size and
|
||||
is kept consistent by rolling back or forwarding to the log. How- the 𝑦-axis represents the percentage of the time spent on WAL writes,
|
||||
ever, using logs often generates write amplification, affecting database WAL synchronization, data writes, and data synchronization.
|
||||
performance. Write-ahead logging (WAL) serves as an example. A From Fig. 3, we observe that WAL (WAL write and WAL synchro-
|
||||
WAL-based transaction update includes three steps: WAL writing, WAL nization) takes up a significant portion of the total transaction latency.
|
||||
synchronization, and database writing, as shown in Fig. 1a. First, when Compared to the data (data write and data synchronization) operations,
|
||||
a transaction is initiated, the new data are written into the page cache the proportion is significantly higher for small value sizes, while for the
|
||||
of WAL files (Step 1). Upon transaction commit, the WAL files are 16 KB size, the two are comparable.
|
||||
physically written to flash memory (WAL synchronization) (Step 2). Two main factors contribute to this phenomenon. Firstly, WAL
|
||||
Finally, the database data is updated during system checkpointing. As introduces additional overhead by writing an extra frame header for
|
||||
this checkpoint is performed at the database software level, WAL data each transaction. This header contains essential recovery information
|
||||
cannot be directly moved into the database data. Thus, the WAL file is and is stored alongside the normal data. Consequently, the relative
|
||||
read again into the page cache (Step 3) and written into flash memory overhead of the frame header becomes more significant for smaller
|
||||
upon database synchronization (Step 4). Duplicated writes introduced transactions. Secondly, although WAL consolidates multiple updates to
|
||||
by WAL are detrimental to flash memory endurance and performance. the same data pages into a single write operation during checkpointing,
|
||||
|
||||
2
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1. Existing write-ahead logging schemes in SSDs.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Transaction latency distribution in SQLite database.
|
||||
Fig. 2. Multi-version pages in the WAL.
|
||||
|
||||
|
||||
|
||||
|
||||
the logging mechanism still necessitates storing multiple versions of 3.1. Overview
|
||||
the same data in log files. It results in increased storage requirements,
|
||||
particularly affecting smaller transactions with frequent updates on the We propose NoLgn-FTL, a novel approach that optimizes both soft-
|
||||
same page, as the overhead of maintaining multiple versions becomes ware and hardware architectures to efficiently manage transactions and
|
||||
more significant relative to the size of the transactions. data version control at the FTL layer, thereby avoiding the overhead of
|
||||
This paper proposes a novel approach by directly updating data and logs in databases. At the core of NoLgn-FTL is the novel FTL, where
|
||||
leveraging the inherent multi-version characteristic of flash memory. transaction information is utilized to perform mapping conversion of
|
||||
Shifting the focus of transaction support to flash can reduce the reliance logical and physical addresses in the L2P and P2P tables only when
|
||||
on logs and frequent file synchronization operations in the database. data is written, minimizing overhead. However, the use of NoLgn-
|
||||
This leads to faster application response times as it reduces the need FTL starts at the database layer where the transaction information is
|
||||
for excessive logging and synchronization. attached to write requests. The file system layer also plays a crucial role
|
||||
by providing transaction-related interfaces and transmitting necessary
|
||||
transactional metadata.
|
||||
3. The proposed NoLgn-FTL Fig. 4 shows the overall workflow with an example of transactional
|
||||
data update on three pages in L1, L2, and L3. The process is divided
|
||||
We first introduce the overview of the whole system flow using an into three key stages: transaction delivery, transaction persistence, and
|
||||
no-logging flash translation layer, which, hereafter, we simply refer to GC. These stages can be further subdivided into six steps.
|
||||
as NoLgn-FTL. Then, we delve into the design details of NoLgn-FTL, First, the database assigns transaction flags to each transaction (⃝ 1
|
||||
including old page information storage, transaction process, garbage in Fig. 4) to indicate the completion status of the transaction. Then, a
|
||||
collection (GC), and data recovery. Without loss of generality, the SQL transaction ID is added to the original transactional data request (⃝). 2
|
||||
database is used in discussing the use of NoLgn-FTL. Finally, we analyze To retain transaction flags and IDs, we design new interfaces in the file
|
||||
and discuss the overhead associated with NoLgn-FTL. system (⃝).3
|
||||
|
||||
|
||||
3
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 4. Overview of NoLgn-FTL.
|
||||
|
||||
|
||||
In the second stage, which occurs within the SSDs, the flash con- of the old pages are also stored in the OOB area of the new flash pages.
|
||||
troller identifies transaction data by transaction flags and IDs. Data and The primary purposes of the P2P table are twofold: firstly, to facilitate
|
||||
transaction information are persisted, obtaining their corresponding the management of transactional information by the underlying FTL,
|
||||
physical addresses. The old addresses and transaction information are and secondly, to enhance the performance during GC and transaction
|
||||
written in the OOB area of the corresponding flash pages, as well as in operations. Note that locating old pages can be accelerated by using the
|
||||
the P2P table in DRAM (⃝). 4 The old pages remain valid in this step P2P table, thereby avoiding frequent access on flash pages to the OOB
|
||||
but will be invalidated only after the transaction is committed (⃝). 5 area. This table does not need to be written to flash memory and can be
|
||||
As transactions are continuously executed, a large amount of invalid recovered through a full scan even after a sudden power failure, thus
|
||||
data accumulates in the flash memory. The GC process (⃝) 6 reclaims the avoiding frequent writes of transaction information to flash memory.
|
||||
invalid data. The collaboration between the database, file system, and Furthermore, transaction information, including transaction IDs and
|
||||
flash controller in NoLgn-FTL ensures data consistency and integrity flags, is stored in the OOB area of new flash pages. In detail, flags S,
|
||||
throughout the transactional data update process. M, and E represent the starting page, the middle pages, and the end
|
||||
The modified file system interfaces play a crucial role in preserving page of a transaction, respectively. In the implementation of transac-
|
||||
the necessary transaction metadata. The design of NoLgn-FTL in the tion flags, since we are only concerned whether the transaction has
|
||||
above-mentioned three main stages will be presented in Sections 3.2,
|
||||
ended, we use only one bit to mark the transaction’s completion. By
|
||||
3.3, and 3.4.
|
||||
storing transaction information alongside the corresponding pages, the
|
||||
progress and state of transactions can be more effectively tracked,
|
||||
3.2. Metadata management in transaction delivery
|
||||
enabling data recovery in case of unexpected failures or interruptions.
|
||||
Database recovery will be explained in Section 3.5.
|
||||
In the transaction delivery process, we introduce additional meta-
|
||||
In addition to transaction information, one extra bit, referred to
|
||||
data to facilitate the implementation of the no-logging scheme. This
|
||||
as the lock bit, is used to indicate the block lock state. The lock bit
|
||||
metadata is passed along with the transactional data requests to en-
|
||||
value ‘1’ signifies that valid old pages exist in the current block, while
|
||||
sure proper handling and management of transactions throughout the
|
||||
‘0’ indicates the block is stale and can be reclaimed during GC. By
|
||||
system.
|
||||
embedding the lock bit within the FTL, blocks containing valid old
|
||||
In the FTL, we establish a physical-to-physical (P2P) table that
|
||||
pages and normal blocks can be efficiently distinguished, allowing for
|
||||
stores the mapping between new and old physical pages (i.e., their old
|
||||
GC optimization. The GC process under NoLgn-FTL will be presented
|
||||
version). In detail, one entry in the P2P table includes the transaction
|
||||
in Section 3.4.
|
||||
ID, the physical page number (PPN) of the new page and the PPN of the
|
||||
corresponding old page. To ensure persistent P2P mappings, the PPNs
|
||||
|
||||
4
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
3.3. Transaction persistence in NoLgn-FTL P2P Table Storage and Overhead: The P2P table is stored in the RAM
|
||||
of the flash controller. The number of entries in the P2P table depends
|
||||
To ensure transaction persistence, the transaction needs to do the on the number of concurrent transactions. In our experiment, the table
|
||||
following during its write and commit process. During transaction contains 10 000 entries. Each P2P entry takes 12 bytes, including a 4-
|
||||
writing, NoLgn-FTL first looks up the original L2P table to find the byte transaction ID and 4 bytes each for the new page PPN and the
|
||||
old PPN corresponding to the requested logical addresses. As shown in old page PPN. The total size of the P2P table is about 120 KB. The
|
||||
1
|
||||
Fig. 4, the old PPNs are P1, P2, and P3 for the requested L1, L2, and L3, DRAM size is usually around 1024 of the SSD capacity. For an SSD with
|
||||
respectively. Then, the updated data are written into the new pages P1′ , a 1TB capacity, the DRAM size will be 1 GB, and the P2P table will be
|
||||
P2′ , and P3′ , respectively. At the same time, transaction information 0.12 MB, which is only 0.012% of the DRAM size and is negligible. The
|
||||
and the old PPN are written into the OOB area of these new pages. block lock state is stored in the metadata of data blocks as a bitmap,
|
||||
Finally, NoLgn-FTL stores the mapping entry of P1, P2, and P3 into the with each block requiring only 1 bit, which is insignificant in terms of
|
||||
P2P table. Different from the original flash write, the old page remains overhead. This lock bit is loaded into the SSD’s DRAM during startup.
|
||||
valid. Meanwhile, the block’s lock state containing valid old pages is Transaction Information Storage in OOB Area: Transaction informa-
|
||||
set to ‘1’. tion is stored in the OOB area of flash pages. NoLgn-FTL uses 4 bytes for
|
||||
During transaction commit, NoLgn-FTL first searches the P2P table old PPNs and 4 bytes for transaction information (comprising the trans-
|
||||
to find old valid pages and then invalidates them. Then, the block’s lock action ID and 1 bit for transaction flag). In current flash chips, the ratio
|
||||
state containing these old valid pages would be set to ‘0’. Finally, the of the OOB area size to the data area size is about 18 [12]. Therefore,
|
||||
corresponding entries in the P2P table are deleted. the OOB area has enough space to store transaction information.
|
||||
|
||||
|
||||
3.4. Garbage collection with NoLgn-FTL 4. Evaluation
|
||||
|
||||
|
||||
GC in NoLgn-FTL requires handling valid old pages temporarily In this section, we present a comprehensive evaluation of NoLgn-
|
||||
generated during transaction processing. Selecting a victim block for FTL, using an SQLite and Ext4 combination as a case study. We first
|
||||
describe the experimental setup. Then, we present the sqlite-bench
|
||||
GC involves several steps to ensure data integrity and efficient space
|
||||
experimental results, focusing on two key aspects: flash write and
|
||||
reclamation.
|
||||
database performance. We also investigate the impact of NoLgn-FTL
|
||||
When selecting a victim block for GC, the first step is to check the
|
||||
on GC. Furthermore, we show the performance of real-world workloads
|
||||
block’s lock state. If the lock state is ‘1’, valid old pages still exist within
|
||||
with the YCSB and TPC-C benchmarks.
|
||||
the block, and therefore, the block cannot be reclaimed. At this time,
|
||||
the next victim block in the queue is selected until the selected block’s
|
||||
4.1. Experimental setup
|
||||
lock state is ‘0’. Then, whether there is a transaction page in the block
|
||||
must be checked. As the transaction information and old PPN are stored
|
||||
NoLgn-FTL is implemented on FEMU [13–15], a QEMU-based NVMe
|
||||
in the OOB area of the new valid pages, GC in NoLgn-FTL deals with
|
||||
SSD emulator. The host system kernel of FEMU is Linux 5.15, and the
|
||||
them differently depending on the transaction state. That is, before the
|
||||
file system is Ext4. To ensure a representative and consistent setup, the
|
||||
transaction is committed, GC will migrate these valid pages together
|
||||
simulated SSD has a 16 GB logical capacity, with 1024 pages per flash
|
||||
with the OOB area. However, after a commit has occurred, GC only
|
||||
block and a 4 KB page size. The flash latency for read, write, and erase
|
||||
migrates valid page data, removing the extra metadata of NoLgn-FTL
|
||||
operations is 50 μs, 500 μs, and 5 ms, respectively [16]. To ensure the
|
||||
that resides in the OOB area.
|
||||
GC (Garbage Collection) mechanism is appropriately triggered during
|
||||
our experiments, we conducted 4 million 4 KB write operations on the
|
||||
3.5. Database recovery with NoLgn-FTL
|
||||
SSD in each test. This setup guarantees that GC operations occur as part
|
||||
of the evaluation.
|
||||
In the event of a power-off or system crash, data stored in the For the logging database, we make use of SQLite. We make nec-
|
||||
flash controller’s RAM is lost, and only the OOB area of flash pages essary modifications to the Linux kernel to receive and process trans-
|
||||
can be used for system recovery. One solution is to recover to the action information from the SQLite database. To enable SQLite to
|
||||
consistent states in the latest checkpoint, which requires periodically transmit transaction information to the kernel, we utilize the ioctl
|
||||
storing checkpoints. The other solution involves a full flash scan to system call to change database write, commit, and abort operations into
|
||||
rebuild mappings, as shown in Step 1 of Fig. 5. Physical pages and write, commit, and abort commands. As SQLite does not automatically
|
||||
their OOB area would be read one by one (Step 2). For pages that generate unique transaction IDs for each transaction, the transaction
|
||||
do not have transaction information in the OOB area, NoLgn-FTL can IDs are generated in the kernel after each transaction is committed.
|
||||
directly recover the L2P table of PPNs based on the LPNs in their OOB Upon receiving the written information from SQLite, the kernel first
|
||||
area. Otherwise, NoLgn-FTL decides to recover old-version pages or assigns flags to the requested transaction pages. This enables the kernel
|
||||
not according to transaction information. NoLgn-FTL would first obtain to keep track of the transaction status and perform necessary operations
|
||||
pages with the same transaction ID. If the page with the end flag bit accordingly. Approximately 150 lines of code were modified in SQLite,
|
||||
can be found, these pages would be directly put into the L2P table around 100 lines in the file system, and about 300 lines in FEMU.
|
||||
together with their LPNs (Step 3). Otherwise, if all pages have the flag Hereafter, NoLgn-FTL will refer to the entire SQLite-Ext4-SSD sys-
|
||||
bit ‘0’, which indicates that the current transaction is not committed, tem stack modified to ensure the seamless integration and functionality
|
||||
the old-version pages would be first read out (Step 4), and only the L2P of NoLgn-FTL within the existing software and hardware stack. The
|
||||
mappings of old-version pages would then be put into the L2P table. newly introduced commands, which are based on the ioctl system
|
||||
call, are as follows.
|
||||
3.6. Discussion and overhead analysis write(page p, tid t, flag f). This command adds a transaction ID (tid),
|
||||
𝑡, and a transaction flag, 𝑓 , to the original write operation. It is the
|
||||
Compared to existing logging methods that store extra logs for each beginning of a transaction and corresponds to Step 4 in Fig. 4. The
|
||||
transaction, the use of NoLgn-FTL allows normal data updates without inclusion of the transaction ID and flag enables the FTL to track and
|
||||
the need for additional logging. The overhead of NoLgn-FTL is due manage the transaction.
|
||||
to the storage of extra metadata, including the P2P table, transaction commit (tid t). This command with the parameter of transaction ID
|
||||
information, and the block lock state. tid t is sent to NoLgn-FTL along with the original fsync command in the
|
||||
|
||||
5
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 5. Recovery with NoLgn-FTL.
|
||||
|
||||
|
||||
Linux kernel. It indicates the successful completion of a transaction and shows the normalized number of writes in flash memory compared to
|
||||
aligns with Step 5 in Fig. 4. Upon receiving this command, NoLgn-FTL Base-WAL under two synchronization modes. In NORMAL mode, SW-
|
||||
finalizes the transaction and ensures the durability of the associated WAL reduces writes by 35% compared to Base-WAL, as it eliminates
|
||||
data. extra writes caused by out-of-place updates through WAL file remap-
|
||||
abort(tid t). This command is invoked to terminate ongoing trans- ping. On average, NoLgn-FTL reduces 55% and 20% of the flash page
|
||||
actions before committing transaction 𝑡. It indicates a rollback opera- writes compared to Base-WAL and SW-WAL, respectively. The superior
|
||||
tion, reverting the data pages to their previous versions, akin to the performance of NoLgn-FTL is due to its elimination of WAL writes
|
||||
data recovery process for uncommitted transactions as mentioned in and WAL synchronization, resulting in a greater reduction of writes
|
||||
Section 3.5. compared to SW-WAL. Specifically, there are two reasons for NoLgn-
|
||||
We compare NoLgn-FTL with Base-WAL, the original SQLite, which FTL’s write reduction. First, as WAL has to write an extra log header,
|
||||
uses the native logging scheme, and SW-WAL [4], which reduces WAL write involves more data than normal data write. Second, since
|
||||
duplicate writes by SSD remapping as shown in Fig. 1a. For each trans-
|
||||
synchronization does not happen immediately after each transaction,
|
||||
action size, the database runs separately, but these transactions share
|
||||
in NORMAL mode, updates onto the same page are serviced from
|
||||
the same SSD storage. It is important to consider that in real-world
|
||||
the cache. NoLgn-FTL combines several updates into a single update,
|
||||
scenarios, particularly in mobile environments, the characteristics of
|
||||
thereby reducing writes. However, this combination cannot be realized
|
||||
write requests can significantly impact the performance of storage
|
||||
in SW-WAL as it uses different LPNs for data updates and WAL writes.
|
||||
systems. SQLite is a lightweight, embedded database commonly used in
|
||||
mobile devices for local data storage, making it highly relevant to our In FULL mode, NoLgn-FTL reduces flash page writes by 35% and
|
||||
analysis. Studies have shown that approximately 90% of write requests 2% compared to Base-WAL and SW-WAL, respectively. Both methods
|
||||
in Android applications, such as Facebook and Twitter, are related to show reductions in page writes compared with Base-WAL, similar to the
|
||||
SQLite databases and journal files. In environments like these, the data NORMAL mode. However, the enhancement brought by NoLgn-FTL is
|
||||
items stored in the database are typically small, often below 4 KB. less than that of the NORMAL mode. As each transaction is forcibly
|
||||
These small data items, such as individual records or key–value pairs, synchronized to flash memory after committing, there is no chance for
|
||||
are frequently written to the storage medium in the form of random NoLgn-FTL to combine updates on the same page. The reduction from
|
||||
write operations. These operations usually target data blocks ranging log header writes is limited. Thus, in this mode, NoLgn-FTL behaves
|
||||
from 64B to 4 KB, and such small writes often involve high interaction similarly to SW-WAL.
|
||||
with the underlying file system, such as EXT4, which is commonly
|
||||
used in Android devices [17,18]. Therefore, we set different transaction 4.3. Results of database performance
|
||||
sizes from 256B to 16 KB in the experiment to observe their impact on
|
||||
performance. We used sqlite-bench to observe SQLite performance. Fig. 7 shows
|
||||
We conduct experiments in both the FULL and NORMAL syn- the normalized throughput results of SQLite under the three com-
|
||||
chronous modes of the database. In FULL mode, synchronization is pared methods. In NORMAL mode, NoLgn-FTL achieves an average
|
||||
triggered after each transaction is committed. This forces all transaction performance improvement of 51% and 15% against Base-WAL and SW-
|
||||
data to be written into SSDs, thus providing the highest atomicity
|
||||
WAL, respectively. NoLgn-FTL performs particularly better compared
|
||||
and durability. Conversely, in NORMAL mode, synchronization is not
|
||||
to SW-WAL for small-sized transactions, due to the reasons described
|
||||
triggered immediately after the transaction is committed. Typically,
|
||||
earlier.
|
||||
transactions are synchronized into SSDs only when a certain number
|
||||
In FULL mode, we observe that NoLgn-FTL outperforms Base-WAL
|
||||
of frames (including transaction heads and data) are accumulated.
|
||||
and SW-WAL by an average of 26% and 4%, respectively. This perfor-
|
||||
Note that NoLgn-FTL has no explicit WAL synchronization operation.
|
||||
In NORMAL mode, we manually control the frequency of commit in mance improvement is primarily due to the reduction in the number
|
||||
NoLgn-FTL to keep consistent with the synchronization operation of the of writes achieved by NoLgn-FTL. Meanwhile, we find that both SW-
|
||||
other two existing methods. In NoLgn-FTL, a synchronization operation WAL and NoLgn-FTL demonstrate a gradual performance improvement
|
||||
will be triggered every 1000 data pages. as the transaction size increases. This is because, for large-size trans-
|
||||
actions, Base-WAL takes up more latency to write flash pages and GC.
|
||||
4.2. Results of flash page writes Since SW-WAL and NoLgn-FTL reduce the number of data writes, this
|
||||
degradation is mitigated. Even in this situation, the performance of
|
||||
We used sqlite-bench with 200 thousand overwrite operations to SW-WAL is still inferior to that of NoLgn-FTL, as it maintains head
|
||||
observe the effect of NoLgn-FTL on flash memory page writes. Fig. 6 information that consumes data write latency.
|
||||
|
||||
6
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 6. Results of flash page writes.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 7. SQLite database performance.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. SQLite database latency.
|
||||
|
||||
|
||||
Besides, we also evaluated database latency data under different by NoLgn-FTL remains significant. Compared to Base-WAL, NoLgn-FTL
|
||||
conditions. Fig. 8 illustrates the normalized latency results under the reduces latency by an average of 16.4%, and compared to SW-WAL,
|
||||
three compared methods: Base-WAL, SW-WAL, and NoLgn-FTL, in both the reduction is 3.7%. Both NoLgn-FTL and SW-WAL exhibit a gradual
|
||||
NORMAL and FULL modes. latency improvement as transaction size increases, which aligns with
|
||||
In NORMAL mode, NoLgn-FTL demonstrates the lowest latency the behavior observed in throughput analysis. For larger transactions
|
||||
among the three methods, achieving an average reduction of 34.4% (e.g., 8 KB and 16 KB), Base-WAL experiences higher latency due to
|
||||
compared to Base-WAL and 11% compared to SW-WAL. The latency more extensive flash page writes and garbage collection overhead. In
|
||||
advantage of NoLgn-FTL is particularly pronounced for small-sized contrast, NoLgn-FTL and SW-WAL effectively mitigate this degradation
|
||||
transactions (e.g., 256B and 512B). This stems from its ability to by reducing the volume of writes.
|
||||
reduce the number of writes and optimize metadata updates, minimiz-
|
||||
ing the overhead typically associated with WAL. SW-WAL also shows 4.4. Results of GC overhead
|
||||
improved latency compared to Base-WAL, with an average reduction
|
||||
of approximately 26.2%, thanks to its selective write strategy. How- We used sqlite-bench to investigate the impact of block locking on
|
||||
ever, its performance is still limited due to the additional overhead GC performance by collecting write distribution results under different
|
||||
introduced by writing WAL, which becomes increasingly noticeable for transaction sizes. Fig. 9 shows the write distribution of host requests,
|
||||
smaller transactions. In FULL mode, the latency reduction achieved GC migration, and block locking (denoted as additional pages) under
|
||||
|
||||
7
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 9. Results of GC overhead. NoLgn-FTL would lock certain blocks, which would affect victim block selection and induce more migrations.
|
||||
|
||||
|
||||
Table 1 E), the improvements from both methods are not significant. This is
|
||||
YCSB workloads.
|
||||
mainly because both methods only enhance write performance and
|
||||
Workload Description have little impact on read performance. Meanwhile, NoLgn-FTL still
|
||||
A 50% read and 50% update, Zipfian distribution outperforms SW-WAL due to its greater write performance benefits. In
|
||||
B 95% read and 5% update, Zipfian distribution
|
||||
the case of workload C, which only contains read requests, there are
|
||||
C 100% read, Zipfian distribution
|
||||
D 95% read and 5% insert, latest read no obvious differences in the three methods. This is because the remap-
|
||||
E 95% scan and 5% insert, Zipfian distribution based logging in SW-WAL and no-logging scheme in NoLgn-FTL are not
|
||||
F 50% read and 50% read–modify–write, Zipfian distribution triggered. The slight performance fluctuations arise from the random
|
||||
nature of read operations.
|
||||
Fig. 11 shows the performance of SQLite in terms of transactions
|
||||
different transaction sizes. per minute (tpmC) with different SSD free spaces. To obtain SSDs
|
||||
Two key observations can be made from Fig. 9. First, as transaction with varying free space, sufficient random overwrite iterations are
|
||||
value size increases, the proportion of valid page migration involved performed before each of the experiments. TPC-C is a write-intensive
|
||||
in GC also increases, reaching a maximum of 62%. This trend can be workload with operations such as new orders, payment, and delivery,
|
||||
attributed to the fact that larger transaction sizes require more frequent with an average of two pages updated per transaction. The results
|
||||
GC to accommodate new content. Second, the block locking mechanism show that when SSD free space is 75%, the performance differences
|
||||
impacts the number of valid pages migrated. The maximum proportion among the three modes are relatively small. However, as SSD free
|
||||
of additional migration pages due to block locking is 6%, with an space decreases, the performance gap widens. Overall, NoLgn-FTL sig-
|
||||
average increase of 3.5% in total write pages. This impact is more nificantly outperforms Base-WAL and SW-WAL. On average, SW-WAL
|
||||
significant for smaller transaction sizes, as updates may be concentrated improves transaction throughput by 20% compared to Base-WAL, while
|
||||
in fewer blocks, preventing them from being chosen as optimal victim NoLgn-FTL improves throughput by 38%. Notably, the performance
|
||||
blocks for GC and leading to suboptimal data migration with more valid gains of SW-WAL and NoLgn-FTL become more pronounced when SSD
|
||||
pages. free space is limited. When SSD remaining space is 25%, NoLgn-FTL’s
|
||||
Despite the extra page writes caused by block locking, these over- throughput is 81% higher than Base-WAL. This is mainly because when
|
||||
heads are acceptable compared to the significant reduction in duplicate SSD free space is low, there may be a lack of free blocks, requiring
|
||||
writes achieved by NoLgn-FTL. The benefits of eliminating duplicate frequent GC to accommodate new writes. Additionally, TPC-C’s trans-
|
||||
writes and improving overall write performance outweigh the relatively action data size is relatively small, allowing multiple data items to be
|
||||
minor increase in valid page migrations caused by locking SSD blocks. stored in a single page. Therefore, NoLgn-FTL effectively reduces write
|
||||
operations and GC needs by minimizing duplicated writes.
|
||||
4.5. Results of YCSB and TPC-C performance
|
||||
5. Related works
|
||||
We also evaluate NoLgn-FTL using the YCSB benchmark to assess its
|
||||
performance under various realistic workloads. YCSB provides six core Research addressing duplicate writes can be divided into two direc-
|
||||
workloads as summarized in Table 1. To evaluate the long-term impact tions: optimization on atomic writes and remapping-based methods.
|
||||
of NoLgn-FTL, we use TPC-C benchmarks with four 4 warehouses [19] An atomic write interface was initially proposed by Park et al. [20],
|
||||
tested under different SSD free space conditions. TPC-C contains the which achieved atomicity for multi-page writes. Prabhakaran et al. [21]
|
||||
following 5 transaction types: 43% new order, 43% payment, 4% further introduced a transactional FTL called txFlash, which provides
|
||||
delivery, 4% order status, 4% stock level. The number of database a transaction interface (WriteAtomic) to higher-level software. It pro-
|
||||
connections was set to 1 to avoid frequent aborts of update transactions. vides isolation among multiple atomic write calls by ensuring that
|
||||
Fig. 10 shows the normalized throughput results of SQLite under no conflicting writes are issued. Xu et al. [22] used the native off-
|
||||
YCSB benchmarks in NORMAL mode. On average, SW-WAL shows site update feature of NAND flash memory to simulate copy-on-write
|
||||
a 10% performance improvement over Base-WAL, while NoLgn-FTL technology and, at the same time, used NVM to store the FTL mapping
|
||||
achieves a 17% improvement. For write-intensive workloads (A and F), table. However, these methods mostly supported atomicity for multi-
|
||||
both SW-WAL and NoLgn-FTL exhibit significantly better performance page writes only. Kang et al. presented X-FTL [23], aiming to support
|
||||
than Base-WAL. However, for read-intensive workloads (B, D, and general transactional atomicity, allowing data pages in a transaction
|
||||
|
||||
8
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 10. SQLite performance on YCSB benchmarks.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 11. SQLite performance on TPC-C benchmark.
|
||||
|
||||
|
||||
to be written to flash at any time. However, it requires an additional 6. Conclusion
|
||||
X-L2P table and needs to persist it to flash upon transaction commit.
|
||||
Address remapping is another extensively researched method that In this paper, we presented NoLgn-FTL to directly update the
|
||||
modifies the mapping table directly without performing actual writing. database in a no-logging way by reusing the old flash pages. NoLgn-
|
||||
Wu et al. [24] proposed KVSSD, which exploits the FTL mapping mech- FTL uses a P2P table and OOB area of flash pages to keep old page
|
||||
anism to implement copy-free compaction of LSM trees, and it enables information and transaction information. Thus, systems can recover
|
||||
direct data allocation in flash memory for efficient garbage collection. to a consistent state when a crash happens. As there is no need to
|
||||
However, address remapping may suffer from mapping inconsistencies store logging files in NoLgn-FTL, duplicate writes can be avoided. We
|
||||
due to the inability of flash memory to perform in-place updates. implemented a prototype of NoLgn-FTL on the FEMU SSD simulator
|
||||
Hahn et al. [25] use the address remapping operation for file system and integrated it with the SQLite database. The file system is modified
|
||||
defragmentation. However, after remapping, it uses file system logs to to enable SQLite to use the provided interface and transfer transaction
|
||||
deal with mapping inconsistencies. The larger log size results in longer information. Experimental results demonstrate that NoLgn-FTL can
|
||||
search times and increased memory consumption when performing significantly reduce writes to SSDs and improve the performance of
|
||||
read operations. As the number of remappings escalates, the log can SQLite, while still ensuring atomicity.
|
||||
become several hundred MB or even GB. Therefore, these methods
|
||||
may incur significant lookup overhead. Zhou et al. [26] address this CRediT authorship contribution statement
|
||||
issue by storing the new mapping table in Non-Volatile Memory, re-
|
||||
ducing lookup overhead. Besides, Wu et al. [4] proposed SW-WAL, a Zhenghao Yin: Writing – original draft, Visualization, Validation,
|
||||
novel approach that emulates the maintenance of a mapping table by Software, Methodology, Investigation, Formal analysis, Data curation.
|
||||
inscribing transaction information directly into the OOB area of flash Yajuan Du: Writing – review & editing, Supervision, Project adminis-
|
||||
pages. This strategy markedly reduces the footprint of the search table tration, Conceptualization. Yi Fan: Visualization. Sam H. Noh: Writing
|
||||
and concurrently boosts search efficiency. Additionally, to deal with – review & editing.
|
||||
the heavy query latency during WAL checkpointing, Yoon et al. [27]
|
||||
proposed Check-In to align journal logs to the FTL mapping unit. Funding
|
||||
The FTL creates a checkpoint by remapping the journal logs to the
|
||||
checkpoint, effectively reducing the checkpointing overhead and WAL’s This research did not receive any specific grant from funding agen-
|
||||
duplicate writes. cies in the public, commercial, or not-for-profit sectors.
|
||||
|
||||
9
|
||||
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
|
||||
|
||||
|
||||
Declaration of competing interest [23] W.-H. Kang, S.-W. Lee, B. Moon, G.-H. Oh, C. Min, X-FTL: transactional FTL
|
||||
for SQLite databases, in: Proceedings of the 2013 ACM SIGMOD International
|
||||
Conference on Management of Data, 2013, pp. 97–108.
|
||||
The authors declare that they have no known competing finan-
|
||||
[24] S.-M. Wu, K.-H. Lin, L.-P. Chang, KVSSD: Close integration of LSM trees and
|
||||
cial interests or personal relationships that could have appeared to flash translation layer for write-efficient KV store, in: 2018 Design, Automation
|
||||
influence the work reported in this paper. & Test in Europe Conference & Exhibition, DATE, IEEE, 2018, pp. 563–568.
|
||||
[25] S.S. Hahn, S. Lee, C. Ji, L. Chang, I. Yee, L. Shi, C.J. Xue, J. Kim, Improving file
|
||||
system performance of mobile storage systems using a decoupled defragmenter,
|
||||
Data availability
|
||||
in: 2017 USENIX Annual Technical Conference (USENIX ATC 17), 2017, pp.
|
||||
759–771.
|
||||
The original contributions presented in the study are included in the [26] Y. Zhou, Q. Wu, F. Wu, H. Jiang, J. Zhou, C. Xie, Remap-SSD: Safely and
|
||||
article, further inquiries can be directed to the corresponding author. efficiently exploiting SSD address remapping to eliminate duplicate writes, in:
|
||||
19th USENIX Conference on File and Storage Technologies (FAST 21), 2021, pp.
|
||||
187–202.
|
||||
[27] J. Yoon, W.S. Jeong, W.W. Ro, Check-In: In-storage checkpointing for key-
|
||||
References
|
||||
value store system leveraging flash-based SSDs, in: 2020 ACM/IEEE 47th Annual
|
||||
International Symposium on Computer Architecture, ISCA, 2020, pp. 693–706,
|
||||
[1] C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, P. Schwarz, ARIES: A transaction http://dx.doi.org/10.1109/ISCA45697.2020.00063.
|
||||
recovery method supporting fine-granularity locking and partial rollbacks using
|
||||
write-ahead logging, ACM Trans. Database Syst. 17 (1) (1992) 94–162.
|
||||
[2] S. Lee, D. Park, T. Chung, D. Lee, S. Park, H. Song, A log buffer-based flash Zhenghao Yin received the BS degree in Computer Science
|
||||
translation layer using fully-associative sector translation, ACM Trans. Embed. from Wuhan University of Technology, Wuhan, China, in
|
||||
Comput. Syst. ( TECS) 6 (3) (2007) 18–es. 2022, and is currently pursuing the MS degree in Computer
|
||||
[3] L. Shi, J. Li, C.J. Xue, C. Yang, X. Zhou, ExLRU: A unified write buffer cache Science, expected to graduate in 2025. His research interests
|
||||
management for flash memory, in: Proceedings of the Ninth ACM International include flash memory and database technologies.
|
||||
Conference on Embedded Software, 2011, pp. 339–348.
|
||||
[4] Q. Wu, Y. Zhou, F. Wu, K. Wang, H. Lv, J. Wan, C. Xie, SW-WAL: Leveraging
|
||||
address remapping of SSDs to achieve single-write write-ahead logging, in: 2021
|
||||
Design, Automation & Test in Europe Conference & Exhibition, DATE, 2021, pp.
|
||||
802–807.
|
||||
[5] F. Ni, X. Wu, W. Li, L. Wang, S. Jiang, Leveraging SSD’s flexible address mapping
|
||||
to accelerate data copy operations, in: 2019 IEEE 21st International Conference Yajuan Du received the joint Ph.D. degrees from the City
|
||||
on High Performance Computing and Communications; IEEE 17th International University of Hong Kong and the Huazhong University of
|
||||
Conference on Smart City; IEEE 5th International Conference on Data Science Science and Technology, in December 2017 and February
|
||||
and Systems (HPCC/SmartCity/DSS), 2019, pp. 1051–1059. 2018, respectively. She is currently an Assistant Professor
|
||||
[6] J. Coburn, T. Bunker, M. Schwarz, R. Gupta, S. Swanson, From ARIES to MARS: with the School of Computer Science and Technology,
|
||||
Transaction support for next-generation, solid-state drives, in: Proceedings of Wuhan University of Technology. Her research interests
|
||||
the Twenty-Fourth ACM Symposium on Operating Systems Principles, 2013, pp. include optimizing access performance, data reliability, and
|
||||
197–212. persistency of flash memories and non-volatile memories.
|
||||
[7] J. Arulraj, M. Perron, A. Pavlo, Write-behind logging, Proc. VLDB Endow. 10 (4)
|
||||
(2016) 337–348.
|
||||
[8] K. Han, H. Kim, D. Shin, WAL-SSD: Address remapping-based write-ahead-logging
|
||||
solid-state disks, IEEE Trans. Comput. 69 (2) (2019) 260–273.
|
||||
[9] G. Oh, C. Seo, R. Mayuram, Y.-S. Kee, S.-W. Lee, SHARE interface in flash storage
|
||||
for relational and NoSQL databases, in: Proceedings of the 2016 International Yi Fan received the BS degree in Computer Science from
|
||||
Conference on Management of Data, 2016, pp. 343–354. Wuhan University of Technology, Wuhan, China, in 2022,
|
||||
[10] Q. Wu, Y. Zhou, F. Wu, H. Jiang, J. Zhou, C. Xie, Understanding and exploiting and is currently pursuing the MS degree in Computer
|
||||
the full potential of SSD address remapping, IEEE Trans. Comput.-Aided Des. Science, expected to graduate in 2025. His research interests
|
||||
Integr. Circuits Syst. 41 (11) (2022) 5112–5125. include key–value databases and flash memory technologies.
|
||||
[11] H. Li, M. Hao, M.H. Tong, S. Sundararaman, M. Bjørling, H.S. Gunawi, The
|
||||
CASE of FEMU: Cheap, accurate, scalable and extensible flash emulator, in:
|
||||
16th USENIX Conference on File and Storage Technologies (FAST 18), 2018,
|
||||
pp. 83–90.
|
||||
[12] Y. Zhou, F. Wu, Z. Lu, X. He, P. Huang, C. Xie, SCORE: A novel scheme to
|
||||
efficiently cache overlong ECCs in NAND flash memory, ACM Trans. Archit.
|
||||
Code Optim. ( TACO) 15 (4) (2018) 1–25.
|
||||
Sam H. (Hyuk) Noh received his BE in Computer Engineer-
|
||||
[13] L. Long, S. He, J. Shen, R. Liu, Z. Tan, C. Gao, D. Liu, K. Zhong, Y. Jiang, WA-
|
||||
ing from Seoul National University in 1986 and his Ph.D. in
|
||||
Zone: Wear-aware zone management optimization for LSM-Tree on ZNS SSDs,
|
||||
Computer Science from the University of Maryland in 1993.
|
||||
ACM Trans. Archit. Code Optim. 21 (1) (2024) 1–23.
|
||||
He held a visiting faculty position at George Washington
|
||||
[14] D. Huang, D. Feng, Q. Liu, B. Ding, W. Zhao, X. Wei, W. Tong, SplitZNS: Towards
|
||||
University (1993–1994) before joining Hongik University,
|
||||
an efficient LSM-tree on zoned namespace SSDs, ACM Trans. Archit. Code Optim.
|
||||
where he was a professor in the School of Computer and
|
||||
20 (3) (2023) 1–26.
|
||||
Information Engineering until 2015. From 2001 to 2002, he
|
||||
[15] S.-H. Kim, J. Shim, E. Lee, S. Jeong, I. Kang, J.-S. Kim, NVMeVirt: A versatile
|
||||
was a visiting associate professor at UM IACS, University of
|
||||
software-defined virtual NVMe device, in: 21st USENIX Conference on File and
|
||||
Maryland. In 2015, Dr. Noh joined UNIST as a professor
|
||||
Storage Technologies (FAST 23), 2023, pp. 379–394.
|
||||
in the Department of Computer Science and Engineering.
|
||||
[16] B.S. Kim, J. Choi, S.L. Min, Design tradeoffs for SSD reliability, in: 17th USENIX
|
||||
He became the inaugural Dean of the Graduate School
|
||||
Conference on File and Storage Technologies (FAST 19), 2019, pp. 281–294.
|
||||
of Artificial Intelligence and previously served as Dean of
|
||||
[17] Z. Shen, Y. Shi, Z. Shao, Y. Guan, An efficient LSM-tree-based sqlite-like database
|
||||
the School of Electrical and Computer Engineering (2016–
|
||||
engine for mobile devices, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
|
||||
2018). He has contributed to numerous conferences, serving
|
||||
38 (9) (2018) 1635–1647.
|
||||
as General Chair, Program Chair, or committee member
|
||||
[18] A. Mäkinen, Tracing Android applications for file system optimization.
|
||||
for events like ACM SOSP, USENIX FAST, ACM ASPLOS,
|
||||
[19] S.T. Leutenegger, D. Dias, A modeling study of the TPC-C benchmark, ACM
|
||||
and USENIX OSDI. He also chaired the ACM HotStorage
|
||||
Sigmod Rec. 22 (2) (1993) 22–31.
|
||||
Steering Committee and serves on the Steering Committees
|
||||
[20] S. Park, J.H. Yu, S.Y. Ohm, Atomic write FTL for robust flash file system, in:
|
||||
for USENIX FAST and IEEE NVMSA. Dr. Noh was Editor-
|
||||
Proceedings of the Ninth International Symposium on Consumer Electronics,
|
||||
in-Chief of ACM Transactions on Storage (2016–2022) and
|
||||
2005.(ISCE 2005), 2005, pp. 155–160.
|
||||
is now co-Editor-in-Chief of ACM Transactions on Computer
|
||||
[21] V. Prabhakaran, T.L. Rodeheffer, L. Zhou, Transactional flash, in: OSDI, Vol. 8,
|
||||
Systems. His research focuses on system software and storage
|
||||
2008.
|
||||
systems, emphasizing emerging memory technologies like
|
||||
[22] Y. Xu, Z. Hou, NVM-assisted non-redundant logging for Android systems, in:
|
||||
flash and persistent memory.
|
||||
2016 IEEE Trustcom/BigDataSE/ISPA, 2016, pp. 1427–1433.
|
||||
|
||||
|
||||
10
|
||||
|
||||
@@ -0,0 +1,595 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
Energy consumption assessment in embedded AI: Metrological
|
||||
improvements of benchmarks for edge devices
|
||||
Andrea Apicella b , Pasquale Arpaia a ,∗, Luigi Capobianco d , Francesco Caputo a ,
|
||||
Antonella Cioffi d , Antonio Esposito a , Francesco Isgrò a , Rosanna Manzo c ,
|
||||
Nicola Moccaldi a , Danilo Pau e , Ettore Toscano d
|
||||
a
|
||||
Dipartimento di Ingegneria Elettrica e delle Tecnologie dell’Informazione, Università degli Studi di Napoli Federico II, Naples, Italy
|
||||
b
|
||||
Dipartimento di Ingegneria dell’Informazione ed Elettrica e Matematica applicata (DIEM), Università degli Studi di Salerno, Fisciano, Italy
|
||||
c
|
||||
Dipartimento di Sanità Pubblica e Medicina Preventiva, Università degli Studi di Napoli Federico II, Naples, Italy
|
||||
d
|
||||
Software Design Center, STMicroelectronics, Marcianise, Italy
|
||||
e System Research and Applications, STMicroelectronics, Agrate Brianza, Italy
|
||||
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: This manuscript proposes a new method to improve the MLCommons protocol for measuring power consump-
|
||||
Energy assessment tion on Microcontroller Units (MCUs) when running edge Artificial Intelligence (AI). In particular, the proposed
|
||||
Embedded AI approach (i) selectively measures the power consumption attributable to the inferences (namely, the predictions
|
||||
Tiny-ML
|
||||
performed by Artificial Neural Networks — ANN), preventing the impact of other operations, (ii) accurately
|
||||
Uncertainty analysis
|
||||
identifies the time window for acquiring the sample of the current thanks to the simultaneous measurement of
|
||||
Edge device benchmark
|
||||
power consumption and inference duration, and (iii) precisely synchronize the measurement windows and the
|
||||
inferences. The method is validated on three use cases: (i) Rockchip RV1106, a neural MCU that implements
|
||||
ANN via hardware neural processing unit through a dedicated accelerator, (ii) STM32 H7, and (iii) STM32 U5,
|
||||
high-performance and ultra-low-power general-purpose microcontroller, respectively. The proposed method
|
||||
returns higher power consumption for the two devices with respect to the MLCommons approach. This result
|
||||
is compatible with an improvement of selectivity and accuracy. Furthermore, the method reduces measurement
|
||||
uncertainty on the Rockchip RV1106 and STM32 boards by factors of 6 and 12, respectively.
|
||||
|
||||
|
||||
|
||||
1. Introduction (MCUs), widely used in IoT, this is particularly true. Many IoT applica-
|
||||
tions, such as autonomous driving [6], demand low-latency responses
|
||||
The rapid expansion of Internet of Things (IoT) devices has ushered to be effectively reactive. Moreover, several IoT devices often operate
|
||||
in a new era of connected intelligence at the edge, where data process- under very limited power sources. Promising energy-efficient strategies
|
||||
ing, low latency, and real-time decision making can take place directly aim to minimize consumption. For instance, index modulation [7,8] is
|
||||
at the edge [1]. These IoT devices cover a variety of applications, from a transmission technique that conveys additional information through
|
||||
smart home sensors [2], to industrial automation [3], and health mon- the indices of available resources such as antennas, subcarriers, or
|
||||
itoring systems [4], where low latency responses and energy efficiency time slots, and it can significantly reduce energy usage while maintain-
|
||||
are essential. ing data throughput. Nevertheless, even with advanced optimization
|
||||
Extending computation to more peripheral network nodes enhances strategies, the repetitive and frequent processing required by many ap-
|
||||
all key aspects of edge computing, including energy efficiency, carbon plications can rapidly deplete power resources, thereby limiting device
|
||||
footprint reduction, security, latency, privacy, offline functionality, and
|
||||
lifetime.
|
||||
data management costs [5]. However, deploying intelligence at the
|
||||
In recent years, Machine Learning (ML) methods [9], particularly
|
||||
end nodes requires careful consideration of the IoT devices inherent
|
||||
Artificial Neural Networks (ANNs), have been increasingly deployed on
|
||||
limitations, such as memory and computational resources impacting
|
||||
IoT devices to enhance localized data processing capabilities and reduce
|
||||
time performances, and energy constraints. For Microcontroller Units
|
||||
|
||||
|
||||
∗ Corresponding author.
|
||||
E-mail addresses: andapicella@unisa.it (A. Apicella), pasquale.arpaia@unina.it (P. Arpaia), luigi.capobianco@st.com (L. Capobianco),
|
||||
francesco.caputo3@unina.it (F. Caputo), antonella.cioffi@st.com (A. Cioffi), antonio.esposito9@unina.it (A. Esposito), francesco.isgro@unina.it (F. Isgrò),
|
||||
rosanna.manzo@unina.it (R. Manzo), nicola.moccaldi@unina.it (N. Moccaldi), danilo.pau@st.com (D. Pau), ettore.toscano@st.com (E. Toscano).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104120
|
||||
Received 10 January 2025; Received in revised form 2 September 2025; Accepted 21 December 2025
|
||||
Available online 22 December 2025
|
||||
0920-5489/© 2025 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
|
||||
A. Apicella et al. Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
dependency on cloud infrastructures [10,11]. It is common to refer to
|
||||
these devices as tiny devices [12] and embedded ML as tiny machine
|
||||
learning or tiny ML [5].
|
||||
Consequently, assessing the inference time provided by the IoT
|
||||
hardware for a specific ANN model is crucial to ensure that the em-
|
||||
bedded system can satisfy real-time processing requirements. In this
|
||||
context, inference refers to the process of an ANN generating outputs
|
||||
based on its trained model parameters and given inputs.
|
||||
Therefore, tailored energy consumption metrics are essential to
|
||||
ensure the alignment between the ANN implementation and the en-
|
||||
ergy constraints of the targeted IoT application. To this aim, Neural
|
||||
MCUs are new edge devices embedding ANN accelerators, specifically
|
||||
designed to manage the trade-off between reliability, latency, cost,
|
||||
and power consumption [13]. Therefore, adopting standardized metrics
|
||||
and procedures is essential for assessing the actual performance gains
|
||||
achieved by neural MCUs in the context of embedded AI. Despite
|
||||
several frameworks and tools have been proposed to facilitate the
|
||||
benchmarking of tinyML models [14–16], no standardized metrics and
|
||||
procedures are currently defined.
|
||||
Fig. 1. Energy measurement set up proposed by MLPerf Tiny Benchmark [17,
|
||||
Among the proposed benchmarking protocols, MLPerf Tiny Bench- 19]. The DUT is powered by the Energy Monitor. The IO manager serves as
|
||||
mark (MLPTB) [17] is developed by the MLCommons Association, an electrical-isolation proxy.
|
||||
the largest and most authoritative community aimed at improving
|
||||
the industrialization standardization process of machine learning [18].
|
||||
MLPTB provides protocols and AI components, namely datasets and
|
||||
functionalities: (i) sending a trigger signal, (ii) enabling UART commu-
|
||||
pre-trained ML models. These can act as metrological references when
|
||||
nication, (iii) generating and feeding random input data to the ANN,
|
||||
implemented on different hardware to assess their performance such
|
||||
(iv) performing inferences, and (v) printing the prediction results. The
|
||||
as the inference time and the power consumption under real-world
|
||||
software includes a graphical user interface that can be run on the Host
|
||||
conditions. However, the MLPTB protocols exhibit some metrological
|
||||
Computer, allowing the initiation of the measurement and monitoring
|
||||
weakness: (i) both the assessment of time performance and energy
|
||||
of input data. It is important to emphasize that in phase (iii) random
|
||||
consumption is realized without measurement uncertainty computa-
|
||||
data are generated to feed the ANN. This operation, however, does
|
||||
tion, (ii) the energy consumption analysis is performed based on an
|
||||
not reflect real-world applications, where the network processes sensor
|
||||
approximate estimate of the average inference duration, and (iii) the
|
||||
data in real time. Although not an intrinsic part of ANN inference,
|
||||
impact on consumption caused by inferences is not isolated with respect
|
||||
MLPTB includes this step in the performance and energy measurements.
|
||||
to other processes.
|
||||
Throughout this paper, phase (iii) is explicitly distinguished from phase
|
||||
In this paper, a new method is proposed and validated to improve
|
||||
(iv) (i.e., inference) and is referred to as the pre-inference phase.
|
||||
MLPTB protocols to measure power consumption in MCUs running
|
||||
ANNs, in a rigorous metrological framework. Specifically, in Section 2 The energy per inference (𝐸𝑖𝑛𝑓 ) is calculated using latency infor-
|
||||
the MLPTB framework is reported, then the proposed method is pre- mation determined in the Performance phase. Specifically, the IPS is
|
||||
sented in Section 3. Experiments and results are reported in Section 4 determined by taking the median value across five experiments. In each
|
||||
and discussed in Section 5. experiment, input data is provided for a duration of at least 10 s, and
|
||||
the number of inferences is recorded via a direct connection between
|
||||
2. Background the Host Computer and the DUT. Given the IPS, 𝐸𝑖𝑛𝑓 is computed as:
|
||||
𝐼𝑚 × 𝑉𝑛
|
||||
𝐸𝑖𝑛𝑓 = (1)
|
||||
Several frameworks and tools have been introduced to support 𝜏 × 𝐼𝑃 𝑆
|
||||
the benchmarking of tinyML models [14–16]. Among the available where 𝑉𝑛 is the nominal voltage, 𝐼𝑚 is the current averaged over the
|
||||
benchmarking protocols, the MLPerf Tiny Benchmark (MLPTB) [17], fixed period 𝜏.
|
||||
developed by the MLCommons Association [18], emerges as a key
|
||||
initiative.
|
||||
3. Proposed method
|
||||
MLPTB proposes two modalities of assessment: (i) Performance and
|
||||
(ii) Energy. The former measures Latency (inferences per second — IPS)
|
||||
and accuracy (percentage of correct predictions to all predictions ratio) The MLCommons pre-inference phase generates random numbers as
|
||||
through a direct USB connection between a Device Under Test (DUT) input to the ANN in order to perform inference (in addition to memory
|
||||
and an host computer, while the latter measures energy (micro-joules operations needed to provide the input to the network). However, ran-
|
||||
per inference). In the remainder of this section, the energy configura- dom number generation is hardly reproducible across different devices
|
||||
tion mode is detailed, as it represents the central focus of this study. In under test, since both the libraries and the hardware resources available
|
||||
the energy configuration mode (Fig. 1), an Energy Monitor is proposed on the microcontrollers for random number generation vary. In con-
|
||||
to supply power to the DUT while measuring the current consumption. trast, the proposed work selectively excludes the pre-inference phase
|
||||
An Input/Output Manager is introduced to interface the Host Computer from the performance and energy measurements, ensuring greater re-
|
||||
with the DUT and serving as an electrical-isolation proxy. Furthermore, producibility while also providing a closer adherence to the actual
|
||||
MLPTB requires level shifters to adapt the power supply in input to the operation of the device in real-world scenarios. In the following of this
|
||||
DUT (not reported in Fig. 1 to simplify the schematic as they are not section, the proposed method is described. In paragraph 3.1 the circuit
|
||||
essential to the discussion). solution for the joint measurement of time and energy consumption
|
||||
In addition to defining assessment procedures, MLPTB provides is described. In paragraph 3.2 the expected impact of the method on
|
||||
some firmware and software [19] for ML tasks on DUT. In particular, selectivity, accuracy, and uncertainty during the energy measurement
|
||||
the provided firmware to be loaded onto the DUT ensures the following is highlighted.
|
||||
|
||||
2
|
||||
A. Apicella et al. Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
inference. Furthermore, it is assumed with a non-negligible degree of
|
||||
approximation that the inferences are executed consecutively by the
|
||||
MCU, disregarding the impact of inter-inference operations that are
|
||||
still present. Finally, the delays in the transmission of the command for
|
||||
starting the measurement have a further impact on the accuracy, albeit
|
||||
to a very small extent. Specifically, this refers to the time taken by the
|
||||
CPU on the DUT to generate the trigger signal and by the Measurement
|
||||
Board to handle the interrupt triggered at its input pin (see Fig. 3).
|
||||
In the proposed method, limiting the observation to a single in-
|
||||
ference at a time eliminates the approximation inherent in MLPTB,
|
||||
where the inference duration is estimated through the average of
|
||||
multiple successive inferences executed within a known time window.
|
||||
Specifically, the proposed method allows the exclusion of all energy
|
||||
contributions unrelated to the inference itself (e.g., data transfer op-
|
||||
erations to memory during the pre-inference phase). However, in the
|
||||
proposed method, the repetition of the measurement for each inference
|
||||
amplifies the impact of inaccuracies caused by the delay in transmitting
|
||||
the status signal. In contrast, the MLPTB approach mitigates this effect
|
||||
Fig. 2. Proposed energy measurement setup. The Host Computer powers the
|
||||
because the delay only occurs at the start of the measurement for
|
||||
DUT and an ammeter is connected in series along the power line on the DUT multiple inferences. To address this issue, the inference duration (𝛥𝑡)
|
||||
(e.g. a MCU). measurement is also performed. In the firmware for the DUT, the
|
||||
onboard counter is read immediately before and after the inference
|
||||
execution. The 𝛥𝑡, is used to appropriately resize the current sample
|
||||
vector acquired while the inference status signal is active. The current
|
||||
3.1. Circuit diagram and measurement procedure
|
||||
sample vector is trimmed at both ends by a number of elements (𝑁𝑡𝑟𝑖𝑚 ),
|
||||
calculated as follows:
|
||||
The proposed method utilizes an ammeter that does not require ( )
|
||||
powering the DUT to measure the absorbed current. The ammeter is 𝑓 𝑁𝑐𝑠
|
||||
𝑁𝑡𝑟𝑖𝑚 = 𝑐 − 𝛥𝑡 (2)
|
||||
connected in series to the microprocessor on the MCU powered by the 2 𝑓𝑐
|
||||
Host Computer through the USB port (Fig. 2). This approach allows where 𝑓𝑐 is the sampling frequency of the Ammeter, 𝑁𝑐𝑠 is the number
|
||||
the Host Computer to perform both latency and energy measurements of current samples acquired when the inference status signal is high,
|
||||
simultaneously. Indeed, the firmware provided by MLPTB enables the and 𝛥𝑡 is the inference duration.
|
||||
DUT to update the Host Computer on the number of completed infer-
|
||||
ences through the USB connection. Instead of computing the energy 3.3. Uncertainty improvements
|
||||
per inference as the ratio between the total energy measured in a
|
||||
specific time window and the number of inferences (MLPTB method), Two distinct phases should be addressed in the evaluation of un-
|
||||
the proposed method computes the energy for each inference without certainty: (i) the inference time measurement, and (ii) the energy
|
||||
considering the impact of pre-inference phase. This is obtained by consumption assessment. In particular, an important source of un-
|
||||
modifying the firmware provided by MLPTB: the trigger is replaced by certainty in MLPTB is due to the counting of inferences during the
|
||||
a logic signal (inference status) that goes high during an ongoing infer- IPS measurement affecting inference time measurement and, conse-
|
||||
ence and returns low otherwise. The inference status signal output from quently, also the energy consumption assessment. More deeply, the
|
||||
the device under test is sampled by the Measurement Board (ammeter) measurement window is not an integer multiple of the inference period,
|
||||
in parallel with the current (Fig. 3.a). Two vectors of synchronously therefore, there is no synchronization between the end of the last
|
||||
sampled data (current and inference status signal) are sent to the Host inference and the end of the measurement window. This contribution
|
||||
Computer. The current samples are processed, and the energy consump- can be modeled by a uniform random variable whose domain is equal
|
||||
tion is calculated only when the inference status samples indicate a to the central value inference duration 𝛥𝑡𝑚 , with a standard deviation
|
||||
low logic signal. Additionally, before and after each inference, the DUT 𝜎1𝑐𝑜𝑛𝑡 computed as:
|
||||
reads the values of the Clock and Reset Management Unit (CRMU) and
|
||||
𝛥𝑡
|
||||
transmits them to the Host Computer to determine the duration of the 𝜎1𝑐𝑜𝑛𝑡 = 𝑢𝑡1 = √𝑚 (3)
|
||||
inference. Finally, the software on the Host Computer computes the 2 3
|
||||
mean value of 𝑁 inferences with associated uncertainty. In this work, The uncertainty of the MLPTB method is assessed by assuming the
|
||||
𝑁 is set to 100. Similar to the MLPTB, the proposed firmware runs as median inference duration approximately equal to the mean. Differ-
|
||||
the sole program on the MCU, with fully sequential execution and no ently, in the proposed method the counting uncertainty is determined
|
||||
concurrency, or interrupts. Furthermore, in the proposed method, the by the fact that the inference duration is not an integer multiple of
|
||||
inference status signal is set high immediately after the pre-inference the counter period (𝑇𝑐 ). Again, the random variable with uniform
|
||||
phase, and the CRMU is queried right before the inference execution. probability distribution effectively describes this aspect. The standard
|
||||
As soon as the inference completes, the CRMU is queried again, and deviation 𝜎2𝑐𝑜𝑛𝑡 is computed as:
|
||||
finally the inference status is set low to signal the ammeter that the 𝑇
|
||||
inference has finished. In Fig. 4, a flowchart describing the customized 𝜎2𝑐𝑜𝑛𝑡 = 𝑢𝑡2 = √𝑐 (4)
|
||||
firmware behavior is reported. 2 3
|
||||
Assuming that 𝛥𝑡𝑚 ≫ 𝑇𝑐 , it follows 𝑢𝑡1 ≫ 𝑢𝑡2 and the proposed method
|
||||
3.2. Accuracy improvements improves the measurement uncertainty due to counting.
|
||||
Then there is the uncertainty due to the variability of the duration
|
||||
In the MLPTB, the number of inferences during the measurement time of the processes between the inferences (pre-inference phase). The
|
||||
time in energy mode is calculated using the IPS obtained from the proposed method is not affected by this source of uncertainty because
|
||||
previous latency measurement. This approach introduces accuracy is- it excludes from the energy measurement all the processes outside
|
||||
sues because an estimator is used instead of the actual time of each the inference. Finally, both methods are exposed to the uncertainty
|
||||
|
||||
3
|
||||
A. Apicella et al. Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Comparison between the block diagram of the proposed method (a) and ML Commons-Tiny approach (b) for energy consumption measurement. The
|
||||
added blocks and signals are reported in red. In the proposed method, the Device Under Test stops the power consumption computation after each inference.
|
||||
Differently, in the MLCommons-Tiny approach, the Host Computer stops the acquisition of current samples after a fixed time window, without distinguishing
|
||||
between pre-inference and inference phases. Furthermore, it computes the energy consumption (μJ per inference) based on the Inference per Second measured
|
||||
exploiting the Performance mode (see Section 2.) The Counter and the Time Calculator blocks are used for the measurement of the duration of each inference,
|
||||
while an Inference Status ADC minimizes the latency between the inference start and current sample consideration. (For interpretation of the references to color
|
||||
in this figure legend, the reader is referred to the web version of this article.)
|
||||
|
||||
|
||||
according to the following formula [20]:
|
||||
√
|
||||
𝑢𝑐 = 𝑢2𝐴 + 𝑢2𝐵 + 𝑢2𝐵 + ⋯ + 𝑢2𝐵 . (5)
|
||||
1 2 𝐾
|
||||
|
||||
|
||||
4. Experiments and results
|
||||
|
||||
In this section, a comparison between the application of the pro-
|
||||
posed and MLPTB methods is presented. In paragraph 4.1 the ex-
|
||||
perimental procedure is described. The DUTs and the ammeter are
|
||||
presented in paragraph 4.2. Results are reported in paragraph 4.3.
|
||||
|
||||
4.1. Experimental procedure
|
||||
|
||||
The MLPTB method was implemented using two different circuit
|
||||
configurations for measuring inference duration and energy per infer-
|
||||
ence, as described in [17]. Instead, in the proposed method the two
|
||||
measures were realized with the same circuital solution shown in Fig. 2.
|
||||
The Firmware used for MLPTB measurement was modified to allow the
|
||||
measurement of the single inference as described in the paragraph 3.1.
|
||||
The four MLPerf benchmarks were retained: (i) Anomaly Detection, (ii)
|
||||
Keyword Spotting, (iii) Image Classification, (iv) Visual Wake Words.
|
||||
Each benchmark targets a specific use case and specifies a dataset, a
|
||||
model, and a quality target [17].
|
||||
|
||||
4.2. Experimental setup
|
||||
|
||||
Both methods are applied on three different MCU: STMicroelec-
|
||||
tronics STM32-H7 (Clock Frequency = 280 MHz), STMicroelectronics
|
||||
STM32-U5 (Clock Frequency = 160 MHz), and Rockchip RV1106 (Clock
|
||||
Fig. 4. Flow chart of the proposed Firmware. The pre-inference phase (in red) Frequency = 1200 MHz). The STM32H7 and the STM32U5 are general-
|
||||
is excluded from both time (CRMU timestamp read) and energy assessment purpose microcontrollers, the former designed for high-performance
|
||||
(‘‘Inference Status’’ digital signal setting and unsetting). (For interpretation of applications and the latter for ultra-low-power operation, both pro-
|
||||
the references to color in this figure legend, the reader is referred to the web duced by STMicroelectronics. These devices do not have any ded-
|
||||
version of this article.) icated Neural Processing Unit (NPU) hardware for ANN computa-
|
||||
tion, so this part is commonly made by implemented firmware that
|
||||
run on main Central Process Unit (CPU). The firmware is automati-
|
||||
of the stability of the DUT (jitter) and ammeter precision, as well cally deployed using ST EdgeAI Core Technology and compiled through
|
||||
as to the uncertainty of the signal transmission times between the STMCubeIDE [21] compiler implementing all needed tools to convert,
|
||||
devices involved in the measurement process. For the calculation of optimize, and implement ANN models on the DUT.
|
||||
the measurement uncertainty, the combined standard uncertainty 𝑢𝑐 is The evaluation boards of the STMicroelectronics Nucleo-STM32H7
|
||||
adopted, where the contribution from the type A evaluation (𝑢𝐴 ) is with STM32H7 microcontroller and B-U585I-IOT02 A Discovery Kit
|
||||
integrated with the 𝐾 contributions from the type B evaluations (𝑢𝐵𝑘 ), with STM32U5 microcontroller were chosen for the experimental setup
|
||||
|
||||
4
|
||||
A. Apicella et al. Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
|
||||
|
||||
(a) (b) (c)
|
||||
|
||||
|
||||
|
||||
|
||||
(d)
|
||||
|
||||
|
||||
Fig. 5. Hardware components used in the experiments: (a) H7 board with STM32H7 MCU, (b) Luckfox Pico Pro Max with Rockchip RV1106 SoC, (c) B-U585I-
|
||||
IOT02 A Discovery Kit with STM32U5 MCU, and (d) Power Profiler Kit II ammeter.
|
||||
|
||||
|
||||
(Figs. 5(a), 5(c)). They include a connector in series to the MCU’s power counter values returned by two consecutive CRMU readings. On each
|
||||
supply line allowing an ammeter to be inserted to assess the power board, 30 experiments were performed, each providing two latency
|
||||
consumption of the DUT under operating conditions. values. For each board, the mean value and type A uncertainty were
|
||||
The RV1106 is a System on Chip (SoC) produced by Rockchip Elec- computed. In the worst case, namely the Rockchip, the latency was
|
||||
tronics. This device has a dedicated NPU hardware, so the computation found to be 7 ± 4 CPU clock cycles (2 ± 1 for the other two boards),
|
||||
of ANN models are made by hardware, and the software shall only which corresponds to only a few nanoseconds. Tables 1, 2, and 3
|
||||
allocate necessary data into a dedicated memory area. While STM32 present the results of inference duration (𝛥𝑡) assessments conducted
|
||||
microcontrollers operate without an operating system, RV1106 requires using both the MLPTB and the proposed methods. The results are
|
||||
the use of an operating system given its CPU architecture. Ubuntu reported for the Rockchip RV1106, STM32H7, and STM32U5, respec-
|
||||
22.04 RT [22] was therefore installed to minimize execution timing tively, with varying ANN models. Concerning uncertainty computation,
|
||||
uncertainties. the MLPTB method does not provide strategies for calculating mea-
|
||||
The software is deployed using RKNN Toolkit compiler that im- surement uncertainty and, in this work, it was computed by referring
|
||||
plements all needed tools to convert, optimize, and implement ANN to the sole contribution of the counting inferences (Eq. (2)). In the
|
||||
models on the device. The evaluation board with Rockchip RV1106 proposed method, since the Clock and Reset Management Unit (CRMU)
|
||||
chosen for the experimental setup is the Luckfox Pico Pro Max (Fig. of the MCUs is employed for inference time measurement, the type
|
||||
5(b)). The ammeter is inserted between USB-C main supply and the A uncertainty is combined with type B contributions arising from
|
||||
SoC’s power supply line in order to assess the power consumption of counting uncertainty, system clock stability (jitter), and the response
|
||||
device under operative conditions. time required by the CRMU to be queried and to return a value.
|
||||
The measurement board used for the power assessment is the Power For all the considered microcontrollers, the type B contribution was
|
||||
Profiler Kit II (PPKII) produced by Nordic Semiconductor (Fig. 5(d)). found to be dominated by the counting uncertainty, computed using
|
||||
This device is composed by an ammeter and a 8-bits digital sampler formula (4), and equal to 289 ns. The jitter contribution is at least
|
||||
synchronized with the same time base. It can work into two different three orders of magnitude smaller at room temperature (between 20 ◦ C
|
||||
modes that affect the only ammeter component: and 30 ◦ C) [23–25]. Similarly, the uncertainty related to the CRMU
|
||||
response time, characterized in this work for all three microcontrollers,
|
||||
• Source Meter: With this mode, the internal ammeter is linked was found to be equal to 1 CPU clock cycle. In the worst case, i.e., con-
|
||||
to a power supply generator that can be used to provide the sidering the STM32U5 device with the lowest CPU clock frequency, this
|
||||
power supply to DUT. This mode was adopted for the MLPTB contribution was on the order of nanoseconds. Therefore, the overall
|
||||
implementation evaluated uncertainty corresponds to the joint contribution of type A
|
||||
• Ammeter Mode: With this mode, the instrument works as a pure and type B, with the latter coinciding with the counting uncertainty,
|
||||
ammeter and the power supply of DUT can be provided ex- according to:
|
||||
ternally. This mode was implemented in the proposed method √
|
||||
application. 𝑢𝑡 = 𝑢2𝐴 + 𝑢2𝐵 (6)
|
||||
|
||||
For both modes, the device was metrologically characterized under To propagate the measurement uncertainty of the 𝛥𝑡 on the energy
|
||||
operating conditions of 20–30 ◦ C (the same conditions used for all per inference (𝐸𝑖𝑛𝑓 ) measurement, a constant power 𝑃 is assumed
|
||||
experiments), exhibiting an uncertainty of less than 2%. during the inference time, obtaining the following propagation formula:
|
||||
|
||||
4.3. Results
|
||||
𝐸𝑖𝑛𝑓 = 𝑃 𝛥𝑡 ⇒ 𝑢𝑒 = 𝑃 𝑢𝑑 (7)
|
||||
For the proposed method, a characterization of the CRMU query where 𝑢𝑒 is the energy per inference measurement uncertainty. With
|
||||
latency was carried out on all devices. A modified version of the same respect to the energy consumption estimation, an additional uncer-
|
||||
firmware used for the energy consumption assessment was employed. tainty source arises from the measuring instrument, i.e., the ammeter
|
||||
Specifically, an additional CRMU query was appended directly after employed. For both methods, an instrumental uncertainty of 2% was
|
||||
the preceding one, making it consecutive to the two already present. considered, after a metrological characterization performed under oper-
|
||||
The CRMU query latency was measured as the difference between the ational conditions at room temperature (between 20 ◦ C and 30 ◦ C). The
|
||||
|
||||
5
|
||||
A. Apicella et al. Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
Table 1
|
||||
Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑡 ) of inference duration (expressed in ms) assessed by MLCommons and
|
||||
proposed methods on Rockchip RV1106 at varying of neural models.
|
||||
Method Visual Wake Words Image Classification Keyword Spotting Anomaly Detection
|
||||
𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡
|
||||
Proposed 0.820 0.006 0.415 0.012 0.400 0.008 0.558 0.033
|
||||
MLPTB 0.815 0.235 0.414 0.120 0.371 0.107 0.350 0.101
|
||||
a
|
||||
In MLPTB, the counting uncertainty was taken into account.
|
||||
|
||||
|
||||
Table 2
|
||||
Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑡 ) of inference duration (expressed in ms) assessed by MLCommons and
|
||||
proposed methods on STM32H7 microcontroller at varying of neural models.
|
||||
Method Visual Wake Words Image Classification Keyword Spotting Anomaly Detection
|
||||
𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡
|
||||
Proposed 29.656 0.003 49.941 0.001 14.860 0.001 1.690 0.002
|
||||
MLPTB 29.600 8.545 51.900 14.982 15.400 4.446 1.800 0.520
|
||||
a In MLPTB, the Counting Uncertainty was taken into account.
|
||||
|
||||
|
||||
Table 3
|
||||
Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑡 ) of inference duration (expressed in ms) assessed by MLCommons and
|
||||
proposed methods on STM32U5 microcontroller at varying of neural models.
|
||||
Method Visual Wake Words Image Classification Keyword Spotting Anomaly Detection
|
||||
𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡 𝑚𝑡 𝑢𝑡
|
||||
Proposed 78.447 0.002 133.280 0.002 48.060 0.001 4.910 0.002
|
||||
MLPTB 71.600 20.669 128.200 37.008 38.600 11.143 4.800 1.386
|
||||
a
|
||||
In MLPTB, the Counting Uncertainty was taken into account.
|
||||
|
||||
|
||||
Table 4
|
||||
Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑒 ) of energy (expressed in μJ) assessed by MLCommons and proposed methods
|
||||
on Rockchip RV1106 at varying of neural models.
|
||||
Method Visual Wake Words Image Classification Keyword Spotting Anomaly Detection
|
||||
𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒
|
||||
Proposed 380 13 193 15 165 9 222 11
|
||||
MLPTB 373 108 183 53 159 46 148 43
|
||||
a
|
||||
In MLPTB, the counting uncertainty was propagated into the energy measurements.
|
||||
|
||||
|
||||
Table 5
|
||||
Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑒 ) of energy (expressed in μJ) assessed by MLCommons and proposed methods
|
||||
on STM32H7 microcontroller at varying of neural models.
|
||||
Method Visual Wake Words Image Classification Keyword Spotting Anomaly Detection
|
||||
𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒
|
||||
Proposed 4386 88 7536 151 2202 44 236 6
|
||||
MLPTB 3699 1068 6311 1822 1870 540 221 64
|
||||
a In MLPTB, the counting uncertainty was propagated into the energy measurements.
|
||||
|
||||
|
||||
final uncertainty was thus obtained by applying the following formula: trends: for two networks, the measured consumption is higher with the
|
||||
proposed method, while for the other two networks it is higher with
|
||||
√ MLCommons. Regarding the uncertainty, the proposed method reduces
|
||||
𝑢𝑒 = 𝑢2𝑡 + 𝑢2𝑠 (8)
|
||||
𝑝 it by a factor of 12.
|
||||
where 𝑢𝑡𝑝 denotes the inference time measurement uncertainty 𝑢𝑡 prop-
|
||||
agated through the functional relation used for energy computation 5. Discussion
|
||||
(see formula), and 𝑢𝑠 represents the instrumental uncertainty of the
|
||||
ammeter. The measurement uncertainty obtained for the proposed The contrasting trends from energy assessment on STM32U5 pro-
|
||||
method appears for all tested devices to be very low compared to the vide an opportunity to discuss the relationship between the two meth-
|
||||
uncertainty of the MLPTB method. ods in terms of metrological accuracy. The MLCommons method ex-
|
||||
In Tables 4, 5, and 6 a comparison between results of energy per tracts a central Inference Per Second value based on five experiments,
|
||||
inference assessment by MLPTB and proposed methods are reported for whereas our method computes a central value as the mean over 100
|
||||
the three DUTs. On the Rockchip RV1106, the proposed method mea- acquisitions. Given the large uncertainty of the MLPTB method and
|
||||
sures an inference energy value that is, on average, 15% higher than the limited number of experiments, the calculated central value is
|
||||
that obtained with MLPTB, while improving the uncertainty by a factor unlikely to be a reliable estimator of the true value of the measured
|
||||
of 6. In the case of a STM32H7 inference energy assessment grows quantity [26]. The comparison of mean values obtained with the two
|
||||
by 16% while the uncertainty improves by a factor of 12. Notably, methods is limited by the large difference in their associated uncertain-
|
||||
the inference energy assessment on the STM32U5 shows contrasting ties. The less precise method exhibits an uncertainty up to two orders
|
||||
|
||||
6
|
||||
A. Apicella et al. Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 6. Temporal diagram of current values acquired from MCU during ANN operations. Orange traces represent (a) the inference status signal in the proposed
|
||||
method and (b) the trigger signal in the MLPTB method. The windows used for energy consumption estimation are highlighted in light blue. Specifically, the
|
||||
proposed method (a) considers only the current samples acquired during each neural network inference phase, whereas the MLPTB method (b) also includes the
|
||||
energy contribution of pre-inference phases (light yellow window). (For interpretation of the references to color in this figure legend, the reader is referred to
|
||||
the web version of this article.)
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 7. Comparison between proposed method (orange) and MLPTB (green) in Energy per inference Assessment on the Rockchip RV1106, at varying th Models
|
||||
provided by MLCommons. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
|
||||
|
||||
|
||||
Table 6
|
||||
Comparison of central value (𝑚𝑡 ) and uncertaintya (𝑢𝑒 ) of energy (expressed in μJ) assessed by MLCommons and proposed methods
|
||||
on STM32U5 microcontroller at varying of neural models.
|
||||
Method Visual Wake Words Image Classification Keyword Spotting Anomaly Detection
|
||||
𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒 𝑚𝑡 𝑢𝑒
|
||||
Proposed 2362 47 3249 65 1184 27 116 3
|
||||
MLPTB 1921 556 3384 980 1004 291 121 35
|
||||
a
|
||||
In MLPTB, the counting uncertainty was propagated into the energy measurements.
|
||||
|
||||
|
||||
of magnitude higher than the other, rendering direct statistical com- by low energy consumption) from the calculation (Fig. 6). This prevents
|
||||
parisons of the means largely insignificant. Observed differences may underestimation of the actual energy consumption, which may occur
|
||||
therefore primarily reflect the inherent variability of the less accurate when using the MLPTB method.
|
||||
method rather than genuine differences in the measured phenomenon. Finally the Figs. 7, 8, and 9 present the histograms of Energy
|
||||
However, it is important to note that the proposed method provides per Inference assessment with the two methods on Rockchip RV1106,
|
||||
greater selectivity by excluding the pre-inference phase (characterized STM32H7, and STM32U5, respectively. The orange bars (proposed
|
||||
|
||||
7
|
||||
A. Apicella et al. Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. Comparison between proposed method (orange) and MLPTB (green) in Energy per inference Assessment on the STM32 H7, at varying th Models provided
|
||||
by MLCommons. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 9. Comparison between proposed method (orange) and MLPTB (green) in Energy per inference Assessment on the STM32 U5, at varying th Models provided
|
||||
by MLCommons. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
|
||||
|
||||
|
||||
method) are generally higher than the green bars (MLPTB). However, 6. Conclusions
|
||||
comparing the mean values measured by the two methods is challeng-
|
||||
ing due to the large uncertainty intervals (error bars) associated with A new method for assessing power consumption of edge devices
|
||||
MLPTB. Nevertheless, the differences in error bar lengths confirm the such as MCUs running ANNs is presented, claiming metrological im-
|
||||
improved precision of the proposed method. provements over the MLPerf Tiny Benchmark. Unlike MLPTB, the
|
||||
The metrological improvements introduced in this work have direct proposed method calculates the duration and energy consumption of
|
||||
each individual inference performed by the Device Under Test. Through
|
||||
consequences for the practical adoption of embedded AI. First, more
|
||||
an appropriate circuit and firmware design, the method measures only
|
||||
accurate and reproducible energy assessments enhance the reliability of
|
||||
the energy consumed by the inference, excluding other operations from
|
||||
benchmarking, enabling fair comparisons among devices and support-
|
||||
the computation. This approach not only enhances the selectivity and
|
||||
ing informed selection of hardware for battery-powered applications,
|
||||
accuracy of the measurement process but also reduces measurement
|
||||
where autonomy is a critical design constraint. Second, the improved uncertainty. Instead of counting the number of inferences over a fixed
|
||||
accuracy in energy characterization facilitates more precise sizing of interval, as MLPTB does, the proposed method counts the number of
|
||||
power supply components, which is essential for ensuring efficiency, ticks from the counter of the DUT during a single inference execution.
|
||||
stability, and cost-effectiveness in embedded deployments. Finally, the On a NPU powered microcontroller, the proposed method improves
|
||||
refined timing characterization allows designers to better estimate measurement uncertainty by a factor of 6. In the case of two general-
|
||||
inference latency, a key parameter for real-time and safety-critical purpose microcontrollers (high-performance and ultra-low-power), the
|
||||
applications. measurement uncertainty improves by a factor of 12.
|
||||
|
||||
8
|
||||
A. Apicella et al. Computer Standards & Interfaces 97 (2026) 104120
|
||||
|
||||
|
||||
CRediT authorship contribution statement [6] M. Cunneen, M. Mullins, F. Murphy, Autonomous vehicles and embedded
|
||||
artificial intelligence: The challenges of framing machine driving decisions, Appl.
|
||||
Artif. Intell. 33 (8) (2019) 706–731.
|
||||
Andrea Apicella: Writing – review & editing, Methodology, Con-
|
||||
[7] J. Li, S. Dang, M. Wen, Q. Li, Y. Chen, Y. Huang, W. Shang, Index modulation
|
||||
ceptualization. Pasquale Arpaia: Writing – review & editing, Method- multiple access for 6G communications: Principles, applications, and challenges,
|
||||
ology, Conceptualization. Luigi Capobianco: Writing – review & edit- IEEE Netw. 37 (1) (2023) 52–60.
|
||||
ing, Methodology, Conceptualization. Francesco Caputo: Writing – re- [8] M. Wen, B. Zheng, K.J. Kim, M. Di Renzo, T.A. Tsiftsis, K.-C. Chen, N.
|
||||
view & editing, Writing – original draft, Visualization, Validation, Soft- Al-Dhahir, A survey on spatial modulation in emerging wireless systems: Re-
|
||||
search progresses and applications, IEEE J. Sel. Areas Commun. 37 (9) (2019)
|
||||
ware, Methodology, Investigation, Formal analysis, Data curation, Con- 1949–1972.
|
||||
ceptualization. Antonella Cioffi: Writing – review & editing, Methodol- [9] M.I. Jordan, T.M. Mitchell, Machine learning: Trends, perspectives, and
|
||||
ogy, Conceptualization. Antonio Esposito: Writing – review & editing, prospects, Science 349 (6245) (2015) 255–260.
|
||||
Methodology, Conceptualization. Francesco Isgrò: Writing – review [10] S. Mishra, J. Manda, Improving real-time analytics through the internet of things
|
||||
and data processing at the network edge, J. AI Assist. Sci. Discov. 4 (1) (2024)
|
||||
& editing, Methodology, Conceptualization. Rosanna Manzo: Writ-
|
||||
184–206.
|
||||
ing – review & editing, Methodology, Conceptualization. Nicola Moc- [11] M. De Donno, K. Tange, N. Dragoni, Foundations and evolution of mod-
|
||||
caldi: Writing – review & editing, Methodology, Conceptualization. ern computing paradigms: Cloud, IoT, edge, and fog, IEEE Access 7 (2019)
|
||||
Danilo Pau: Writing – review & editing, Methodology, Conceptual- 150936–150948.
|
||||
ization. Ettore Toscano: Writing – review & editing, Methodology, [12] D.P. Pau, P.K. Ambrose, F.M. Aymone, A quantitative review of automated neural
|
||||
search and on-device learning for tiny devices, Chips 2 (2) (2023) 130–141.
|
||||
Conceptualization. [13] C.-T. Lin, P.X. Huang, J. Oh, D. Wang, M. Seok, iMCU: A 102-𝜇J, 61-ms digital
|
||||
in-memory computing-based microcontroller unit for edge TinyML, in: 2023 IEEE
|
||||
Declaration of competing interest Custom Integrated Circuits Conference, CICC, IEEE, 2023, pp. 1–2.
|
||||
[14] S. Gal-On, M. Levy, Exploring coremark a benchmark maximizing simplicity and
|
||||
efficacy, Embed. Microprocess. Benchmark Consortium (2012).
|
||||
The authors declare that they have no known competing finan-
|
||||
[15] P. Torelli, M. Bangale, Measuring Inference Performance of Machine-Learning
|
||||
cial interests or personal relationships that could have appeared to Frameworks on Edge-Class Devices with the Mlmark Benchmark, Techincal Re-
|
||||
influence the work reported in this paper. port, 2021, Available Online: https://www.eembc.org/techlit/articles/MLMARK-
|
||||
WHITEPAPERFINAL-1.pdf. (Accessed on 5 April 2021).
|
||||
Acknowledgments [16] B. Sudharsan, S. Salerno, D.-D. Nguyen, M. Yahya, A. Wahid, P. Yadav, J.G.
|
||||
Breslin, M.I. Ali, Tinyml benchmark: Executing fully connected neural networks
|
||||
on commodity microcontrollers, in: 2021 IEEE 7th World Forum on Internet of
|
||||
This work was carried out within the DHEAL-COM project (ID: PNC- Things, WF-IoT, IEEE, 2021, pp. 883–884.
|
||||
E3-2022-23683267 PNC – HLS – DH; CUP: E63C22003790001), which [17] C. Banbury, V.J. Reddi, P. Torelli, J. Holleman, N. Jeffries, C. Kiraly, P. Montino,
|
||||
was financially supported by the Italian Ministry of Health through D. Kanter, S. Ahmed, D. Pau, et al., Mlperf tiny benchmark, 2021, arXiv preprint
|
||||
arXiv:2106.07597.
|
||||
the Complementary National Plan (CNP) to the PNRR. This publication
|
||||
[18] MLCommons, 2024, URL: https://mlcommons.org/benchmarks/inference-tiny/.
|
||||
reflects only the authors’ view and the Italian Ministry of Health is not [19] Performance mode vs. Energy mode, 2022, URL: https://github.com/eembc/
|
||||
responsible for any use that may be made of the information it contains. energyrunner?tab=readme-ov-file#performance-mode-vs-energy-mode.
|
||||
[20] B.N. Taylor, C.E. Kuyatt, Guidelines for Evaluating and Expressing the Un-
|
||||
Data availability certainty of NIST Measurement Results, NIST Technical Note 1297, National
|
||||
Institute of Standards and Technology (NIST), Gaithersburg, MD, 2020, http:
|
||||
//dx.doi.org/10.6028/NIST.TN.1297-2020.
|
||||
Data will be made available on request. [21] STMCubeIDE, 2022, URL: https://stm32ai.st.com/stm32-cube-ai/.
|
||||
[22] Ubuntu 12 RT, 2012, Real-time variant of Ubuntu 12, Canonical Ltd. https:
|
||||
//ubuntu.com/real-time. Canonical Ltd.
|
||||
References [23] STMicroelectronics, STM32H753xI - 32-bit Arm® Cortex® -M7 480MHz MCUs,
|
||||
2MB flash, 1MB RAM, 46 com. and Analog Interfaces, Crypto - Datasheet -
|
||||
[1] R. Chataut, A. Phoummalayvane, R. Akl, Unleashing the power of IoT: A Production Data, Datasheet DS12117 Rev 9, STMicroelectronics, 2023, p. 358,
|
||||
comprehensive review of IoT applications and future prospects in healthcare, URL: https://www.st.com/resource/en/datasheet/stm32h753vi.pdf. (Accessed 21
|
||||
agriculture, smart homes, smart cities, and industry 4.0, Sensors 23 (16) (2023) August 2025).
|
||||
7194. [24] STMicroelectronics, STM32U575xx - Ultra-low-power Arm® Cortex® -M33 32-bit
|
||||
[2] Q. Ma, H. Tan, T. Zhou, Mutual authentication scheme for smart devices in MCU+TrustZone® +FPU, 240 DMIPS, up to 2 MB Flash memory, 786 KB SRAM -
|
||||
IoT-enabled smart home systems, Comput. Stand. Interfaces 86 (2023) 103743. Datasheet - production data, Datasheet DS13737 Rev 10, STMicroelectronics,
|
||||
[3] C.-W. Shih, C.-H. Wang, Integrating wireless sensor networks with statistical 2024, p. 346, URL: https://www.st.com/resource/en/datasheet/stm32u575ag.
|
||||
quality control to develop a cold chain system in food industries, Comput. Stand. pdf. (Accessed 21 August 2025).
|
||||
Interfaces 45 (2016) 62–78. [25] UEC Electronics, AR4236–AR4237 Luckfox Pico Pro/Max Datasheet,
|
||||
[4] S.B. Baker, W. Xiang, I. Atkinson, Internet of things for smart healthcare: Datasheet, UEC Electronics, 2024, URL: https://uelectronics.com/wp-
|
||||
Technologies, challenges, and opportunities, IEEE Access 5 (2017) 26521–26544. content/uploads/2024/07/AR4236-AR4237-Luckfox-Pico-Pro-Max-Datasheet.pdf.
|
||||
[5] Y. Abadade, A. Temouden, H. Bamoumen, N. Benamar, Y. Chtouki, A.S. Hafid, (Accessed 21 August 2025).
|
||||
A comprehensive survey on tinyml, IEEE Access (2023). [26] I. BIPM, I. IFCC, I. ISO, O. IUPAP, Evaluation of measurement data—guide to
|
||||
the expression of uncertainty in measurement, JCGM 100: 2008 GUM 1995 with
|
||||
minor corrections, Jt. Comm. Guides Metrol. 98 (2008).
|
||||
|
||||
|
||||
|
||||
|
||||
9
|
||||
|
||||
@@ -0,0 +1,834 @@
|
||||
Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Journal of Systems Architecture
|
||||
journal homepage: www.elsevier.com/locate/sysarc
|
||||
|
||||
|
||||
|
||||
|
||||
Fast post-quantum private set intersection from oblivious pseudorandom
|
||||
function for mobile social networks✩
|
||||
Zhuang Shan a , Leyou Zhang a ,∗, Qing Wu b , Qiqi Lai c , Fuchun Guo d
|
||||
a School of Mathematics and Statistics, Xidian University, Xi’an 710126, China
|
||||
b
|
||||
School of Automation, Xi’an University of Posts and Telecommunications, Xi’an 710121, China
|
||||
c
|
||||
School of Computer Science, Shaanxi Normal University, Xi’an 710121, China
|
||||
d
|
||||
Centre for Computer and Information Security Research, University of Wollongong, Wollongong, NSW 2522, Australia
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: Mobile social networks have become integral to our daily lives, transforming communication methods and
|
||||
Mobile social networks facilitating social interactions. With technological advancements, users generate vast amounts of valuable
|
||||
Private set intersection and sensitive personal data, which is stored on servers to enable instant information sharing. To protect the
|
||||
Oblivious pseudorandom function
|
||||
sharing data, each platform has implemented many techniques such as end-to-end encryption mechanisms,
|
||||
Private information retrieval
|
||||
fully homomorphic encryption, etc. However, these approaches face several security and privacy challenges,
|
||||
including potential leaks of user data, vulnerabilities in encryption that expose privacy ciphertexts to
|
||||
probabilistic attacks, and threats posed by future quantum computers.
|
||||
Aimed at the above, we introduce a private set intersection (PSI) protocol based on oblivious pseudorandom
|
||||
functions (OPRF) under ring LPR problem from lattice. The proposed perturbed pseudorandom generator
|
||||
not only enhances the PSI’s resistance to probabilistic attacks, but also leads to generate a more efficient
|
||||
OPRF and a PSI. It boasts a time complexity of 𝑂(𝑛 log 𝑛) and is superior to existing well-known fast post-
|
||||
quantum PSI protocol operating at 𝑂(𝑚𝑛 log(𝑚𝑛)), where 𝑚 is the bit length of the cryptographic modulus and 𝑛
|
||||
represents the dimension of the security parameter. Simulation experiments and security analyses demonstrate
|
||||
that our proposal effectively preserves user privacy, ensures collusion resilience, verifies computation results,
|
||||
and maintains low computational costs. Finally, as an expansion of our OPRF, we also give a fast private
|
||||
information retrieval (PIR) protocol.
|
||||
|
||||
|
||||
|
||||
1. Introduction respective data sets. This way, even if data is stored in distributed
|
||||
systems, it can effectively prevent data breaches and violations of user
|
||||
Mobile social networks have greatly enriched the ways people com- privacy, such as those caused by data leaks or unauthorized access.
|
||||
municate and enhanced the convenience of social interactions. With the The application of PSI in mobile social networks not only enhances
|
||||
development of technology, users generate a large amount of useful data security but also strengthens user trust in the platform, which
|
||||
and sensitive personal data within mobile social networks. This data
|
||||
is crucial for protecting user privacy and improving the platform’s
|
||||
often needs to be stored and processed to provide more personalized
|
||||
competitiveness. In this way, mobile social networks can continue to
|
||||
services and experiences [1,2]. However, due to the limited storage
|
||||
capacity of mobile social network devices, it is impossible to store all provide a rich and vibrant social experience and efficient information
|
||||
the data generated at any given moment, which presents challenges for services while safeguarding personal privacy. Furthermore, as an im-
|
||||
data storage and privacy protection. portant application in the field of privacy computing, PSI has recently
|
||||
To address this issue while ensuring data confidentiality and se- garnered widespread attention due to its efficiency and practicality,
|
||||
curity, many mobile social network platforms have started adopting jointly promoting the rapid implementation of privacy computing tech-
|
||||
advanced privacy-preserving technologies, such as private set inter- nology and ensuring the secure flow and value extraction of data
|
||||
section (PSI). The technology allows two or more parties to securely elements.
|
||||
compute the intersection of their datasets without disclosing their
|
||||
|
||||
|
||||
✩ This document is the results of the research project funded by the National Science Foundation.
|
||||
∗ Corresponding author.
|
||||
E-mail addresses: arcsec30@stu.xidian.edu.cn (Z. Shan), lyzhang@mail.xidian.edu.cn (L. Zhang), xiyouwuq@126.com (Q. Wu), laiqq@snnu.edu.cn (Q. Lai),
|
||||
fuchun@uow.edu.au (F. Guo).
|
||||
|
||||
https://doi.org/10.1016/j.sysarc.2025.103346
|
||||
Received 3 November 2024; Received in revised form 24 December 2024; Accepted 16 January 2025
|
||||
Available online 25 January 2025
|
||||
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
set intersection from oblivious pseudorandom function is proposed in
|
||||
this paper, and it has the following advantages:
|
||||
|
||||
• Symmetric encryption is adopted, which is efficient and reduces the risk of
|
||||
privacy leakage. The PSI in this paper is constructed based on OPRF,
|
||||
which belongs to asymmetric encryption, thus reducing the number
|
||||
of interactions between users and lowering the risk of user privacy
|
||||
leakage. Compared to symmetric encryption, the operational cost of
|
||||
asymmetric encryption is lower, reducing reliance on authoritative
|
||||
institutions.
|
||||
• The structure of OPRF is simple, and it is relatively efficient in post-
|
||||
quantum OPRF. The OPRF used to construct PSI in this paper is based
|
||||
on a new lattice problem, namely the learning parity with rounding
|
||||
Fig. 1. Mobile social networks.
|
||||
over ring problem(Ring-LPR). The Ring-LPR problem not only has a
|
||||
simple structure but also possesses the capability to resist quantum
|
||||
attacks.
|
||||
• A perturbed pseudorandom generator (PPRG) can withstand probabilistic
|
||||
attacks. In addition to OPRF, the PSI in this paper also includes
|
||||
a structure with a perturbed pseudorandom generator, which can
|
||||
overcome the weakness of weak encryption in symmetric encryp-
|
||||
tion, thereby preventing adversaries from guessing the corresponding
|
||||
plaintext using statistical methods on the ciphertext ratios.
|
||||
|
||||
|
||||
Fig. 2. Private set intersection. 1.2. Technical overview
|
||||
|
||||
We adopted oblivious transfer technique and hamming correlation
|
||||
There are many common construction tools for PSI [3], and obliv- robustness, both of which are used in the OPRF construction presented
|
||||
ious transfer (OT) is one of them. An OT [4] is a crucial tool used in this paper. For the incidental pseudorandom function subject, we
|
||||
for secure multiparty computation. In this tool, the sender transmits initially aimed to use learning parity with noise (LPN) over rings.
|
||||
data from a set of messages to the receiver but remains oblivious to However, this approach results in varying encryption outcomes for the
|
||||
which specific message was sent, while the receiver is unaware of the same private data, preventing the recipient from matching the private
|
||||
other messages they did not receive. This protocol is also known as the
|
||||
data. Thus, we sought to make LPN over rings behave consistently
|
||||
oblivious transfer protocol. The essence of an oblivious pseudorandom
|
||||
like learning with rounding (LWR), leading to the introduction of the
|
||||
function is a pseudorandom function (PRF) enhanced with oblivious
|
||||
concept of learning parity with rounding over rings (LPR over rings) in
|
||||
transfer capabilities.
|
||||
this paper.
|
||||
In 1986, Goldreich, Goldwasser, and Micali introduced a new cryp-
|
||||
To prove that LPR over rings is quantum-resistant, we established
|
||||
tographic primitive known as the pseudorandom function, whose out-
|
||||
put appears to be randomly chosen [5]. Two decades later, Naor and a reduction bridge between LPR over rings and LWR. Yes, LPR over
|
||||
Reingold [6] noticed that their number-theoretic PRF allows for an rings is reduced to LWR, not LPN over rings. For (𝑞 = 2𝑛 , 𝑝)-LWR
|
||||
interactive and oblivious evaluation, where a ‘‘client’’ with input 𝑥 instances, we demonstrated the hardness of (𝑞 = 2, 𝑝 = 1)-LWR instances
|
||||
obtains 𝐹𝑘 (𝑥) for a function 𝐹𝑘 (𝑥) that is contributed by a ‘‘server’’. and (𝑞 = 2, 𝑝 = 1)-LWR over rings, where (𝑞 = 2, 𝑝 = 1)-LWR over
|
||||
Neither does the client learn the function (i.e., its key 𝑘), nor does the rings corresponds to LPR over rings. To verify that the computational
|
||||
server learn 𝑥 or 𝐹𝑘 (𝑥). Freedman et al. later called such two-party efficiency of the post-quantum OPRF in this paper is quite fast, we
|
||||
protocol an OPRF and gave first formal definitions and two OPRFs compared the OPRF with the LWE-instantiated OPRF from [14]. The
|
||||
based on the Naor-Reingold PRF [7]. In 2009, Jarecki and Liu presented results showed that, as theoretical analysis suggested, the computation
|
||||
an efficient OPRF for securing intersection data [8]. efficiency improves with the increase of security parameters.
|
||||
Oblivious pseudorandom functions have been utilized in PSI [9]. Based on OPRF, we constructed private set intersection (PSI) based
|
||||
The additional functionalities of oblivious pseudorandom functions on OPRF. Since the paper [15] analyzed that PSI based on symmetric
|
||||
also exhibit diversity, such as verifiable oblivious pseudorandom func- encryption does not resist probabilistic attacks and proposed the con-
|
||||
tions (VOPRF, [10]) and partially oblivious pseudorandom functions cept of perturbed pseudorandom generator, we used LPN over rings
|
||||
(POPRF, [11]). to construct a pseudorandom generator and proved that it satisfies the
|
||||
Currently, OPRFs still faces challenges, as summarized by Casacu- definition of PPRG as given in [15].
|
||||
berta, Hesse, and Lehmann [12]. Efficient OPRF constructions often
|
||||
rely on discrete-log or factoring-type hardness assumptions, which
|
||||
1.3. Organizations
|
||||
are vulnerable to quantum computers. This paper aims to address
|
||||
this by constructing OPRFs based on lattice-hardness assumptions and
|
||||
improving their efficiency (see Figs. 1 and 2). The structure of this paper is as follows. Section 3 provides the
|
||||
necessary definitions and lemmas as a foundation for the readers’
|
||||
1.1. Contributions knowledge. Section 4 presents the construction and efficiency analysis
|
||||
of OPRF, along with the definition and reduction of Ring-LPR. Section 5
|
||||
Regarding the open problem proposed by Casacuberta, there are details the construction of the PSI in this paper, security proofs, and
|
||||
currently quantum-resistant OPRFs, namely Albrecht et al.’s lattice- LWE-based efficiency analysis, as well as the construction of the PPRG
|
||||
based VOPRF [10] and Boneh et al.’s isogeny-based OPRF [13]. Both and the proof of its pseudorandomness. Finally, Section 6 summarizes
|
||||
constructions represent significant feasibility results but require further the advantages and limitations of the PSI presented in this paper, as
|
||||
research to improve their efficiency [12]. So, fast post-quantum private well as the extension of OPRF to PIR
|
||||
|
||||
2
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
2. Preliminary ⎛ 0 0 0 ⋯ 0 −1 ⎞
|
||||
⎜ 1 0 0 ⋯ 0 0 ⎟
|
||||
Each element of a lattice in R𝑛 can be expressed linearly by 𝑛 ⎜ ⎟
|
||||
0 1 0 ⋯ 0 0 ⎟
|
||||
𝑋=⎜ .
|
||||
linearly independent vector integer coefficients. This set of linearly ⎜ 0 0 1 ⋯ 0 0 ⎟
|
||||
independent vectors is called a lattice basis, and we know that the ⎜ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⎟⎟
|
||||
⎜
|
||||
lattice basis is not unique. Given a set of lattice bases (𝑣1 , … , 𝑣𝑛 ) in ⎝ 0 0 0 ⋯ 1 0 ⎠
|
||||
the lattice , then the fundamental parallelelepiped is
|
||||
{ 𝑛 } So there is
|
||||
∑ |
|
||||
(𝑣1 , … , 𝑣𝑛 ) = 𝑘𝑖 𝑣𝑖 ||𝑘𝑖 ∈ [0, 1) . ⎛ 𝑎0 −𝑎𝑛−1 ⋯ −𝑎1 ⎞
|
||||
| ⎜ ⎟
|
||||
𝑖=1 𝑎1 𝑎0 ⋯ −𝑎2 ⎟
|
||||
𝑅𝑜𝑡(𝑓 ) = ⎜ ,
|
||||
If the lattice base (𝑣1 , … , 𝑣𝑛 ) is determined, use the symbol () to ⎜ ⋮ ⋮ ⋱ ⋮ ⎟
|
||||
replace (𝑣1 , … , 𝑣𝑛 ). ∀𝑥 ∈ R𝑛 , project it onto (). According to the ⎜ 𝑎 𝑎𝑛−2 ⋯ ⎟
|
||||
𝑎0 ⎠
|
||||
⎝ 𝑛−1
|
||||
properties of projection, there is a unique 𝑦 ∈ () makes 𝑦 − 𝑥 ∈ .
|
||||
it is easy to prove that this mapping relationship is isomorphic.
|
||||
Use the symbol det () to represent the volume of the fundamental
|
||||
parallelelepiped of the lattice . In other words, the symbol det ()
|
||||
Definition 3 (Learning with Rounding, [16,17]). Let 𝜆 be the security
|
||||
represents the determinant of a matrix composed of a set of lattice bases
|
||||
parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆), 𝑞 = 𝑞(𝜆), 𝑝 = 𝑝(𝜆) be integers. The LWR
|
||||
(𝑣1 , … , 𝑣𝑛 ). For a given 𝑛 dimensional lattice, the det () size of any set
|
||||
problem states that for 𝐴 ∈ Z𝑚×𝑛 𝑛 𝑚
|
||||
𝑞 , 𝑠 ∈ Z𝑞 , 𝑢 ∈ Z𝑞 the following distri-
|
||||
of lattice bases of the lattice is constant.
|
||||
butions are computationally indistinguishable: (𝐴, ⌊𝐴𝑠⌋𝑝 ) ≈𝐶 (𝐴, ⌊𝑢⌋𝑝 ).
|
||||
Given 𝑛 lattice , (𝑣1 , … , 𝑣𝑛 ) and (𝑢1 , … , 𝑢𝑛 ) are two arbitrary groups
|
||||
∑ Here ⌊𝑥⌋𝑝 = ⌊ 𝑞𝑝 𝑥⌋, ⌊𝑥⌋ represents the floor function, which rounds down
|
||||
of lattice respectively lattice bases. Therefore, there is 𝑣𝑖 = 𝑛𝑗=1 𝑚𝑖𝑗 𝑢𝑗
|
||||
∑𝑛 ′ to the nearest integer. For example, ⌊3.14⌋ = 3 and ⌊3⌋ = 3.
|
||||
and 𝑢𝑖 = 𝑗=1 𝑚𝑖𝑗 𝑣𝑗 , 𝑖 ∈ {1, … , 𝑛}, there are two integer matrices 𝑀 and
|
||||
𝑀 ′ such that
|
||||
⎛ 𝑣1 ⎞ ⎛ 𝑢1 ⎞ ⎛ 𝑢1 ⎞ ⎛ 𝑣1 ⎞ Definition 4 (Learning Parity with Noise, [18,19]). Let 𝜆 be the security
|
||||
⎜ ⋮ ⎟ = 𝑀 ⎜ ⋮ ⎟ and ⎜ ⋮ ⎟ = 𝑀 ′ ⎜ ⋮ ⎟ . parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆) be integers. The LPN problem states
|
||||
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
|
||||
⎝ 𝑣𝑛 ⎠ ⎝ 𝑢𝑛 ⎠ ⎝ 𝑢𝑛 ⎠ ⎝ 𝑣𝑛 ⎠ that for 𝐴 ∈ Z𝑚×𝑛
|
||||
2
|
||||
, 𝑠 ∈ Z𝑛2 , 𝑢, 𝑒 ∈ Z𝑚
|
||||
2
|
||||
the following distributions are
|
||||
computationally indistinguishable: (𝐴, 𝐴𝑠 + 𝑒) ≈𝐶 (𝐴, 𝑢).
|
||||
It is easy to prove that 𝑀 and 𝑀 ′ are inverse to each other, and 𝑀
|
||||
and 𝑀 ′ are both integer matrices, there are det (𝑀)⋅ det (𝑀 ′ ) = 1 and
|
||||
det (𝑀) = det (𝑀 ′ ) = ±1, so Definition 5 (Hamming Correlation Robustness, [14]). For a hash func-
|
||||
det (𝑣1 , … , 𝑣𝑛 ) = ± det (𝑢1 , … , 𝑢𝑛 ). tion (⋅) and a pseudorandom function 𝐹𝑘 (⋅) with key 𝑘, (⋅) is Ham-
|
||||
ming correlation robust if (𝑥) ≈𝐶 𝐹𝑘 (𝑥).
|
||||
|
||||
|
||||
Definition 1. An ideal lattice is a subset of rings or domains that Definition 6 (OT1 ). The message sender sends data to the receiver
|
||||
satisfies the following two properties: from a set of pending messages but remains oblivious to which specific
|
||||
message was sent. Meanwhile, the receiver is unaware of the additional
|
||||
1. Additive closure: If any two elements in the ideal are added, the data they want to receive. This protocol is also known as oblivious
|
||||
result is still in the ideal. In other words, for any elements 𝑎 and transfer.
|
||||
𝑏 in the ideal, 𝑎 + 𝑏 also belongs to that ideal.
|
||||
2. Multiplicative absorptivity: If an element in the ideal is multi-
|
||||
plied by any element in the ring (or field), the result is still in Definition 7 (OPRF, [20]). Let the PRF key 𝑘 consist of two bit-
|
||||
the ideal. In other words, for any element 𝑎 in the ideal and any strings 𝑞 , 𝑠 ∈ {0, 1}𝜆 . Let 𝐹 (⋅)be a pseudorandom code that produces a
|
||||
element 𝑟 in the ring (or field), 𝑎𝑟 and 𝑟𝑎 belong to that ideal. pseudorandom string and let be a hash function. The pseudorandom
|
||||
function is computed as
|
||||
For a commutative ring, further require that the ideal be closed for both
|
||||
addition and multiplication. Such an ideal is called a true ideal. OPRF𝑘 (𝑥) = (𝑞 ⊕ [𝐹 (𝑥) ⋅ 𝑠]),
|
||||
|
||||
where ⋅ denotes bitwise-AND and ⊕ denotes bitwise-XOR. For a ran-
|
||||
Definition 2. Referring to the definition of ideal, the ideal lattice is domly generated s, if 𝐹 (𝑥) has enough Hamming weight then the
|
||||
a subset of the lattice that satisfies the following two properties: function OPRF𝑘 (𝑥) is pseudorandom assuming the hash function is
|
||||
correlation robust.
|
||||
1. Additive closure: If any two elements in an ideal lattice are
|
||||
added, the result is still in the ideal lattice. In other words, for
|
||||
any elements 𝑎 and 𝑏 in an ideal lattice, 𝑎+𝑏 also belongs to that Definition 8 (PSI, [14]). PSI enables two parties, each holding a private
|
||||
ideal lattice. set of elements, to compute the intersection of the two sets while
|
||||
2. Multiplicative absorptivity: If an element in an ideal lattice is revealing nothing more than the intersection itself.
|
||||
multiplied by an element in any other ideal lattice, the result
|
||||
remains in the ideal lattice. In other words, for any element 𝑎 in
|
||||
Definition 9 (Dihedral Coset Problem). Given a security parameter 𝜅, for
|
||||
the ideal and any element 𝑟 in another ideal lattice, both 𝑎𝑟 and
|
||||
an instance of the DCP𝓁𝑞 problem, where 𝑁 denotes the modulus and 𝓁
|
||||
𝑟𝑎 belong to that ideal lattice.
|
||||
represents the number of states. Each state is expressed as
|
||||
|0⟩|𝑥𝑖 ⟩ + |1⟩|(𝑥𝑖 + 𝑠) mod 𝑞⟩, 𝑖 ≤ 𝓁,
|
||||
Corollary 1. The ideal lattice is a true idea of the lattice . and it stores 1 + ⌈log2 𝑞⌉ bits, where 𝑥 ∈𝑅 Z𝑛𝑞 and 𝑠 ∈ Z𝑛𝑞 . If 𝑠 can be
|
||||
For 𝑓 (𝑥) = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 is mapped to computed with probability poly(1∕ log 𝑞) in time poly(log 𝑞), then the
|
||||
DCP𝓁𝑞 problem is considered to be broken.
|
||||
𝑅𝑜𝑡(𝑓 ) = 𝑎0 𝐼 + 𝑎1 𝑋 + ⋯ + 𝑎𝑛−1 𝑋 𝑛−1 ∈ .
|
||||
̃
|
||||
|
||||
Among them, ̃ is the mapping of all Z[𝑥]∕<𝑥𝑛 + 1> to the elements in
|
||||
1
|
||||
the ideal lattice collection, and https://blog.csdn.net/m0_61869253/article/details/139362753
|
||||
|
||||
|
||||
3
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
3.2. Security proof of OPRF
|
||||
|
||||
Note 1. The Dihedral Coset Problem is a difficult problem in quantum In this subsection, we will provide the definition of the underly-
|
||||
computing, and solving it has a time complexity of 𝑂(𝑒𝑛 ) or 𝑂(𝑛!). ing lattice problem for OPRF, learning parity with rounding, and its
|
||||
reduction proof.
|
||||
|
||||
Lemma 1. If an efficient algorithm can solve DCP𝓁2 in polynomial
|
||||
Definition 11 (Learning Parity with Rounding). Let 𝜆 be the security
|
||||
time, then there exists an efficient algorithm ′ that can solve DCP𝓁𝑞 in
|
||||
parameter, 𝑛 = 𝑛(𝜆), 𝑚 = 𝑚(𝜆) be integers. The LPR problem states
|
||||
polynomial time.
|
||||
that for 𝐴 ∈ Z𝑚×𝑛
|
||||
2
|
||||
, 𝑠 ∈ Z𝑛2 , 𝑢 ∈ Z𝑚 2
|
||||
the following distributions are
|
||||
computationally indistinguishable: (𝐴, ⌊𝐴𝑠 mod 4⌋1 ) ≈𝐶 (𝐴, ⌊𝑢⌋1 ).
|
||||
Proof. We use a proof by contradiction. Suppose 𝑞 = 2𝑛 and there exists
|
||||
an efficient algorithm that can solve DCP𝓁2 in polynomial time. For Definition 12 (Learning Parity with Rounding Over Ring). The Ring LPR
|
||||
instances of DCP𝓁4 , we have problem states that for 𝑎, 𝑠, 𝑢 ∈ 2 the following distributions are
|
||||
|0⟩|𝑥𝑖 ⟩+|1⟩|(𝑥𝑖 + 𝑠) mod 4⟩ = |0⟩|𝑥′𝑖 ⟩ + |1⟩|(𝑥′𝑖 + 𝑠′ ) mod 2⟩ computationally indistinguishable: (𝑎, ⌊𝑎𝑠 mod 4⌋1 ) ≈𝐶 (𝑎, ⌊𝑢⌋1 ).
|
||||
+ 2(|0⟩|𝑥′′ ′ ′′
|
||||
𝑖 ⟩ + |1⟩|(𝑥𝑖 + 𝑠 ) mod 2), 𝑖 ≤ 𝓁,
|
||||
|
||||
so running the algorithm twice will solve DCP𝓁4=22 . Similarly, run- Lemma 4. For an LWR problem instance ⌊𝐴𝑠⌋𝑝 , if there exists an algorithm
|
||||
ning four times will solve DCP𝓁16=24 , and continuing in this manner, for solving 𝑠 from ⌊𝐴𝑠⌋1 , then there also exists an algorithm ′ for
|
||||
running the algorithm 𝑛 times will solve DCP𝓁𝑞 . Let 𝑂() represent solving the LWR problem.
|
||||
the time complexity of the algorithm . Thus, we have ′ ≤ 𝑛𝑂()
|
||||
and algorithm ′ is an efficient algorithm. □ Proof. Given that there exists an algorithm that can solve ⌊𝐴𝑠⌋1 =
|
||||
⌊ 𝐴𝑠 ⌋, for an LWR problem instance ⌊𝐴𝑠⌋𝑝 , we have:
|
||||
𝑞 ⌊ ⌋
|
||||
Definition 10 (Extrapolated Dihedral Coset Problem with model 2, [21]). 1 1 𝑝𝐴𝑠
|
||||
⌊𝐴𝑠⌋𝑝 =
|
||||
Given a security parameter 𝜅, an instance of EDCP𝓁𝑛,2,𝜌 is provided, 𝑝 𝑝 𝑞
|
||||
( )
|
||||
where 2 denotes the modulus, 𝜌 represents the probability density 1 𝑝𝐴𝑠
|
||||
= +𝑒 (𝑒 ∈ (−1, 0]𝑚 )
|
||||
function, and 𝓁 denotes the number of states. Each state is expressed 𝑝 𝑞
|
||||
( ( ]𝑚 )
|
||||
as 1 1
|
||||
∑ = 𝐴𝑠 + 𝑒′ 𝑒′ ∈ − , 0
|
||||
𝜌(𝑗)|𝑗⟩|(𝑥𝑖 + 𝑗 𝑠) mod 2⟩, 𝑖 ≤ 𝓁, 𝑞 𝑝
|
||||
𝑗∈supp(𝜌) ≈ ⌊𝐴𝑠⌋1 .
|
||||
and stores 2 bits, where 𝑥𝑖 ∈𝑅 Z𝑛2 and 𝑠 ∈ Z𝑛2 . If 𝑠 can be determined
|
||||
Thus, the algorithm can be used to solve the LWR problem. □
|
||||
with probability poly(1∕(𝑛 log 2)) in time poly(𝑛 log 2), then the EDCP𝓁𝑛,2,𝜌
|
||||
problem is considered to be broken. We get next corollary by Lemma 3.
|
||||
√
|
||||
Corollary 3. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)
|
||||
Lemma 2. If there exists an algorithm for solving EDCP𝓁𝑛,4,𝜌 , then this be an instance of 2-LWR. If there exists an algorithm for solving 2-LWR,
|
||||
algorithm can also solve DCP𝓁4 . then there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .
|
||||
𝑟
|
||||
|
||||
|
||||
√
|
||||
Proof. Let Corollary 4. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼)
|
||||
1 1 be an instance of LPR. If there exists an algorithm for solving LPR, then
|
||||
|𝑏⟩ = √ |0⟩|𝑥𝑖 ⟩ + √ |1⟩|(𝑥𝑖 + 𝑠) mod 4⟩.
|
||||
2 2 there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 .
|
||||
𝑟
|
||||
|
||||
Thus, 𝜌(0)|0⟩ = √1 |0⟩ and 𝜌(1)|1⟩ = √1 |1⟩. Hence, DCP𝓁2 is a special
|
||||
2 2
|
||||
case of EDCP𝓁𝑛,2,𝜌 . Therefore, if there exists an algorithm for solving Lemma 5. If there exists an algorithm for solving the Ring-LPR problem,
|
||||
EDCP𝓁𝑛,2,𝜌 , this algorithm can also solve DCP𝓁2 . □ then there also exists an algorithm ′ for solving the LPR problem.
|
||||
|
||||
|
||||
√ Proof. For an instance of the inner product Ring-LPR
|
||||
Lemma 3 ([21]). Let (𝑛, 𝑞 , 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and
|
||||
(𝑛, 𝑞 , 𝛼) be an instance of LWE. If there exists an algorithm for solving 𝑏 = ⌊𝑎 ⋅ 𝑠⌋1
|
||||
LWE𝑛,𝑞,𝛼 , then there exists an algorithm for solving G-EDCP𝓁𝑛,𝑞,𝜌 . where 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 , we can represent 𝑎 as a circulant
|
||||
𝑟
|
||||
matrix, specifically
|
||||
√ ⎛ 𝑎0 −𝑎𝑛−1 ⋯ −𝑎1 ⎞
|
||||
Corollary 2. Let (𝑛, 2, 𝑟 = 𝛺( 𝜅)) be an instance of G-EDCP and (𝑛, 2, 𝛼) ⎜ ⎟
|
||||
𝑎 𝑎0 ⋯ −𝑎2 ⎟
|
||||
be an instance of LPN. If there exists an algorithm for solving LPN𝑛,𝛼 , then 𝐴1 ∶= ⎜ 1
|
||||
.
|
||||
⎜ ⋮ ⋮ ⋱ ⋮ ⎟
|
||||
there exists an algorithm for solving G-EDCP𝓁𝑛,2,𝜌 . ⎜ 𝑎 ⎟
|
||||
𝑟
|
||||
⎝ 𝑛−1 𝑎𝑛−2 ⋯ 𝑎0 ⎠
|
||||
Thus,
|
||||
3. Ring-LPR based OPRF
|
||||
𝑏 = ⌊𝑎 ⋅ 𝑠⌋1 ⇒ 𝑏 = 𝐴1 𝑠.
|
||||
3.1. Constructing OPRF where 𝑎 = (𝑎0 , 𝑎1 , … , 𝑎𝑛−1 ) ← 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 . We use
|
||||
a proof by contradiction. Suppose there exists an efficient algorithm
|
||||
Fig. 3 presents the ring LPR-based oblivious pseudorandom func- that can solve Ring-LPR in polynomial time. We take the first row
|
||||
tion. In the next section, we will prove the security of the oblivious from 𝐴1 , denote it as 𝛼1 , and have ⌊𝛼1 𝑠⌋1 = 𝑏1 , where 𝑏1 is the first
|
||||
pseudorandom function. component of 𝑏. For the LWR problem instance, 𝛽⃗ = ⌊𝛬𝑠⃗⌋1 , assume
|
||||
|
||||
4
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Oblivious Pseudorandom Function (OPRF).
|
||||
|
||||
|
||||
|
||||
𝛬𝑇 = (𝛼1 , 𝛼2 , … , 𝛼𝑚 ).
|
||||
|
||||
Thus, we use the algorithm 𝑚 times to find 𝛽𝑖 such that ⌊𝛾𝑖 ⌋1 = 𝛽𝑖 =
|
||||
⌊𝛼1 𝑠1 ⌋1 , and thus we can solve the equation
|
||||
𝛾 = 𝛬𝑠⃗, 𝛾 𝑇 = (𝛾1 , … , 𝛾𝑚 ).
|
||||
|
||||
|
||||
Assuming that the time complexity of solving 𝑠 from LWR problem
|
||||
instance is 𝑂(𝛬, 𝛽), according to Corollary 3, let 𝑂(𝛾 = 𝛬𝑠⃗) be the
|
||||
computational complexity of solving the equation 𝛾 = 𝛬𝑠⃗, we have
|
||||
𝑚𝑂() + 𝑂(𝛾 = 𝛬𝑠⃗) ≥ 𝑂(𝛬, 𝛽) ≥ 𝑂(𝑛!) or 𝑂(𝑒𝑛 ).
|
||||
|
||||
Let 𝑚 = 𝑛, then
|
||||
𝑂(𝛬, 𝛽) − 𝑂(𝛾 = 𝛬𝑠⃗)
|
||||
𝑂() ≥
|
||||
𝑛
|
||||
𝑂(𝑛!) − 𝑂(𝛾 = 𝛬𝑠⃗) 𝑂(𝑒𝑛 ) − 𝑂(𝛾 = 𝛬𝑠⃗)
|
||||
≥ or .
|
||||
𝑛 𝑛
|
||||
This contradicts the assumption that there is an efficient algorithm
|
||||
that can solve the inner product Ring-LPR in polynomial time, thus the
|
||||
theorem holds. □
|
||||
|
||||
|
||||
3.3. Efficiency analysis
|
||||
|
||||
This section simulates the OPRF computation efficiency of this
|
||||
paper and OPRF in [14] on MAC, Pad and Phone. The PRF of [14]
|
||||
is instantiated based on LWE.
|
||||
|
||||
3.3.1. Efficiency analysis on MAC
|
||||
The tools used in the subsection are Python 3.12, the programs are
|
||||
performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB (see
|
||||
Fig. 4).
|
||||
|
||||
3.3.2. Efficiency analysis on mobile pad
|
||||
The tools used in the subsection are Pydriod 3, the programs are
|
||||
performed on Xiaomi Pad 6 Pro File Explorer 1th Qualcomm(R)AI En-
|
||||
gine(TM) Xiaolong 8+ mobile platform@3.2 GHz, RAM 8.00+3.00 GB
|
||||
(see Fig. 5).
|
||||
Fig. 4. Parallel comparison of OPRF on MAC, where 𝑛 represents the security
|
||||
parameter, unit is microseconds.
|
||||
3.3.3. Summary of data comparison
|
||||
From the simulation results, it can be seen that for 𝑛 ≤ 250, the
|
||||
LWE-based OPRF in [14] is slightly faster, while for 𝑛 > 250, the ring
|
||||
LPR-based OPRF in this paper is faster. Furthermore, as 𝑛 increases, 4. PSI based on OPRF
|
||||
the advantages of ring LPR become more pronounced. Based on the
|
||||
simulation results for Pad, the OPRF in this paper is more stable; In this paper, apart from OPRF, another tool used in the construction
|
||||
although there are fluctuations, they are less significant compared to of PSI is a perturbed pseudorandom generator [15]. The perturbed
|
||||
the LWE-based OPRF in [14]. pseudorandom generator in this paper is constructed from Ring-LPN.
|
||||
|
||||
5
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 6. Pseudorandom generator with perturbation 𝐺𝛾 (⋅).
|
||||
|
||||
|
||||
|
||||
√
|
||||
√𝑛−1
|
||||
√∑
|
||||
‖𝑎‖ = √ |𝑎 |2 . 𝑖
|
||||
𝑖=0
|
||||
|
||||
|
||||
|
||||
|
||||
Definition 15 ([15]). A pseudorandom generator with perturbation,
|
||||
denoted as 𝐺𝛾 (⋅), is defined such that for 𝑥1 , 𝑥2 ∈ , there exists 𝛾
|
||||
satisfying the following conditions:
|
||||
|
||||
1. When 𝑥1 = 𝑥2 , Pr (𝐺𝛾 (𝑥1 ) = 𝐺𝛾 (𝑥2 )) ≤ 𝑂(exp(−𝑛)),
|
||||
2. When 𝑥1 = 𝑥2 , such that ‖𝐺𝛾 (𝑥1 ) − 𝐺𝛾 (𝑥2 )‖ < 𝛾, there exists 𝑁
|
||||
such that ‖𝐺𝛾 (𝑥1 ) − 𝐺𝛾 (𝑥2 )‖ ≥ 𝛾 ⋅ 𝑁, where clearly 𝑁 = 1 is
|
||||
optimal.
|
||||
|
||||
|
||||
|
||||
Theorem 1. The Ring-LPN problem itself can be viewed as a pseudorandom
|
||||
function with perturbations.
|
||||
|
||||
|
||||
Proof. We prove each statement separately. First, when 𝑥1 = 𝑥2 , we
|
||||
Fig. 5. Parallel comparison of OPRF on mobile pads, where 𝑛 represents the security have
|
||||
parameter, unit is microseconds. ( ) 1
|
||||
Pr 𝐺𝛾 (𝑥1 ) = 𝐺𝛾 (𝑥2 ) = Pr (𝑒1 = 𝑒2 ) = 𝑛 .
|
||||
2
|
||||
√
|
||||
Additionally, set 𝛾 = 𝑛 + 1, so
|
||||
Next, we will present the reduction process for Ring-LPN.
|
||||
‖(𝐴𝑥1 + 𝑒1 ) − (𝐴𝑥2 + 𝑒2 )‖ = ‖𝑒1 − 𝑒2 ‖ < 𝛾 .
|
||||
4.1. Reduction of ring-LPN When 𝑥1 ≠ 𝑥2 , set 𝑣1 = 𝐺𝛾 (𝑥1 ), 𝑣2 = 𝐺𝛾 (𝑥2 ), and know that
|
||||
√ ∑𝑛 ( )𝑘 ( )𝑛−𝑘
|
||||
1 1
|
||||
Definition 13 (Learning Parity with Noise Over Ring). The learning parity Pr (‖𝑣1 − 𝑣2 ‖ ≤ 𝑛) = 𝐶𝑛𝑘
|
||||
𝑘=0
|
||||
3 2
|
||||
with noise over ring problem states that for 𝑎, 𝑠, 𝑒, 𝑢 ∈ {0,1} the
|
||||
following distributions are computationally indistinguishable: (𝑎, 𝑎𝑠 + ∑
|
||||
𝑛∕2 ( )𝑘 ( )𝑘 ( )𝑛−2𝑘
|
||||
1 1 1
|
||||
+ 𝐶𝑛𝑘 .
|
||||
𝑒) ≈𝐶 (𝑎, 𝑢). 3 6 2
|
||||
𝑘=0
|
||||
|
||||
Because
|
||||
( )𝑘 ( )𝑛−𝑘 ( ( )2 ( )𝑛 )
|
||||
Corollary 5. If there exists an efficient algorithm that can solve the ∑𝑛
|
||||
1 1 1 2 2 2
|
||||
Ring-LPN problem in polynomial time, then there also exists an algorithm 𝐶𝑛𝑘 = 𝑛 + +⋯+
|
||||
𝑘=0
|
||||
3 2 2 3 3 3
|
||||
′ that can solve the LPN problem. ( ( )𝑛 )
|
||||
3 2
|
||||
= 𝑛 1− ,
|
||||
2 3
|
||||
Proof. The proof method is similar to that of Lemma 5, but this way
|
||||
and
|
||||
the computational complexity of will decrease. If we want the Ring- ( )
|
||||
∑
|
||||
𝑛∕2 ( )𝑘 ( )𝑘 ( )𝑛−2𝑘 ( ) 2𝑛
|
||||
LPN problem to be ‘approximately’ as hard as the LPN problem, then 1 1 1 3⋅6 1 1
|
||||
𝐶𝑛𝑘 ≤ 1− .
|
||||
for the security parameters 𝜅1 of the Ring-LPN problem and 𝜅2 of the 𝑘=0
|
||||
3 6 2 17 2𝑛− 2𝑛 3⋅6
|
||||
LPN problem, we have
|
||||
Therefore
|
||||
𝑒𝜅1 (𝜅 )! ( √ √ )
|
||||
≥ 𝑒𝜅2 , or 1 ≥ (𝜅2 )!. 1
|
||||
Pr ‖𝑣1 − 𝑣2 ‖ ≤ 𝑛 < 𝑛 + 1 ≤ 𝑛 .
|
||||
𝜅12 𝜅12 2
|
||||
√
|
||||
Thus, we can roughly obtain 𝜅1 ≥ 1.5𝜅2 and 𝜅2 ≥ 12. Note that 𝑂(𝑛) Thus, there is a very high probability that ‖𝑣1 −𝑣2 ‖ ≥ 𝑛 + 1, and 𝑁 = 1
|
||||
is an asymptotically large quantity with respect to 𝑛. We use the most (see Fig. 6). □
|
||||
extreme case to determine the relationship between 𝜅1 and 𝜅2 . □
|
||||
|
||||
|
||||
4.2. Perturbed pseudorandom generator 4.3. PSI based on OPRF
|
||||
|
||||
Definition 14. Let 𝑎 = 𝑎0 + 𝑎1 𝑥 + ⋯ + 𝑎𝑛−1 𝑥𝑛−1 ∈ {0,1} . Define the Lemma 6. Assuming 𝑓 (𝑦) ≈𝐶 𝑢1 and 𝑔(𝑢1 ) ≈𝐶 𝑢2 , then (𝑔◦𝑓 )(𝑦) ≈𝐶 𝑢2 .
|
||||
norm of 𝑎 as ‖𝑎‖, and
|
||||
|
||||
6
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 7. PSI based on OPRF.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 9. Parallel comparison of PSI on mobile pads, where 𝑛 represents the security
|
||||
parameter, unit is microseconds.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 8. Parallel comparison of PSI on MAC, where 𝑛 represents the security parameter, Fig. 10. Comparison of PSI on mobile phones, where 𝑛 represents the security
|
||||
unit is microseconds. parameter, unit is microseconds.
|
||||
|
||||
|
||||
|
||||
7
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 11. PIR based on OPRF.
|
||||
|
||||
|
||||
Proof. On one hand, because the pseudorandom 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →
|
||||
{0,1} , for any 𝑘 ∈ {0,1} , 𝑦 ∈ ⊂ {0, 1}∗ , we have 𝐹̃𝑘 (𝑦) ≈𝐶 𝑢𝜔 ∈
|
||||
{0,1} .
|
||||
On the other hand, due to the pseudorandom function 𝐹𝑘 ∶ {0,1} ×
|
||||
{0,1} → {0,1} , for 𝑢𝓁1 ∈ {0,1} , we have 𝐹𝑘 (𝑢𝓁1 ) ≈𝐶 𝑢𝜔 . According
|
||||
to the property of the hash function, have 1 (𝑦) ≈𝐶 𝑢𝓁1 . Combining
|
||||
with Lemma 6, one can obtain that 𝐹𝑘 (1 (𝑦)) ≈𝐶 𝑢𝜔 . Consequently,
|
||||
𝐹̃𝑘 (𝑦) ≈𝐶 𝐹𝑘 (1 (𝑦)). □
|
||||
|
||||
|
||||
Theorem 2. If 1 is a collision resistant hash function, 2 and 3
|
||||
are hamming correlation robustness, then the protocol in Fig. 7 securely
|
||||
realizes 𝑃 𝑆 𝐼 in the semi-honest model when parameters 𝑚, 𝑤 are chosen
|
||||
as described in [14].
|
||||
|
||||
|
||||
Proof. Perspective from 𝑃1 .
|
||||
Hyb0 𝑃1 ’s view and 𝑃2 ’s output in the real protocol.
|
||||
Hyb1 Same as Hyb0 except that on 𝑃2 ’s side, for each 𝑖 ∈ [𝜔], if 𝑠[𝑖] = 0,
|
||||
then sample 𝐴𝑖 ← {0, 1}𝑚 and compute 𝐵𝑖 = 𝐴𝑖 ⊕ 𝐷𝑖 ; otherwise
|
||||
sample 𝐵𝑖 ← {0, 1}𝑚 and compute 𝐴𝑖 = 𝐵𝑖 ⊕ 𝐷𝑖 . This hybrid is
|
||||
identical to Hyb0 .
|
||||
Hyb2 Initialize an 𝑚 × 𝑤 binary matrix 𝐷 to all 1’s. Denote its column
|
||||
vectors by 𝐷1 , … , 𝐷𝜔 . Then 𝐷1 = ⋯ = 𝐷𝜔 = 1𝑚 . For 𝑦 ∈ ,
|
||||
randomly select 𝑣 ← [𝑚]𝜔 , and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
|
||||
Hyb3 Find a suitable pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ →
|
||||
{0,1} . For 𝑦 ∈ , compute 𝑣̃ = 𝐹̃𝑘 (𝑦), randomly select 𝑣 ← [𝑚]𝜔 ,
|
||||
and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
|
||||
Hyb4 Let there be a pseudorandom function 𝐹 ∶ {0,1} ×{0,1} → {0,1}
|
||||
and a hash function 1 ∶ {0, 1}∗ → {0,1} . For 𝑦 ∈ , compute
|
||||
𝑣′ = 𝐹𝑘 (1 (𝑦)), randomly select 𝑣 ← [𝑚]𝜔 , and set 𝐷𝑖 [𝑣[𝑖]] = 0 for
|
||||
all 𝑖 ∈ [𝜔].
|
||||
Hyb5 Let there be a pseudorandom function 𝐹 ∶ {0,1} × {0,1} →
|
||||
{0,1} , Hamming Correlation Robustness 2 ∶ Z𝑚×𝜔 {0,1}
|
||||
→ {0,1}
|
||||
and a hash function 1 ∶ {0, 1}∗ → {0,1} . For 𝑦 ∈ , compute
|
||||
𝑣′ = 𝐹𝑘 (1 (𝑦)), 𝑣 = 2 (𝑣′ ), and set 𝐷𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔].
|
||||
Fig. 12. Parallel comparison of PIR on MAC, where 𝑛 represents the security parameter, Given that Hyb0 ≈𝐶 Hyb1 ≈𝐶 Hyb2 ≈𝐶 Hyb3 , Hyb4 ≈𝐶 Hyb5 and
|
||||
unit is microseconds. according to Lemma 7, it be known that Hyb3 ≈𝐶 Hyb4 . Therefore, we
|
||||
have Hyb0 ≈𝐶 Hyb5 .
|
||||
Perspective from 𝑃2 .
|
||||
Lemma 7. Find a suitable pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ → Hyb0 𝑃2 ’s view in the real protocol.
|
||||
{0,1} . Assuming that the pseudo-random function 𝐹𝑘 ∶ {0,1} × {0,1} →
|
||||
Hyb1 𝜓 ← {0,1} , all other aspects are consistent with the real
|
||||
{0,1} and the hash function 1 ∶ {0, 1}∗ → {0,1} are indistinguishable,
|
||||
protocol.
|
||||
we have
|
||||
Hyb2 Introduce 𝐺𝛾 ∶ {0,1} → {0,1} and Hamming Correlation
|
||||
𝐹̃𝑘 (𝑦) ≈𝐶 𝐹𝑘 (1 (𝑦)).
|
||||
Robustness 3 ∶ Z𝑚×𝜔 {0,1}
|
||||
→ {0,1} , let the initial matrices be
|
||||
𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , randomly select 𝑣 ∈ [𝑚]𝜔 , set 𝐶𝑖 [𝑣[𝑖]] = 0
|
||||
for all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]]).
|
||||
|
||||
8
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
Hyb3 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , find an appropriate • Setup The simulator generates some necessary parameters for the
|
||||
pseudorandom function 𝐹̃𝑘 ∶ {0,1} × {0, 1}∗ → {0,1} . For 𝑦 ∈ , algorithms and selects an appropriate hash functions 1 ∶ {0, 1}∗ →
|
||||
compute 𝑣̃ = 𝐹̃𝑘 (𝑦), randomly select 𝑣 ← [𝑚]𝜔 , set 𝐶𝑖 [𝑣[𝑖]] = 0 for {0,1} , Hamming Correlation Robustness 2 ∶ {0,1} → [𝑚]𝜔 , Ham-
|
||||
all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]]). ming Correlation Robustness 3 ∶ Z𝑚×𝜔 → {0,1} and a 𝐺𝛾 ∶ {0,1} →
|
||||
{0,1}
|
||||
Hyb4 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , set a pseudo- {0,1} , a pseudorandom function 𝐹 ∶ {0,1} × {0,1} → {0,1} with
|
||||
random function 𝐹 ∶ {0,1} × {0,1} → {0,1} , a hash function key 𝑘 ∈ {0,1} . The adversary 𝑃1 selects 𝑠 and transmits 𝑠 to the
|
||||
1 ∶ {0, 1}∗ → {0,1} and Hamming Correlation Robustness simulator using OT.
|
||||
𝑚×𝜔
|
||||
3 ∶ Z{0,1} → {0,1} . For 𝑦 ∈ , compute 𝑣′ = 𝐹𝑘 (1 (𝑦)), • H-Query, PRF-Query and PRG-Query The adversary 𝑃1 makes
|
||||
randomly select 𝑣 ← [𝑚]𝜔 . Set 𝐶𝑖 [𝑣[𝑖]] = 0 for all 𝑖 ∈ [𝜔]. Compute queries about the hash function, pseudorandom function, oblivious
|
||||
𝐺𝛾 (3 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]])). transfer values, and pseudorandom generator. The simulator pre-
|
||||
Hyb5 Let the initial matrices be 𝐶1 = ⋯ = 𝐶𝜔 = 1𝑚 , set a pseu- establishes lists for handling H-Query, PRF-Query, and PRG-Query
|
||||
dorandom function 𝐹 ∶ {0,1} × {0,1} → {0,1} and a hash respectively.
|
||||
function 1 ∶ {0, 1}∗ → {0,1} , Hamming Correlation Robustness
|
||||
𝑚×𝜔
|
||||
2 ∶ Z{0,1} → {0,1} and 3 ∶ Z𝑚×𝜔 → {0,1} . For 𝑦 ∈ , – 1 -Query For the 𝑖th query 𝑥𝑖 ∈ {0, 1}∗ corresponding to the
|
||||
{0,1}
|
||||
compute 𝑣′ = 𝐹𝑘 (1 (𝑦)), compute 𝑣′ = 𝐹𝑘 (1 (𝑦)). Set 𝐶𝑖 [𝑣[𝑖]] = 0 value of 1 , the simulator selects from the hash value list
|
||||
for all 𝑖 ∈ [𝜔]. Compute 𝐺𝛾 (3 (𝐶1 [𝑣[1]]‖ ⋯ ‖𝐶𝜔 [𝑣[𝜔]])). if available, otherwise selects a random 𝑋𝑖 ∈ {0,1} . Set 𝑋𝑖 =
|
||||
Similarly, it can be proven that Hyb0 ≈𝐶 Hyb5 . □ 1 (𝑥𝑖 ) and update the list accordingly.
|
||||
– 2 -Query For the 𝑖th query 𝑦𝑖 ∈ {0,1} corresponding to the
|
||||
value of 2 , the simulator selects from the hash value list if
|
||||
Definition 16 (CPA Security Model of the Protocol in Fig. 7). Assume available, otherwise selects a random 𝑌𝑖 ∈ [𝑚]𝜔 . Set 𝑌𝑖 = 2 (𝑦𝑖 )
|
||||
there exists a perturbed pseudorandom oracle machine 𝑃 𝑟𝑀𝛾 (where
|
||||
and update the list accordingly.
|
||||
𝛾 is the upper bound on the norm of the perturbation in 𝑃 𝑟𝑀𝛾 ), such
|
||||
– 3 -Query For the 𝑖th query 𝑧𝑖 ∈ Z𝑚×𝜔 corresponding to the
|
||||
that for an input 𝑥, it outputs two values: one is a random value 𝑦0 , {0,1}
|
||||
value of 3 , the simulator selects from the hash value list
|
||||
and the other is a pseudorandom value 𝑦1 with 𝑥 as its input.
|
||||
if available, otherwise selects a random 𝑍𝑖 ∈ {0,1} . Set 𝑍𝑖 =
|
||||
• Setup The simulator generates the necessary parameters for 3 (𝑧𝑖 ) and update the list accordingly.
|
||||
the algorithms. The adversary chooses 𝑠 and sends it to the – 𝐹 -Query For the 𝑖th query 𝑢𝑖 ∈ {0,1} corresponding to the value
|
||||
simulator using OT. of 𝐹 , the simulator selects from the pseudorandom function
|
||||
• Hash Queries, PRF Queries and PRG Queries The adversary value list if available, otherwise selects a random 𝑈𝑖 ∈ {0,1} .
|
||||
sequentially performs hash function queries, pseudorandom Set 𝑈𝑖 = 𝐹 (𝑢𝑖 , 𝑘) and update the list accordingly.
|
||||
function queries, and pseudorandom synthesizer queries. Here,
|
||||
– 𝐺𝛾 -Query For the 𝑖th query 𝑤𝑖 ∈ {0,1} corresponding to the
|
||||
the adversary cannot know the key in pseudorandom function
|
||||
value of 𝐺𝛾′ , the simulator selects from the pseudorandom
|
||||
queries.
|
||||
generator value list if available, otherwise selects a random
|
||||
• Challenge The adversary selects a private message 𝑚 and sends
|
||||
𝑊𝑖 ∈ {0,1} . Set 𝑊𝑖 = 𝐺𝛾′ (𝑤𝑖 ) and update the list accordingly.
|
||||
it to the simulator . The simulator queries the hash function,
|
||||
pseudorandom function, and oblivious transfer values of the real Note that 𝐺𝛾′ is not 𝐺𝛾black-box .
|
||||
scheme, inputs these results into the pseudorandom oracle ma-
|
||||
chine 𝑃 𝑟𝑀𝛾 , obtains two ciphertexts 𝑐0 and 𝑐1 , and sends them • Challenge 𝑃1 selects 𝑚 ∈ ∕ and sends it to . using the corre-
|
||||
to the adversary . sponding hash function queries and pseudorandom function queries,
|
||||
• Guessing After receiving the two ciphertexts 𝑐0 and 𝑐1 , guesses inputs the queried values into the black-box 𝐺𝛾′ , obtaining 𝜓0 and 𝜓1 ,
|
||||
which ciphertext corresponds to the encryption of 𝑚 and sends the and then sends 𝜓0 , 𝜓1 to 𝑃1 .
|
||||
guess back to the simulator . • Guess Based on the received 𝜓0 and 𝜓1 , 𝑃1 guesses whether 𝜓0 or
|
||||
The advantage of the adversary is defined as the advantage of the 𝜓1 is the ciphertext of the encrypted message 𝑚.
|
||||
simulator in distinguishing the outputs of 𝑃 𝑟𝑀𝛾 . According to the assumption, if the adversary 𝑃1 can break the
|
||||
scheme with a non-negligible advantage, then the simulator can
|
||||
Note 2. The 𝑃 𝑟𝑀 mentioned in this paper differs from [22]. In [22], also break the black-box 𝐺𝛾′ with a non-negligible advantage. This
|
||||
𝑃 𝑟𝑀 refers to a pseudorandom oracle machine that outputs random contradicts the assumption that 𝐺𝛾′ is secure. □
|
||||
values when the adversary does not know the pseudorandom function key,
|
||||
and outputs pseudorandom function values based on the key known to the
|
||||
adversary when the key is known. This is a single-value output. However, the 4.4. Efficiency analysis PSI
|
||||
𝑃 𝑟𝑀 required in this paper outputs both of these values simultaneously,
|
||||
making it a multi-value output. This section simulates the PSI computation efficiency of this pa-
|
||||
per and PSI in [14] on MAC, Pad, and Phone. The PRF of [14] is
|
||||
Theorem 3. If 1 is a collision resistant hash function, 2 and 3 are instantiated based on LWE.
|
||||
hamming correlation robustness, then the protocol in Fig. 7 securely realizes
|
||||
𝑃 𝑆 𝐼 in Definition 16.
|
||||
4.4.1. Efficiency analysis on MAC
|
||||
The tools used in the subsection are Python 3.12, the programs are
|
||||
Proof. Suppose the adversary 𝑃1 can break the scheme with non- performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB (see
|
||||
negligible advantage. Now, the simulator simulates the scheme. Fig. 8).
|
||||
Suppose there exists a black-box 𝐺𝛾𝑏𝑙𝑎𝑐 𝑘−𝑏𝑜𝑥 such that
|
||||
𝑦0 = 𝐺𝛾 (𝑥) ∈ {0,1} ,
|
||||
4.4.2. Efficiency analysis on mobile pad
|
||||
↗ The tools used in the subsection are Pydriod 3, the programs are
|
||||
𝐺𝛾𝑏𝑙𝑎𝑐 𝑘−𝑏𝑜𝑥 (𝑥) → (𝑦0 , 𝑦1 )
|
||||
↘ performed on Xiaomi Pad 6 Pro File Explorer 1th Qualcomm(R)AI En-
|
||||
𝑦1 ∈𝑅 {0,1} . gine(TM) Xiaolong 8+ mobile platform@3.2 GHz, RAM 8.00+3.00 GB
|
||||
(see Fig. 9).
|
||||
|
||||
9
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
4.5. Analysis of efficiency on mobile phones Acknowledgments
|
||||
|
||||
The tools used in the subsection are Pydriod 3, the programs are per- This work was supported in part by the National Nature Science
|
||||
formed on Redmi K30 File Explorer 4th Qualcomm(R)AI Engine(TM) Foundation of China under Grant 61872087 and Grant 51875457; in
|
||||
Qualcomm Xiaolong 730G 8+ mobile platform@2.2 GHz, RAM 6.00 GB part by the Key Foundation of National Natural Science Foundation
|
||||
(see Fig. 10). of China under Grant U19B2021; and in part by the Key Research
|
||||
and Development Program of Shaanxi under Program 2022GY-028 and
|
||||
Program 2022GY-050.
|
||||
4.5.1. Summary of data comparison
|
||||
From the simulation results, it can be seen that for 𝑛 ≤ 400, the Data availability
|
||||
LWE-based OPRF in [14] is slightly faster, while for 𝑛 > 400, the ring
|
||||
LPR-based OPRF in this paper is faster. Furthermore, as 𝑛 increases, No data was used for the research described in the article.
|
||||
the advantages of ring LPR become more pronounced. Based on the
|
||||
simulation results for Pad, the OPRF in this paper is more stable;
|
||||
although there are fluctuations, they are less significant compared to References
|
||||
the LWE-based OPRF in [14].
|
||||
[1] R. Lei, X. Chen, D. Liu, C. Song, Y. Tan, A. Ren, CEIU: Consistent and efficient
|
||||
incremental update mechanism for mobile systems on flash storage, J. Syst. Ar-
|
||||
5. Expansion of this work chit. 152 (2024) 103151, http://dx.doi.org/10.1016/j.sysarc.2024.103151, URL:
|
||||
https://www.sciencedirect.com/science/article/pii/S1383762124000882.
|
||||
[2] J. Sun, L. Yin, M. Zou, Y. Zhang, T. Zhang, J. Zhou, Makespan-minimization
|
||||
Private Information Retrieval (PIR) [23–29] is a technique that workflow scheduling for complex networks with social groups in edge
|
||||
enables a client to securely download a specific element, such as a computing, J. Syst. Archit. 108 (2020) 101799, http://dx.doi.org/10.1016/
|
||||
movie or a friend’s record, from a database managed by an untrusted j.sysarc.2020.101799, URL: https://www.sciencedirect.com/science/article/pii/
|
||||
server, such as a streaming service or a social network, without disclos- S1383762120300928.
|
||||
[3] Y. Gao, Y. Luo, L. Wang, X. Liu, L. Qi, W. Wang, M. Zhou, Efficient scalable
|
||||
ing to the server which particular element has been retrieved. Given
|
||||
multi-party private set intersection(-variants) from bicentric zero-sharing, in:
|
||||
the functional similarities between PIR and PSI, this paper extends its
|
||||
Proceedings of the Conference on Computer and Communications Security, CCS,
|
||||
exploration into the construction of PIR using OPRF (see Fig. 11). Association for Computing Machinery (ACM), New York, NY, USA, 2024.
|
||||
[4] M.O. Rabin, How to exchange secrets with oblivious transfer, 2005, URL: https:
|
||||
5.1. Efficiency analysis PIR //eprint.iacr.org/2005/187.
|
||||
[5] O. Goldreich, S. Goldwasser, S. Micali, How to construct random functions, J.
|
||||
ACM 33 (4) (1986) 792–807, http://dx.doi.org/10.1145/6490.6503.
|
||||
This section simulates the PSI computation efficiency of this paper [6] M. Naor, O. Reingold, Number-theoretic constructions of efficient pseudo-random
|
||||
and machine learning-based PIR in [30](DLMI for short) on MAC. functions, J. ACM 51 (2) (2004) 231–262, http://dx.doi.org/10.1145/972639.
|
||||
The tools used in the subsection are Python 3.12, the programs are 972643.
|
||||
[7] M.J. Freedman, Y. Ishai, B. Pinkas, O. Reingold, Keyword search and oblivious
|
||||
performed on MacBook Air MAC Desktop Apple M1, RAM 8.00 GB.
|
||||
pseudorandom functions, in: J. Kilian (Ed.), Theory of Cryptography, Springer
|
||||
The OPRF-based PIR proposed in this paper has a runtime that Berlin Heidelberg, Berlin, Heidelberg, 2005, pp. 303–324.
|
||||
differs from the machine learning-based PIR by no more than approx- [8] S. Jarecki, X. Liu, Efficient oblivious pseudorandom function with applications
|
||||
imately 5 × 10−3 seconds. Additionally, the security of our PIR scheme to adaptive OT and secure computation of set intersection, in: O. Reingold (Ed.),
|
||||
is theoretically supported in comparison to [30] (see Fig. 12). Theory of Cryptography, Springer Berlin Heidelberg, Berlin, Heidelberg, 2009,
|
||||
pp. 577–594.
|
||||
[9] V.K. Yadav, N. Andola, S. Verma, S. Venkatesan, A survey of oblivious trans-
|
||||
6. Conclusion fer protocol, ACM Comput. Surv. 54 (10s) (2022) http://dx.doi.org/10.1145/
|
||||
3503045.
|
||||
This paper presents a PSI based on efficient post-quantum OPRF and [10] M.R. Albrecht, A. Davidson, A. Deo, N.P. Smart, Round-optimal verifiable
|
||||
oblivious pseudorandom functions from ideal lattices, in: J.A. Garay (Ed.), Public-
|
||||
proves its security under the semi-honest model, demonstrating security
|
||||
Key Cryptography – PKC 2021, Springer International Publishing, Cham, 2021,
|
||||
even in the CPA model in Definition 16. The addition of PPRG enables pp. 261–289.
|
||||
the PSI to effectively resist probabilistic attacks. In the simulation [11] N. Tyagi, S. Celi, T. Ristenpart, N. Sullivan, S. Tessaro, C.A. Wood, A fast
|
||||
experiments, the proposed PSI shows greater efficiency compared to and simple partially oblivious PRF, with applications, in: O. Dunkelman, S.
|
||||
post-quantum PSIs represented by LWE. Dziembowski (Eds.), Advances in Cryptology – EUROCRYPT 2022, Springer
|
||||
Although the PIR in this study is not as efficient as the machine International Publishing, Cham, 2022, pp. 674–705.
|
||||
[12] S. Casacuberta, J. Hesse, A. Lehmann, Sok: Oblivious pseudorandom functions,
|
||||
learning-based PIR, the gap between the two is already quite small.
|
||||
in: 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P),
|
||||
However, there are also notable shortcomings; the efficiency of the 2022, pp. 625–646, http://dx.doi.org/10.1109/EuroSP53844.2022.00045.
|
||||
proposed PSI still lags behind that of non-post-quantum PSIs, which [13] D. Boneh, D. Kogan, K. Woo, Oblivious pseudorandom functions from isogenies,
|
||||
will be addressed in future work. in: S. Moriai, H. Wang (Eds.), Advances in Cryptology – ASIACRYPT 2020,
|
||||
Springer International Publishing, Cham, 2020, pp. 520–550.
|
||||
[14] M. Chase, P. Miao, Private set intersection in the internet setting from lightweight
|
||||
CRediT authorship contribution statement oblivious PRF, in: D. Micciancio, T. Ristenpart (Eds.), Advances in Cryptology –
|
||||
CRYPTO 2020, Springer International Publishing, Cham, 2020, pp. 34–63.
|
||||
Zhuang Shan: Writing – original draft, Conceptualization. Leyou [15] Z. Shan, L. Zhang, Q. Wu, Q. Lai, Analysis, modify and apply in IIOT form
|
||||
Zhang: Writing – review & editing, Writing – original draft. Qing Wu: light-weight PSI in CM20, 2024, URL: https://eprint.iacr.org/2024/969.
|
||||
[16] J. Alwen, S. Krenn, K. Pietrzak, D. Wichs, Learning with rounding, revisited, in:
|
||||
Conceptualization. Qiqi Lai: Writing – review & editing. Fuchun Guo:
|
||||
R. Canetti, J.A. Garay (Eds.), Advances in Cryptology – CRYPTO 2013, Springer
|
||||
Writing – review & editing. Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 57–74.
|
||||
[17] A. Banerjee, C. Peikert, A. Rosen, Pseudorandom functions and lattices, in: D.
|
||||
Declaration of competing interest Pointcheval, T. Johansson (Eds.), Advances in Cryptology – EUROCRYPT 2012,
|
||||
Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 719–737.
|
||||
[18] D. Bellizia, C. Hoffmann, D. Kamel, H. Liu, P. Méaux, F.-X. Standaert, Y.
|
||||
The authors declare that they have no known competing finan- Yu, Learning parity with physical noise: Imperfections, reductions and FPGA
|
||||
cial interests or personal relationships that could have appeared to prototype, IACR Trans. Cryptogr. Hardw. Embed. Syst. 2021 (2021) 390–417,
|
||||
influence the work reported in this paper. URL: https://api.semanticscholar.org/CorpusID:235814670.
|
||||
|
||||
|
||||
10
|
||||
Z. Shan et al. Journal of Systems Architecture 160 (2025) 103346
|
||||
|
||||
|
||||
[19] Y. Yu, J. Zhang, Smoothing out binary linear codes and worst-case sub- Leyou Zhang received the M.S. and Ph.D. degrees from Xid-
|
||||
exponential hardness for LPN, in: T. Malkin, C. Peikert (Eds.), Advances in ian University, Xi’an, China, in 2002 and 2009, respectively.
|
||||
Cryptology – CRYPTO 2021, Springer International Publishing, Cham, 2021, pp. From 2013 to 2014, he served as a visiting scholar at the
|
||||
473–501. University of Wollongong, Australia. He currently worked
|
||||
[20] V. Kolesnikov, R. Kumaresan, M. Rosulek, N. Trieu, Efficient batched oblivious in Xidian University as a professor.
|
||||
PRF with applications to private set intersection, in: Proceedings of the 2016 His current research interests include public key cryp-
|
||||
ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, tography, network security and computer security. He has
|
||||
Association for Computing Machinery, New York, NY, USA, 2016, pp. 818–829, over 120 scientific publications in many highly ranked
|
||||
http://dx.doi.org/10.1145/2976749.2978381. cybersecurity journals and conferences.
|
||||
[21] Z. Brakerski, E. Kirshanova, D. Stehlé, W. Wen, Learning with errors and
|
||||
extrapolated dihedral cosets, in: Public-Key Cryptography – PKC 2018, Springer
|
||||
International Publishing, 2018, pp. 702–727.
|
||||
[22] A. Jain, H. Lin, J. Luo, D. Wichs, The pseudorandom oracle model and ideal
|
||||
obfuscation, in: H. Handschuh, A. Lysyanskaya (Eds.), Advances in Cryptology –
|
||||
CRYPTO 2023, Springer Nature Switzerland, Cham, 2023, pp. 233–262.
|
||||
Qing Wu received the M.S. and Ph.D. degrees from the Xid-
|
||||
[23] S. Angel, H. Chen, K. Laine, S. Setty, PIR with compressed queries and amortized
|
||||
ian University, Xi’an, China, in 2006 and 2009, respectively.
|
||||
query processing, in: 2018 IEEE Symposium on Security and Privacy, SP, 2018,
|
||||
She currently works with Xi’an University of Posts and
|
||||
pp. 962–979, http://dx.doi.org/10.1109/SP.2018.00062. Communications, Xi’an, as a Professor. Her current research
|
||||
[24] A. Burton, S.J. Menon, D.J. Wu, Respire: High-rate PIR for databases with small interests include artificial intelligence security and cloud
|
||||
records, in: Proceedings of the Conference on Computer and Communications security.
|
||||
Security, CCS, Association for Computing Machinery (ACM), New York, NY, USA,
|
||||
2024.
|
||||
[25] J. Dujmovic, M. Hajiabadi, Lower-bounds on public-key operations in PIR, in: M.
|
||||
Joye, G. Leander (Eds.), Advances in Cryptology – EUROCRYPT 2024, Springer
|
||||
Nature Switzerland, Cham, 2024, pp. 65–87.
|
||||
[26] B. Fisch, A. Lazzaretti, Z. Liu, C. Papamanthou, Thorpir: Single server PIR via
|
||||
homomorphic thorp shuffles, in: Proceedings of the Conference on Computer and
|
||||
Communications Security, CCS, Association for Computing Machinery (ACM),
|
||||
New York, NY, USA, 2024.
|
||||
Qiqi Lai received the B.S. from PLA University of Informa-
|
||||
[27] A. Gascon, Y. Ishai, M. Kelkar, B. Li, Y. Ma, M. Raykova, Computationally
|
||||
tion Engineering, henan, China, in 2008. And he received
|
||||
secure private information retrieval and aggregation in the shuffle model, in:
|
||||
the M.S. and Ph.D. degrees from Xidian University, Xi’an,
|
||||
Proceedings of the Conference on Computer and Communications Security, CCS, China, in 2011 and 2015.
|
||||
Association for Computing Machinery (ACM), New York, NY, USA, 2024. His currently works with Shaanxi Normal University,
|
||||
[28] A. Ghoshal, M. Zhou, E. Shi, Efficient pre-processing PIR without public- Xi’an, as a Professor. His current research interests include
|
||||
key cryptography, in: M. Joye, G. Leander (Eds.), Advances in Cryptology – the theory of lattice-based public key cryptography and its
|
||||
EUROCRYPT 2024, Springer Nature Switzerland, Cham, 2024, pp. 210–240. provable security, as well as the construction and analysis
|
||||
[29] M. Luo, F.-H. Liu, H. Wang, Faster FHE-based single-server private information of homomorphic encryption schemes.
|
||||
retrieval, in: Proceedings of the Conference on Computer and Communications
|
||||
Security, CCS, Association for Computing Machinery (ACM), New York, NY, USA,
|
||||
2024.
|
||||
[30] M. Lam, J. Johnson, W. Xiong, K. Maeng, U. Gupta, Y. Li, L. Lai, I. Leontiadis,
|
||||
M. Rhu, H.-H.S. Lee, V.J. Reddi, G.-Y. Wei, D. Brooks, E. Suh, GPU-based
|
||||
Funcun Guo received the B.S. and M.S. degrees from Fujian
|
||||
private information retrieval for on-device machine learning inference, in:
|
||||
Normal University, China, in 2005 and 2008, respectively,
|
||||
Proceedings of the 29th ACM International Conference on Architectural Support and the Ph.D. degree from the University of Wollongong,
|
||||
for Programming Languages and Operating Systems, Volume 1, ASPLOS ’24, Australia, in 2013. He is currently an Associate Research
|
||||
Association for Computing Machinery, New York, NY, USA, 2024, pp. 197–214, Fellow with the School of Computing and Information
|
||||
http://dx.doi.org/10.1145/3617232.3624855. Technology, University of Wollongong.
|
||||
His primary research interests include the public
|
||||
key cryptography, in particular protocols, encryption and
|
||||
Zhuang Shan received the B.S. from Liaoning Institute of signature schemes, and security proof.
|
||||
Science and Technology, benxi, China, in 2019. And he
|
||||
received the M.S. from North Minzu University, yinchuan,
|
||||
China, in 2022.
|
||||
He is currently pursuing the Ph,D. degree in mathemat-
|
||||
ics with Xidian University, Xi’an, China. His current interests
|
||||
include cryptography, reduction of hard problems in lattice,
|
||||
and network security.
|
||||
|
||||
|
||||
|
||||
|
||||
11
|
||||
|
||||
@@ -0,0 +1,989 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
Fully decentralized period k-times anonymous authentication with access
|
||||
criteriaI , II
|
||||
Hongyan Di a , Yinghui Zhang a ,∗, Ziqi Zhang a , Yibo Pang a , Rui Guo a , Yangguang Tian b
|
||||
a
|
||||
School of Cyberspace Security, Xi’an University of Posts & Telecommunications, 710121, Xi’an, China
|
||||
b
|
||||
University of Surrey, GU2 7XH, Surrey, UK
|
||||
|
||||
|
||||
|
||||
ARTICLE INFO ABSTRACT
|
||||
|
||||
Keywords: The explosive growth of Internet user devices highlights the strong and urgent need for digital identity
|
||||
Fully decentralized infrastructure. However, the existing decentralized identity schemes are still not fully decentralized, and there
|
||||
Publicly auditable is still a contradiction between publicly auditable credentials and maintaining anonymity. Therefore, using
|
||||
Access criteria
|
||||
advanced cryptographic techniques such as signature proof of knowledge, Pedersen commitment, and Merkle
|
||||
Anonymous authentication
|
||||
tree, this paper propose a fully decentralized period k-times anonymous authentication with access criteria.
|
||||
Signature proof of knowledge
|
||||
The scheme allows user credentials to be publicly audited, users can manage their identity independently, and
|
||||
the verifier can not only verify the user’s identity, but also implement access control. The issuer does not need
|
||||
to hold a key or maintain a list, and it can still authenticate even after the trusted center is attacked, and only
|
||||
three zero-knowledge proofs are needed for registration and verification. The security analysis indicates that
|
||||
this scheme satisfies unforgeability, anonymity, unlinkability and attribute privacy. Performance evaluation
|
||||
shows significant improvements in both computational and communication efficiency over existing schemes.
|
||||
|
||||
|
||||
|
||||
1. Introduction control over digital resources such as services. The core of this system is
|
||||
the concept of digital identity. The evolution of digital identity has gone
|
||||
With the surge in digital services accessed through network con- through multiple eras, during which digital identity recognition has
|
||||
nections, the number of digital identities has seen an unprecedented gradually shifted from centralized to decentralized identity models [3].
|
||||
increase. Therefore, the vast majority of the global population has In fact, the way entities prove the ownership of digital identities may be
|
||||
at least one digital identity, which becomes the key to unlocking a affected by various vulnerabilities [4]. The current Internet ecosystem
|
||||
variety of online functions and services. However, the concept of digital generally adopts the centralized Identity Provider (IdP) model, with
|
||||
identity goes far beyond human identity recognition [1]. With the wide tech giants such as Google and Facebook (e.g., Meta) serving as the
|
||||
adoption of IoT and the powerful functions of the 5th Generation Mo- custodians of digital identities. Other services can directly rely on the
|
||||
bile Communication Technology (5G) network, as well as the upcoming identity information provided by IdP. This architecture simplifies the
|
||||
6th Generation Mobile Communication Technology (6G), the number authentication process by achieving single sign-on through protocols
|
||||
of connected devices has increased significantly [2]. These devices such as OAuth, it has fundamental flaws when examined from the
|
||||
require unique digital identities to enable their participation in digital perspective of privacy protection, users lose control over their digital
|
||||
ecosystems, such as establishing secure communications. identities [5], and all their identity attributes are centrally stored in the
|
||||
Authentication and authorization are crucial security-related core IdP’s servers. Users neither know the specific usage of these data nor
|
||||
tasks in the digital world. Their purpose is to ensure the authenticity can they effectively manage their flow. More seriously, this architecture
|
||||
of the identities of the communicating parties and implement access has created a dangerous ‘‘data island’’ phenomenon—IdP can fully
|
||||
|
||||
|
||||
I This article is part of a Special issue entitled: ‘Information Security and Privacy’ published in Computer Standards & Interfaces.
|
||||
II This work is supported by the National Cryptologic Science Fund of China (2025NCSF02037), the National Natural Science Foundation of China (62072369),
|
||||
the Youth Innovation Team of Shaanxi Universities (23JP160), the Shaanxi Special Support Program Youth Top-notch Talent Program, the Technology Innovation
|
||||
Leading Program of Shaanxi (2023-YD-CGZH-31), the Technology Innovation Guidance Special Fund of Shaanxi Province (2024QY-SZX-17), the Graduate
|
||||
Innovation Fund of Xi ’an University of Posts and Telecommunications (CXJJBDL2024004).
|
||||
∗ Corresponding author.
|
||||
E-mail addresses: 15029659213@163.com (H. Di), yhzhaang@163.com (Y. Zhang), qiqizhang0408@163.com (Z. Zhang), ybpang1998@163.com (Y. Pang),
|
||||
guorui@xupt.edu.cn (R. Guo), yangguang.tian@surrey.ac.uk (Y. Tian).
|
||||
URLs: https://www.xiyou.edu.cn/ (Y. Zhang), http://www.surrey.ac.uk (Y. Tian).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104097
|
||||
Received 12 July 2025; Received in revised form 26 September 2025; Accepted 11 November 2025
|
||||
Available online 19 November 2025
|
||||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
grasp the cross-platform service usage trajectory and behavioral char- have emerged. These include zero-knowledge credentials, lightweight
|
||||
acteristics of users, essentially constructing a panoramic user profile. anonymous credentials without heavy zero-knowledge proofs and other
|
||||
IdP, on the other hand, can obtain information about all the network computationally intensive operations, self-blinding credentials, group
|
||||
services used by users (and related usage data). When the server storing signatures, AC schemes without unlinkability, and post-quantum AC
|
||||
user data is invaded, sensitive personal information may be ‘‘obtained’’ schemes. In order to reduce the trust dependence of the credential
|
||||
by malicious attackers, causing significant loss of personal data and issuance process on a central authority in traditional anonymous cre-
|
||||
damaging the reputation of stakeholders [6]. In 2022 alone, there were dential schemes, Garman et al. [14] proposed the concept of decen-
|
||||
over 1800 major data breaches worldwide, involving more than 400 tralized anonymous credential (DAC), which allows users to construct
|
||||
million user records. The increasing number of data breach cases has and manage credentials in a completely anonymous manner. Derler
|
||||
raised significant concerns to data confidentiality and transparency et al. [15] designed a new revocable multi-show attribute anonymous
|
||||
in the field of digital identity management. In addition, centralized credential based on previous work, which has good scalability and con-
|
||||
identity management systems rely on specific identity service nodes, stant operation of two roles. Bui and Aura [16] developed a distributed
|
||||
making them vulnerable to single point of failure problem [7]. access control revocation framework to facilitate the manipulation of
|
||||
Therefore, the increasing popularity of online services, the growing revocation methods. Subsequently, Sonnino et al. [17] proposed a
|
||||
trend of decentralization, and the rising awareness of the shortcomings special selective disclosure voucher solution based on blind signatures
|
||||
of traditional methods are paving the way for more secure and privacy- and bilinear pairing, which holds short and highly efficient vouch-
|
||||
protecting approaches. Under this trend, supported by current laws and ers. Inspired by Sonnino’s work, Halpin [18] redesigned the tagging
|
||||
regulations (such as the General Data Protection Regulation (GDPR) mechanism to improve scalability and support embedding arbitrary
|
||||
of the European Union) [8], the concept of Self-Sovereign Identity attributes. Cui et al. [19] constructed a Blockchain Digital Identity
|
||||
(SSI) [9] has attracted significant attention from both academia and Management System (BDIdM) by extending the functional features of
|
||||
industry. SSI is based on the idea that individuals should have full the DAC scheme [14], which enabled limited reusability of specific cre-
|
||||
control over their information without being forced to outsource data dentials on the premise of maintaining the security of the DAC scheme.
|
||||
to any centralized institution or third party. Such technologies play a In addition, decentralized anonymous credentials are widely integrated
|
||||
crucial role in establishing trust among entities (including non-human with other scenarios. Lin et al. [20] applied the DAC scheme to the
|
||||
entities such as humans and IoT devices) and ensuring communication smart grid scenario and enhanced the privacy protection mechanism.
|
||||
security through digital identities. Decentralized Identifiers (DIDs) and The solutions combined with the application scenarios of blockchain-
|
||||
Verifiable Credentials (VCs), as effective solutions for enhancing pri- based Internet of Vehicles include [21–25], Zeng et al. [26] also applied
|
||||
vacy and security, have been promoted in multiple application fields anonymous credentials to cross-domain authentication in IIoT.
|
||||
such as intelligent transportation and smart healthcare. These standards
|
||||
can be extended to anyone or anything, covering cloud, edge, and IoT 2.2. 𝑘-Time anonymous authentication (𝑘-TAA)
|
||||
resources. It is worth noting that several institutions, including industry
|
||||
giants such as Microsoft, have recently developed and released a variety The 𝑘-period anonymous authentication allows users to be authen-
|
||||
of implementation plans to support these technologies. In addition, ticated up to 𝑘-times within a certain time period while remaining
|
||||
global government agencies are also actively promoting the widespread anonymous. Teranishi et al. [27] introduced the first 𝑘-TAA scheme,
|
||||
application of DIDs and VCs. For instance, the European union pro- allowing the identification of users who exceeded the authentication
|
||||
mulgated regulation 2024/1183 [10] in May 2024, establishing the limit. Nguyen and Safavi-Naini [28] extended this concept to dynamic
|
||||
European digital identity framework, aiming to provide European cit- 𝑘-TAA, enabling each authenticator to independently grant or revoke
|
||||
izens with digital passes for cross-border access to public and private access rights. Au et al. [29] proposed a fixed-size dynamic 𝑘-times.
|
||||
services through the SSI system. This represents a significant milestone Chaterjee et al. [30] proposed a 𝑘-TAA scheme based on physically
|
||||
in the development of digital identity solutions. However, current unclonable functions (PUFs), which is applicable to trusted platform
|
||||
decentralized anonymous authentication schemes still face significant modules (TPM). Huang et al. [31] designed an efficient 𝑘-TAA system
|
||||
challenges. These include the inability to achieve full decentralization, tailored for pay-as-you-go pricing, facilitating multiple service accesses
|
||||
a lack of mutual trust between users and issuers, and the persistent and related payments within each certification cycle. However, many
|
||||
contradiction between public verifiability and true anonymity. Against existing 𝑘-TAA schemes fail to provide periodic anonymous authenti-
|
||||
this backdrop, AI-driven identity threat analysis has become a new cation. Although the existing schemes [32,33] support periodic anony-
|
||||
focus of security research. Initiatives such as the Global Digital Iden- mous authentication, they have deficiencies in supporting the selective
|
||||
tity Wallet (GDIW) have launched cross-border interoperability tests, disclosure of credential attributes to achieve fine-grained authentica-
|
||||
while ‘‘Digital Identity Chain’’ has completed the integration of DIDs tion. In addition, they require a large number of pairing operations,
|
||||
with the national government service platform—efforts that represent resulting in significant verification delays. In contrast, scheme [34,35]
|
||||
preliminary but critical explorations in addressing these underlying supports periodic 𝑘-times anonymous authentication while reducing
|
||||
issues. cumbersome pairing operations. However, scheme [34] does not sup-
|
||||
port credential revocation. As shown in Table 1, our scheme, while
|
||||
2. Relate work meeting the above requirements, supports full decentralization and
|
||||
access control.
|
||||
2.1. Decentralized anonymous credential (DAC)
|
||||
• Research Contributions
|
||||
In the 1980s, David Chaum [11,12] introduced privacy-preserving Next, we list the main research contributions of this paper.
|
||||
cryptographic techniques, aiming to create a more privacy-focused The Proposed Scheme: We propose a fully decentralized 𝑘-times
|
||||
and user-centered authentication and authorization solution. It enables period anonymous authentication scheme with access control.
|
||||
users to prove their membership, identity, or any other arbitrary at- The scheme enforces both access criteria and authentication dur-
|
||||
tribute in a group in a privacy-preserving manner. Such techniques are ing the verification process, while eliminating the need for issuers
|
||||
often referred to as anonymous credentials (ACs), and various methods to hold keys or maintain lists, thus remaining secure even if the
|
||||
for building AC systems have been widely studied in the academic com- trusted center is compromised. Only three zero-knowledge proofs
|
||||
munity. However, since Camenish and Lysyanskaya [13] first proposed are required for registration and verification.
|
||||
a completely anonymous credential scheme in 2001, a large number of Security Analysis: We conducted a correctness and theoretical
|
||||
anonymous credit construction schemes suitable for various scenarios security analysis based on the game definition of the proposed
|
||||
|
||||
2
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
Table 1
|
||||
Function comparison.
|
||||
Security features [29] [30] [31] [33] [19] [34] [35] Our Scheme
|
||||
Anonymity ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
|
||||
Unlinkability ✓ N.A ✓ N.A ✓ ✓ ✓ ✓
|
||||
𝑘-times period anonymous authentication × × × ✓ × ✓ N.A ✓
|
||||
Publicly auditable N.A × N.A N.A ✓ ✓ ✓ ✓
|
||||
Select attribute disclosure × × × × ✓ ✓ N.A ✓
|
||||
Key forward and backward secure ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
|
||||
Reveal violator’s identity without TTP ✓ ✓ × ✓ ✓ ✓ × ✓
|
||||
Issuer not hold key and identity list × × × × × × × ✓
|
||||
Support credential revocation ✓ ✓ ✓ ✓ ✓ × ✓ ✓
|
||||
|
||||
Note*: ✓: Support this feature; ×: Does not support this feature; N.A: No applicable; TTP: Trusted third party.
|
||||
|
||||
|
||||
scheme. By simulating games and citing programmable random 3.2. Zero-knowledge proof
|
||||
oracles and fork lemmas, among other techniques, we demon-
|
||||
strated that the scheme meets the requirements of unforgeability,
|
||||
A signature proof of knowledge (SPK) is a non-interactive zero-
|
||||
anonymity, unlinkability, and attribute privacy. This analysis em-
|
||||
knowledge proof (ZKP) technique that enables a prover to demonstrate
|
||||
phasizes that the plan has protected the integrity and validity of
|
||||
the data. knowledge of a secret value without revealing it, while also signing
|
||||
Performance Evaluation: We conducted a detailed analysis of a message. We constructed a cyclic group G of prime order 𝑞 and
|
||||
this authentication scheme, demonstrating its efficiency advan- employed the Fiat–Shamir heuristic [36] to convert an interactive
|
||||
tages over existing authentication schemes. Tests were also car- proof into a non-interactive one. These non-interactive constructs are
|
||||
ried out on secp256k1 and BLS12-381 curves, verifying that the precisely referred to as signature proofs of knowledge (SPK). All the
|
||||
proposed algorithm performs better on lightweight curves. signatures of knowledge are secure in the random oracle model. Ac-
|
||||
• Structure of Paper cording to the symbols introduced by Camenisch and Stadler [37],
|
||||
The remaining paper is structured as follows: Section 3 intro- 𝑃 𝑜𝐾{(𝑥) ∶ 𝑦 = 𝑔 𝑥 } represents the zero-knowledge proof protocol
|
||||
duces the problem assumptions and fundamentals. Section 4 de- between the prover and the verifier. Such prover knows 𝑥 ∈ Z𝑝 and
|
||||
fines the syntax, security model, and detailed construction of 𝑦 = 𝑔 𝑥 ∈ G. The corresponding non-interactive signature knowledge
|
||||
the scheme. Section 5 analyzes its correctness and theoretical proof on the message 𝑚 should be expressed as 𝑆𝑃 𝐾{(𝑥) ∶ 𝑦 = 𝑔 𝑥 }(𝑚).
|
||||
security. Section 6 evaluates performance in terms of computation It can be regarded as a signature on the message 𝑚, which is signed by
|
||||
and communication overhead, and Section 7 concludes the paper. a key pair (𝑔 𝑥 , 𝑥) based on discrete logarithms.
|
||||
|
||||
3. Preliminaries
|
||||
3.3. Pedersen commitment
|
||||
3.1. Group description and hardness assumptions
|
||||
Literature [38] uses Poseidon to realize the hash of Merkle tree
|
||||
A group generator 𝐺𝐺𝑒𝑛(1𝜅 ) → (G, 𝑞) inputs a security parameter 𝜅 and commitment. Instantiate another method of using Pedersen hash-
|
||||
and outputs a cyclic group G of prime order 𝑞. This scheme is based on ing and perfectly hiding commitments in the scheme. The Pedersen
|
||||
the following hard problem assumption.
|
||||
commitment algorithm as follows:
|
||||
|
||||
Definition 2.1 (Discrete Logarithm Problem (DLP) Assumption). Let 𝑔 be
|
||||
• 𝐺𝑒𝑛(1𝜅 ) → 𝑐𝑘 ∶ Select a finite group G with a large prime order
|
||||
a generator of a group G. Given a tuple (𝑔, 𝑔 𝑎 ) ∈ G2 , where 𝑎 ∈ Z∗𝑞 , the
|
||||
𝑞, and choose two generators 𝑔 and ℎ from the group G. The
|
||||
Discrete Logarithm Problem is output 𝑎. The DLP assumption holds if
|
||||
parameters of this commitment scheme are 𝑐𝑘 = (G, 𝑞, 𝑔, ℎ).
|
||||
for all PPT adversary , the advantage is negligible.
|
||||
• 𝐶𝑜𝑚𝑚𝑖𝑡(𝑐𝑘, 𝑢) → 𝑐: Generate a commitment 𝑐 for a secret value 𝑢.
|
||||
AdvDLP
|
||||
(𝜅) = |𝑃 𝑟[(𝑔, 𝑔 )| = 𝑎] ≤ 𝑛𝑒𝑔𝑙(𝜅).
|
||||
𝑎 The commitment party randomly selects a blind factor 𝑟 and then
|
||||
calculates 𝑐 = 𝑔 𝑢 ℎ𝑟 .
|
||||
• 𝑂𝑝𝑒𝑛𝐶𝑜𝑚(𝑐𝑘, 𝑐, 𝑢, 𝑟) → 0∕1: The verifier checks whether 𝑐 is equal
|
||||
Definition 2.2 (Decisional Diffie–Hellman (DDH) Assumption). Let G
|
||||
to 𝑔 𝑢 ℎ𝑟 .
|
||||
be a group of order a large prime 𝑞, 𝑔 be the generator of G. The
|
||||
input is a random quadruple = (𝑔, 𝑔 𝑥 , 𝑔 𝑦 , 𝑔 𝑥𝑦 ) ∈ G3 , and quadruple
|
||||
= (𝑔, 𝑔 𝑥 , 𝑔 𝑦 , 𝑔 𝑧 ) ∈ G3 , where 𝑥, 𝑦, 𝑧 ← Z∗𝑞 . It is computationally hard
|
||||
3.4. Merkle tree
|
||||
for adversary to distinguish between two tuples, the advantage of
|
||||
PPT adversary is negligible.
|
||||
In the proposed scheme, the Merkle tree 𝑇 is used to represent the
|
||||
𝐴𝑑𝑣DDH
|
||||
(𝜅) = |𝑃 𝑟[() = 1] − 𝑃 𝑟[() = 1]| ≤ 𝑛𝑒𝑔𝑙(𝜅). membership of the set. The root of the tree 𝑇 is denoted 𝑇𝑟𝑜𝑜𝑡 . The
|
||||
Merkle tree has the following functions:
|
||||
Definition 2.3 (Computing Diffie–Hellman (CDH) Assumption). Let G
|
||||
be a cyclic group of order 𝑞 with generator 𝑔. Given the tuple = • 𝑇 .𝐼𝑛𝑠𝑒𝑟𝑡(𝑣) → 𝑇 ∶ Inserts the value 𝑣 into the next available leaf
|
||||
(𝑔, 𝑔 𝑎 , 𝑔 𝑏 ) where 𝑎, 𝑏 ← Z∗𝑞 , computing 𝑔 𝑎𝑏 is hard. For all probabilistic in 𝑇 and returns the modified tree.
|
||||
polynomial-time (PPT) algorithms , the advantage probability of • 𝑇 .𝑅𝑒𝑚𝑜𝑣𝑒(𝑣) → 𝑇 ′ ∶ Removes 𝑣 from the tree, if it exists, and
|
||||
successfully solving the CDH problem is negligible. returns the modified tree 𝑇 ′ .
|
||||
| [ ]| • 𝑇 .𝐴𝑢𝑡ℎ𝑃 𝑎𝑡ℎ(𝑣) → 𝜃 ∶ Generate an authentication path 𝜃 that
|
||||
𝐴𝑑𝑣𝐶𝐷𝐻 (𝜅) = |𝑃 𝑟 (𝑔, 𝑔 𝑎 , 𝑔 𝑏 ) = 𝑔 𝑎𝑏 | ≤ 𝑛𝑒𝑔𝑙(𝜅).
|
||||
| | proves 𝑣 ∈ 𝑇 . The size of 𝜃 is proportional to the height of the
|
||||
where 𝜅 is a security parameter, 𝑛𝑒𝑔𝑙(𝜅) denotes a negligible function. tree, ensuring efficient verification in cryptographic protocols.
|
||||
|
||||
3
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
Table 2
|
||||
Summary of notations.
|
||||
Symbol Description
|
||||
, , User, Issuer, Verifier
|
||||
𝜆 Security parameter
|
||||
ℎ The maximum height of the Merkle tree
|
||||
𝑚 The maximum number of attributes
|
||||
𝑛 The number of access criteria the verifier is allowed to define
|
||||
𝜄𝑝𝑢𝑏 , 𝜄𝑧𝑘 Verify the access policy for ancillary information when the request is issued
|
||||
𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 Auxiliary information when requesting registration
|
||||
𝜙𝑖 The verifier defines the 𝑖th access criterion
|
||||
𝑎𝑢𝑥𝑖 Show proof of auxiliary information
|
||||
{ }𝑚
|
||||
𝐴𝑡𝑡𝑟𝑠 = 𝑎𝑡𝑡𝑟𝑖 𝑖=1 The 𝑖th attribute of the user and the attribute set
|
||||
𝑤 Witness Collection
|
||||
𝑐𝑡𝑥 Context information
|
||||
𝐼, 𝑉 Collection of issuance criteria and access criteria
|
||||
𝛱𝑈1 , 𝛱𝑉1 , 𝛱̃ Zero-knowledge proofs generated by the user and issuer
|
||||
𝑠′′ ← Z∗𝑞 A secret random number randomly selected by the issuer
|
||||
𝜃 The authentication path generated by the Merkle tree
|
||||
𝑇𝑟𝑜𝑜𝑡 , 𝑇𝜅 , 𝑇𝜅′ Merkle tree root, Merkle tree, updated Merkle tree
|
||||
|
||||
Note*: 𝜄, 𝜙 ∶ → {0, 1} is a predicate over the user’s attributes that needs to be satisfied in order to pass verification, i.e.,
|
||||
verification only passes if 𝜄𝑝𝑢𝑏 (𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) = 1, 𝜙(𝐴𝑡𝑡𝑟𝑠, 𝑎𝑢𝑥) = 1.
|
||||
|
||||
|
||||
3.5. Pseudo-Random Function (PRF) • 𝑆𝑒𝑡𝑢𝑝(1𝜆 , 1ℎ , 1𝑚 ) → 𝑝𝑝 ∶ The algorithm inputs the security pa-
|
||||
rameter 𝜆, the maximum height ℎ of the Merkle tree, and the
|
||||
A Pseudo-Random Function (PRF) is a family of computational func- maximum number 𝑚 of attributes in a credential. Generates the
|
||||
{ } system parameters 𝑝𝑝.
|
||||
tions 𝐹𝑘 , where 𝑘 is a key and 𝐹𝑘 is a function from the input space
|
||||
to the output space. For an ideal PRF, when the key 𝑘 is unknown, its • 𝐼𝑠𝑠𝑢𝑒𝑆𝑒𝑡𝑢𝑝𝐼 (𝑝𝑝) → (𝐼, 𝜄𝑝𝑢𝑏 ) ∶ The algorithm inputs the public
|
||||
output is computationally indistinguishable from that of a true random parameter 𝑝𝑝, outputs the issue criteria set 𝐼 and the issue criteria
|
||||
for verifying public auxiliary information 𝜄𝑝𝑢𝑏 .
|
||||
function. We construct a PRF with efficient correctness proof. We adopt
|
||||
the specific PRF construction proposed by Dodis and Yampolskiy [39] • 𝑆ℎ𝑜𝑤𝑆𝑒𝑡𝑢𝑝𝑉 (𝑝𝑝) → 𝑉 ∶ The verifier sets up 𝑛 access criteria to
|
||||
(DY-PRF). The DY-PRF is defined by the tuple (G, 𝑞, 𝑔, 𝑠), where G = ⟨𝑔⟩ define the user’s access policy. This algorithm outputs a collection
|
||||
of access criteria 𝑉 = {𝜙1 , 𝜙2 , … , 𝜙𝑛 } where each 𝜙𝑖 represents an
|
||||
is a cyclic group of prime order 𝑞 and 𝑠 ∈ Z𝑞 . For an input 𝑘, 𝑃 𝑅𝐹𝑔,𝑠 (𝑘)
|
||||
access criteria.
|
||||
is defined as 𝑃 𝑅𝐹𝑔,𝑠 (𝑘) ∶ 𝑘 ↦ 𝑔 −(𝑠+𝑘+1) . There exists an efficient proof of
|
||||
• 𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞
|
||||
( ( 𝑈 (𝑝𝑝, 𝐼, 𝐴𝑡𝑡𝑟𝑠,
|
||||
) ) 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 )
|
||||
𝑤, 𝑐𝑡𝑥, →
|
||||
correct formation for the output, and as long as the 𝑞-DDHI assumption
|
||||
𝐶𝑚, 𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ∶ The issue request algorithm inputs
|
||||
holds, the output 𝑃 𝑅𝐹𝑔,𝑠 (𝑘) is indistinguishable from a random element
|
||||
the public parameters 𝑝𝑝, the issue criteria 𝐼, the set of attributes
|
||||
in G𝑞 .
|
||||
𝐴𝑡𝑡𝑟𝑠 of , the secret value 𝑤, the context 𝑐𝑡𝑥, and the auxiliary
|
||||
information (𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ). generates the 𝛱𝑈1 associated with
|
||||
4. Proposed scheme 𝑖𝑎𝑢𝑥𝑧𝑘 and outputs ((𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 ), 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ).
|
||||
• 𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡𝐼 (𝑝𝑝, (𝐼, 𝜄𝑝𝑢𝑏 ), (𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 ), 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) →
|
||||
In this section, we describe in Table 2 all the symbolic definitions (𝑠′′ , (𝜃, 𝑇𝑟𝑜𝑜𝑡 ), 𝑘, 𝑇𝜅 ) ∶ The algorithm inputs the zero-knowledge sig-
|
||||
involved as well as the implications, followed by defining the syntax nature 𝛱𝑈1 , and the auxiliary information (𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ). Then
|
||||
and designing the scheme. return the random value 𝑠′′ , authentication path 𝜃, number of
|
||||
times 𝑘 to , and locally generated Merkle tree 𝑇𝜅 .
|
||||
{ }𝑛 { }
|
||||
• 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑𝑈 (𝑝𝑝, 𝑉 , 𝑇𝑟𝑜𝑜𝑡 , 𝑐𝑟𝑒𝑑, 𝜃, 𝑤𝑖 , 𝑎𝑢𝑥𝑖 𝑖=1 ) → (𝛱, ̃ 𝑎𝑢𝑥𝑖 𝑛 ) ∶
|
||||
4.1. Syntax and security model 𝑖=1
|
||||
inputs the root 𝑇𝑟𝑜𝑜𝑡 of the affiliated tree, the credential 𝑐𝑟𝑒𝑑,
|
||||
and the authentication path 𝜃. shows that the sent credential
|
||||
4.1.1. Security definition satisfies the access criterion 𝜙𝑖 and proves that the displayed
|
||||
The security of the system is defined by the standard properties credential
|
||||
{ } belongs to the tree 𝑇𝜅 . Then, the algorithm outputs
|
||||
of anonymous credentials, including unforgeability, anonymity, un- ̃ 𝑎𝑢𝑥𝑖 𝑛 ).
|
||||
(𝛱, 𝑖=1 { }
|
||||
linkability, and attribute privacy. In our model, the attacker is as- • 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤𝑉 (𝑝𝑝, 𝑉 , (𝑐𝑟𝑒𝑑, 𝑇𝑟𝑜𝑜𝑡 ), (𝛱, ̃ 𝑎𝑢𝑥𝑖 𝑛 )) → 0∕1 ∶ ver-
|
||||
𝑖=1
|
||||
sumed to have only polynomial-time computational capability, and all ifies that the credentials 𝑐𝑟𝑒𝑑 displayed by meet the access
|
||||
communications occur over open channels. criteria and that 𝑐𝑟𝑒𝑑 belongs to the Merkle tree 𝑇𝜅 , outputting
|
||||
Threat Model. Our model considers adversaries as external attack- 0/1.
|
||||
ers intercepting or modifying communications without breaking hard • 𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑𝐼 (𝑝𝑝, 𝑇𝜅 , 𝑐𝑟𝑒𝑑) → 𝑇𝜅′ ∶ revoke the 𝑐𝑟𝑒𝑑 registered by
|
||||
cryptographic problems, internal attackers misusing valid credentials dishonest users and update the Merkle tree 𝑇𝜅 to 𝑇𝜅′ .
|
||||
for forgery, transfer, or link attacks, semi-honest verifiers inferring user
|
||||
identities or attributes while following the protocol, and trusted-but- 4.1.3. Security requirements
|
||||
curious issuers complying with the protocol but attempting to snoop The scheme is required to satisfy the following security require-
|
||||
on user data. ments:
|
||||
Unforgeability: Attackers cannot forge valid credentials and de-
|
||||
ceive validators into performing correct verification. This game is
|
||||
4.1.2. Syntax definition reduced to discrete logarithm or CDH problems.
|
||||
Referring to the ideal function in [38], the zk-credit anonymous Anonymity: Credentials are displayed without revealing the user’s
|
||||
credential approach realizes using Groth16 [40], which is not suitable identity. This game specification is reduced to the DDH problem.
|
||||
for authentication. In this work, is instantiated using signatures of Unlinkability: Different displays of the same certificate cannot
|
||||
knowledge, resulting in an algorithm that meets the authentication be linked, even if the merkle path remains identical across multiple
|
||||
requirements. The specific algorithm is as follows: authentications.
|
||||
|
||||
4
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1. System Model.
|
||||
|
||||
|
||||
Attribute Privacy: Hides attributes when displaying credentials from untrusted channels, forge information and impersonate users.
|
||||
unless the access policy requires them to be displayed. Therefore, this paper adopts the method of zero-knowledge proof to
|
||||
Security is analyzed using a formal game-based model [41] under realize the user’s verification of the certificate sent by the issuer, and
|
||||
the random oracle assumption [42]. The game is defined as follows: prove to the verifier that the certificate is the user’s own, and at the
|
||||
same time, it can reduce the risk of privacy leakage. As shown in Fig.
|
||||
Game 1: Unforgeability Game 1.
|
||||
Setup. The challenger-1 run system initialization algorithm
|
||||
𝑆𝑒𝑡𝑢𝑝(1𝜆 , 1ℎ , 1𝑚 ) generate 𝑝𝑝, send 𝑝𝑝 to adversary 1 . 1 save issuer • Issuer: The issuer is the issuer of the certificate, usually an
|
||||
private key 𝑖𝑠𝑘. authority or trusted entity (such as government, enterprise, de-
|
||||
Query. In this phase, the adversary 1 can querie three random centralized organization, etc.), which is responsible for verifying
|
||||
oracles, as follows: the identity or attribute of the user and generating the encrypted
|
||||
credential. Before sending the certificate, the issuing criteria will
|
||||
1. − 𝑄𝑢𝑒𝑟𝑦: 1 query random oracle 1 , 2 , 3 , 1 random re- be verified.
|
||||
sponse and recording. • User: The user is the holder of the credential, requests the cre-
|
||||
2. 𝑄𝑢𝑒𝑟𝑦2 : 1 query the issuer to registered certificate, 1 use dential from the issuer, upon receipt, verifies the credential.
|
||||
the simulator Simulate the interaction between 𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞 and • Verifier: The verifier is the receiver of credentials, who receives
|
||||
𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡, using the programmability of random oracle to gen- the user’s credentials, goes through a secure channel, downloads
|
||||
erate effective 𝑆𝑃 𝐾2 . the criteria and auxiliary verification data, verifies the access
|
||||
3. 𝑄𝑢𝑒𝑟𝑦3 : 1 query certificate display, simulate the interaction criteria, and then verifies the user’s identity.
|
||||
between 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑 and 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤, and simulate 𝑆𝑃 𝐾3 using
|
||||
a zero-knowledge simulator. 4.2.1. System ( initialization
|
||||
)
|
||||
𝑆𝑒𝑡𝑢𝑝 1𝜆 , 1ℎ , 1𝑚 → 𝑝𝑝 ∶
|
||||
Forgery. 1 output a forged certificate 𝑐𝑟𝑒𝑑 ∗ , correspond Merkle − select a cyclic group G of order 𝑞, and generate generators
|
||||
tree path 𝜃 ∗ , satisfy that 𝑐𝑟𝑒𝑑 ∗ is not on the list of previously issued 𝑢, {𝑢𝑖 }𝑖∈[0,𝑛] ) ∈ G, along with hash functions 𝐻1 ∶
|
||||
(𝑔0 , 𝑔1 , 𝑔2 , 𝛾, ℎ0 , ℎ1 , ℎ2 , ̃
|
||||
credentials. 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤 accept 𝑐𝑟𝑒𝑑 ∗ and 𝜃 ∗ . 1 wins conditional on {0, 1}∗ → Z∗𝑞 and 𝐻2 ∶ {0, 1}∗ × {0, 1}∗ → Z∗𝑞 ;
|
||||
the output of valid forged credentials. − Define a Merkle tree of height ℎ, where for public input (𝑇𝑟𝑜𝑜𝑡 , 𝑐𝑟𝑒𝑑),
|
||||
it can prove 𝑐𝑟𝑒𝑑 ∈ 𝑇𝜅 through an authentication path 𝜃;
|
||||
Game 2: Anonymity and Unlinkability Game − Define the global period 𝑒𝑝𝑜𝑐ℎ and pseudorandom function
|
||||
Setup. The challenger-2 run system initialization algorithm 𝑃 𝑅𝐹𝑔,𝑠 (𝑘) ∶ 𝑘 ↦ 𝑔𝑠+𝑘+1 1
|
||||
;
|
||||
𝑆𝑒𝑡𝑢𝑝(1𝜆 , 1ℎ , 1𝑚 ) generate 𝑝𝑝, send 𝑝𝑝 to adversary 2 . 2 save issuer 𝑦
|
||||
− selects random number 𝑦1 , 𝑦2 ← Z∗𝑞 , computes 𝑌1 = ℎ11 , 𝑌2 =
|
||||
private key 𝑖𝑠𝑘. 𝑦2
|
||||
ℎ2 , and sets the issuer secret key 𝑖𝑠𝑘 = (𝑦1 , 𝑦2 ) and issuer public key
|
||||
Query. Adversary 2 can continue to query issuance and pre-
|
||||
𝑖𝑝𝑘 = (𝑌1 , 𝑌2 ); (
|
||||
sentation, but cannot query revocation or presentation of challenge
|
||||
− Set the public parameters 𝑝𝑝 ) ∶= 𝑞, G, 𝑔0 , 𝑔1 , 𝑔2 , 𝛾, ℎ0 , ℎ1 , ℎ2 ,
|
||||
credentials. 𝑢, {𝑢𝑖 }𝑖∈[0,𝑛] , 𝐻1 , 𝐻2 , 𝑇𝜅 (, 𝑇𝑟𝑜𝑜𝑡 , 𝑒𝑝𝑜𝑐ℎ,
|
||||
̃ 𝑖𝑝𝑘 .
|
||||
challenge. The adversary 2 selects the identity and attribute sets )
|
||||
( ) ( ) 𝐼𝑠𝑠𝑢𝑒𝑆𝑒𝑡𝑢𝑝𝐼 (𝑝𝑝) → 𝐼, 𝜄𝑝𝑢𝑏 ∶
|
||||
of two users, 𝐼0 , 𝐴𝑡𝑡𝑟𝑠0 ∗ , 𝐼1 , 𝐴𝑡𝑡𝑟𝑠1 ∗ , which satisfy the same access − Define the relevant issuance criteria 𝜄 = (𝜄𝑧𝑘 , 𝜄𝑝𝑢𝑏 ), set
|
||||
policy. Send it to the challenger 2 . 2 randomly selects 𝑏 ← {0, 1} 𝐼𝑠𝑠𝑢𝑒𝐶𝑟𝑖𝑡𝑒𝑟𝑖𝑎[𝐼] ∶= 𝐼𝑠𝑠𝑢𝑒𝐶𝑟𝑖𝑡𝑒𝑟𝑖𝑎[𝐼] ∪ 𝜄;
|
||||
to generate a credential for 𝐼𝑏 and display it (i.e., run 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑 to − For the public input auxiliary information 𝑖𝑎𝑢𝑥𝑧𝑘 , prove:
|
||||
generate 𝛱𝑏 ), and then gives 𝛱𝑏 to 2 . 𝜄𝑧𝑘 (𝐴𝑡𝑡𝑟𝑠, 𝑖𝑎𝑢𝑥𝑧𝑘 ) = 1;
|
||||
Guess. 2 outputs 𝑏′ and wins if 𝑏′ = 𝑏. − Publish (𝐼, 𝜄𝑝𝑢𝑏 ).
|
||||
𝑆ℎ𝑜𝑤𝑆𝑒𝑡𝑢𝑝𝑉 (𝑝𝑝) → 𝑉 ∶
|
||||
4.2. Scheme construction − define access criteria 𝜙 for user attributes 𝐴𝑡𝑡𝑟𝑠 (Multiple access
|
||||
criteria 𝜙𝑖 can be defined), and set 𝐴𝑐𝑐𝑒𝑠𝑠𝐶𝑟𝑖𝑡𝑒𝑟𝑖𝑎[𝑉 ]
|
||||
In this scheme, the user is untrusted, the issuer is semi-trusted, the ∶= 𝐴𝑐𝑐𝑒𝑠𝑠𝐶𝑟𝑖𝑡𝑒𝑟𝑖𝑎[𝑉 ] ∪ {𝜙𝑖 };
|
||||
channel between the verifier and the issuer is trusted, and the rest of − For public input (𝑇root , 𝑐𝑟𝑒𝑑, 𝑎𝑢𝑥), prove: 𝜙(𝐴𝑡𝑡𝑟𝑠, 𝑎𝑢𝑥) = 1𝛬𝑐𝑟𝑒𝑑;
|
||||
the channels are untrusted channels. Attackers can steal information − Publish the access criteria set 𝑉 .
|
||||
|
||||
5
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
4.2.2. Credential registration Proof 𝛱̃ = 𝑆𝑃 𝐾3 . The generation of 𝛱̃ = 𝑆𝑃 𝐾3 is as follows:
|
||||
( ( ))
|
||||
𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞𝑈 𝑝𝑝, 𝐼, 𝐴𝑡𝑡𝑟𝑠, 𝑤, 𝑐𝑡𝑥, 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 → ( )
|
||||
( ( 1 ) ) ⎧ 𝑛𝑘, 𝑟𝑘, 𝐴𝑡𝑡𝑟𝑠, 𝛼0 , 𝑥𝑢 , 𝑠, 𝑡, 𝑛𝑗 , 𝑎𝑡𝑡𝑟𝑗 ∉ 𝐴𝑇 𝑇 𝑅 ∶ ⎫
|
||||
𝐶𝑚, 𝛱𝑈 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ∶ ⎪ 𝛼 ⎪
|
||||
⎪ 𝑋0 = 𝑔0 0 𝛾 𝐻1 (𝜃) ⎪
|
||||
− generate anonymous key 𝑛𝑘 and rate-limiting key 𝑟𝑘 us-
|
||||
⎪ ∧ 𝜁 ′ = 𝑌1𝑥𝑢 𝑌2𝑠 ⋅ 𝐶𝑚𝑡 ⎪
|
||||
ing pseudorandom function 𝑃 𝑅𝐹 and context 𝑐𝑡𝑥, calculate 𝑛𝑘 ∶= ⎪ 1 ⎪
|
||||
𝑃 𝑅𝐹 (𝑐𝑡𝑥), 𝑟𝑘 ∶= 𝑃 𝑅𝐹 (𝑒𝑝𝑜𝑐ℎ ∥ 𝑐𝑡𝑥), define 𝑚 attributes 𝐴𝑡𝑡𝑟𝑠 = ⎪ ∧ 𝜂 = 𝑃 𝑅𝐹𝑟𝑘,𝑢̃ (𝑛𝑗 ) = 𝑟𝑘+𝑛 +1 ⎪
|
||||
⎪ 𝑢̃ 𝑗 ⎪
|
||||
{𝑎𝑡𝑡𝑟1 , 𝑎𝑡𝑡𝑟2 , … , 𝑎𝑡𝑡𝑟𝑚 }; 𝛱̃ = 𝑆𝑃 𝐾3 ⎨ 𝑥𝑢 𝑅 𝑥𝑢
|
||||
𝑅
|
||||
𝑛𝑘+𝑛𝑗 +1 ⎬
|
||||
− Select a random blind factor 𝑟 ← Z∗𝑞 and compute pedersen ⎪ ∧ 𝛤 = 𝑢0 𝑃 𝑅𝐹𝑛𝑘,𝑢̃ (𝑛𝑗 ) = 𝑢0 ⋅ 𝑢̃ ⎪
|
||||
⎪ ∧ 0 ≤ 𝑛𝑗 < 𝑘 ⎪
|
||||
commitment 𝐶𝑚, where 𝐶𝑚 ∈ G: ⎪ ⎪
|
||||
( 𝑚 ) ⎪ ∧ 𝜙 1 (𝐴𝑡𝑡𝑟𝑠, 𝑎𝑢𝑥 1 ) = 1 ⎪
|
||||
∏ 𝐻 (𝑎𝑡𝑡𝑟 ) ⎪ ∧ ⋮ ⎪
|
||||
𝐶𝑚 = 𝐶𝑜𝑚𝑚𝑖𝑡(𝑛𝑘, 𝑟𝑘, 𝐴𝑡𝑡𝑟𝑠; 𝑟) = 𝑔1𝑛𝑘 𝑔2𝑟𝑘 𝑢𝑖 1 𝑖 ⋅ ℎ𝑟0 ; ⎪ ∧ 𝜙 (𝐴𝑡𝑡𝑟𝑠, 𝑎𝑢𝑥 ) = 1 ⎪
|
||||
⎩ 𝑖 𝑖 ⎭
|
||||
𝑖=1 ( )
|
||||
− Set 𝑤 ∶= (𝑟, 𝑛𝑘, 𝑟𝑘, 𝐴𝑡𝑡𝑟𝑠) (collect private witness 𝑤), select × 𝑎𝑢𝑥𝑖 , 𝑋0 , 𝜁 ′ , 𝜂, 𝛤 , 𝑇𝑟𝑜𝑜𝑡 ;
|
||||
𝑥𝑢 , 𝑠′ , 𝑡 ← Z∗𝑞 and generate 𝛱𝑈1 :
|
||||
− Send (𝛱, ̃ {𝑎𝑢𝑥𝑖 }𝑛 , 𝑋0 , 𝜁 ′ , 𝜂, 𝛤 , (𝜃, 𝑇𝑟𝑜𝑜𝑡 ), 𝛷′ , 𝑎𝑡𝑡𝑟𝑖 ∈ 𝐴𝑇 𝑇 𝑅 ) to the
|
||||
𝑖=1
|
||||
⎧ ( ) ⎫ verifier .
|
||||
𝑥𝑢 , 𝑠′ , 𝑡, 𝑟, 𝑛𝑘, 𝑟𝑘, 𝐴𝑡𝑡𝑟𝑠 ∶ ⎪ ( ( ) ( { } ))
|
||||
⎪ 𝑥𝑢 𝑠′ 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤𝑉 𝑝𝑝, 𝑉 , 𝑐𝑟𝑒𝑑, 𝑇𝑟𝑜𝑜𝑡 , 𝛱, ̃ 𝑎𝑢𝑥𝑖 𝑛 → 0∕1 ∶
|
||||
⎪ 𝑋𝑢 = 𝑔1 𝑔2 ⎪( ) 𝑖=1
|
||||
𝛱𝑈1 = 𝑆𝑃 𝐾1 ⎨ 𝑥𝑢 𝑠′ 𝑡 ⎬ 𝑋𝑢 , 𝜁, 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ; − checks whether the user’s submitted 𝛷′ matches its defined
|
||||
⎪ ∧ 𝜁 = 𝑌 𝑌 ⋅ 𝐶𝑚 ⎪
|
||||
( 1 2 ) access criteria set 𝛷. Using 𝜃, verify and calculate 𝑐𝑟𝑒𝑑 = 𝜁 ′ ⋅𝑢0 2
|
||||
? 𝐻 (𝑒𝑝𝑜𝑐ℎ∥𝑘)
|
||||
.
|
||||
⎪ ∧ 𝜄𝑧𝑘 𝐴𝑡𝑡𝑟𝑠, 𝑖𝑎𝑢𝑥𝑧𝑘 = 1 ⎪
|
||||
⎩ ⎭ If (𝜂, 𝛤 ) is valid, it proves that 𝑛𝑗 is within the range allowed to be
|
||||
1
|
||||
− send (𝛱𝑈 , 𝑋𝑢 , 𝜁, 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) to Issuer ; displayed within 𝑒𝑝𝑜𝑐ℎ;
|
||||
− received 𝛱𝑉1 . If verification passes, receive the returned au- − If verification succeeds, accept the request, otherwise reject it and
|
||||
thentication path 𝜃, 𝑠′′ and 𝑘; invoke the 𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑 function to revoke 𝑐𝑟𝑒𝑑. For the specific process,
|
||||
− Locally store (𝑛𝑘, 𝑟𝑘, 𝑟, 𝐴𝑡𝑡𝑟𝑠, 𝜃, 𝑠, 𝑡, 𝑒𝑝𝑜𝑐ℎ, 𝑘), where 𝑠 = 𝑠′ + 𝑠′′ and please refer to Fig. 2.
|
||||
𝑘 is the maximum allowed accesses within epoch 𝑒𝑝𝑜𝑐ℎ.
|
||||
𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡𝐼 (𝑝𝑝, (𝐼, 𝜄𝑝𝑢𝑏 ), (𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 ), 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) →
|
||||
( ( ) ) 4.2.4. Credential revocation
|
||||
𝑐𝑟𝑒𝑑, 𝑠′′ , 𝜃, 𝑇𝑟𝑜𝑜𝑡 , 𝑘, 𝑇𝜅 ∶ ( )
|
||||
𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑 𝑝𝑝, 𝑇𝜅 , 𝑐𝑟𝑒𝑑 → 𝑇𝜅′ ∶
|
||||
− verify 𝜄𝑝𝑢𝑏 (𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ), 𝜄𝑝𝑢𝑏 checks for publicly auxiliary information − Search for 𝑐𝑟𝑒𝑑 ∈ 𝑇𝜅 , if 𝑐𝑟𝑒𝑑 is not found, terminate the process;
|
||||
𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ;
|
||||
− Else run 𝑇𝜅′ ∶= 𝑇𝜅 . Remove(𝑐𝑟𝑒𝑑), store and update the Merkle
|
||||
− Verify 𝛱𝑈1 ∶= 𝑆𝑃 𝐾1 , where 𝛱𝑈1 proves the correctness of tree 𝑇𝜅′ ;
|
||||
(𝜁, 𝑋𝑢 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) and that the hidden attributes satisfy the issuance − Return 𝑇𝑘′ and publicly notify that 𝑐𝑟𝑒𝑑 has been revoked.
|
||||
criteria 𝜄𝑧𝑘 . If verification fails, reject issuance and abort ⟂;
|
||||
− Else verification passes, randomly selects 𝑠′′ ← Z∗𝑞 , and define
|
||||
5. Analysis of correctness and security
|
||||
the maximum times of accesses 𝑘 allowed by users within 𝑒𝑝𝑜𝑐ℎ,
|
||||
′′ 𝐻 (𝑒𝑝𝑜𝑐ℎ∥𝑘)
|
||||
calculate 𝑐𝑟𝑒𝑑 ∶= (𝜁 ⋅ 𝑌2𝑠 ) ⋅ 𝑢0 1 , run 𝑇𝜅 = 𝑇 .Insert(𝑐𝑟𝑒𝑑) registers
|
||||
5.1. Correctness analysis
|
||||
the anonymous credential. Where the registered 𝑐𝑟𝑒𝑑 is only known
|
||||
privately by the issuer. Then, run 𝜃 = 𝑇𝜅 .AuthPath(𝑐𝑟𝑒𝑑) generate
|
||||
authentication path. Updated Merkle tree root 𝑇𝑟𝑜𝑜𝑡 , and upload to a 5.1.1. Details of 𝑆𝑃 𝐾1
|
||||
public panel such as blockchain; 𝑆𝑃 𝐾1 can be implemented using standard discrete logarithm proof
|
||||
techniques.
|
||||
− Next, select 𝑧0 , 𝑧1 ← Z∗𝑞 and generate 𝛱𝑉1 :
|
||||
( ) 1. (Commitment.) User randomly selects 𝑠1 , 𝑠2 , 𝑠3 ∈𝑅 Z∗𝑞 and
|
||||
⎧ 𝑧0 , 𝑧1 , 𝑦1 , 𝑦2 ∶ ⎫
|
||||
1 ⎪ 𝑌 = ℎ
|
||||
𝑦1 𝑦2
|
||||
ℎ ⎪( ′′
|
||||
) computes:
|
||||
𝛱𝑉 = 𝑆𝑃 𝐾2 ⎨ 𝑢 ( 1 2 ′′ )𝑧1 ⎬ 𝑌𝑢 , 𝑠 , 𝑘, ; 𝑠 𝑠 𝑠 𝑠 𝑦 𝑦
|
||||
⎪ ∧ = 𝜁 ⋅𝑌 𝑠 𝐻 2 (𝑒𝑝𝑜𝑐ℎ∥𝑘)⋅𝑧 0 ⎪ 𝑇1 = 𝑔11 𝑔22 , 𝑇2 = 𝑌1 1 𝑌2 2 ⋅ 𝐶𝑚𝑠3 = (ℎ11 )𝑠1 (ℎ22 )𝑠2 ⋅ 𝐶𝑚𝑠3 .
|
||||
⎩ 2
|
||||
⋅ 𝑢0 ⎭ 2. (Challenge.) The scheme uses non-interactive zero-knowledge
|
||||
− store the Merkle tree 𝑇𝜅 and send (𝛱𝑉1 , 𝑠′′ , 𝑘, 𝜃) to user .
|
||||
proof, where the user generates challenge 𝑐:
|
||||
|
||||
4.2.3. Show and verification certificate 𝑐 = 𝐻(𝑇1 ∥ 𝑇2 ∥ 𝑋𝑢 ∥ 𝜁 ∥ 𝑖𝑎𝑢𝑥𝑧𝑘 ∥ 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ).
|
||||
( { }𝑛 ) ( { } )
|
||||
̃ 𝑎𝑢𝑥𝑖 𝑛
|
||||
𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑𝑈 𝑝𝑝, 𝑉 , 𝑇𝑟𝑜𝑜𝑡 , cred, 𝜃, 𝑤𝑖 , 𝑎𝑢𝑥𝑖 𝑖=1 → 𝛱, ∶
|
||||
𝑖=1 3. (Proof.) generates proof 𝛱𝑈1 that satisfies issuer policy
|
||||
− User sends an access request message 𝑚𝑠𝑔, and the verifier 𝜄𝑧𝑘 , 𝜄𝑧𝑘 (𝐴𝑡𝑡𝑟𝑠, 𝑖𝑎𝑢𝑥𝑧𝑘 ) = 1, and computes 𝑆1 = 𝑠1 − 𝑐 ⋅ 𝑥𝑢 , 𝑆2 =
|
||||
returns a random number 𝑅 = 𝐻2 (𝑛𝑜𝑛𝑐𝑒 ∥ 𝑚𝑠𝑔); 𝑠2 − 𝑐 ⋅ 𝑠′ , 𝑆3 = 𝑠3 − 𝑐 ⋅ 𝑡. The proof 𝛱𝑈1 = (𝑐, 𝑆1 , 𝑆2 , 𝑆3 ), and sends
|
||||
− locally retrieves the verifier’s access criteria 𝑉 and the root ((𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 ), 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ) to the issuer .
|
||||
node 𝑇𝑟𝑜𝑜𝑡 of the tree containing 𝑐𝑟𝑒𝑑; 𝑆 𝑆 𝑆 𝑆
|
||||
4. (Verify.) computes 𝑇1′ = 𝑋𝑢𝑐 𝑔1 1 𝑔2 2 , 𝑇2′ = 𝜁 𝑐 𝑌1 1 𝑌2 2 ⋅ 𝐶𝑚𝑆3 , and
|
||||
? ?
|
||||
− Upon receiving (𝑛𝑜𝑛𝑐𝑒, 𝑅), verify 𝑅 = 𝐻2 (𝑛𝑜𝑛𝑐𝑒 ∥ 𝑚𝑠𝑔), then verify: 𝑐 = 𝐻(𝑇1′ ∥ 𝑇2′ ∥ 𝑋𝑢 ∥ 𝜁 ∥ 𝑖𝑎𝑢𝑥𝑧𝑘 ∥ 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 ). If verification
|
||||
randomly select 𝛼0 ← Z∗𝑞 . For 𝑛 access criteria 𝛷′ = {𝜙1 , 𝜙2 , … , 𝜙𝑛 }, passes, then 𝛱𝑈1 is correct, otherwise abort.
|
||||
partition the attribute set into public attributes 𝐴𝑇 𝑇 𝑅 and secret
|
||||
attributes {𝑎𝑡𝑡𝑟𝑗 ∉ 𝐴𝑇 𝑇 𝑅 }. Compute the commitment using blind
|
||||
5.1.2. Details of 𝑆𝑃 𝐾2
|
||||
factor 𝑟:
|
||||
SPK2 can also be implemented using standard discrete logarithm
|
||||
𝐶𝑚 = 𝐶𝑜𝑚𝑚𝑖𝑡(𝑛𝑘, 𝑟𝑘, {𝑎𝑡𝑡𝑟𝑗 ∉ 𝐴𝑇 𝑇 𝑅 }; 𝑟) proof techniques.
|
||||
⎛ ∏ ⎞ ∏
|
||||
𝐻 (𝑎𝑡𝑡𝑟 ) 1. (Commitment.) The issuer/trust authority randomly selects
|
||||
= ⎜𝑔1𝑛𝑘 𝑔2𝑟𝑘 ⋅ 𝑢𝑖 1 𝑗 ⋅ ℎ𝑟0 ⎟ ⋅
|
||||
𝐻 (𝑎𝑡𝑡𝑟 )
|
||||
𝑢𝑖 1 𝑖 ;
|
||||
⎜ ⎟ 𝑡1 , 𝑡2 , 𝑡3 , 𝑡4 ∈𝑅 Z∗𝑞 and computes:
|
||||
⎝ 𝑎𝑡𝑡𝑟 𝑗 ∉𝐴𝑇 𝑇 𝑅 ⎠ 𝑎𝑡𝑡𝑟 𝑖 ∉𝐴𝑇 𝑇 𝑅
|
||||
− Next, the times of certificate displays is initialized to 𝑛𝑗 = 1, and 𝑡 𝑡 ′′ 𝐻 (𝑒𝑝𝑜𝑐ℎ∥𝑘)⋅𝑡4
|
||||
𝐶1 = ℎ11 ℎ22 , 𝐶2 = (𝜁 ⋅ 𝑌2𝑠 )𝑡3 ⋅ 𝑢0 2 .
|
||||
𝑛𝑗 = 𝑛𝑗 + 1 (0 ≤ 𝑛𝑗 < 𝑘) is set for each generation of zero-knowledge
|
||||
|
||||
6
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 2. System Flowchart.
|
||||
|
||||
|
||||
2. (Challenge.) The scheme uses non-interactive zero-knowledge 2. (Challenge.) Using non-interactive zero-knowledge proof, the
|
||||
proof, where generates challenge 𝑐: user generates challenge 𝑐:
|
||||
𝑐 = 𝐻(𝐶1 ∥ 𝐶2 ∥ 𝑌𝑢 ∥ ∥ 𝑠′′ ∥ 𝑘). 𝑐 = 𝐻(𝐴1 ∥ 𝐴2 ∥ 𝐴3 ∥ 𝐴4 ∥ 𝐴5 ∥ 𝑋0 ∥ 𝜁 ′ ∥ 𝜂 ∥ 𝛤 ∥ 𝑇𝑟𝑜𝑜𝑡 ∥ 𝑎𝑢𝑥𝑖 ).
|
||||
3. (Proof.) The issuer generates proof 𝛱𝑉1 by computing 𝐶1′ = 3. (Proof.) generates proof 𝛱̃ by computing:
|
||||
𝑡1 − 𝑐 ⋅ 𝑦1 , 𝐶2′ = 𝑡2 − 𝑐 ⋅ 𝑦2 , 𝐶3′ = 𝑡3 − 𝑐 ⋅ 𝑧1 , 𝐶4′ = 𝑡4 − 𝑐 ⋅ 𝑧0 . The
|
||||
proof 𝛱𝑉1 = (𝑐, 𝐶1′ , 𝐶2′ , 𝐶3′ , 𝐶4′ ), sends (𝛱𝑉1 , 𝑠′′ , 𝑘) to user. 𝐴′1 = t3 − 𝑐 ⋅ 𝛼0 , 𝐴′2 = t4 − 𝑐 ⋅ 𝑥𝑤 , 𝐴′3 = t5 − 𝑐 ⋅ 𝑠,
|
||||
𝐶′ 𝐶′ ′′ ′ 𝐴′4 = t6 − 𝑐 ⋅ 𝑡, 𝐴′5 = n7 − 𝑐 ⋅ 𝑛𝑗 , 𝐴′6 = n8 − 𝑐 ⋅ 𝜌1 ,
|
||||
4. (Verify.) computes, C1 = 𝑌𝑢𝑐 ℎ1 1 ℎ2 2 , C2 = 𝑐 (𝜁 ⋅ 𝑌 𝑠 )𝐶3
|
||||
2
|
||||
⋅
|
||||
𝐻2 (𝑒𝑝𝑜𝑐ℎ∥𝑘)⋅𝐶4′ ?
|
||||
𝑢0 , and verify: 𝑐 = 𝐻(C1 ∥ C2 ∥ 𝑌𝑢 ∥ 𝑍 ∥ 𝑘). ∥ 𝑠′′ 𝐴′7 = 𝜚2 − 𝑐 ⋅ 𝑟𝑘, 𝐴′8 = 𝜚1 − 𝑐 ⋅ 𝑛𝑘.
|
||||
If verification passes, then 𝛱𝑉1 is correct, otherwise abort.
|
||||
The proof 𝛱̃ = (𝑐, 𝐴′1 , 𝐴′2 , 𝐴′3 , 𝐴′4 , 𝐴′5 , 𝐴′6 , 𝐴′7 , 𝐴′8 ), and sends
|
||||
̃ 𝑎𝑢𝑥𝑖 , 𝑋0 , 𝜁 ′ , 𝜂, 𝛤 , 𝑇𝑟𝑜𝑜𝑡 ) to verifier .
|
||||
(𝛱,
|
||||
5.1.3. Details of 𝑆𝑃 𝐾3
|
||||
4. (Verify.) computes:
|
||||
The construction of 𝑆𝑃 𝐾3 includes zero-knowledge proof and range
|
||||
proof. We divide 𝑆𝑃 𝐾3 into two parts 𝑆𝑃 𝐾3𝐴 and 𝑆𝑃 𝐾3𝐵 . The specific 𝐴′ 𝐴′ 𝐴′ ′
|
||||
A1 = 𝑋0𝑐 𝑔0 1 𝛾 𝐻1 (𝜃) , A2 = 𝜁 ′𝑐 𝑌1 2 𝑌2 3 𝐶𝑚𝐴4 ,
|
||||
details are as follows: ( )𝑐
|
||||
( ) 𝐴′ 𝐴′ ̃
|
||||
𝑢 ′ ′
|
||||
⎧ 𝑛𝑘, 𝑟𝑘, 𝛼0 , 𝑥𝑢 , 𝑠, 𝑡, 𝑛𝑗 , 𝜌1 ∶ ⎫ A3 = 𝑐 𝑔1 5 𝑔2 6 , A4 = 𝜂 𝐴7 𝜂 𝐴5 ,
|
||||
𝜂
|
||||
⎪ 𝑋0 = 𝑔0 𝛾 1
|
||||
𝛼0 𝐻 (𝜃)
|
||||
⎪
|
||||
⎪ ′ = 𝑌 𝑥𝑢 𝑌 𝑠 ⋅ 𝐶𝑚𝑡 ⎪ [ 𝑅 ]𝑐
|
||||
⎪ ∧ 𝜁 1 2 ⎪( ) 𝑢 ⋅ 𝑢0
|
||||
̃ −𝐴 ′ −𝐴 ′ −𝐴 ′ ′
|
||||
𝑆𝑃 𝐾3𝐴 ⎨ ∧ = 𝑔 𝑛𝑗 𝑔 𝜌1 ′
|
||||
⎬ 𝑎𝑢𝑥𝑖 , 𝑋0 , 𝜁 , 𝜂, 𝛤 , 𝑇𝑟𝑜𝑜𝑡 , A5 = 𝑢0 8 𝑢0 5 𝑢0 2 𝛤 𝐴8′ 𝛤 𝐴5 ,
|
||||
𝛤
|
||||
⎪ 𝑢̃
|
||||
1 2
|
||||
𝑟𝑘 𝑛 ⎪
|
||||
⎪ ∧ 𝜂 =𝜂 𝜂 𝑗 ⎪ ?
|
||||
⎪ and verify: 𝑐 = 𝐻(A1 ∥ A2 ∥ A3 ∥ A4 ∥ A5 ∥ 𝑋0 ∥ 𝜁 ′ ∥ 𝜂 ∥ 𝛤 ∥
|
||||
𝑢̃ 𝑅 ⋅𝑢0 −𝑛𝑘 𝑢−𝑛𝑗 𝑢−𝑥𝑢 𝛤 𝑛𝑘 𝛤 𝑛𝑗 ⎪
|
||||
⎩ ∧ 𝛤
|
||||
= 𝑢 0 0 0
|
||||
⎭ 𝑇𝑟𝑜𝑜𝑡 ∥ 𝑎𝑢𝑥𝑖 ).
|
||||
𝑛 𝜌
|
||||
𝑆𝑃 𝐾3𝐵 {(𝑛𝑗 , 𝜌1 ) ∶ = 𝑔1 𝑗 𝑔2 1 ∧ 0 ≤ 𝑛𝑗 < 𝑘}(𝑚). In groups of unknown order, range proofs currently widely recognized
|
||||
SPK3𝐵 is instantiated as a simple range proof, which will be dis- by academia and industry are based on the square decomposition
|
||||
cussed later. Next, we demonstrate how to implement SPK3𝐴 . assumption [43] and 𝑛-ary decomposition [40], which can achieve
|
||||
secure and efficient range proofs. However, we note that the range
|
||||
1. (Commitment.) randomly selects 𝜚1 , 𝜚2 , t3 , t4 , t5 , t6 , n7 , n8 ∈𝑅 proofs required in authentication protocols always take the form 0 ≤
|
||||
Z𝑛𝑞 and computes: 𝑛 < 𝑘. If we set 𝑘 = 2𝜅 , we can easily construct a simple range proof
|
||||
t t t n n
|
||||
with complexity (𝜅), as shown in Eq. (1):
|
||||
𝐴1 = 𝑔03 𝑦𝐻1 (𝜃) , 𝐴2 = 𝑌1 4 𝑌2 5 𝐶𝑚t6 , 𝐴3 = 𝑔1 7 𝑔2 8 ,
|
||||
−𝜚 −n −𝑡 𝑃 𝑂𝐾𝑅𝐴𝑁𝐺𝐸 {(𝑛, 𝑟) ∶ 𝐶𝑛 = 𝑔0𝑛 𝑔1𝑟 ∧ 0 ≤ 𝑛 < 2𝜅 }. (1)
|
||||
𝐴4 = 𝜂 𝜚2 𝜂 n7 , 𝐴5 = 𝑢0 1 𝑢0 7 𝑢0 4 𝛤 𝜚1 𝛤 n7 .
|
||||
|
||||
7
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
In this scheme, we use a Bulletproofs-based instantiation of 𝑆𝑃 𝐾3𝐵 . the adversary 1 forges parameters (𝑐𝑡𝑥∗ , 𝑛𝑘∗ , 𝑟𝑘∗ , 𝐴𝑡𝑡𝑟𝑠∗ ), selects the
|
||||
Here we will briefly describe and provide a detailed proof process. random blind factor 𝑟∗ ∈ Z∗𝑞 , query 1 − 𝑄𝑢𝑒𝑟𝑦, and generates 𝐶𝑚∗ =
|
||||
∗
|
||||
Please refer to the Ref. [29,43]. 𝐶𝑜𝑚𝑚𝑖𝑡 (𝑛𝑘∗ , 𝑟𝑘∗ , 𝐴𝑡𝑡𝑟𝑠∗ ; 𝑟∗ ). Next, choose 𝑥∗𝑢 , 𝑠′∗ , 𝑡∗ ← Z∗𝑞 , calculate 𝛱𝑈1 :
|
||||
∑ ( ∗ ′∗ ∗ ∗ )
|
||||
1. (Prove.) First, perform binary decomposition on 𝑛, 𝑛 = 𝑘−1 𝑖
|
||||
𝑖=0 𝑏𝑖 2 ,
|
||||
⎧ 𝑥𝑢 , 𝑠 , 𝑡 , 𝑟 , 𝑛𝑘∗ , 𝑟𝑘∗ , 𝐴𝑡𝑡𝑟𝑠∗ ∶ ⎫
|
||||
where 𝑏 ∈ {0, 1}. Construct vector 𝐚𝐿 = (𝑏0 , 𝑏1 , … , 𝑏𝑘−1 ), 𝐚𝑅 = ⎪ 𝑥∗𝑢 𝑠′∗ ⎪
|
||||
∗ ⎪ ∗
|
||||
𝑋𝑢 = 𝑔1 𝑔2 ⎪( ∗ ∗ )
|
||||
𝐚𝐿 −𝟏𝑘 (𝑎𝑅,𝑖 = 𝑏𝑖 −1). Next, choose blind factor 𝛼, 𝜌 ← Z𝑞 , 𝒔𝐿 , 𝒔𝑅 ← 𝛱𝑈1 = 𝑆𝑃 𝐾1∗ ⎨ ( ) ′∗ ⎬ 𝑋𝑢 , 𝜁 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 .
|
||||
∗ 𝑎 𝑥∗ 𝑏 𝑠 ⋅ 𝐶𝑚∗𝑡∗
|
||||
Z𝑘𝑞 , compute the initialization commitment 𝐴 = ℎ𝛼 𝒈𝒂𝐿 𝒉𝒂𝑅 , 𝑆 = ⎪ 𝛬 𝜁 (= ( ) 𝑢 ) ⎪
|
||||
⎪ 𝛬 𝜄𝑧𝑘 𝐴𝑡𝑡𝑟𝑠∗ , 𝑖𝑎𝑢𝑥𝑧𝑘 = 1 ⎪
|
||||
ℎ𝜌 𝒈𝒔𝐿 𝒉𝒔𝑅 . Then, construct a non-interactive proof challenge 𝑦 = ⎩
|
||||
( ∗ ) ⎭ ( )
|
||||
( ) Sending 𝛱𝑈1 , 𝑖𝑎𝑢𝑥𝑧𝑘 , 𝑖𝑎𝑢𝑥𝑝𝑢𝑏 to the issuer, checks 𝜄𝑝𝑢𝑏 𝑖𝑎𝑢𝑥𝑝𝑢𝑏
|
||||
𝐻 𝐴, 𝑆, 𝐶𝑛 , 𝑧 = 𝐻(𝑦, 𝐴, 𝑆) based on Fiat–Shamir and polyno-
|
||||
( ) 1 ∗
|
||||
and validates 𝛱𝑈 , aborts if it fails, otherwise it selects a random
|
||||
mials 𝒍(𝑥) = 𝒂𝐿 − 𝑧𝟏𝑘 + 𝒔𝐿 𝑥, 𝒓(𝑥) = 𝑦𝑘 ◦ 𝒂𝑅 + 𝑧𝟏𝑘 + 𝒔𝑅 𝑥, calculate
|
||||
the inner product 𝑡 = ⟨𝒍(𝑥), 𝒓(𝑥)⟩, 𝜏𝑥 ← Z𝑝 , 𝑇 = 𝑔 𝑡 ℎ𝜏𝑥 . The final number 𝑠′′∗ ∈ Z∗𝑞 and performs 2 − 𝑄𝑢𝑒𝑟𝑦. Embed tuple = (, 𝑎 , 𝑏 ),
|
||||
′′∗ ∗
|
||||
challenge is 𝑥 = 𝐻(𝑧, 𝑦, 𝑇 ), generate response 𝒍 = 𝒍(𝑥), 𝒓 = register 𝑐𝑟𝑒𝑑 ∗ ∶= (𝜁 ∗ ⋅ (𝑏 )𝑠 ) ⋅ 𝑢𝑤 0
|
||||
, generate the forged Merkle
|
||||
tree 𝑇 ∗ , update the root node to 𝑇𝑟𝑜𝑜𝑡 ∗ , select 𝑧∗ , 𝑧∗ ← Z∗ , Calculate
|
||||
𝒓(𝑥), 𝑡̂ = ⟨𝒍, 𝒓⟩, 𝜏 = 𝜏𝑥 + 𝑥2 𝜌, 𝜇 = 𝛼 + 𝑥𝜌. Finally output the proof { 0 1 𝑞 }
|
||||
𝜋 = (𝐴, 𝑆, 𝑇 , 𝑡̂, 𝜏, 𝜇, 𝒍, 𝒓). ∗ ( ∗ ∗ ) ∗ ∗ 𝑤∗ ⋅𝑧∗
|
||||
𝛱𝑉1 = 𝑆𝑃 𝐾2∗ 𝑧0 , 𝑧1 , 𝑎, 𝑏 ∶ 𝑌𝑢∗ = 𝑎 𝑏 ∧ ∗ = (𝜁 ∗ ⋅ (𝑏 )𝑠′′ )𝑧1 ⋅ 𝑢0 0
|
||||
2. (Verify.) Upon receiving the commitment 𝐶𝑛 , proof 𝜋, recal-
|
||||
( ) ∗
|
||||
(𝑌𝑢∗ , 𝑠′′∗ , 𝑘∗ , ∗ ), send (𝛱𝑉1 , 𝑠′′∗ , 𝑘∗ , 𝜃 ∗ ) to adversary 1 , 1 calculate
|
||||
culate the challenge 𝑦 = 𝐻 𝐴, 𝑆, 𝐶𝑛 , 𝑧 = 𝐻(𝑦, 𝐴, 𝑆), 𝑥 =
|
||||
⟨ ⟩ 𝑠∗ = 𝑠′∗ + 𝑠′′∗ and save to local.
|
||||
𝐻(𝑧, 𝑦, 𝑇 ). Next, compute offset value 𝛿𝑦 = 𝑦𝑘 , 𝑧𝟏𝑘 + 𝑧2 2𝑘 , and
|
||||
𝑘 ( )𝑧𝟏 𝑘 +𝑧2 2𝑘 𝑄𝑢𝑒𝑟𝑦3 : In this phase 1 to show the proof, using zero knowledge
|
||||
reconstruct the commitment 𝑃 = 𝐴 ⋅ 𝑆 𝑥 ⋅ ℎ−𝜇 ⋅ 𝒈𝑧𝟏 ⋅ 𝒉′ ,
|
||||
? 2
|
||||
simulator , run algorithm 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑 forged 𝑡𝑜𝑘𝑒𝑛∗ and 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤
|
||||
where 𝒉′ = 𝒉◦𝑦𝑘 . Then, verify inner product 𝑔 𝑡̂ℎ𝜏 = 𝑇 ⋅ 𝐶𝑛𝑍 ⋅ 𝑔 𝛿𝑦 . interact. Adversary 1 forges the message 𝑚𝑠𝑔 ∗ requesting access to
|
||||
If passed, accept, otherwise, reject. . selects 𝑛𝑜𝑛𝑐𝑒∗ , conducts 3 − 𝑄𝑢𝑒𝑟𝑦 query, calculates 𝑟∗ , and
|
||||
returns it to adversary 1 . Adversary 3 − 𝑄𝑢𝑒𝑟𝑦 hash verification,
|
||||
5.2. Theoretical security analysis if by selecting public attribute 𝑎𝑡𝑡𝑟∗𝑖 ∈ 𝐴𝑇 ∗
|
||||
( 𝑇 𝑅 , the secret attribute )is
|
||||
𝑎𝑡𝑡𝑟∗𝑗 ∉ 𝐴𝑇 𝑇 𝑅∗ , calculate 𝐶𝑚∗ = Commit 𝑛𝑘∗ , 𝑟𝑘∗ , 𝑎𝑡𝑡𝑟∗𝑗 ∉ 𝐴𝑇 𝑇 𝑅∗ ; 𝑟∗ ,
|
||||
5.2.1. Proof of Game1 ( )
|
||||
select 𝑛∗𝑗 0 ≤ 𝑛∗𝑗 < 𝑘∗ , 𝛼0∗ ← Z∗𝑞 , generate 𝛱 ̃ ∗ , send
|
||||
{ } 𝑖=𝑛 ( )
|
||||
Theorem 1. The scheme is unforgeable if the DLP and DDH assumptions ̃ ∗ , 𝑎𝑢𝑥𝑖
|
||||
(𝛱 ∗
|
||||
, 𝜃 ∗ , 𝑇𝑟𝑜𝑜𝑡 , 𝛷′ , 𝑎𝑡𝑡𝑟∗𝑖 ∈ 𝐴𝑇 𝑇 𝑅∗ ) to .
|
||||
𝑖=1
|
||||
hold. Forgery. Adversary 1 outputs the forged certificate 𝑐𝑟𝑒𝑑 ∗ and the
|
||||
corresponding authentication path 𝜃 ∗ , which meets the condition that
|
||||
Proof. Suppose that the adversary 1 forges the credential with the 𝑐𝑟𝑒𝑑 ∗ was not generated through legal issuance. running )algorithm
|
||||
( ( ) { }
|
||||
non-negligible probability 𝜖, we construct reduction algorithm to VerifyShow, 𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤 𝑝𝑝, 𝑉 , 𝑐𝑟𝑒𝑑 ∗ , 𝑇𝑟𝑜𝑜𝑡 ∗ ̃ ∗ , 𝑎𝑢𝑥𝑖 𝑖=𝑖 = 1.
|
||||
,𝛱 𝑖=1
|
||||
solve the DLP or CDH problem with the non-negligible advantage Then, requery 3 by rewinding technique to obtain 𝑟∗ , modify the
|
||||
𝜖 − 𝑛𝑒𝑔𝑙. The reduction algorithm embeds the group parameter tuple new challenge to 𝑐 ≠( 𝑐 ′ , compute the response and output ̃ ′∗
|
||||
) 𝛱 to
|
||||
= (, 𝑎 , 𝑏 ) into the problem instance, can control and program
|
||||
extract witness 𝑤∗ = 𝑥∗𝑢 , 𝑠∗ , 𝑡∗ , 𝑟∗ , 𝑛𝑘∗ , 𝑟𝑘∗ , 𝑎𝑡𝑡𝑟∗𝑗 ∉ 𝐴𝑇 𝑇 𝑅∗ , separate
|
||||
the random oracle, and simulates the whole system: ∗ ∗ ∗ ∗ ∗ ∗
|
||||
Setup. Challenger 1 run system initialization algorithm from the witness 𝜁 ′∗ = (𝑎 )𝑥𝑢 (𝑏 )𝑠 ⋅ 𝐶𝑚∗𝑡 = (𝑎𝑏 )𝑥𝑢 ⋅𝑠 ⋅ 𝐶𝑚∗𝑡 . According
|
||||
𝑆𝑒𝑡𝑢𝑝(1𝜆 , 1ℎ , 1𝑚 ) generate 𝑝𝑝, send 𝑝𝑝 to simulator . 1 save issuer to the above proof, if the forgery credential 𝑐𝑟𝑒𝑑 ∗ and the corresponding
|
||||
private key 𝑖𝑠𝑘 = (𝑦1 , 𝑦2 ). authentication path 𝜃 ∗ make it difficult to compute 𝑎𝑏 on G, the
|
||||
Query. In this phase, 1 query random Oracle − 𝑄𝑢𝑒𝑟𝑦, 𝑄𝑢𝑒𝑟𝑦2 , probability that adversary 1 will successfully forge a credential for the
|
||||
and 𝑄𝑢𝑒𝑟𝑦3 , 1 random response and recording. first time is 𝜖, and the probability of a single retry is about 𝜖 2 . By the
|
||||
− 𝑄𝑢𝑒𝑟𝑦: The adversary 1 can query the random oracle 1 , 2 , 3 . universal bifurcation Lemma, since adversary 1 performs 𝑞𝐻3 queries.
|
||||
Before any hash query, will prepare three empty hash lists 1,2,3 , The probability of success is 𝜖 2 ∕𝑞𝐻3 , then the advantage of simulator
|
||||
and define the query number size as 𝑞𝐻1 , 𝑞𝐻2 , 𝑞𝐻3 to record the query to break CDH hard problem successfully is 𝜖 2 ∕𝑞𝐻3 − 𝑛𝑒𝑔𝑙.
|
||||
response. [ ]
|
||||
1 − 𝑄𝑢𝑒𝑟𝑦: Before 1 query, randomly selected 𝑖∗1 ∈ 1, 𝑞𝐻1 , the 5.2.2. Proof of Game2
|
||||
input attribute 𝑎𝑡𝑡𝑟𝑖 , record of all the queries in the list 1 , and make
|
||||
a response. If 𝑖 = 𝑖∗1 , return values in the list, otherwise generated Theorem 2. The Scheme is anonymity and unlinkability if the CDH
|
||||
1 (𝑎𝑡𝑡𝑟𝑖 ), records (𝑖, 𝑎𝑡𝑡𝑟𝑖 , 1 (𝑎𝑡𝑡𝑟𝑖 )) in 1 . assumption hold.
|
||||
[ ]
|
||||
2 − 𝑄𝑢𝑒𝑟𝑦: Before the 2 query, randomly selects 𝑖∗2 ∈ 1, 𝑞𝐻2 ,
|
||||
Proof. Suppose that the adversary 2 distinguishes credentials with
|
||||
after entering each user time period 𝑒𝑝𝑜𝑐ℎ𝑖 , and the maximum number
|
||||
a non-negligible advantage 𝜖, and construct a reduction algorithm
|
||||
of credentials to be initialized 𝑘𝑖 , records all queries in the list 2 ,
|
||||
to solve the DDH problem with a non-negligible advantage 𝜖 − 𝑛𝑒𝑔𝑙.
|
||||
and responds. If 𝑖 = 𝑖∗2 , returns the value in the list, otherwise
|
||||
generates 2 (𝑒𝑝𝑜𝑐ℎ ∥ 𝑘) with the following Eq. (2): The reduction algorithm embedded the group parameter tuple =
|
||||
{ (, 𝑎 , 𝑏 , 𝑐 ) into the DDH problem instance, and the adversary 2
|
||||
( ) 𝑤∗ , 𝑖 = 𝑖∗2 determined whether 𝑐 = 𝑎𝑏 or random, and simulated the whole
|
||||
2 𝑒𝑝𝑜𝑐ℎ𝑖 ∥ 𝑘𝑖 = . (2)
|
||||
𝑤 , otherwise process:
|
||||
( (𝑖 ) ( ))
|
||||
Then, record 𝑖, epoch 𝑖 ∥ 𝑘𝑖 , 2 𝑒𝑝𝑜𝑐ℎ𝑖 ∥ 𝑘𝑖 in the [ list ]2 . Setup. Same with the initialization of Game 1.
|
||||
∗
|
||||
3 −𝑄𝑢𝑒𝑟𝑦: Before 3 queries, randomly selected 𝑖3 ∈ 1, 𝑞𝐻3 , the Query. Adversary 2 can continue to query issuance and show, but
|
||||
input random 𝑛𝑜𝑛𝑐𝑒𝑖 and message 𝑚𝑠𝑔𝑖 , record of all the queries in cannot query revocation or presentation of challenge credentials. At the
|
||||
the list 3 , and respond. If 𝑖 = 𝑖∗3 , return values in the list, otherwise same time also can query 1 − 𝑄𝑢𝑒𝑟𝑦.
|
||||
generated 2 (𝑛𝑜𝑛𝑐𝑒 ∥ 𝑚𝑠𝑔) in the following Eq. (3): Challenge. Adversary 2 submits two attribute sets 𝐴𝑡𝑡𝑟𝑠∗0 and
|
||||
{ 𝐴𝑡𝑡𝑟𝑠∗1 , that satisfy the same access policy to challenger 2 . Since the
|
||||
( ) 𝑟∗ , 𝑖 = 𝑖∗3
|
||||
2 𝑛𝑜𝑛𝑐𝑒𝑖 ∥ 𝑚𝑠𝑔𝑖 = . (3) parameter related to the attribute set in zero-knowledge is 𝜁 ′ . The
|
||||
𝑟𝑖 , otherwise
|
||||
challenger 2 calls the simulator to simulate the SPK and prove
|
||||
( ( ) ( ))
|
||||
Then, record 𝑖, 𝑛𝑜𝑛𝑐𝑒𝑖 ∥ 𝑚𝑠𝑔𝑖 , 2 𝑛𝑜𝑛𝑐𝑒𝑖 ∥ 𝑚𝑠𝑔𝑖 in the list 3 , the embedding group parameter tuple = (, 𝑎 , 𝑏 , 𝑐 ), randomly
|
||||
where oracle 2 and 3 share a hash function. 𝑄𝑢𝑒𝑟𝑦2 : In this phase, select 𝑎, 𝑏 ← Z∗𝑞 , and calculate 𝜁1′∗ . Select 𝑐 ← Z∗𝑞 calculate 𝜁2′∗ . Next,
|
||||
|
||||
8
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
Table 3
|
||||
Average times of cryptographic and Merkle tree operations.
|
||||
Symbol Definition secp256k1 (128-bit security) BLS12-381 (128-bit security)
|
||||
100 s/Leaves 1000 s/Leaves 100 s/Leaves 1000 s/Leaves
|
||||
𝑇𝑏𝑝 Bilinear pairing operation time – – 0.9162 ms 0.9466 ms
|
||||
𝑇ℎ Hash computation time 0.0003 ms 0.0000 ms 0.0001 ms 0.0000 ms
|
||||
𝑇𝑒𝑝 Exponentiation time in group G 0.0211 ms 0.0314 ms 0.2606 ms 0.2677 ms
|
||||
G1 :0.3958 ms G1 :0.2686 ms
|
||||
𝑇𝑚𝑝−𝑒𝑐 Elliptic curve point multiplication time 0.0254 ms 0.0234 ms
|
||||
G2 :0.8140 ms G2 :0.8009 ms
|
||||
G1 :0.0007 ms G1 :0.0006 ms
|
||||
𝑇𝑎𝑑𝑑−𝑒𝑐 Elliptic curve point addition time 0.0462 ms 0.0530 ms
|
||||
G2 :0.0018 ms G2 :0.0018 ms
|
||||
𝑇𝜅𝐺 Generation algorithm of tree 𝑇𝜅 0.0025 ms 0.0024 ms 0.0029 ms 0.0023 ms
|
||||
𝑇𝜅𝑉 Verification algorithm of tree 𝑇𝜅 0.0004 ms 0.0002 ms 0.0020 ms 0.0002 ms
|
||||
𝑇𝜅𝑈 Update algorithm of tree 𝑇𝜅 0.0002 ms 0.0002 ms 0.0003 ms 0.0003 ms
|
||||
|
||||
|
||||
Table 4
|
||||
Computation and communication cost analysis.
|
||||
Algorithms Parameter Phase Computation cost Communication cost
|
||||
𝑆𝑒𝑡𝑢𝑝 𝑝𝑝 – 2𝑇𝑒𝑝 (13 + 𝑚)|G|
|
||||
𝐼𝑠𝑠𝑢𝑒𝑆𝑒𝑡𝑢𝑝𝐼 (𝐼, 𝜄𝑝𝑢𝑏 ) – – –
|
||||
𝑆ℎ𝑜𝑤𝑆𝑒𝑡𝑢𝑝𝑉 𝑉 – – –
|
||||
𝐶𝑚 – (3 + 𝑚)𝑇𝑒𝑝 + 𝑚𝑇ℎ + 3𝑇𝑚𝑝−𝑒𝑐 |G|
|
||||
𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞𝑈
|
||||
Proof (16 + 𝑚)𝑇𝑒𝑝 + 3𝑇𝑚𝑝−𝑒𝑐 2|G| + 5|Z𝑞 |
|
||||
𝛱𝑈1
|
||||
Verify 7𝑇𝑒𝑝 –
|
||||
𝑐𝑟𝑒𝑑 – 1𝑇𝑒𝑝 + 2𝑇𝑚𝑝−𝑒𝑐 + 1𝑇ℎ –
|
||||
𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡𝐼 𝑇𝜅 – 𝑇𝜅𝐺 –
|
||||
Proof 8𝑇𝑒𝑝 + 1𝑇ℎ + 3𝑇𝑚𝑝−𝑒𝑐 2|G| + 6|Z𝑞 |
|
||||
𝛱𝑉1
|
||||
Verify 6𝑇𝑒𝑝 –
|
||||
𝛱̃ Proof 25𝑇𝑒𝑝 5|G| + 7|Z𝑞 |
|
||||
𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑𝑈
|
||||
{𝑎𝑢𝑥𝑖 }𝑛𝑖=1 – – i|Z𝑞 |
|
||||
𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤𝑉 – Verify 26𝑇𝑒𝑝 + 𝑇𝜅𝑉 –
|
||||
𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑 𝑇𝜅′ – 𝑇𝜅𝑈 –
|
||||
|
||||
Note*: i is the number of access criteria defined per verifier.
|
||||
|
||||
|
||||
simulator selects 𝑏 ← ( {0, 1}, and uses 𝐴𝑡𝑡𝑟𝑠𝑏 ∗ to generate the cre- ) 6.2. Algorithm computation and communication cost analysis
|
||||
{ } ( )
|
||||
dential display 𝛱̃ 𝑏 . Send 𝛱 ̃ 𝑏 , 𝑎𝑢𝑥𝑖 𝑖=𝑖 , 𝜃, 𝑇𝑟𝑜𝑜𝑡 , 𝛷′ , 𝑎𝑡𝑡𝑟𝑖 ∈ 𝐴𝑇 𝑇 𝑅
|
||||
𝑖=1
|
||||
to adversary 2 . Table 4 shows the computational cost and communication cost
|
||||
Guess. 2 guesses 𝑏′ from the output 𝛱 ̃ 𝑏 , and the advantage is of the proposed algorithm in the scheme. The algorithm includes
|
||||
| [ ] |
|
||||
defined as: |Pr 𝑏′ = 𝑏 − 12 |. 8 algorithms as follows. 𝑆𝑒𝑡𝑢𝑝, 𝐼𝑠𝑠𝑢𝑒𝑆𝑒𝑡𝑢𝑝𝐼 , 𝑆ℎ𝑜𝑤𝑆𝑒𝑡𝑢𝑝𝑉 , 𝐼𝑠𝑠𝑢𝑒𝑅𝑒𝑞𝑈 ,
|
||||
| |
|
||||
𝐼𝑠𝑠𝑢𝑒𝐺𝑟𝑎𝑛𝑡𝐼 , 𝑆ℎ𝑜𝑤𝐶𝑟𝑒𝑑𝑈 ,
|
||||
According to the above proof, if two attribute sets satisfying the
|
||||
𝑉 𝑒𝑟𝑖𝑓 𝑦𝑆ℎ𝑜𝑤𝑉 and 𝑅𝑒𝑣𝑜𝑘𝑒𝐶𝑟𝑒𝑑. The computational cost increases
|
||||
same access policy are (submitted 𝐴𝑡𝑡𝑟𝑠∗0 , 𝐴𝑡𝑡𝑟𝑠 ∗ ̃
|
||||
) 1 . It( is difficult for 𝛱)𝑏 linearly with the number of attributes 𝑚. We compared the single user
|
||||
to distinguish between 𝑎 , 𝑏 , 𝑎⋅𝑛𝑘+𝑏⋅𝑟𝑘+𝑎𝑏⋅𝑟 and 𝑎 , 𝑏 , 𝑎⋅𝑛𝑘+𝑏⋅𝑟𝑘+𝑐⋅𝑟
|
||||
in Table 4 cases for each verifier ℶ access criteria general computation
|
||||
on G, then adversary 2 succeeds in distinguishing credentials with
|
||||
and communication costs. Respectively, (94 + 2 𝑚)𝑇𝑒𝑝 + (𝑚 + 2)𝑇ℎ +
|
||||
non-negligible probability 𝜖∕𝑞𝐻1 . Then the advantage of the simulator
|
||||
11𝑇𝑚𝑝−𝑒𝑐 + 𝑇𝜅𝐺 + 𝑇𝜅𝑉 and (22 + 𝑚)|G| + (18 + ℶ)|Z𝑞 |. The cost of a single
|
||||
to break the DDH hard problem successfully is 𝜖∕𝑞𝐻1 − 𝑛𝑒𝑔𝑙.
|
||||
algorithm is shown in Table 4 below:
|
||||
Note that even if the underlying Merkle path remains the same
|
||||
for repeated authentications, the simulator ensures that each creden-
|
||||
6.3. Computation and communication cost comparison
|
||||
tial presentation is randomized. Therefore, the adversary’s advantage
|
||||
does not increase by observing identical path values, which remain
|
||||
In Table 1 of Section 2, we have compared the functions of the ex-
|
||||
computationally indistinguishable across sessions.
|
||||
isting schemes [19,29–31,33–35]. The scheme [32–34] satisfies the 𝑘-
|
||||
times period anonymous authentication function. Since the scheme [32]
|
||||
Theorem 3. The Scheme is attribute Privacy if the CDH assumption hold.
|
||||
is constructed based on bilinear pairing. Here, we compare the scheme
|
||||
Similar anonymity, but in view of the properties rather than identity.
|
||||
[33,34] with the proposed scheme in the computation cost processes of
|
||||
6. Performance analysis issuance, show and verification. Using the lightweight curve secp256k1
|
||||
environment, as shown in Table 5 and Fig. 3. In Table 1, the scheme
|
||||
6.1. Experimental setup [33] does not support the attribute selection disclosure function and
|
||||
does not increase with the increase of the number of attributes 𝑚.
|
||||
The scheme is based on AMD Ryzen9 7945HX processor, Rust 1.75 Therefore, the data results in Fig. 3 show that our scheme is better
|
||||
and Ubuntu 22.04 LTS environment, and the error is controlled within than the scheme [33] when the number of attributes 𝑚 is small.
|
||||
5%. The test program is written in 𝑅𝑢𝑠𝑡 and performs benchmark Throughout the entire process, the overall performance was superior
|
||||
evaluations on SHA-256 hacks, elliptic curve operations, and Merkle to the scheme [34]. Finally, the data results show that our scheme
|
||||
tree operations with the 128-bit security secp256k1, BLS12-381, and is superior to the existing schemes under the condition of similar
|
||||
sha2 libraries. The experiment measured the average time of 100 and functions.
|
||||
1000 operations (as shown in Table 3). All tests were compiled based In addition to the above experimental comparison, we also added
|
||||
on –release optimization to ensure accurate and reliable performance the proposed scheme to test the computational overhead under two
|
||||
results. different curve environments, BLS12-381 supporting bilinear pairing
|
||||
|
||||
9
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
Table 5
|
||||
Computation cost comparison.
|
||||
Scheme Computation cost (ms)
|
||||
Credential issuance Certificate showing Authentication credentials
|
||||
[33] 15𝑇𝑒𝑝 + 10𝑇𝑚𝑝−𝑒𝑐 + 2𝑇𝑎𝑑𝑑−𝑒𝑐 31𝑇𝑒𝑝 + 6𝑇𝑚𝑝−𝑒𝑐 + 𝑇ℎ 20𝑇𝑒𝑝 + 9𝑇𝑚𝑝−𝑒𝑐 + 𝑇ℎ
|
||||
[34] (5 𝑚 + 40)𝑇𝑒𝑝 + (3 𝑚 + 4)𝑇ℎ (𝑚 + 22)𝑇𝑒𝑝 + 𝑇ℎ (𝑚 + 23)𝑇𝑒𝑝
|
||||
Our Scheme (𝑚 + 35)𝑇𝑒𝑝 + (𝑚 + 2)𝑇ℎ + 11𝑇𝑚𝑝−𝑒𝑐 + 𝑇𝜅𝐺 (16 + 𝑚)𝑇𝑒𝑝 + 𝑚𝑇ℎ 19𝑇𝑒𝑝 + 𝑇ℎ + 𝑇𝜅𝑉
|
||||
|
||||
|
||||
|
||||
|
||||
(a) (b) (c) (d)
|
||||
|
||||
|
||||
Fig. 3. Computation cost comparison.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 4. Computation cost comparison of different curves.
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 5. Communication cost comparison.
|
||||
|
||||
|
||||
and lightweight curve secp256k1, as shown in Fig. 4. The exper- 7. Conclusion
|
||||
imental results show that the scheme has more advantages under
|
||||
lightweight curve. It is suggested to apply the proposed scheme under In this paper, we propose a 𝑘-times periodic anonymous authen-
|
||||
curve secp256k1.
|
||||
tication that does not require the issuer to hold a key and supports
|
||||
Finally, the communication cost of the existing scheme [33,34] is
|
||||
the access criteria. Compared with other existing 𝑘-Times periodic
|
||||
compared and calculated based on the size of the data to be transmitted
|
||||
anonymous authentication schemes, the proposed scheme not only has
|
||||
during the anonymous certificate display process. We test the commu-
|
||||
lower computational cost, but also eliminates the need for the issuer to
|
||||
nication efficiency on curve secp256k1, where the group element and
|
||||
hold the issuing information or the user key, and only needs to upload
|
||||
integer size of curve secp256k1 are |G| = 264𝑏𝑖𝑡𝑠 = 33𝑏𝑦𝑡𝑒𝑠, |Z𝑞 | =
|
||||
256𝑏𝑖𝑡𝑠 = 32𝑏𝑦𝑡𝑒𝑠, respectively. In the test, it is assumed that the the root path of the Merkle tree to the blockchain or public panel, which
|
||||
access criterion ℶ is 1, and the number of user attributes is 1. The ensures that the subsequent authentication can still be carried out even
|
||||
communication costs of the schemes [33,34] are respectively 8|G| + in the case of the failure of the issuing center. In terms of security,
|
||||
11|Z𝑞 |, and (𝑚 + 14)|G| + 8|Z𝑞 |. The parameters that our scheme needs it satisfies a series of DAC security properties, including anonymity,
|
||||
to transmit for presentation are (𝛱, ̃ {𝑎𝑢𝑥𝑖 }𝑛 , 𝑋0 , 𝜁 ′ , 𝜂, 𝛤 , 𝜃), where 𝛱̃ = unlinkability, unforgeability and attribute privacy. The limitation of
|
||||
𝑖=1
|
||||
(𝑐, 𝐴′1 , 𝐴′2 , 𝐴′3 , 𝐴′4 , 𝐴′5 , 𝐴′6 , 𝐴′7 , 𝐴′8 ). Therefore, the total communication current schemes is that they rely on classical cryptography, which
|
||||
cost during the transmission process is 4|G| + (9 + ℶ)|Z𝑞 |. As shown cannot resist quantum computing attacks. To address this challenge,
|
||||
in Fig. 5. we plan to integrate quantum-resistant cryptographic frameworks, such
|
||||
|
||||
10
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
as lattice-based signature, coding cryptography, or multivariate poly- [14] C. Garman, M. Green, I. Miers, Decentralized anonymous credentials, in: Proceed-
|
||||
nomial encryption in future research to construct periodic 𝑘-times ings of the 21st NDSS, 2014, URL: https://www.ndss-symposium.org/ndss2014/
|
||||
authentication schemes with post-quantum security. decentralized-anonymous-credentials.
|
||||
[15] D. Derler, C. Hanser, D. Slamanig, A new approach to efficient revocable
|
||||
attribute-based anonymous credentials, in: Cryptography and Coding, 2015, pp.
|
||||
CRediT authorship contribution statement 57–74.
|
||||
[16] T. Bui, T. Aura, Application of public ledgers to revocation in distributed access
|
||||
Hongyan Di: Writing – original draft, Methodology, Formal analy- control, in: Information and Communications Security, 2018, pp. 781–792.
|
||||
[17] A. Sonnino, M. Al-Bassam, S. Bano, S. Meiklejohn, G. Danezis, Coconut: Thresh-
|
||||
sis, Data curation, Conceptualization. Yinghui Zhang: Writing – review
|
||||
old issuance selective disclosure credentials with applications to distributed
|
||||
& editing, Supervision, Project administration, Methodology, Funding ledgers, in: 26th Annual Network and Distributed System Security Symposium,
|
||||
acquisition. Ziqi Zhang: Writing – original draft, Formal analysis, Data NDSS, 2019, URL: https://arxiv.org/pdf/1802.07344.
|
||||
curation. Yibo Pang: Project administration, Formal analysis, Data [18] H. Halpin, Nym credentials: Privacy-preserving decentralized identity with
|
||||
curation. Rui Guo: Writing – original draft, Methodology, Formal anal- blockchains, in: 2020 Crypto Valley Conference on Blockchain Technology,
|
||||
ysis. Yangguang Tian: Writing – original draft, Project administration, CVCBT, 2020, pp. 56–67, http://dx.doi.org/10.1109/CVCBT50464.2020.00010.
|
||||
[19] H. Cui, M. Whitty, A. Miyaji, Z. Li, A blockchain-based digital identity manage-
|
||||
Methodology, Funding acquisition. ment system via decentralized anonymous credentials, in: Proceedings of the 6th
|
||||
ACM International Symposium on Blockchain and Secure Critical Infrastructure,
|
||||
Declaration of competing interest 2025, pp. 1–11, http://dx.doi.org/10.1145/3659463.3660027.
|
||||
[20] C. Lin, D. He, H. Zhang, L. Shao, X. Huang, Privacy-enhancing decentralized
|
||||
anonymous credential in smart grids, Comput. Stand. Interfaces 75 (2021)
|
||||
The authors declare that they have no known competing finan-
|
||||
103505, http://dx.doi.org/10.1016/j.csi.2020.103505.
|
||||
cial interests or personal relationships that could have appeared to [21] Z. Ma, J. Zhang, Y. Guo, Y. Liu, X. Liu, W. He, An efficient decentralized key
|
||||
influence the work reported in this paper. management mechanism for VANET with blockchain, IEEE Trans. Veh. Technol.
|
||||
69 (2020) 5836–5849, http://dx.doi.org/10.1109/TVT.2020.2972923.
|
||||
Data availability [22] J. Zhang, J. Cui, H. Zhong, I. Bolodurina, L. Liu, Intelligent drone-assisted
|
||||
anonymous authentication and key agreement for 5G/B5G vehicular ad-hoc
|
||||
networks, IEEE Trans. Netw. Sci. Eng. 8 (2021) 2982–2994, http://dx.doi.org/
|
||||
Data will be made available on request. 10.1109/TNSE.2020.3029784.
|
||||
[23] D. Liu, H. Wu, C. Huang, J. Ni, X. Shen, Blockchain-based credential management
|
||||
for anonymous authentication in SAGVN, IEEE J. Sel. Areas Commun. 40 (2022)
|
||||
References 3104–3116, http://dx.doi.org/10.1109/JSAC.2022.3196091.
|
||||
[24] D. Liu, H. Wu, J. Ni, X. Shen, Efficient and anonymous authentication with
|
||||
[1] K.Y. Lam, C.H. Chi, Identity in the internet-of-things (IoT): New challenges and succinct multi-subscription credential in SAGVN, IEEE Trans. Intell. Transp. Syst.
|
||||
opportunities, in: Information and Communications Security, 2016, pp. 18–26. 23 (2022) 2863–2873, http://dx.doi.org/10.1109/TITS.2022.3147354.
|
||||
[2] K. Shafique, B.A. Khawaja, F. Sabir, S. Qazi, M. Mustaqim, Internet of things [25] L. Wei, Y. Zhang, J. Cui, H. Zhong, I. Bolodurina, D. He, A threshold-based full-
|
||||
(IoT) for next-generation smart systems: A review of current challenges, future decentralized authentication and key agreement scheme for VANETs powered
|
||||
trends and prospects for emerging 5G-IoT scenarios, IEEE Access 8 (2020) by consortium blockchain, IEEE Trans. Mob. Comput. 23 (2024) 12505–12521,
|
||||
23022–23040, http://dx.doi.org/10.1109/ACCESS.2020.2970118. http://dx.doi.org/10.1109/TMC.2024.3412106.
|
||||
[3] L. Ante, C. Fischer, E. Strehle, A bibliometric review of research on digital [26] M. Zeng, J. Cui, Q. Zhang, H. Zhong, D. He, Efficient revocable cross-domain
|
||||
identity: Research streams, influential works and future research paths, J. Manuf. anonymous authentication scheme for IIoT, IEEE Trans. Inf. Forensics Secur. 20
|
||||
Syst. 62 (2022) 523–538, http://dx.doi.org/10.1016/j.jmsy.2022.01.005. (2025) 996–1010, http://dx.doi.org/10.1109/TIFS.2024.3523198.
|
||||
[4] M.A. Olivero, A. Bertolino, F.J.D. Mayo, M.J.E. Cuaresma, I. Matteucci, Digital [27] I. Teranishi, J. Furukawa, K. Sako, K-times anonymous authentication (extended
|
||||
persona portrayal: Identifying pluridentity vulnerabilities in digital life, J. Inf. abstract), in: Advances in Cryptology - ASIACRYPT 2004, 2004, pp. 308–322.
|
||||
Secur. Appl. 52 (2020) 102492, URL: https://api.semanticscholar.org/CorpusID: [28] L. Nguyen, R. Safavi-Naini, Dynamic k-times anonymous authentication, in:
|
||||
215881538. Applied Cryptography and Network Security, 2005, pp. 318–333.
|
||||
[29] M.H. Au, W. Susilo, Y. Mu, Constant-size dynamic k-TAA, in: Security and
|
||||
[5] M.S. Ferdous, F. Chowdhury, M.O. Alassafi, In search of self-sovereign identity
|
||||
Cryptography for Networks, 2006, pp. 111–125.
|
||||
leveraging blockchain technology, IEEE Access 7 (2019) 103059–103079, http:
|
||||
[30] U. Chaterjee, D. Mukhopadhyay, R.S. Chakraborty, 3PAA: A private PUF protocol
|
||||
//dx.doi.org/10.1109/ACCESS.2019.2931173.
|
||||
for anonymous authentication, IEEE Trans. Inf. Forensics Secur. 16 (2021)
|
||||
[6] A. Shabtai, Y. Elovici, L. Rokach, List of data breaches and cyber attacks in 2023.
|
||||
756–769, http://dx.doi.org/10.1109/TIFS.2020.3021917.
|
||||
Media report. IT governance, 2023, URL: https://www.itgovernance.co.uk/blog/
|
||||
[31] J. Huang, W. Susilo, F. Guo, G. Wu, Z. Zhao, Q. Huang, An anonymous
|
||||
list-of-data-breaches-andcyber-attacks-in-2023.
|
||||
authentication system for pay-as-you-go cloud computing∗ *, IEEE Trans. Depend-
|
||||
[7] P.C. Bartolomeu, E. Vieira, S.M. Hosseini, J. Ferreira, Self-sovereign identity:
|
||||
able Secur. Comput. 19 (2) (2022) 1280–1291, http://dx.doi.org/10.1109/TDSC.
|
||||
Use-cases, technologies, and challenges for industrial IoT, in: 2019 24th IEEE
|
||||
2020.3007633.
|
||||
International Conference on Emerging Technologies and Factory Automation,
|
||||
[32] J. Camenisch, S. Hohenberger, M. Kohlweiss, A. Lysyanskaya, M. Meyerovich,
|
||||
ETFA, 2019, pp. 1173–1180, http://dx.doi.org/10.1109/ETFA.2019.8869262.
|
||||
How to win the clonewars: efficient periodic n-times anonymous authentication,
|
||||
[8] European Union, Regulation (EU) 2016/679 of the European parliament and of
|
||||
in: Proceedings of the 13th ACM Conference on Computer and Communications
|
||||
the council of 27 april 2016 on the protection of natural persons with regard
|
||||
Security, 2006, pp. 201–210, http://dx.doi.org/10.1145/1180405.1180431.
|
||||
to the processing of personal data and on the free movement of such data,
|
||||
[33] B. Lian, G. Chen, M. Ma, J. Li, Periodic 𝐾 -times anonymous authentication with
|
||||
and repealing directive 95/46/EC (general data protection regulation), 2016,
|
||||
efficient revocation of violator’s credential, IEEE Trans. Inf. Forensics Secur. 10
|
||||
[Online] Available: URL: https://eur-lex.europa.eu/eli/reg/2016/679/oj/eng.
|
||||
(3) (2015) 543–557, http://dx.doi.org/10.1109/TIFS.2014.2386658.
|
||||
[9] A. Mühle, A. Grüner, T. Gayvoronskaya, C. Meinel, A survey on essential [34] Y. Yang, W. Xue, J. Sun, G. Yang, Y. Li, H. Hwa Pang, R.H. Deng, PkT-
|
||||
components of a self-sovereign identity, Comput. Sci. Rev. 30 (2018) 80–86, SIN: A secure communication protocol for space information networks with
|
||||
http://dx.doi.org/10.1016/j.cosrev.2018.10.002. periodic k-time anonymous authentication, IEEE Trans. Inf. Forensics Secur.
|
||||
[10] European Union, Regulation (EU) 2024/1183 of the European parliament and (2024) 6097–6112, http://dx.doi.org/10.1109/TIFS.2024.3409070.
|
||||
of the council of 5 June 2024 on European digital identity wallets, 2024, URL: [35] C. Wiraatmaja, S. Kasahara, Scalable anonymous authentication scheme based
|
||||
https://eur-lex.europa.eu/eli/reg/2024/1183/oj. (Accessed 13 October 2024). on zero-knowledge set-membership proof, Distrib. Ledger Technol. 4 (2025)
|
||||
[11] D. Chaum, Security without identification: transaction systems to make big http://dx.doi.org/10.1145/3676285.
|
||||
brother obsolete, Commun. ACM 28 (1985) 1030–1044, http://dx.doi.org/10. [36] R. Canetti, Y. Chen, J. Holmgren, A. Lombardi, G.N. Rothblum, R.D. Rothblum,
|
||||
1145/4372.4373. D. Wichs, Fiat-Shamir: from practice to theory, 2019, http://dx.doi.org/10.1145/
|
||||
[12] D. Chaum, Showing credentials without identification. Signatures transferred 3313276.3316380.
|
||||
between unconditionally unlinkable pseudonyms, in: Proc. of a Workshop on [37] J. Camenisch, M. Stadler, Efficient group signature schemes for large groups, in:
|
||||
the Theory and Application of Cryptographic Techniques on Advances in Advances in Cryptology — CRYPTO ’97, 1997, pp. 410–424.
|
||||
Cryptology—EUROCRYPT ’85, 1986, pp. 241–244. [38] M. Rosenberg, J. White, C. Garman, I. Miers, zk-creds: Flexible anonymous
|
||||
[13] J. Camenisch, A. Lysyanskaya, An efficient system for non-transferable anony- credentials from zkSNARKs and existing identity infrastructure, in: 2023 IEEE
|
||||
mous credentials with optional anonymity revocation, in: Advances in Cryptology Symposium on Security and Privacy, SP, 2023, pp. 790–808, http://dx.doi.org/
|
||||
— EUROCRYPT 2001, 2001, pp. 93–118. 10.1109/SP46215.2023.10179430.
|
||||
|
||||
|
||||
11
|
||||
H. Di et al. Computer Standards & Interfaces 97 (2026) 104097
|
||||
|
||||
|
||||
[39] Y. Dodis, A. Yampolskiy, A verifiable random function with short proofs and Yibo Pang received the B.S. degree in Information Security
|
||||
keys, 2004, URL: https://eprint.iacr.org/2004/310. Cryptology ePrint Archive, from the School of Cyberspace Security, Xi’an University of
|
||||
Paper 2004/310. Posts and Telecommunications, Xi’an, China, in 2020, and
|
||||
[40] J. Groth, On the size of pairing-based non-interactive arguments, in: Advances the M.S. degree in Cyberspace Security from the School of
|
||||
in Cryptology – EUROCRYPT 2016, 2016, pp. 305–326. Cyberspace Security, Xi’an University of Posts and Telecom-
|
||||
[41] V. Shoup, Sequences of games: a tool for taming complexity in security proofs, munications, Xi’an, China, in 2023. He is currently pursuing
|
||||
IACR Cryptol. EPrint Arch. (2004) 332, URL: http://eprint.iacr.org/2004/332. a PhD at Xi’an University of Posts and Telecommunica-
|
||||
[42] M. Bellare, P. Rogaway, Random oracles are practical: a paradigm for designing tions. His research interests include multimedia security and
|
||||
efficient protocols, in: Proceedings of the 1st ACM Conference on Computer and privacy.
|
||||
Communications Security, 1993, pp. 62–73, http://dx.doi.org/10.1145/168588.
|
||||
168596.
|
||||
[43] B. Bünz, J. Bootle, D. Boneh, A. Poelstra, P. Wuille, G. Maxwell, Bulletproofs:
|
||||
Short proofs for confidential transactions and more, in: 2018 IEEE Symposium Rui Guo is an associate professor and master’s supervisor at
|
||||
on Security and Privacy, SP, 2018, pp. 315–334, http://dx.doi.org/10.1109/SP. Xi’an ’an University of Posts and Telecommunications. He
|
||||
2018.00020. has presided over a total of 9 scientific research projects,
|
||||
including those funded by the National Natural Science
|
||||
Foundation of China, the Key Research and Development
|
||||
Hongyan Di is currently studying for a master’s degree in
|
||||
Program of Shaanxi Province, and the Basic Research Pro-
|
||||
Cyberspace and Information Security from Xi’an University
|
||||
gram of Shaanxi Province. As a major participant, he has
|
||||
of Posts and Telecommunications. Her research interests
|
||||
participated in and completed more than 10 projects, such
|
||||
include cross-domain authentication and digital signature
|
||||
as the National Key Research and Development Plan and the
|
||||
security.
|
||||
National Natural Science Foundation of China. As the first
|
||||
author, I have published over 20 academic papers, among
|
||||
which 12 are indexed by SCI (including 1 TOP 1% ESI
|
||||
highly cited paper).
|
||||
|
||||
|
||||
Dr. Yangguang Tian received his Ph.D. degree in applied
|
||||
Yinghui Zhang received his Ph.D. degree in Cryptography cryptography from the University of Wollongong, Australia.
|
||||
from Xidian University, China, in 2013. He is a professor After Ph.D., he did post-docs at School of Information
|
||||
at School of Cyberspace Security, National Engineering System, Singapore Management University, and iTrust, Sin-
|
||||
Research Center for Secured Wireless (NERCSW), Xi’an gapore University of Technology and Design. Before Surrey,
|
||||
University of Posts & Telecommunications. He was a re- he was a research-based assistant professor at Osaka Uni-
|
||||
search fellow at School of Information System, Singapore versity, Japan. He is currently a lecturer at the University
|
||||
Management University. He has published over 100 research of Surrey, UK. His research interests include applied cryp-
|
||||
articles in ACM CSUR, IEEE TDSC, IEEE TCC, Computer tography, network security, blockchain technologies, and
|
||||
Networks, etc. He served on the program committee of privacy-preserving technologies. Dr. Tian’s recent research
|
||||
several conferences and the editorial member of several works have been published in the cybersecurity-related
|
||||
international journals in information security. His research international conferences and journals, such as USENIX’24,
|
||||
interests include public key cryptography, cloud security, AsiaCCS’24, IEEE TIFS’23, IEEE TDSC’24, etc.
|
||||
and wireless network security.
|
||||
|
||||
|
||||
Ziqi Zhang is currently studying for a master’s degree in
|
||||
Cyberspace and Information Security from Xi’an University
|
||||
of Posts and Telecommunications. Her research interests
|
||||
include digital signature security and its applications.
|
||||
|
||||
|
||||
|
||||
|
||||
12
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,897 @@
|
||||
Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
Contents lists available at ScienceDirect
|
||||
|
||||
|
||||
Computer Standards & Interfaces
|
||||
journal homepage: www.elsevier.com/locate/csi
|
||||
|
||||
|
||||
|
||||
|
||||
How AI agents transform reflective practices: A three-semester comparative
|
||||
study in socially shared regulation of learning
|
||||
Yumin Zheng a, Fengjiao Tu b , Fengfang Shu a,c , Chaowang Shang a,* , Lulu Chen a , Jiang Meng a
|
||||
a
|
||||
Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China
|
||||
b
|
||||
Department of Information Science, University of North Texas, 3940 North Elm, Denton, Texas, 76203, USA
|
||||
c
|
||||
Institute of Open Education, Wuhan Vocational College of Software and Engineering, Wuhan Open University, Wuhan, China
|
||||
|
||||
|
||||
|
||||
|
||||
A R T I C L E I N F O A B S T R A C T
|
||||
|
||||
Keywords: High-quality reflection has been a challenging barrier in the socially shared regulation of learning (SSRL).
|
||||
Artificial intelligence agent Especially with the emergence of generative artificial intelligence (GAI), traditional methods such as reflection
|
||||
Socially shared regulation of learning reports may increase the students’ risk of superficial reflection. This study uses an artificial intelligence agent (AI
|
||||
Reflection quality
|
||||
agent) to design a reflection assistant, which aims to enhance students’ reflection ability through continuous
|
||||
Collaborative learning
|
||||
Generative artificial intelligence
|
||||
questioning and real-time, content-specific feedback based on their written reflections. Through a comparative
|
||||
experiment conducted over three semesters, this study demonstrates the different impacts of three reflection
|
||||
methods, reflection reports, reflection short-answer questions, and AI agents, on the quality of university stu
|
||||
dents’ reflections. The results indicate that there is a significant difference in the quality of reflection among the
|
||||
three reflection methods. Students using AI agents show the highest levels of reflection, characterized primarily
|
||||
by connective reflection and critical reflection. Epistemic network analysis further reveals that the AI agent
|
||||
reflection method is more effective in improving the reflection quality of low-performance teams than that of
|
||||
high-performance teams. This expands AI agents’ use in SSRL reflection, introduces new methods for the GAI era,
|
||||
and provides practical experience and reflection intervention strategies for teachers and instructional designers
|
||||
in SSRL.
|
||||
|
||||
|
||||
|
||||
|
||||
1. Introduction Nowadays, these traditional methods fall short of addressing the chal
|
||||
lenges posed by GAI [9]. Students may easily rely on tools like ChatGPT
|
||||
With the rapid advancement of generative artificial intelligence to complete short-answer questions, journals, and reports. Kiy [10] has
|
||||
(GAI), numerous challenges in collaborative learning have been shown that 76 % of university students use ChatGPT for their assign
|
||||
addressed with innovative solutions [1,2]. GAI applications, represented ments, with the percentage being even higher among software engi
|
||||
by artificial intelligence agents (AI agents), have introduced revolu neering students, reaching 93 % [11]. The widespread use of GAI has
|
||||
tionary transformations to education. These transformations are mainly profoundly transformed traditional methods of learning and teaching,
|
||||
due to the powerful expert-level conversational abilities and and this era calls for new approaches to reflection.
|
||||
user-friendly accessibility [3]. AI agents are computing systems with capabilities for autonomous
|
||||
The socially shared regulation of learning (SSRL) strategy serves as a perception, decision making, and action [12]. They use GAI to learn,
|
||||
crucial mechanism for enhancing learning outcomes in collaborative reason, and perform corresponding tasks or actions from the surround
|
||||
learning [4]. Through the SSRL strategy, learners collaboratively set ing environment and input information. To enable practical imple
|
||||
goals and monitor progress, thereby improving their performance [5]. mentation, rule-based AI agents have been developed that require no
|
||||
Reflection is a critical component of SSRL, aiding learners in recognizing programming and can be deployed simply by defining task objectives
|
||||
and refining their learning processes [6]. However, achieving and roles via prompts. In educational contexts, these rule-based AI
|
||||
high-quality reflection remains a challenge [7]. agents are commonly used for personalized instruction and intelligent
|
||||
There are various methods to enhance reflection quality in SSRL, tutoring due to their ability to engage in real-time dialogue and provide
|
||||
such as providing prompts and templates in reflection reports [8]. immediate feedback [13].
|
||||
|
||||
|
||||
* Corresponding author.
|
||||
E-mail address: phdzhengyumin@mails.ccnu.edu.cn (C. Shang).
|
||||
|
||||
https://doi.org/10.1016/j.csi.2025.104094
|
||||
Received 1 February 2025; Received in revised form 28 October 2025; Accepted 10 November 2025
|
||||
Available online 11 November 2025
|
||||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
The rule-based AI agent provides an effective approach for sup widely applied in education [16]. It can support collaborative learning
|
||||
porting SSRL reflection. Instructors can set specific SSRL task directions, through personalized instruction, real-time feedback, and intelligent
|
||||
and the agent guides students based on the reflection checklist while assessment [17]. AI agents, a form of GAI equipped with autonomous
|
||||
adaptively generating questions according to students’ responses. Each learning and decision-making capacities, have emerged as key instruc
|
||||
follow-up question is dynamically generated based on the student’s tional tools in global educational research.
|
||||
prior answers and the specific SSRL task, making it difficult for students Empirical studies have shown that AI agents significantly improve
|
||||
to rely on external AI tools like ChatGPT to provide generic responses. student engagement [18,19], learning motivation [20,21], and aca
|
||||
This continuous dialogue mechanism supports deeper, more analytical demic performance [22]. AI agents exist in various forms, such as
|
||||
reflection and reduces the risk of superficial reflection [14]. Despite AI chatbots [23], intelligent tutoring systems (ITS; [24]), embodied
|
||||
agents having broad application prospects, current research on conversational agents (ECA; [25,26]), and intelligent virtual assistants
|
||||
improving learners’ reflection quality by AI agents remains limited and (IVA; [13,27]). Among these, GAI-based chatbots have been widely
|
||||
requires further in-depth exploration. adopted in education due to their customizable roles and flexible
|
||||
Against this backdrop, this study introduces a rule-based AI agent deployment. The present study focuses on this type of conversational AI
|
||||
reflection assistant within the SSRL framework to help learners enhance agent.
|
||||
their reflection quality. This study aims to examine the impact of the AI In higher education, AI agents have been shown to support higher-
|
||||
agent on SSRL reflection quality by comparing three reflection methods: order thinking skills, such as critical thinking, metacognition, and
|
||||
reflection reports, short-answer reflection questions, and the AI agent- problem-solving [23,28,29]. In these studies, GAI was embedded within
|
||||
based reflection. In addition, different methods may lead to different structured reflection activities, allowing students to engage in guided
|
||||
reflection qualities among learners in high and low-performance teams reflective processes targeting specific cognitive skills. For example,
|
||||
[15]. Therefore, we further explored the differences in reflection quality Hong et al. [29] employed AI to handle lower-level tasks in essay
|
||||
between high and low-performance teams when using these three writing, enabling students to focus on evaluation and reflection, thereby
|
||||
reflection methods. We proposed the following research questions: enhancing critical thinking. Chen et al. [28] implemented metacognitive
|
||||
strategy-supported AI agents that prompted process-oriented reflection
|
||||
RQ1: How does the AI agent reflection assistant affect learners’ and multi-perspective discussion, improving metacognitive skills. Zhou
|
||||
reflection quality in SSRL? et al. [23] situated reflection within a self-regulated learning frame
|
||||
RQ2: What differences do high and low-performance teams show in work, showing that GAI-supported reflection indirectly benefits critical
|
||||
reflection quality when using the three reflection methods? thinking and problem-solving.
|
||||
Although these studies demonstrate that AI agents can enhance
|
||||
This study conducted a three-semester comparative teaching exper higher-order thinking, reflection itself has often been treated merely as a
|
||||
iment to evaluate the impact of AI agents and two traditional reflection learning process rather than a measurable skill. Reflection is a core
|
||||
methods (reflection reports and short-answer questions) on university component of higher-order thinking and an essential learning compe
|
||||
students’ reflection quality. Using statistical analysis, content analysis, tency for 21st-century university students. Empirical evidence directly
|
||||
and epistemic network analysis (ENA), this study examines the effec examining the impact of AI agents on learners’ reflective abilities,
|
||||
tiveness of AI agents in enhancing university students’ reflection quality particularly in collaborative learning environments, remains scarce.
|
||||
in SSRL. Investigating this relationship is therefore necessary to understand how
|
||||
The main contributions of this study are summarized as follows: AI agents can effectively support the development of reflection.
|
||||
|
||||
- We introduce a practical SSRL activity, providing educators with a 2.2. Socially shared regulation of learning and reflection
|
||||
valuable instructional framework for facilitating collaborative
|
||||
learning. Collaborative learning includes three primary types of regulation:
|
||||
- We integrated an AI agent reflection assistant in SSRL and provided a self-regulation (SR), co-regulation (CoR), and socially shared regulation
|
||||
comprehensive debugging process, offering instructors examples and (SSR) [30,31]. Based on SSR theory, socially shared regulation of
|
||||
considerations of AI agent implementation. learning (SSRL) is an emerging collaborative learning strategy empha
|
||||
- We revealed the reflection quality differences between high and low- sizing mutual support and feedback among team members. The strategy
|
||||
performance teams in various reflection approaches and demon consists of four key stages: goal setting, task distribution, progress
|
||||
strated the advantages of the AI agent for low-performance teams. monitoring, and reflection evaluation [32–35]. Research indicates that
|
||||
the SSRL strategy has a positive impact on collaborative learning [36].
|
||||
The research is organized as follows: Section 2 reviews prior research Learners may enhance their awareness of the collaborative process and
|
||||
on AI agents in education, SSRL theory, and reflection. Section 3 de facilitate the activation of regulatory processes through SSRL [4]. And
|
||||
scribes the participants, research design, and methods for data collection SSRL helps to enhance learners’ cognitive and metacognitive abilities,
|
||||
and analysis. Section 4 compares reflection quality across the three boosting learning motivation and engagement [37,38]. Additionally,
|
||||
methods and examines differences between high and low-performance SSRL fosters communication among team members, improving collab
|
||||
teams using ENA. Section 5 discusses the results and implications. The orative efficiency [39]. Thus, SSRL has been widely incorporated into
|
||||
paper concludes with a summary and potential directions for future collaborative learning and plays a significant role in enhancing various
|
||||
research. learner abilities.
|
||||
Reflection quality is a key indicator for assessing the success of SSRL
|
||||
2. Literature review [39]. High-quality reflection is an indispensable component of SSRL, as
|
||||
it enables learners to examine and evaluate their learning processes and
|
||||
To explore the impact of AI agents on learning processes, it is outcomes [40]. Unlike conventional collaborative learning, the reflec
|
||||
essential to examine their application in education, followed by a dis tion content in SSRL emphasizes the process of mutual regulation and
|
||||
cussion on SSRL and reflection. monitoring among group members. However, since reflection is the final
|
||||
stage of SSRL, educators often overlook its significance [41]. Teachers’
|
||||
2.1. AI agents in teaching lack of emphasis on the reflection stage may lead to low-quality
|
||||
reflection among students [42]. Achieving high-quality SSRL reflection
|
||||
Generative Artificial Intelligence (GAI), defined as AI systems remains a persistent challenge for educators and students [43].
|
||||
capable of autonomous learning and content generation, has been To enhance students’ reflective abilities, it is essential to focus on the
|
||||
|
||||
2
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
definition of reflection. Dewey [44] defined reflection as a continuous elaborated on the activities of SSRL and the design process of the AI
|
||||
process of exploring and evaluating experiences, which helps in agent. Lastly, we discussed the coding scheme for reflection quality and
|
||||
dividuals gain a deeper understanding of their behaviors and outcomes. provided the methodology for data collection and analysis.
|
||||
Zimmerman [45] further emphasized that self-reflection is a complex
|
||||
learning process involving various aspects of self-monitoring, such as 3.1. Participants
|
||||
self-assessment and feedback on contributions. In the theory of SSRL,
|
||||
reflection encompasses not only self-assessment but also shared moni The participants were from the course “Internet Thinking and Digital
|
||||
toring processes with others [39]. These theories provide support for Self-Learning” over three semesters: Spring 2023, Fall 2023, and Spring
|
||||
exploring and promoting the reflective process. 2024. A total of 97 undergraduate students, aged 18 to 22, took part in
|
||||
In reflective activities, teachers can support students’ deep learning this study (Table 1).
|
||||
and reflective abilities through various intervention strategies, such as At the beginning of each semester, students completed a pre-test
|
||||
scaffolding, reflective prompts, and feedback [46]. Reflective scaf using the CThQ [63], which assesses six cognitive dimensions: mem
|
||||
folding involves providing structured guidance to help students more ory, comprehension, application, analysis, evaluation, and creation
|
||||
effectively review and analyze their learning experiences [47]. When (overall reliabilityα= 0.87). According to Dewey [64], critical thinking
|
||||
designing reflection tasks for SSRL, teachers often utilize the SSRL is a deepening and extension of reflective thinking, with high consis
|
||||
reflection scaffolds developed by Panadero et al. [48]. Additionally, tency in cognitive processing, reasoning, and evidence evaluation. The
|
||||
reflective prompts and guiding questions steer students toward specific CThQ pre-test provides a valid proxy for students’ baseline reflection
|
||||
directions for reflection, assisting them in identifying potential barriers levels. One-way ANOVA indicated no significant differences in pre-test
|
||||
and challenges in their learning [49]. Feedback provides learners with total scores among the three groups (Group 1: M = 105.07, SD =
|
||||
suggestions or information to improve task performance, helping them 6.13; Group 2: M = 103.72, SD = 4.19; Group 3: M = 105.22, SD =
|
||||
optimize both their reflection and learning processes [50]. From a 4.24), F(2, 86) = 1.33, p = 0.27, suggesting comparable reflection
|
||||
cognitive perspective, feedback serves as guidance to enhance students’ abilities across groups prior to the intervention.
|
||||
task performance [51]. Timely feedback on students’ reflections not Participants were divided into 3 groups, each employing a different
|
||||
only improves the quality of subsequent reflections but also deepens reflection method, and within each group, students were further divided
|
||||
their understanding of reflective concepts [52]. into teams using random assignment to minimize potential biases arising
|
||||
Reflection journals, reflection reports, and reflection short-answer from prior academic performance, familiarity, or interpersonal prefer
|
||||
questions have been explored to improve reflection quality [53,54]. ence. Random assignment was chosen over self-selection or instructor-
|
||||
However, the traditional methods may not adapt to the advancements of based grouping to ensure group equivalence and to enhance the inter
|
||||
GAI. These require students to submit longer texts, which inevitably nal validity of the comparative analysis [65].
|
||||
causes a risk of superficial reflections due to the use of GAI. Some The first group (G1), consisting of 31 students from the Spring 2023
|
||||
scholars have also modified reflection methods from a technological semester, conducted reflection reports and were further divided into 7
|
||||
perspective by using various reflection platforms, such as Google Docs teams. The second group (G2), consisting of 30 students from the Fall
|
||||
[55], Flipgrid [56], the VEO app [57], and Wiki [58]. However, these 2023 semester, conducted short-answer reflections and were divided
|
||||
platforms primarily offer static or limited interaction, which constrains into 7 teams. The third group (G3), consisting of 36 students from the
|
||||
students’ ability to adaptively engage in reflective processes. The Spring 2024 semester, conducted reflections through continuous ques
|
||||
low-quality reflection issues in SSRL urgently require new solutions. tioning by an AI agent and were divided into 9 teams. Additional in
|
||||
Although GAI poses challenges to traditional reflection methods, it formation about the participants is provided in Table 1.
|
||||
also offers new solutions. AI agents are increasingly regarded as effective
|
||||
tools for supporting reflection practices. Research indicates that the use 3.2. Design of socially shared regulation of learning activities
|
||||
of AI agents in reflection activities may enhance students’ learning
|
||||
motivation and engagement [59]. Teachers can use AI agents to design During the 4-week activity, students collaborated in teams to pro
|
||||
reflection scaffolding, assisting learners in conducting more in-depth duce micro-lesson videos lasting 5 to 8 min. The activity was divided
|
||||
and systematic reflections [60]. In addition, AI agents may enhance into 4 stages, each lasting one week (Table 2).
|
||||
reflection quality through data analysis and intelligent feedback [61]. In the first week (goal setting), students were required to establish a
|
||||
Therefore, AI agents demonstrate potential in addressing the issue of common goal, select the video’s theme, and outline the content frame
|
||||
improving SSRL reflection quality. work. Then, they submitted a project proposal detailing the topic, ob
|
||||
Thus, this study designed a reflection assistant by AI agents to jectives, task distribution, and timeline. In the second week (task
|
||||
enhance university students’ reflection quality in SSRL. Statistical distribution), the teams followed their project plan to allocate tasks and
|
||||
analysis, content analysis, and ENA were employed to collect and begin executing the project. The instructor provided guidance and
|
||||
analyze textual data related to reflection quality. By comparing the AI suggestions throughout this process. In the third week (progress moni
|
||||
agent reflection assistant with traditional SSRL strategy reflection scaf toring), each team submitted a video sample that was between 1 and 2
|
||||
folding, this study analyzed the differences in reflection content and min long. The instructor conducted an initial evaluation based on the
|
||||
reflection levels among university students across three methods. sample and suggested improvement. Students refined and adjusted their
|
||||
Additionally, previous research suggests that high and low-performance video production based on the feedback. In the fourth week (reflection
|
||||
teams may experience different effects from various reflection methods evaluation), students submitted their completed micro-lesson videos
|
||||
[62]. Therefore, this study further explores the differences between high
|
||||
and low-performance teams when using three reflection methods. This Table 1
|
||||
study provides new theoretical evidence for using AI agents in SSRL Participant and group information.
|
||||
reflection practices.
|
||||
Group Course Reflection Team Participant Female Male
|
||||
method
|
||||
3. Methodology
|
||||
G1 Spring Report 7 31 17 14
|
||||
2023
|
||||
This study employed a quasi-experiment to explore the differences G2 Fall 2023 Short-answer 7 30 19 11
|
||||
among three reflection methods in SSRL. And examine whether AI questions
|
||||
agents improve the reflection quality of university students. Firstly, we G3 Spring AI reflection 9 36 20 16
|
||||
2024 assistant
|
||||
provided information about the participants and the course. Then, we
|
||||
|
||||
3
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
Table 2 AI agent reflection assistant, Crystal, was developed using the Coze
|
||||
The stages of SSRL. platform (https://www.coze.cn/). The AI agent consists of 4 core com
|
||||
Week SSRL stages Description ponents, with Part A being the AI agent’s name, Part B defining the role
|
||||
setting and response logic, Part C specifying the conversational experi
|
||||
1 Goal setting Students discuss the goal, theme, and framework.
|
||||
2 Task distribution Students allocate tasks and make the micro lesson ence, such as the opening dialogue, and Part D serving as the preview
|
||||
videos. interface. Developing the AI agent requires following these operational
|
||||
3 Progress Students monitor the task and submit a video sample. steps.
|
||||
monitoring
|
||||
4 Reflection Students submit completed micro-lesson videos and
|
||||
evaluation individual reflection assignments.
|
||||
Step 1: Create the AI agent and assign it to the name Crystal (as
|
||||
shown in Fig. 1, Part A). Define it as the reflection assistant for the
|
||||
course “Internet Thinking and Digital Self-Learning. Set its duty to
|
||||
and individual reflection assignments (employing different reflection guide students in completing tasks (as shown in Fig. 1 Part B) and
|
||||
methods for each of the three semesters). Finally, a reflection-sharing design the opening statement (as shown in Fig. 1 Part C).
|
||||
session was held in class, where students exchanged learning experi Step 2: Set up the reflection task (as shown in Fig. 1, Part B). Input all
|
||||
ences and insights. the questions from the SSRL reflection scaffolding developed by
|
||||
Panadero et al. [48] into the AI agents as the question base. This
|
||||
ensures a logical flow of questions from the AI agent to the students,
|
||||
3.3. Design of the three reflection methods preventing task misdirection. In addition, the AI agent was not
|
||||
restricted to this fixed list but generated follow-up questions,
|
||||
Prior to the reflection phase, all students completed a four-week particularly “Why” questions, based on the students’ specific an
|
||||
SSRL activity in which the instructor introduced and practiced the swers, which reflected its adaptiveness.
|
||||
four SSRL stages. Consequently, all reflections were anchored in the Step 3: Set up the response rule (as shown in Fig. 1, Part B). Establish
|
||||
teams’ performance across these four stages. In G1, the reflection the response rules for the AI agent:
|
||||
remained open-ended within this framework and only specified a min a. Ask only one reflection question per interaction.
|
||||
imum length of at least 200 words (no SSRL question list was provided). b. Provide encouraging feedback that adapts dynamically after each
|
||||
In G2, students conducted individual reflections through short- response (e.g., “You did a great job”, “Your reflection is very
|
||||
answer questions. The guiding questions were derived from the SSRL insightful”).
|
||||
reflection scaffolding [48]. For example, questions included “What is the c. Avoid using academic terms.
|
||||
group’s current assignment?” and “What obstacles might the group d. Use only special interrogative questions (e.g., “What, ” “Why”),
|
||||
encounter?” with follow-up questions adjusted according to students’ responses.
|
||||
G3 students used the AI agent reflection assistant for their re e. After answering all questions, conclude the conversation and ex
|
||||
flections. After the SSRL task, the instructor provided students with a press gratitude.
|
||||
quick response code (QR code) linking to the AI agent’s website. Stu Step 4: Testing and deployment (as shown in Fig. 1, Part D). Check
|
||||
dents scanned the QR code with their phones to initiate a conversation the conversation flow and ensure the AI agent’s smooth and effective
|
||||
with the AI agent. Each student completed the reflection task through interactions. Select 5 students for a second round of testing to ensure
|
||||
the dialogue.
|
||||
The development process of the AI agent is illustrated in Fig. 1. The
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1. AI agent development interface on the Coze platform.
|
||||
|
||||
4
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
the conversation flows smoothly. Once confirmed, the AI agent can Table 3
|
||||
be deployed and available to all students. Learner reflection quality coding scheme.
|
||||
Categories Coding Description
|
||||
|
||||
3.4. Experimental procedure Reflection NOR Lacking a reflection mindset.
|
||||
level LOWR Having a reflective mindset involves reviewing
|
||||
experiences, describing facts and feelings, and
|
||||
The experimental procedure is illustrated in Fig. 2. As described in
|
||||
reflecting on what has been learned. It also
|
||||
the Participants section, all students completed the CThQ [63] as a encompasses the ability to connect new knowledge
|
||||
pre-test before the course. They then attended a 16-week course with existing knowledge and to improve learning
|
||||
covering basic concepts. All students were taught by the same instructor, strategies.
|
||||
with the course content, teaching methods, and learning resources HIGHR Critically analyzing the current situation, attempting to
|
||||
view problems from different perspectives, forming
|
||||
remaining entirely consistent across the three semesters. Students new viewpoints from available resources, and seeking
|
||||
participated in a 4-week group collaboration activity, “creating micro to test hypotheses.
|
||||
lesson videos”, conducted using the SSRL strategy. After the group ac Reflection DESR A description of “what” the object of reflection is.
|
||||
tivity finished, each student was assigned an individual reflection task. content EXPR An explanation of the causes behind the object of
|
||||
reflection, addressing the “why” often indicated by
|
||||
G1 and G2 used traditional reflection methods, with G1 completing
|
||||
keywords such as “in order to”, "due to", or "so as to".
|
||||
reflection reports and G2 answering short-answer questions. G3 CONR Understanding whether the object of reflection has
|
||||
employed a new reflection method, utilizing the AI agent reflection changed across different times and contexts, coupled
|
||||
assistant. with an analysis of the reasons for these changes and
|
||||
their impact on behavior, represents a higher level of
|
||||
analysis concerning the “what” and “why”.
|
||||
3.5. Data collection and analysis CRIR It identifies personal or team issues and analyzes them
|
||||
with theory and practice to solve problems, focusing on
|
||||
“how” to achieve self-reconstruction. This may include
|
||||
After the three semesters, the reflection texts of all students were keywords like “needs improvement” or “next stage”.
|
||||
collected and anonymized. G1 produced 31 reflection reports totaling
|
||||
8032 words. G2 submitted 30 reflection short-answer texts, totaling
|
||||
15,468 words. G3′s AI agent reflection assistant dialogues comprised 36 To ensure reliability, a coding discussion group comprised two ex
|
||||
submissions, totaling 16,801 words (excluding the AI agent’s questions). perts and two professional coders. First, the two coders preliminarily
|
||||
Content analysis was used to process the reflection texts. Through coded the first 10 % of the reflection texts. In cases of disagreement, they
|
||||
systematic coding rules, this method reduced the influence of subjective consulted with the experts to reach a consensus. After training and
|
||||
judgment and personal bias, thereby providing more objective results. repeated practice, the coders achieved a high level of consistency. The
|
||||
The coding scheme consists of two parts: reflection level and reflection coders strictly adhered to the revised coding scheme during the formal
|
||||
content, as shown in Table 3. The reflection level coding scheme is based coding process. After coding, inter-coder reliability was calculated,
|
||||
on Plack et al. [66], and it is used to assess the overall reflection level of yielding a Cohen’s kappa coefficient of 0.87, indicating that the coding
|
||||
learners, categorized into no reflection (NOR), low reflection (LOWR), process had a high level of reliability. The coders consulted with experts
|
||||
and high reflection (HIGHR). The reflection content coding scheme is for different coding results and ultimately reached an agreement.
|
||||
based on Wang et al. [67] and is used to explore the differences in the After coding the reflection texts using the content analysis, ENA was
|
||||
types of learners’ reflection content. The reflection content is catego employed to conduct a fine-grained analysis of the reflection data.
|
||||
rized into 4 types: descriptive reflection (DESR), explanatory reflection Content analysis excels at systematically and objectively analyzing large
|
||||
(EXPR), connected reflection (CONR), and critical reflection (CRIR), volumes of textual content. ENA focuses on uncovering the complex
|
||||
with reflection quality progressively increasing across these categories. relational networks between elements, such as reflection levels. The
|
||||
The reflection texts in the reflection reports and short-answer re combination of the two methods allows for attention to both the char
|
||||
flections were relatively longer, while those in the AI agent dialogues acteristics of the text itself and the internal relationships between the
|
||||
were shorter. To mitigate the differences caused by these length dis content elements. Additionally, the ENA Webkit (http://www.epist
|
||||
crepancies, this study used a single complete sentence as the minimum emicnetwork.org/) provides a stable environment for data analysis.
|
||||
coding unit. For example, the statement “As the group leader, I am quite To investigate the differences in reflection quality between the high
|
||||
decisive. I directly assigned tasks to everyone, and the group was sup and low-performance teams, we assessed the micro lesson videos
|
||||
portive.” should be coded as two separate sentences. completed by students in SSRL. The videos were assessed by two experts
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 2. Experimental procedure.
|
||||
|
||||
5
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
in education, each with over 10 years of teaching experience. The Table 5
|
||||
evaluation criteria included the following categories, with topic selec The result of the Kruskal-Wallis H test.
|
||||
tion worth 10 points, instructional design 40 points, content complete Codes Mean score χ² p
|
||||
ness 20 points, audio-visual quality 20 points, and artistry 10 points.
|
||||
G1 G2 G3
|
||||
Each group received a score ranging from 0 to 100 points. The two ex
|
||||
perts thoroughly discussed the evaluation criteria to ensure consistency Reflection quality NOR 0.018 0.005 0.088 6.557 0.038
|
||||
LOWR 0.267 0.163 0.232
|
||||
in scoring and then individually assessed all instructional designs and
|
||||
|
||||
HIGHR 0.018 0.044 0.218
|
||||
materials. The scoring consistency between the two experts (Spearman DESR 0.229 0.197 0.262
|
||||
correlation coefficient) was 0.86 (p < 0.01). EXPR 0.100 0.103 0.264
|
||||
The average score from both experts was used as the final score for CONR 0.038 0.028 0.221
|
||||
CRIR 0.006 0.037 0.200
|
||||
each group (Table 4). The grouping criteria for high and low performing
|
||||
|
||||
teams proposed by Hou [68] have been widely adopted by scholars [69].
|
||||
In this study, based on those criteria, the top 15 % of teams were clas significance of 0.038. The mean ranks for the 3 groups were G1 = 9.14,
|
||||
sified as the high-performance teams, including G1-team7, G2-team2, G2 = 8.00, and G3 = 15.86. The results indicate a statistically significant
|
||||
and G3-team1. The bottom 15 % of teams were classified as the difference in reflection scores between the groups (p = 0.038). Specif
|
||||
low-performance teams, including G1-team5, G2-team6, and G3-team4. ically, G3′s mean rank was significantly higher than G1 and G2, indi
|
||||
Using ENA, we further explored the differences between the high and cating that using the AI agent is associated with higher performance.
|
||||
low-performance teams of students. To further investigate the observed differences, we applied ENA for a
|
||||
fine-grained analysis of the students’ reflections across the 3 reflection
|
||||
3.6. IRB approval and AI agent data privacy methods. This analysis aims to uncover the epistemic structures and
|
||||
patterns, providing deeper insights into how different reflection
|
||||
This study has received approval from the Institutional Review Board methods influence the quality and complexity of students’ reflection
|
||||
(IRB) of the university, ensuring that all ethical standards are met. All processes. By analyzing epistemic networks, we may better understand
|
||||
students participated voluntarily, fully aware of the study’s purpose and the specific epistemic factors and relationships underlying the differ
|
||||
procedures, and signed informed consent forms prior to the ences observed in the statistical results.
|
||||
commencement of the experiment. In addition, to protect participants’ Fig. 3 presents a comparative ENA network model of reflection
|
||||
privacy, all data collected during the study were anonymized. content for the three groups using different reflection methods. In this
|
||||
All conversations on the Coze platform were fully anonymized, and model, nodes represent individual reflection codes, and edges indicate
|
||||
students were reminded before using the platform not to enter any the co-occurrence of codes within each unit of analysis. Blue, red, and
|
||||
personal or sensitive information (such as name, student ID, gender, or purple dots denote the centroids of students in G1, G2, and G3,
|
||||
school). Data was labeled only with class sequence numbers (e.g., Stu respectively, while the four black dots represent the four categories of
|
||||
dent 1, Student 2), and access was strictly limited to the research team. reflection content (DESR, EXPR, CRIR, CONR). ENA applies singular
|
||||
In addition, all students signed the Coze platform’s privacy protection value decomposition (SVD) to reduce the network model to two di
|
||||
agreement, and the platform further ensures data security through mensions, which together account for 70.1 % of the variance (SVD1 =
|
||||
anonymization and encryption techniques. 51.5 %, SVD2 = 18.6 %). The x-axis in the ENA space (SVD1) defines the
|
||||
dimension of reflection content, with the right side (higher x-values)
|
||||
4. Results representing DESR codes and the left side (lower x-values) representing
|
||||
CONR codes. The y-axis (SVD2) in the ENA space defines the dimension
|
||||
The results are organized to address the key research questions of reflection content, where the CRIR and EXPR codes are positioned
|
||||
regarding the effectiveness of the AI agent and the differences in higher (with higher y-values). The DESR code is located lower in the
|
||||
reflection quality across various reflection methods. ENA space (with lower y-values). This model allows comparison across
|
||||
students and groups, showing which types of reflection are more
|
||||
4.1. How does the AI agent reflection assistant affect learners’ reflection dominant and how reflection content patterns differ between groups.
|
||||
quality in SSRL? The right side of Fig. 3 displays the mean networks of the 3 groups.
|
||||
Overall, the reflection content of all 3 groups predominantly features
|
||||
A Kruskal-Wallis H test was conducted to assess the differences in EXPR and DESR, with a strong association observed between these two
|
||||
SSRL reflection scores among the 3 groups of students using different points. The reflection content network of G1 is the sparsest, with only a
|
||||
reflection methods, as shown in Table 5. The test compares independent few occurrences of CRIR, aside from the relatively frequent appearances
|
||||
samples without assuming a normal data distribution. This makes it of EXPR and DESR. The network of G2 is more concentrated, with dis
|
||||
highly suitable for analyzing the multiple groups of non-normally tribution across all 4 reflection types and a stronger CRIR-DESR
|
||||
distributed reflection data in this study. connection (value of 0.10). The reflection content of G3 is the most
|
||||
For this analysis, an overall reflection quality score was calculated densely connected, with all 4 types having a relatively high proportion
|
||||
for each student by taking the meaning of all seven reflection codes of representation. The CRIR-CONR (0.23) and CONR-EXPR (0.13) con
|
||||
(NOR, LOWR, HIGHR, DESR, EXPR, CONR, CRIR). This composite score nections are relatively strong. In contrast, the other pairs based on
|
||||
was used for the Kruskal-Wallis H test, while the mean scores for indi traditional SSRL reflection did not exhibit strong correlations.
|
||||
vidual codes presented in Table 5 are provided only for descriptive Table 6 demonstrates how the AI agent, through guided dialogue,
|
||||
purposes. facilitated the transition of G3 students from connective reflection
|
||||
The results showed a chi-square value of 6.557, and an asymptotic (CONR) to critical reflection (CRIR), thereby deepening the SSRL
|
||||
|
||||
|
||||
Table 4
|
||||
Scores of the SSRL performance for the 3 groups.
|
||||
Group team1 team2 team3 team4 team5 team6 team7 team8 team9
|
||||
|
||||
G1 86.0 90.0 88.5 76.5 68.5 87.0 92.0 NA NA
|
||||
G2 83.5 93.5 87.0 90.5 81.0 71.5 76.5 NA NA
|
||||
G3 94.0 84.5 75.0 71.5 88.0 76.0 89.5 84.5 90.5
|
||||
|
||||
|
||||
6
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 3. Comparison of reflection content.
|
||||
|
||||
|
||||
then proposed a concrete plan for deeper learning of AI technology
|
||||
Table 6
|
||||
(CONR). The AI agent’s continuous questioning and feedback prompted
|
||||
Transcribed example of the G3 reflection content.
|
||||
students to progress from simple descriptive reflection to more critical
|
||||
Student Transcript and in-depth reflection throughout the reflection process.
|
||||
1 AI agent: What problems did you encounter, and how did you solve them? Table 7 presents reflection examples from some G1 and G2 students,
|
||||
Student 9: We initially struggled, but by discussing with our team and highlighting the impact of different reflection forms and guidance
|
||||
leader, we solved the problems and saw how teamwork improved our
|
||||
process over time. (CONR)
|
||||
AI agent: What are your plans moving forward?
|
||||
Table 7
|
||||
Student 9: I hope for more team projects to use everyone’s strengths and
|
||||
improve how I contribute next time. (CRIR)
|
||||
Transcript example of the G1 and G2 reflection content.
|
||||
2 AI agent: What were the shortcomings of this micro-lecture task? Group Transcript
|
||||
Student 18: Our PPT didn’t cover everything, so next time I’ll make it
|
||||
more complete. (CRIR) G1 G1-Student 4: Our group chose a radio show format for this Himalaya
|
||||
AI agent: I see, that’s a very insightful analysis. What plans do you have for assignment. (DESR) I’ve always been a fan of radio shows, so I was very
|
||||
your future studies? happy to have the opportunity to create one this time. (DESR) Of course, I
|
||||
Student 18: I plan to learn more about AI video production to improve my also faced some challenges during the production process (DESR), such as
|
||||
skills for future jobs. (CONR) the tone not fitting the storyline and the quality of the program needing to
|
||||
3 AI agent: Did you encounter any difficulties and challenges while be better. (EXPR)
|
||||
completing the task? If so, what were they, and how did you solve them? G1 G1-Student 30: Regarding this task, firstly, we didn’t do well in the
|
||||
Student 31: Our biggest challenge was managing time, but adjusting our presentation aspect. The presentation was only in the form of a document,
|
||||
schedule helped us complete the tasks. (CONR) which needed to ensure a smooth connection between the presentation and
|
||||
AI agent: Great. How did you overcome these difficulties and challenges? the work, making it difficult to access the content. (CONR) Secondly, the
|
||||
Student 31: We held weekly meetings to discuss. (DESR) content presentation was poorly executed and lacked a logical structure.
|
||||
4 AI agent: Please summarize the successful aspects of this task. (EXPR) Finally, the speech was not coherent during the presentation, and
|
||||
Student 36: The tasks were well-organized, and because our team the preparation was insufficient. (EXPR)
|
||||
cooperated closely, we were able to complete the work more efficiently G2 G2-Student 6:
|
||||
than at the beginning. (CONR) Task: We approached the task mainly in two aspects. (DESR) The first part
|
||||
determined the theme and type of work, and the second part recorded the
|
||||
work. (DESR)
|
||||
Division of labor: Our division of labor and cooperation were very
|
||||
reflection process. Under the guidance of the AI agent, student 9 and
|
||||
reasonable, and each member completed their assigned tasks. (EXPR)
|
||||
student 31 shifted from describing the current state of teamwork and Self-evaluation: Very successful. (DESR)
|
||||
time management, such as “We solved problems through communica Outlook: We plan to work more collaboratively on each task and strive to do
|
||||
tion with team members” (CONR), to deeper reflections on self- our best. (CRIR)
|
||||
improvement and future learning plans, exemplified by “I hope for G2 G2-Student 27:
|
||||
Task: This task enhanced our understanding of content production and
|
||||
more team projects to utilize everyone’s potential” (CRIR). Prompted by strengthened the collaboration among team members. (EXPR)
|
||||
the AI agent’s questioning, student 10 and student 36 reflected on the Division of labor: Our team had a clear division of responsibilities, and
|
||||
shortcomings of the SSRL tasks, noting that “The resources were not everyone had their tasks (EXPR). I was responsible for the recording, which
|
||||
comprehensive, and most content lacked innovation” (CONR), and was quite challenging. (EXPR)
|
||||
Self-evaluation: Although our team may not have been the best among all
|
||||
further analyzed the root causes of these issues, along with potential
|
||||
the teams, we had unique messages to convey. (CONR) If there is a next
|
||||
improvement measures (CRIR). Inspired by the AI agent, student 18 first time, we will strive to improve it. (CRIR)
|
||||
identified the issue of inadequate presentation in the task (CRIR) and Outlook: We should promote our work more effectively. (CRIR)
|
||||
|
||||
|
||||
7
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
methods on students’ reflection quality. Two G1 students (student 4 and study, the U values are relatively high; however, they remain within the
|
||||
student 30) conducted their reflections in the form of reports. Due to the acceptable range for statistical analysis. Some of these differences
|
||||
lack of specific guidance from the instructor, who only provided general showed relatively small effect sizes, which will be further addressed in
|
||||
requirements, their reflections remained superficial, primarily involving the discussion section.
|
||||
DESR and EXPR. For example, student 4 wrote, “I have always enjoyed
|
||||
radio shows, so I was very pleased to have the opportunity to create one 4.2. What differences do high and low-performance teams show in
|
||||
this time. “Student 30 mentioned, "The tone did not match the storyline, reflection quality when using the three reflection methods?
|
||||
and the sound quality of the program was poor. These reflections remain
|
||||
limited to mere descriptions of the phenomena, needing more in-depth Fig. 4 illustrates the distribution of students from the 3 reflection
|
||||
analysis of the underlying causes and offering no insights for future methods (G1, G2, G3) along the two principal component axes (SVD1
|
||||
improvement. This tendency may be related to the relatively broad and SVD2). The points of different colors and shapes in the figure
|
||||
scope of the reports. These examples demonstrate that structured represent high and low-performance teams within each group, indi
|
||||
guidance exerts a positive effect on the quality of reflection. In addition, cating their performance across various reflection categories, such as
|
||||
they highlight the importance of timely feedback and question DESR, EXPR, CONR, and CRIR. The SVD1 axis accounts for 77.3 % of the
|
||||
prompting. Providing students with immediate feedback based on their total variance, while the SVD2 axis explains 16.8 %. The position of each
|
||||
responses and guiding them toward more elaborated answers contrib point represents the students’ tendencies in reflection content, with
|
||||
utes to fostering deeper levels of reflection. points closer to a specific reflection category indicating that the group’s
|
||||
In contrast, two students from Group G2 (student 6 and student 27), performance is more concentrated in that category.
|
||||
guided by the 4 aspects provided by the instructor and reflecting In Fig. 4, the centroids of the low-performance teams in G1 and G2
|
||||
through short-answer questions, demonstrated a higher reflection are positioned relatively close to each other, with the low-performance
|
||||
quality. The instructor guided students to reflect on four dimensions, teams located higher near DESR. Conversely, the high-performance
|
||||
including task, division of labor, self-evaluation, and outlook. This teams are situated lower, closer to CRIR. This indicates a certain de
|
||||
approach, particularly in the latter two areas, effectively fostered CRIR gree of similarity in the reflection content between the low-performance
|
||||
and CONR. For example, student 6 mentioned, “We plan to collaborate teams in G1 and G2. G3 is distributed on the right side of the figure, with
|
||||
more effectively in completing each future learning task, striving to a greater distance between the high and low-performance teams, indi
|
||||
achieve the best outcome” (CRIR). At the same time, student 27 stated, cating a more pronounced difference in reflection content than the other
|
||||
“Although our team may not be the best among all teams, we conveyed teams. Unlike G1 and G2, the G3 high-performance teams are positioned
|
||||
our unique message. If there is a next time, we will work harder to at the top, closer to CONR, while the low-performance teams are located
|
||||
improve” (CONR and CRIR). This structured guidance enhanced the at the bottom, near CRIR and EXPR. This suggests that the high-
|
||||
depth of reflection. However, since short-answer questions are a one- performance teams in G3 tend to engage more in connective reflec
|
||||
way form of reflection for students, the instructor may not intervene tion, whereas the low-performance teams focus more on critical and
|
||||
in their responses. As a result, there may be instances where students explanatory reflection.
|
||||
provide irrelevant answers or overly brief responses, which can affect The study employed the Mann-Whitney U test to elucidate further
|
||||
the overall reflection quality. For instance, student 6 responded with the scaling characteristics of the differences in reflection content be
|
||||
“Very successful” in the self-evaluation section (DESR), which lacked tween the high and low-performance teams across the 3 cohorts
|
||||
depth in reflection. The AI agent could address this shortcoming by (Table 8). According to the results of the Mann-Whitney U test, there are
|
||||
facilitating continuous interaction and feedback, encouraging students differences in the reflection content performance between the high and
|
||||
to engage in deeper reflection.
|
||||
When comparing the effectiveness of the reflection methods in G1,
|
||||
G2, and G3, G1′s reflection reports were of lower quality, primarily
|
||||
focusing on DESR and EXPR. Due to the absence of specific guidance, the
|
||||
reflections needed more depth. The short-answer questions format in G2
|
||||
improved reflection quality to some extent. Students’ reflections became
|
||||
more focused with the instructor’s guidance, particularly improving
|
||||
CRIR and CONR. However, this approach is still constrained by the
|
||||
limitations of outcome-based assessment. The AI agent guidance in G3
|
||||
further enhanced reflection quality. Through real-time feedback and
|
||||
targeted questioning, students could engage in deeper levels of CRIR and
|
||||
CONR.
|
||||
To scale these differences, the Mann-Whitney U test was employed to
|
||||
evaluate the distribution of the projection points of the 3 groups of
|
||||
students within the ENA space. The results indicated that at the α = 0.05
|
||||
significance level, G1 and G2 showed significant differences in both the
|
||||
first dimension (U = 147,537, p = 0.01, r = 0.09) and the second
|
||||
dimension (U = 147,204, p = 0.01, r = 0.08). This suggests that the
|
||||
structured guidance provided by short-answer questions enhances
|
||||
reflection quality. G1 and G3 also showed a significant difference in the
|
||||
first dimension (U = 99,595.5, p = 0.00, r = 0.34), highlighting the
|
||||
impact of integrating the AI agent in G3 to enhance reflection quality.
|
||||
However, no difference was observed in the second dimension (U =
|
||||
147,049.5, p = 0.42, r = 0.03). Additionally, G2 and G3 exhibited dif
|
||||
ferences in both the first dimension (U = 127,246.5, p = 0.00, r = 0.36)
|
||||
and the second dimension (U = 215,386.5, p = 0.01, r = − 0.08), further
|
||||
demonstrating the effectiveness of the AI agent in fostering deeper
|
||||
reflection. This effect surpasses that of the structured short-answer Fig. 4. The centroid distribution of high and low group students across the
|
||||
questions approach alone. Notably, due to the large sample size in this three reflection methods.
|
||||
|
||||
8
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
Table 8
|
||||
The reflection content distribution of high and low-performance teams across the three methods.
|
||||
|
||||
|
||||
|
||||
|
||||
low-performance teams across different reflection approaches. In G1, highlighted, AI may assist learners in constructing their learning pro
|
||||
the high and low-performance teams did not exhibit significant differ cesses, thereby enhancing critical thinking. In higher education, Xia and
|
||||
ences in either dimension (MR1: U = 4932.00, p = 0.41, r = 0.05; MR2: Li [73] also suggested that AI assistants have a positive impact on stu
|
||||
U = 5463.00, p = 0.44, r = 0.05). In G2, the high and low-performance dents’ imagination, creativity, critical thinking, and autonomous
|
||||
teams showed a significant difference in the MR1 dimension (U = learning. Zang et al. [69] experimentally confirmed the role of AI agents
|
||||
3303.00, p = 0.03, r = 0.19) but no difference in the MR2 dimension (U in enhancing students’ critical thinking in English learning. However,
|
||||
= 3051.00, p = 0.26, r = 0.10). For G3 (students using AI agent-driven the systematic review by Mohamud et al. [74] indicated that the
|
||||
continuous questioning), the high and low-performance teams showed a introduction of AI in higher education may diminish students’ critical
|
||||
significant difference in the MR1 dimension (U = 1136.50, p < 0.001, r thinking. This conclusion contradicts the findings of this study. The
|
||||
= 0.45). In contrast, the difference in the MR2 dimension was insignif differences may be due to a lack of proper instructional design by
|
||||
icant (U = 2187.50, p = 0.54, r = 0.06). teachers when using AI [74]. Cronje [75] argued that AI may serve as a
|
||||
In G3, the differences between the high and low-performance teams teaching assistant to facilitate learning, but it should be integrated with
|
||||
were the most pronounced, particularly on the MR1 dimension. Further instructional design and necessary prompts. In this study, the SSRL
|
||||
analysis of the ENA diagram revealed that low-performance teams reflection checklist was operationalized as structured prompts to cali
|
||||
exhibited stronger connections in EXPR-CRIR (0.46) and EXPR-CONR brate the AI agent, enabling it to scaffold students’ reflections across the
|
||||
(0.61). This suggests that the AI agent-driven reflection method may four phases of SSRL. By embedding SSRL principles into its dialogic
|
||||
help low-performance teams focus more on specific reflection content. design, the agent acted as both a facilitator of reflection and a medium
|
||||
for delivering theoretical scaffolds. This underscores the importance for
|
||||
5. Discussion educators and researchers to apply instructional theory and design
|
||||
thoughtfully when integrating AI into the classroom.
|
||||
This section analyzes the findings based on the research questions. It In addition to SSRL theoretical guidance, the AI agent leveraged its
|
||||
covers the positive impact of AI agents on students’ SSRL reflection, technological capabilities, including continuous questioning and real-
|
||||
differences in reflection quality between high and low-performance time feedback, to actively scaffold deeper student reflections. Wolf
|
||||
teams, and key considerations for using AI agents effectively in SSRL. bauer et al. [76] noted that continuous dialogue with intelligent assis
|
||||
tants enhances students’ levels of reflection. In the G3 group, the AI
|
||||
agents not only guided students to explore the root causes of issues but
|
||||
5.1. The positive role of AI agents in students’ SSRL reflection
|
||||
also helped them develop specific improvement plans. This guiding
|
||||
process is similar to the “Socratic method” in educational psychology.
|
||||
In SSRL, the AI agent reflection assistant enhanced the quality of
|
||||
Through a series of targeted questions, students are encouraged to
|
||||
students’ reflections. This outcome aligns with previous research [70,
|
||||
engage in deep thinking and gain a more profound understanding of the
|
||||
71]. For instance, Maedche et al. [70] demonstrated the positive role of
|
||||
knowledge [77]. In addition, the timely feedback function of AI agents
|
||||
AI agents in fostering deeper reflection among students. Sigman et al.
|
||||
plays a crucial role in enhancing the quality of students’ SSRL re
|
||||
[71] also found that AI assistants emulate and augment human cogni
|
||||
flections. Self-determination theory suggests that providing positive
|
||||
tion, thereby promoting reflection. These studies provide more evidence
|
||||
emotional support through feedback helps students gain a sense of
|
||||
of the positive impact AI agents have on facilitating reflective practices
|
||||
belonging, thereby enhancing their motivation to learn and willingness
|
||||
in education.
|
||||
to reflect [78]. Uygur et al. [79] suggested that timely feedback
|
||||
This study further clarifies how AI agents enhance the quality of
|
||||
enhanced students’ reflection and learning. However, traditional SSRL
|
||||
student reflection in the SSRL process through ENA. In these activities,
|
||||
reflection reports and short-answer questions are one-way reflective
|
||||
student reflections guided by AI agents exhibited higher levels of critical
|
||||
activities, lacking immediate feedback and guidance. The AI agent
|
||||
thinking and coherence. In contrast, the other two traditional reflective
|
||||
reflection assistant compensates for the shortcomings of teachers in
|
||||
texts displayed lower levels of reflection, focusing primarily on
|
||||
providing timely feedback, enhancing the effectiveness of collaborative
|
||||
descriptive and exploratory reflection. As Rusandi et al. [72]
|
||||
|
||||
9
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
learning. examine how to fine-tune AI guidance so that it benefits high performers
|
||||
This study indicates that the level of reflection guidance directly without disrupting their existing strategies.
|
||||
affects learners’ reflection quality, which is consistent with previous Additionally, there was no significant difference in performance
|
||||
research [80–82]. G1, with minimal guidance, showed the lowest between high and low-performance student teams in reflective reports,
|
||||
quality, while G2, guided by the SSRL reflection checklist, exhibited with both showing low quality reflections. This may be due to learners
|
||||
higher-quality reflections, demonstrating the importance of SSRL scaf lacking clear guidance in the reflection process. Maedche et al. [70]
|
||||
folds. G3 combined SSRL scaffolding with real-time feedback and found that in reflective environments lacking external feedback or
|
||||
encouragement for deeper reflection. Comparisons suggest that while structured guidance, the quality of students’ reflections is constrained.
|
||||
structured short-answer questions had a limited impact, the AI agent This suggests that instructors should provide the necessary scaffolding
|
||||
provided a practically meaningful enhancement of students’ reflective when designing reflective tasks. The SSRL scaffolding demonstrated
|
||||
practices. However, these findings are based primarily on qualitative significant value in this study and is well-suited for broader application
|
||||
data, and further quantitative research is needed to validate them. in collaborative settings.
|
||||
In summary, AI agents play a substantial role in promoting student
|
||||
reflection. Although the comparison between structured short-answer 5.3. Considerations for the effective use of AI agents in SSRL
|
||||
questions and traditional reflective reports showed statistically signifi
|
||||
cant but very small effects, this suggests that short-answer questions Although experiments have demonstrated that AI agents enhance
|
||||
alone had a limited impact on enhancing students’ reflection quality. In SSRL reflection quality, there are several limitations in their usage. To
|
||||
contrast, the AI agent had a substantially greater impact on students’ better promote the outcomes of this study, we offer considerations for
|
||||
reflective practices. It is essential for educators and instructional de teachers and instructional designers regarding the use of AI agents.
|
||||
signers to integrate AI agents into classrooms and develop more Firstly, the quality and reliability of feedback provided by AI agents
|
||||
instructional design case studies. Moreover, teachers should prioritize still present limitations. This finding aligns with the studies of Maloney
|
||||
the importance of instructional theories and provide essential design et al. [91] and Fedus et al. [92], which suggest that the accuracy and
|
||||
guidance when applying AI agents. effectiveness of AI agents depend on algorithm design and data quality.
|
||||
In this study, the AI agent exhibited two primary issues: repeated
|
||||
5.2. Differences between high and low-performance teams under various questioning and unexpected interruptions during conversations. To
|
||||
SSRL reflection methods address the issue of repeated questioning, adjustments to the prompt
|
||||
design can be implemented. For example, the prompts specify that each
|
||||
The results indicate a significant difference in the high and low- question should be asked only once and repeated only if the student
|
||||
performance teams that utilized reflective short-answer questions and responds off-topic or does not answer. For unexpected interruptions,
|
||||
the AI agent reflection assistant. In short-answer questions, high- teachers need to guide students in testing their network environment
|
||||
performance teams performed better. This aligns with the conclusions and re-engaging with the task. These observations show that AI agents
|
||||
of Knight et al. [83], who found that high-performance students out need improvement in handling complex contexts and dynamic learning
|
||||
performed low-performance students in reflective questions. The needs.
|
||||
disparity in reflection between high and low-performance learners is In addition, data privacy and ethical concerns pose another chal
|
||||
primarily attributed to their metacognitive levels and learning strategies lenge in the application of AI agents. AI agents require extensive data
|
||||
[84–86]. For instance, Safari and Fitriati [85] found that collection, including students’ reflection content, behavioral patterns,
|
||||
high-performance learners were able to use all strategies equally, but and learning habits [93]. To mitigate this issue, this study incorporated
|
||||
low-performance learners more frequently relied on metacognitive and an opening message in the AI agent’s script. The message advised stu
|
||||
social strategies. These differences may impact learners’ outcomes, dents: “Please do not disclose personal sensitive information, such as
|
||||
including their learning effectiveness and reflection [84]. your name or school, during the interaction.” Furthermore, before
|
||||
In contrast, the reflection quality of low-performance teams using the implementing the AI agent, teachers need to raise students’ awareness of
|
||||
AI agent reflective assistant was better than that of the high- data security and privacy protection [94].
|
||||
performance teams. This is a novel finding of the study, suggesting The risks associated with over-reliance on AI technology should also
|
||||
that the AI reflective assistant played a positive role in guiding low- be carefully evaluated. Although AI agents can provide personalized
|
||||
performance learners through the reflection process. This finding support, they cannot fully replace the role of human teachers, particu
|
||||
aligns with previous evidence showing that AI technologies tend to larly in offering emotional support and fostering social interaction [95].
|
||||
provide greater benefits for lower performers [87–90]. Prior studies In this study, AI agents were utilized exclusively in the post-class
|
||||
have suggested that such differential effects often occur because an AI reflection phase. The remaining instructional time relied on
|
||||
chatbot can use adaptive strategies and personalized feedback to address face-to-face interactions between teachers and students. As GAI tech
|
||||
the strategic gaps of low performers [88]. AI tutoring can also offer both nology becomes increasingly accessible, preventing students from
|
||||
cognitive and emotional support [89]. Xu et al. [90] further found that developing dependency behaviors may become more challenging.
|
||||
low-performing learners become more engaged when they receive im Future research could explore strategies to prevent learners from
|
||||
mediate feedback and external help. This engagement encourages them becoming overly reliant on GAI technologies.
|
||||
to apply higher-order thinking strategies more actively. While AI agents have demonstrated advantages in enhancing stu
|
||||
These mechanisms may also explain the current results in our SSRL dents’ SSRL reflection quality, their widespread applicability is con
|
||||
reflection task. The AI reflection assistant provided structured guidance strained by feedback quality, data privacy, and ethical considerations.
|
||||
in real time and reduced the cognitive load of producing reflections. This Future research should emphasize these limitations, refining the appli
|
||||
allowed low-performing learners to focus more on critical and creative cation framework of AI to ensure its effectiveness and sustainability in
|
||||
thinking. In contrast, high-performing learners may already have the educational domain.
|
||||
established reflection routines. Extra guidance could interfere with these
|
||||
processes, leading to smaller gains in reflection quality [87]. 6. Conclusion, limitations, and future research
|
||||
This study, therefore, not only confirms that differential effects exist
|
||||
in reflection tasks but also highlights the potential of AI support to This study explores methods to enhance student reflection quality by
|
||||
promote higher-order thinking in low-performing learners. In educa designing an AI agent that supports reflection through continuous
|
||||
tional practice, this suggests that AI reflection assistants could be stra questioning and real-time feedback. Using content analysis and ENA,
|
||||
tegically deployed to close performance gaps. Future research could this study conducted a three-semester experiment comparing reflection
|
||||
|
||||
10
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
|
||||
reports, short-answer questions, and an AI agent reflection assistant. The National Natural Science Foundation of China (Grant Number:
|
||||
results indicate that AI agents improve reflection quality, particularly for 62577035). The other authors declare that they have no known
|
||||
low-performance teams. The study offers practical guidance for inte competing financial interests or personal relationships that could have
|
||||
grating AI into SSRL-based instruction. appeared to influence the work reported in this paper.
|
||||
Although this study contributes to understanding students’ reflection
|
||||
behaviors in SSRL, several limitations remain. The first limitation arises Appendix A. The Critical Thinking Questionnaire (CThQ)
|
||||
from the study participants. Conducted within a higher education
|
||||
setting, this research primarily examines the effectiveness of using AI Instructions: For each statement below, please indicate how much
|
||||
agents to facilitate reflection among university students. Only 97 stu you agree using a 5-point Likert scale (1 = Strongly disagree, 2 =
|
||||
dents from the “Internet Thinking and Digital Self-Learning” course Disagree, 3 = Neutral, 4 = Agree, 5 = Strongly agree).
|
||||
participated, so the findings may not be generalizable to other courses or
|
||||
age groups. Further research is needed to explore the potential impact 1. After reading a text, I check important information, even if it
|
||||
and adaptability of AI agents in secondary and primary education set seems to be true.
|
||||
tings [96]. Secondly, the AI agent still has limitations in the quality and 2. I like combining information from different texts.
|
||||
reliability of feedback, which may affect the depth and quality of stu 3. I am willing to share newly acquired information.
|
||||
dents’ reflections. Addressing this issue relies on rapidly updating and 4. In-depth analyses of reality is a waste of time.
|
||||
optimizing large AI model algorithms to provide higher-quality and 5. After reading a text, I can recall important points.
|
||||
more targeted feedback. The third limitation is that the three reflection 6. The same content can be expressed in many different ways.
|
||||
methods used in this experiment all fall under outcome-based reflection, 7. I can understand texts from various fields.
|
||||
overlooking the dynamic process of students’ reflections at different 8. I form my impressions based on various pieces of information that
|
||||
stages of collaborative learning. Additionally, the proposed mechanisms I combine.
|
||||
underlying the AI agent’s impact on reflection quality, particularly for 9. Everything already exists, so nothing completely new can be
|
||||
low-performance teams, remain hypothetical and require further created.
|
||||
empirical validation through quantitative studies. Lastly, this study did 10. When I talk, I give many examples.
|
||||
not differentiate the specific contributions of individual design elements 11. In discussions, I care about justifying my stance while under
|
||||
in the AI agent’s interaction strategy (e.g., sequential questioning, standing the other party.
|
||||
encouraging feedback, simplified language). More research could adopt 12. I like finding connections between seemingly different
|
||||
ablation analysis to examine how these elements independently influ phenomena.
|
||||
ence students’ reflective practices. 13. I can see the structure of a text, and I could reorganize it.
|
||||
Based on the limitations identified in this study, future research 14. When discussing, I try to use practical examples to justify my
|
||||
could expand the study to more diverse educational contexts, including stance.
|
||||
secondary and primary education, to examine the generalizability and 15. If necessary, I can recall information I have read before.
|
||||
adaptability of AI agents. Incorporating multi-modal data, such as stu 16. I do not remember much of what I learned at school.
|
||||
dents’ facial expressions, gestures, and dialogue, may offer a more 17. When I am interested in some information, I try to verify whether
|
||||
comprehensive understanding of reflective behaviors in SSRL. Im it is true.
|
||||
provements in AI models are needed to enhance the quality and reli 18. I can extract the most relevant parts of a text.
|
||||
ability of feedback, supporting deeper and higher-quality student 19. To evaluate information, I check multiple sources.
|
||||
reflections. In addition, investigating the individual contributions of 20. I like discussing new interpretations of texts I already know.
|
||||
specific design elements in AI agents’ interaction strategies, for example, 21. I like to collate different opinions and compare them.
|
||||
through ablation-style comparisons, could clarify which features most 22. I have difficulties with paraphrasing.
|
||||
effectively promote high-order reflection, particularly among low- 23. I try to apply the information I have learned in everyday life.
|
||||
performance teams. We therefore urge more researchers to focus on 24. When I read, I look for relationships between its information and
|
||||
this area of study, exploring the impact of GAI on educational outcomes other texts I have read.
|
||||
to better understand and harness its potential for improving educational 25. I pay attention to the contexts, nuances, and overtones of
|
||||
practices. statements.
|
||||
|
||||
Declaration of generative AI in the writing process
|
||||
Data availability
|
||||
During the preparation of this work, the authors used Kimi (https:
|
||||
//kimi.moonshot.cn/) to improve language and readability. After The datasets generated and analyzed during the current study are
|
||||
using this tool, the authors reviewed and edited the content as needed available from the corresponding author on reasonable request.
|
||||
and take full responsibility for the content of the publication.
|
||||
References
|
||||
CRediT authorship contribution statement
|
||||
[1] S. Ahmad, M. Rahmat, M. Mubarik, M. Alam, S. Hyder, Artificial intelligence and
|
||||
its role in education, Sustainability 13 (22) (2021) 12902.
|
||||
Yumin Zheng: Writing – original draft, Conceptualization. Fengjiao [2] X. Gong, Z. Li, A. Qiao, Impact of generative AI dialogic feedback on different
|
||||
Tu: Investigation, Data curation. Fengfang Shu: Investigation, Data stages of programming problem solving, Educ. Inf. Technol. 30 (7) (2025)
|
||||
curation. Chaowang Shang: Formal analysis, Data curation. Lulu 9689–9709.
|
||||
[3] O. Tapalova, N. Zhiyenbayeva, D. Gura, Artificial Intelligence in Education: aIEd
|
||||
Chen: Writing – review & editing, Formal analysis. Jiang Meng: for personalised learning pathways, Electron. J. e-Learn. 20 (5) (2022) 639–653.
|
||||
Investigation. [4] S. Järvelä, P. Kirschner, E. Panadero, J. Malmberg, C. Phielix, J. Jaspers,
|
||||
M. Koivuniemi, H. Järvenoja, Enhancing socially shared regulation in collaborative
|
||||
learning groups: designing for CSCL regulation tools, in: Educ. Technol. Res. Dev.,
|
||||
Declaration of competing interest 63, 2014, pp. 125–142.
|
||||
[5] D. Bransen, M.J.B. Govaerts, E. Panadero, et al., Putting self-regulated learning in
|
||||
The authors declare the following financial interests/personal re context: integrating self-, co-, and socially shared regulation of learning, Med.
|
||||
Educ. 56 (1) (2022) 29–36.
|
||||
lationships which may be considered as potential competing interests:
|
||||
Chaowang Shang acknowledges the financial support from the
|
||||
|
||||
11
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
[6] E. Eshuis, J. Vrugte, A. Anjewierden, L. Bollen, J. Sikken, T. Jong, Improving the [36] E. Panadero, S. Järvelä, Socially shared Regulation of Learning: a review, Eur.
|
||||
quality of vocational students’ collaboration and knowledge acquisition through Psychol. 20 (2015) 190–203.
|
||||
instruction and joint reflection, Int. J. Comput.-Support. Collab. Learn. 14 (2019) [37] J. Isohätälä, H. Järvenoja, S. Järvelä, Socially shared regulation of learning and
|
||||
53–76. participation in social interaction in collaborative learning, Int. J. Educ. Res. 81
|
||||
[7] C. Chan, K. Lee, Reflection literacy: a multilevel perspective on the challenges of (2017) 11–24.
|
||||
using reflections in higher education through a comprehensive literature review, [38] J. Li, Y. Lin, M. Sun, R. Shadiev, Socially shared regulation of learning in game-
|
||||
Educ. Res. Rev. 32 (2020) 100376. based collaborative learning environments promotes algorithmic thinking,
|
||||
[8] L. Guo, How should reflection be supported in higher education? — A meta- learning participation, and positive learning attitudes, Interact. Learn. Environ. 31
|
||||
analysis of reflection interventions, Reflective Pract. 23 (2021) 118–146. (2020) 1715–1726.
|
||||
[9] S. Popenici, S. Kerr, Exploring the impact of artificial intelligence on teaching and [39] J. Malmberg, S. Järvelä, H. Järvenoja, E. Panadero, Promoting socially shared
|
||||
learning in higher education, Res. Pract. Technol. Enhanc. Learn. 12 (1) (2017) 22. regulation of learning in CSCL: progress of socially shared regulation among high-
|
||||
[10] H. Kiy, A Study on Writing Experience With ChatGPT of College Students, J. Korea and low-performing groups, Comput. Hum. Behav. 52 (2015) 562–572.
|
||||
Converg. Soc. 14 (9) (2024) 976. [40] J. Yukawa, Co-reflection in online learning: collaborative critical thinking as
|
||||
[11] K. Hanifi, O. Cetin, C. Yilmaz, On ChatGPT: perspectives from software engineering narrative, Int. J. Comput.-Support. Collab. Learn. 1 (2006) 203–228.
|
||||
students, in: Proc. 2023 IEEE 23rd Int. Conf. Softw. Qual. Reliab. Secur. (QRS), [41] A. Głowala, M. Kołodziejski, T. Butvilas, Reflection as a basic category of a
|
||||
2023, pp. 196–205. teacher’s thinking and action, Multidiscip. J. Sch. Educ. 12.1(2023):229–250.
|
||||
[12] Zhiheng Xi, et al., The rise and potential of large language model based agents: A [42] J. Buck, Reflecting on reflections: a case study of disappointment in student writing
|
||||
survey, Sci. China Inf. Sci. 68 (2) (2025) 121101. assignments, J. Acoust. Soc. Am. (2023). A273-A273.
|
||||
[13] E. Katsarou, F. Wild, A. Sougari, P. Chatzipanagiotou, A systematic review of voice- [43] N Rahmi, C M Zubainur, Students’ mathematical reflective thinking ability through
|
||||
based intelligent virtual agents in EFL education, Int. J. Emerg. Technol. Learn. scaffolding strategies[C]//Journal of Physics: Conference Series, IOP Publishing
|
||||
(iJET) 18 (10) (2023) 65–85. 1460 (1) (2020) 012022.
|
||||
[14] P.R. Lewis, Ş. Sarkadi, Reflective artificial intelligence, Minds Mach. 34 (2) (2024) [44] J. Dewey, Education Democracy, The elementary school teacher 4 (4) (1903)
|
||||
14. 193–204.
|
||||
[15] Z. Xu, P. Zhang, M. Tu, M. Zhang, Y. Lai, Brain optimization with additional study [45] B.J. Zimmerman, Self-regulated learning and academic achievement: an overview,
|
||||
time: potential brain differences between high- and low-performance college Educ. Psychol. 25 (1) (1990) 3–17.
|
||||
students, Front. Psychol. 14 (2023) 1209881. [46] D. Coulson, M. Harvey, Scaffolding student reflection for experience-based
|
||||
[16] UK Government, Generative Artificial Intelligence (AI) in Education, GOV.UK, learning: a framework, Teach. High. Educ. 18 (2013) 401–413.
|
||||
2023. https://www.gov.uk/government/publications/generative-artificial-i [47] S. Lajoie, Extending the scaffolding metaphor, Instr. Sci. 33 (2005) 541–557.
|
||||
ntelligence-in-education/generative-artificial-intelligence-ai-in-education. [48] E. Panadero, P.A. Kirschner, S. Järvelä, J. Malmberg, H. Järvenoja, How individual
|
||||
[17] M. Dogan, T. Dogan, A. Bozkurt, The use of artificial intelligence (AI) in online self-regulation affects group regulation and performance: a shared regulation
|
||||
learning and distance education processes: a systematic review of Empirical intervention, Small Group Res. 46 (4) (2015) 431–454.
|
||||
Studies, Appl. Sci. 13 (5) (2023) 3056. [49] E. Davis, Prompting middle school science students for productive reflection:
|
||||
[18] L. Shi, The integration of advanced AI-enabled emotion detection and adaptive generic and directed prompts, J. Learn. Sci. 12 (2003) 142–191.
|
||||
learning systems for improved emotional regulation, J. Educ. Comput. Res. 63 [50] J. Hattie, H. Timperley, The Power of Feedback, Rev. Educ. Res. 77 (2007)
|
||||
(2024) 173–201. 112–181.
|
||||
[19] B. Tang, J. Liang, W. Hu, H. Luo, Enhancing programming performance, learning [51] R. Ajjawi, F. Kent, J. Broadbent, J. Tai, M. Bearman, D. Boud, Feedback that works:
|
||||
interest, and self-efficacy: the role of large language models in middle school a realist review of feedback interventions for written tasks, Stud. High. Educ. 47
|
||||
education, Systems 30 (6) (2025) 8109–8138. (2021) 1343–1356.
|
||||
[20] L. Feng, Investigating the effects of artificial intelligence-assisted language learning [52] U. Krause, R. Stark, Reflection in example- and problem-based learning: effects of
|
||||
strategies on cognitive load and learning outcomes: a comparative study, J. Educ. reflection prompts, feedback, and cooperative learning, Eval. Res. Educ. 23 (2010)
|
||||
Comput. Res. 62 (8) (2025) 1741–1774. 255–272.
|
||||
[21] Q. Huang, W. Li, Y. Zhao, Enhancing deep learning and motivation in university [53] J. Contreras, S. Edwards-Maddox, A. Hall, M. Lee, Effects of reflective practice on
|
||||
English education through AI technology: a quasi-experimental study, Asian J. baccalaureate nursing students’ Stress, Anxiety, and competency: an integrative
|
||||
Educ. Soc. Stud. 51 (4) (2025) 452–463. review, Worldviews Evid.-Based Nurs. 17 (3) (2020) 239–245.
|
||||
[22] Ó. Cuéllar, M. Contero, M. Hincapié, Personalized and Timely Feedback in Online [54] H. Gadsby, Fostering reflective practice in Post Graduate Certificate in Education
|
||||
education: Enhancing learning With Deep Learning and Large Language Models, students through reflective journals. Developing a typology for reflection,
|
||||
MTI. 9 (5) (2025) 45. Reflective Pract. 23 (2022) 357–368.
|
||||
[23] X. Zhou, D. Teng, H. Al-Samarraie, The mediating role of generative AI self- [55] S. Rabu, N. Badlishah, Levels of students’ Reflective thinking skills in a
|
||||
regulation on students’ critical thinking and problem-solving, Educ. Sci. 14 (12) collaborative learning environment using Google Docs, TechTrends 64 (2020)
|
||||
(2024) 1302. 533–541.
|
||||
[24] S. Steenbergen-Hu, H. Cooper, A meta-analysis of the effectiveness of intelligent [56] J. Stoszkowski, A. Hodgkinson, D. Collins, Using Flipgrid to improve reflection: a
|
||||
tutoring systems on college students’ academic learning, J. Educ. Psychol. 106 collaborative online approach to coach development, Phys. Educ. Sport Pedagogy
|
||||
(2014) 331–347. 26 (2020) 167–178.
|
||||
[25] C. Moridis, A. Economides, Affective learning: empathetic agents with emotional [57] E. Liesa, P. Mayoral, M. Giralt-Romeu, S. Angulo, Video-Based Feedback for
|
||||
facial and tone of voice expressions, IEEE Trans. Affect. Comput. 3 (2012) Collaborative Reflection Among Mentors, University Tutors, and Students, Edu.
|
||||
260–272. Sci. 13 (9) (2023) 879.
|
||||
[26] S. Nelekar, A. Abdulrahman, M. Gupta, D. Richards, Effectiveness of embodied [58] M. Alghasab, J. Hardman, Z. Handley, Teacher-student Interaction On wikis:
|
||||
conversational agents for managing academic stress at an Indian university (ARU) Fostering collaborative Learning and Writing, Learn Cult. Soc. Inter. 21 (2019)
|
||||
during COVID-19, Br. J. Educ. Technol. 53 (2021) 491–511. 10–20.
|
||||
[27] W. Sun, Q. Chen, The design, implementation, and evaluation of Gamified [59] R. Gubareva, R. Lopes, Virtual Assistants for learning: a systematic literature
|
||||
Immersive Virtual Reality (IVR) for learning: a review of Empirical Studies, Proc. review, CSEDU (1) (2020) 97–103.
|
||||
Eur. Conf. Games-Based Learn. 17 (1) (2023) 789–797. [60] L. González, H. Neyem, I. Contreras-McKay, D. Molina, Improving learning
|
||||
[28] M. Chen, L. Wu, Z. Liu, X. Ma, The impact of metacognitive strategy-supported experiences in software engineering capstone courses using artificial intelligence
|
||||
intelligent agents on the quality of collaborative learning from the perspective of virtual assistants, Comput. Appl. Eng. Educ. 30 (2022) 1370–1389.
|
||||
the community of inquiry, in: Proc. 2024 4th Int. Conf. Educ. Technol. (ICET), [61] B. Renner, G. Wesiak, V. Pammer-Schindler, M. Prilla, L. Müller, D. Morosini,
|
||||
2024, pp. 11–17. S. Mora, N. Faltin, U. Cress, Computer-supported reflective learning: how apps can
|
||||
[29] H. Hong, C. Viriyavejakul, P. Vate-U-Lan, Enhancing critical thinking skills: foster reflection at work, Behav. Inf. Technol. 39 (2019) 167–187.
|
||||
exploring generative AI-enabled cognitive offload instruction in English essay [62] A. Freiberg-Hoffmann, A. Romero-Medina, B. López-Fernández, M. Fernández-
|
||||
writing, 4, ECOHUMANISM Учредители: Transnational Press, London, p. 2024. Liporace, Learning approaches: cross-cultural differences (Spain–Argentina) and
|
||||
[30] D.H. Schunk, B.J. Zimmerman, Motivation and Self-Regulated learning: Theory, academic achievement in college students, Span. J. Psychol. 26 (2023) e16.
|
||||
research, and Applications, Routledge, 2012. [63] A. Kobylarek, K. Błaszczyński, L. Ślósarz, M. Madej, Critical thinking Questionnaire
|
||||
[31] P.H. Winne, A.F. Hadwin, N.E. Perry. Metacognition and computer-supported (CThQ)–construction and application of a critical thinking test tool, Andragogy
|
||||
collaborative learning, The International Handbook of Collaborative Learning, Adult Educ. Soc. Mark. 2 (2) (2022), 1-1.
|
||||
Routledge, 2013, pp. 462–479. [64] J. Dewey, An analysis of reflective thought, J. Philos. (1922) 29–38.
|
||||
[32] Y. Su, Y. Li, H. Hu, et al., Exploring college English language learners’ self and [65] D.T. Campbell, J.C. Stanley, Experimental and Quasi-Experimental Designs For
|
||||
social regulation of learning during wiki-supported collaborative reading activities, Research, Ravenio Books, 2015.
|
||||
Int. J. Comput.-Support. Collab. Learn. 13 (2018) 35–60. [66] M.M. Plack, M. Driscoll, S. Blissett, R. McKenna, T.P. Plack, A method for assessing
|
||||
[33] F. Tu, L. Wu, Kinshuk, et al., Exploring the influence of regulated learning reflective journal writing, J. Allied Health 34 (4) (2005) 199–208.
|
||||
processes on learners’ prestige in project-based learning, Educ. Inf. Technol. 30 (2) [67] L. Wang, G. Wu, J. Wu, A Study on the Reflective Level of Teachers’
|
||||
(2025) 2299–2329. Autobiography, Global Education Outlook (01), (2018) 93–105.
|
||||
[34] S. Zhang, J. Chen, Y. Wen, H. Chen, Q. Gao, Q. Wang, Capturing regulatory [68] H.T. Hou, Integrating cluster and sequential analysis to explore learners’ flow and
|
||||
patterns in online collaborative learning: a network analytic approach, Int. J. behavioral patterns in a simulation game with a situated-learning context for
|
||||
Comput.-Support. Collab. Learn. 16 (2021) 37–66. science courses: a video-based process exploration, Comput. Human Behav. 48
|
||||
[35] J. Zheng, W. Xing, G. Zhu, Examining sequential patterns of self-and socially (2015) 424–435.
|
||||
shared regulation of STEM learning in a CSCL environment, Comput. Educ. 136
|
||||
(2019) 34–48.
|
||||
|
||||
|
||||
12
|
||||
Y. Zheng et al. Computer Standards & Interfaces 97 (2026) 104094
|
||||
|
||||
[69] G. Zang, M. Liu, B. Yu, The application of 5G and artificial intelligence technology [83] J. Knight, D. Weaver, M. Peffer, Z. Hazlett, Relationships between prediction
|
||||
in the innovation and reform of college English education, Comput. Intell. accuracy, metacognitive reflection, and performance in introductory genetics
|
||||
Neurosci. 2022 (1) (2022) 9008270. students, CBE Life Sci. Educ. 21 (3) (2022) ar45.
|
||||
[70] A. Maedche, C. Legner, A. Benlian, B. Berger, H. Gimpel, T. Hess, O. Hinz, [84] D. Difrancesca, J. Nietfeld, L. Cao, A comparison of high and low achieving
|
||||
S. Morana, M. Söllner, AI-based digital assistants, Bus. Inf. Syst. Eng. 61 (2019) students on self-regulated learning variables, Learn. Individ. Differ. 45 (2016)
|
||||
535–544. 228–236.
|
||||
[71] M. Sigman, D. Slezak, L. Drucaroff, S. Ribeiro, F. Carrillo, Artificial and Human [85] S A Gani, D Fajrina, R Hanifa, Students’ learning strategies for developing speaking
|
||||
intelligence in mental health, AI Mag. 42 (2021) 39–46. ability[J], Stud. Eng. lang. educ. 2 (1) (2015) 16–28.
|
||||
[72] M.A. Rusandi, I. Saripah, D.M. Khairun, No worries with ChatGPT: building bridges [86] M. Yip, Differences between high and low academic achieving university students
|
||||
between artificial intelligence and education with critical thinking soft skills, in learning and study strategies: a further investigation, Educ. Res. Eval. 15 (2009)
|
||||
J. Public Health. 45 (3) (2023) e602–e603. 561–570.
|
||||
[73] X. Xia, X. Li, Artificial intelligence for higher education development and teaching [87] H.K. Etkin, K.J. Etkin, R.J. Carter, C.E. Rolle, Differential effects of GPT-based tools
|
||||
skills, Wirel, Commun. Mob. Comput. 2022 (1) (2022) 7614337. on comprehension of standardized passages, Front. Educ. 10 (2025) 1506752.
|
||||
[74] Y. Mohamud, A. Ma’rof, A. Mohamed, M. Uzir, A narrative review on the impact of [88] S. Ruan, A. Nie, W. Steenbergen, J. He, J.Q. Zhang, M. Guo, et al., A reinforcement
|
||||
applied artificial intelligence tools on higher secondary students, Int. J. Acad. Res. learning tutor better supported lower performers in a math task, Mach. Learn. 113
|
||||
Bus. Soc. Sci. 13 (14) (2023) 34–42. (2024) 3023–3048.
|
||||
[75] J. Cronje, Exploring the role of ChatGPT as a peer coach for developing research [89] D.R. Thomas, J. Lin, E. Gatz, A. Gurung, S. Gupta, K. Norberg, et al., Improving
|
||||
proposals: feedback quality, prompts, and student reflection, Electron. J. (2024) student learning with hybrid human-AI tutoring: a three-study quasi-experimental
|
||||
22.2, e-Learn. investigation, in: Proc. 14th Learn. Anal. Knowl. Conf., 2024, pp. 404–415. New
|
||||
[76] I. Wolfbauer, V. Pammer-Schindler, K. Maitz, C. Rosé, A script for conversational York, NY, USA: Association for Computing Machinery (LAK ‘24).
|
||||
reflection guidance: a field study on developing reflection competence with [90] Y Xu, J Zhu, M Wang, et al., The impact of a digital game-based AI chatbot on
|
||||
apprentices, IEEE Trans. Learn. Technol. 15 (2022) 554–566. students’ academic performance, higher-order thinking, and behavioral patterns in
|
||||
[77] F. Leigh, Platonic dialogue, maieutic method, and critical thinking, J. Philos. Educ. an information technology curriculum[J], App. Sci. 14 (15) (2024) 6418.
|
||||
41 (2008) 309–323. [91] Maloney, A., Roberts, D. A., & Sully, J. (2022). A solvable model of neural scaling
|
||||
[78] E. Deci, R. Ryan. Intrinsic motivation and self-determination in Human behavior, laws. arXiv preprint arXiv:2210.16859.
|
||||
1975, pp. 1–371. [92] W. Fedus, B. Zoph, N. Shazeer, Switch transformers: scaling to trillion parameter
|
||||
[79] J. Uygur, E. Stuart, M. Paor, E. Wallace, S. Duffy, M. O’Shea, S. Smith, models with simple and efficient sparsity, J. Mach. Learn. Res. 23 (120) (2022)
|
||||
T. Pawlikowska, The Best evidence in Medical Education systematic review to 1–39.
|
||||
determine the most effective teaching methods that develop reflection in medical [93] K. Seo, J. Tang, I. Roll, S. Fels, D. Yoon, The impact of artificial intelligence on
|
||||
students: BEME Guide No. 51, Med. Teach. 41 (2019) 3–16. learner–instructor interaction in online learning, Int. J. Educ. Technol. High. Educ.
|
||||
[80] K. Arendt, L. Stark, A. Friedrich, R. Brünken, R. Stark, Quality of reflections on 18 (1) (2021) 54.
|
||||
teaching: approaches to its measurement and low-threshold promotion, Educ. Sci. [94] B. Klimova, M. Pikhart, J. Kacetl, Ethical issues of the use of AI-driven mobile apps
|
||||
15 (7) (2025) 884. for education, Front. Public Health 10 (2023) 1118116.
|
||||
[81] J. Jung, Y. Lu, A. Ding, How do prompts shape preservice teachers’ reflections? A [95] T. Adiguzel, M. Kaya, F. Cansu, Revolutionizing education with AI: exploring the
|
||||
case study in an online technology integration class, J. Teach. Educ. 73 (3) (2021) transformative potential of ChatGPT, Contemp. Educ. Technol. 15 (3) (2023).
|
||||
301–313. [96] M. Thottoli, B. Alruqaishi, A. Soosaimanickam, Robo academic advisor: can
|
||||
[82] A. Sturgill, P. Motley, Methods of reflection about service learning: guided vs. free, chatbots and artificial intelligence replace human interaction? Contemp. Educ.
|
||||
dialogic vs. expressive, and public vs. private. Teaching and learning inquiry, Technol. 16 (1) (2024) ep485.
|
||||
ISSOTL J. 2 (1) (2014) 81–93.
|
||||
|
||||
|
||||
|
||||
|
||||
13
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user