opaque-lattice/papers_txt/AdaTraj-DP--An-adaptive-privacy-framework-for-contex_2026_Computer-Standards.txt

                                                               Computer Standards & Interfaces 97 (2026) 104125


                                                                    Contents lists available at ScienceDirect


                                                         Computer Standards & Interfaces
                                                             journal homepage: www.elsevier.com/locate/csi


AdaTraj-DP: An adaptive privacy framework for context-aware trajectory
data publishingI
Yongxin Zhao a , Chundong Wang a,b ,∗, Hao Lin c ,∗∗, Xumeng Wang d , Yixuan Song a , Qiuyu Du c
a
  Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin, China
b
  TianJin Police Institute, Tianjin, China
c
  College of Intelligent Science and Technology (College of Cyberspace Security), Inner Mongolia University of Technology, Inner Mongolia, China
d
  College of Cryptology and Cyber Science, Nankai University, Tianjin, China


ARTICLE                INFO                                ABSTRACT

Keywords:                                                  Trajectory data are widely used in AI-based spatiotemporal analysis but raise privacy concerns due to their fine-
Differential privacy                                       grained nature and the potential for individual re-identification. Existing differential privacy (DP) approaches
Trustworthy AI                                             often apply uniform perturbation, which compromises spatial continuity, or adopt personalized mechanisms
Trajectory data publishing
                                                           that overlook structural utility. This study introduces AdaTraj-DP, an adaptive differential privacy framework
Personalized perturbation
                                                           designed to balance trajectory-level protection and analytical utility. The framework combines context-aware
                                                           sensitivity detection with hierarchical aggregation. Specifically, a dynamic sensitivity model evaluates privacy
                                                           risks according to spatial density and semantic context, enabling adaptive allocation of privacy budgets. An
                                                           adaptive perturbation mechanism then injects noise proportionally to the estimated sensitivity and represents
                                                           trajectories through Hilbert-based encoding for prefix-oriented hierarchical aggregation with layer-wise budget
                                                           distribution. Experiments conducted on the T-Drive and GeoLife datasets indicate that AdaTraj-DP maintains
                                                           stable query accuracy, spatial consistency, and downstream analytical utility across varying privacy budgets
                                                           while satisfying formal differential privacy guarantees.


1. Introduction                                                                                 differential privacy for trajectory data has become essential to support
                                                                                                reliable and ethically compliant AI development.
    The proliferation of mobile devices, GPS sensors, and intelligent                               Differential Privacy (DP) [6] provides a rigorous mathematical guar-
transportation infrastructures has resulted in the large-scale collection                       antee against information leakage. However, its application to tra-
of spatiotemporal data. Such data serve as the foundation for numerous                          jectory publishing introduces a persistent trade-off between privacy
Location-Based Services (LBS), including navigation, ride-hailing, and                          strength, data utility, and personalization, which conventional mecha-
urban planning [1,2]. Trajectory datasets record detailed sequences of                          nisms fail to reconcile. Two primary gaps remain unresolved: (1) the
individual movements, enabling a wide range of AI applications such as                          tension between point-level perturbation and structural integrity;(2)
traffic forecasting, mobility prediction, and behavioral modeling. These                        the difficulty of adapting privacy budgets to varying contextual sen-
applications have become indispensable for smart city management and                            sitivity. Early studies injected uniform Laplace noise into each location
autonomous systems, where the integrity and granularity of trajectory                           point [7,8], which protected individual coordinates but severely dis-
data directly affect analytical and decision-making accuracy.                                   torted the spatiotemporal correlation essential for route-level analysis.
    Despite their utility, trajectory datasets raise critical privacy con-                      Subsequent hierarchical schemes based on prefix trees or space-filling
cerns for trustworthy AI. A single trajectory may expose an individual’s                        curves [9,10] preserved aggregate statistics but relied on global, fixed
home, workplace, or health-related locations, revealing sensitive be-                           privacy parameters, ignoring heterogeneous sensitivity across trajecto-
havioral patterns and social relationships [3,4]. Even after removing                           ries. Recent progress in Personalized Differential Privacy (PDP) [11–13]
explicit identifiers, re-identification attacks can reconstruct personal                        introduced adaptive noise based on semantic or frequency-based sen-
traces with minimal auxiliary information [5]. Consequently, ensuring                           sitivity, yet these methods typically lack integration with hierarchical


    I This article is part of a Special issue entitled: ‘Secure AI’ published in Computer Standards & Interfaces.
     ∗ Corresponding author at: Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin, China.
    ∗∗ Corresponding author.
    E-mail addresses: zyx4237@163.com (Y. Zhao), michael3769@163.com (C. Wang), suzukaze_aoba@126.com (H. Lin), wangxumeng@nankai.edu.cn
(X. Wang), fykatb0824@163.com (Q. Du).

https://doi.org/10.1016/j.csi.2025.104125
Received 29 October 2025; Received in revised form 25 December 2025; Accepted 29 December 2025
Available online 30 December 2025
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
Y. Zhao et al.                                                                                                                  Computer Standards & Interfaces 97 (2026) 104125


aggregation, resulting in limited query accuracy and poor scalability            quadtree variants support spatial indexing under privacy constraints [7,
for AI model training.                                                           10]. Recent work improves spatial locality and query accuracy us-
    To bridge this gap, we propose AdaTraj-DP, an adaptive differ-               ing Hilbert/Geohash encodings and adaptive tree strategies [9]. Zhao
entially private trajectory publishing framework that unifies context-           et al.’s PerTrajTree-DP further integrates point-level sensitivity with
aware sensitivity modeling and hierarchical aggregation. AdaTraj-DP              prefix-tree publishing to better support trustworthy AI analytics [24].
introduces a two-stage protection mechanism. The first stage detects             Complementary systems research on private data access and expla-
and quantifies sensitivity using contextual and statistical cues, allowing       nation (e.g., DPXPlain, Saibot) demonstrates practical techniques for
adaptive privacy budget assignment at the point level. The second                supporting DP-protected analytics and helping users interpret noisy
stage encodes perturbed trajectories into a hierarchical prefix tree,            aggregates [25,26].
applying layer-wise budget allocation to preserve structural consistency
for downstream analysis. This design ensures both localized protection           2.3. Personalized and adaptive privacy protection
and global analytical utility, addressing the core limitations of prior
DP-based trajectory mechanisms.                                                      Personalized Differential Privacy (PDP) methods adapt protection
    The main contributions of this work are summarized as follows:               to varying point- or user-level sensitivity. Semantics-driven approaches
                                                                                 use POI categories or external labels to identify sensitive locations [27,
   (1) We propose AdaTraj-DP, an adaptive framework that unifies per-            28], and movement-model-based frameworks like OPTDP estimate pri-
       sonalized perturbation and hierarchical aggregation. By estab-            vacy risk from mobility patterns [11]. Statistical personalization meth-
       lishing a mathematical link between local coordinate noise and            ods infer sensitivity from dataset properties; for example, TF–IDF-based
       global prefix-tree structures, the framework ensures that fine-           approaches quantify local importance and global rarity to guide bud-
       grained point-level protection remains structurally consistent            get allocation [12,13]. Interactive tools and visual analytics (DPKnob,
       with trajectory-level differential privacy guarantees, enabling           Defogger) provide practical support for configuring heterogeneous DP
       high-fidelity reconstruction for downstream tasks.                        strategies according to utility goals [20,21].
   (2) We design a context-aware sensitivity model that combines spa-                In parallel, recent advances in differentially private deep learning
       tial density with semantic context to guide adaptive budget               and private model training yield methods for improved utility in noisy
       allocation. This mechanism quantifies privacy risks at a granular         training regimes (e.g., optimized DP-SGD variants, selective-update
                                                                                 training, and heterogeneous-noise schemes) that can inform budget
       level, enabling the dynamic adjustment of perturbation intensity
                                                                                 allocation and model-aware privacy strategies in trajectory publish-
       to balance privacy protection and data fidelity.
                                                                                 ing [25,26,29–31]. These works highlight opportunities to close the
   (3) We implement a hierarchical aggregation scheme utilizing Hilbert
                                                                                 gap between personalized point-level protection and structural aggrega-
       spatial mapping and logarithmic layer-wise budget distribution.
                                                                                 tion, motivating AdaTraj-DP’s integration of context-aware sensitivity
       Experiments on the T-Drive and GeoLife datasets validate the
                                                                                 detection, adaptive perturbation, and hierarchical encoding to support
       framework’s effectiveness in preserving query accuracy, spatial
                                                                                 AI-oriented downstream tasks.
       consistency, and AI model performance under varying privacy
       budgets.                                                                  3. Preliminaries

2. Related work
                                                                                 Trajectory Representation. A trajectory 𝑇𝑖 of user 𝑢𝑖 is a temporally
    Existing privacy-preserving trajectory publishing approaches can             ordered sequence of geo-referenced points [32]:
be broadly categorized into three classes: (1) foundational differen-            𝑇𝑖 = {(𝑝𝑖,1 , 𝑡𝑖,1 ), (𝑝𝑖,2 , 𝑡𝑖,2 ), … , (𝑝𝑖,𝐿𝑖 , 𝑡𝑖,𝐿𝑖 )},                               (1)
tial privacy models that ensure privacy but compromise trajectory
continuity; (2) structural aggregation mechanisms that enhance data              where 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ) denotes the spatial coordinate and 𝑡𝑖,𝑗 is the
utility via hierarchical organization; and (3) personalized and adaptive         timestamp. The trajectory dataset is denoted as  = {𝑇1 , 𝑇2 , … , 𝑇𝑁 }.
privacy protection strategies that tailor noise to sensitivity but often             Each point can be projected into a discrete grid cell 𝑐𝑖,𝑗 for statistical
lack integration with structural models. This section reviews these three        analysis or further spatial encoding. The dimensionality and sampling
directions and discusses recent advances that motivate AdaTraj-DP.               irregularity of  result in high sparsity and heterogeneous sensitivity
                                                                                 among locations, which requires adaptive privacy mechanisms.
2.1. Foundational models for differentially private trajectory publishing        Differential Privacy. Let 1 and 2 be two neighboring datasets dif-
                                                                                 fering in at most one trajectory. A randomized mechanism  satisfies
    Differential Privacy (DP) [6] is the standard formalism for privacy-         𝜀-differential privacy if for any measurable subset 𝑂 in the output
preserving data publication. Early approaches discretize continuous              space:
spatio-temporal domains and inject Laplace noise into cell counts
                                                                                 Pr[(1 ) ∈ 𝑂] ≤ 𝑒𝜀 Pr[(2 ) ∈ 𝑂].                                                        (2)
or simple aggregates [14,15], but such methods often disrupt tra-
jectory continuity and reduce utility for route-level analysis [7]. To               The privacy budget 𝜀 > 0 controls the trade-off between privacy pro-
address this, research has explored trajectory generalization and syn-           tection and data utility. Smaller 𝜀 implies stronger privacy guarantees
thetic data generation under DP, including clustering-based generaliza-          but larger perturbation noise.
tion [16] and GAN-based synthetic trajectory models [17–19]. Work                    For a numerical query 𝑓 ∶  → R𝑘 with 𝓁1 sensitivity 𝛥𝑓 =
on DP-aware data exploration and visualization—e.g., DPKnob and                  max1 ,2 ‖𝑓 (1 ) − 𝑓 (2 )‖1 , the Laplace mechanism adds independent
Defogger—highlights the challenge of configuring DP mechanisms to                noise drawn from the Laplace distribution:
balance utility and risk in interactive settings and motivates user- or
                                                                                 () = 𝑓 () + Lap(𝛥𝑓 ∕𝜀).                                                                 (3)
task-guided privacy configuration [20,21].
                                                                                    This mechanism provides 𝜀-differential privacy and is used in sub-
2.2. Structural aggregation for utility enhancement                              sequent trajectory perturbation and aggregation processes.
                                                                                 Geographic Indistinguishability. For any two spatial points 𝑥, 𝑥′ ∈ R2
   Hierarchical structures—such as prefix trees, Hilbert-encoded se-
                                                                                 and any reported location 𝑧, a mechanism  achieves 𝜀-geographic
quences, and spatial index trees—have been widely adopted to preserve
                                                                                 indistinguishability if
aggregate query utility under DP. Early prefix-tree methods aggre-
                                                                                                                 ′
gate shared prefixes to reduce noise impact [22,23], while R-tree and            Pr[(𝑥) = 𝑧] ≤ 𝑒𝜀⋅𝑑(𝑥,𝑥 ) Pr[(𝑥′ ) = 𝑧],                                                  (4)

                                                                             2
Y. Zhao et al.                                                                                                                        Computer Standards & Interfaces 97 (2026) 104125


                                                                                          by combining statistical frequency and contextual semantics to guide
                                                                                          subsequent adaptive perturbation.
                                                                                          Spatial Discretization. The continuous geographical domain is parti-
                                                                                          tioned into a uniform grid of 𝐺 × 𝐺 cells. Each point 𝑝𝑖,𝑗 is mapped to
                                                                                          a corresponding grid cell 𝑐𝑖,𝑗 . This transformation converts raw coordi-
                                                                                          nates into discrete spatial tokens, enabling frequency-based statistical
                                                                                          analysis.

            Fig. 1. Framework of the proposed AdaTraj-DP scheme.                          Context-aware Sensitivity Measure. For each cell 𝑐𝑖,𝑗 , a sensitivity score
                                                                                          𝑆(𝑐𝑖,𝑗 ) is defined as

                                                                                          𝑆(𝑐𝑖,𝑗 ) = TF(𝑐𝑖,𝑗 , 𝑇𝑖 ) ⋅ IDF(𝑐𝑖,𝑗 ) ⋅ 𝜔𝑐 ,                                           (6)
where 𝑑(𝑥, 𝑥′ ) is the Euclidean distance between 𝑥 and 𝑥′ [33].                                                        count(𝑐𝑖,𝑗 ∈𝑇𝑖 )
  This formulation extends differential privacy to continuous spatial                     where TF(𝑐𝑖,𝑗 , 𝑇𝑖 ) =              𝐿𝑖
                                                                                                                                           represents the normalized local fre-
                                                                                                                                                              ||
domains and provides distance-dependent protection.                                       quency of visits within trajectory 𝑇𝑖 , and IDF(𝑐𝑖,𝑗 ) = log |{𝑇 ∈∶𝑐
                                                                                                                                                           𝑘    𝑖,𝑗 ∈𝑇𝑘 }|
Hierarchical Aggregation Structure. Trajectory data exhibit hierarchi-                    denotes the global rarity of the location across the dataset. The term
cal correlations that can be represented through prefix-based aggre-                      𝜔𝑐 is a contextual weighting coefficient that quantifies the semantic
gation. Let each discretized or encoded trajectory be expressed as a                      sensitivity of a location category. Following the semantic sensitivity
                                                                                          hierarchy established in [34], we assign higher weights to privacy-
sequence of spatial identifiers 𝑆𝑖 = [𝑠𝑖,1 , 𝑠𝑖,2 , … , 𝑠𝑖,𝐿𝑖 ]. A prefix tree 
                                                                                          critical categories (e.g., 𝜔ℎ𝑒𝑎𝑙𝑡ℎ𝑐𝑎𝑟𝑒 = 1.5, 𝜔𝑟𝑒𝑠𝑖𝑑𝑒𝑛𝑡𝑖𝑎𝑙 = 1.2) to enforce
organizes all trajectories in  by shared prefixes, where each node 𝑣
                                                                                          stricter protection, while assigning lower base weights to public infras-
corresponds to a spatial prefix and maintains a count 𝑐(𝑣) of trajectories
                                                                                          tructure (e.g., 𝜔𝑟𝑜𝑎𝑑 = 1.0). These semantic categories are mapped from
passing through it. The hierarchical form allows noise to be injected at
                                                                                          public map services (e.g., OpenStreetMap), ensuring that the sensitivity
multiple granularities while preserving global spatial consistency.
                                                                                          configuration relies solely on public knowledge and does not consume
   The total privacy budget 𝜀tree is distributed across tree layers to                    the private budget.
balance upper-level accuracy and lower-level detail preservation.
                                                                                          Normalization and Classification. To unify the sensitivity scale, all
Problem Definition. Given a trajectory dataset  consisting of 𝑁 users                    scores are normalized into [0, 1]:
and a total privacy budget𝜀total , the objective is to design a mechanism
                                                                                                      𝑆(𝑐𝑖,𝑗 ) − min(𝑆)
traj that releases a trajectory dataset ̃ = traj () satisfying:                       ̂ 𝑖,𝑗 ) =
                                                                                          𝑆(𝑐                            .                                             (7)
                                                                                                      max(𝑆) − min(𝑆)
                                                                                              Each point 𝑝𝑖,𝑗 is then labeled as sensitive or non-sensitive according
   (1) traj ensures 𝜀total -differential privacy at the trajectory level;
                                                                                          to a predefined threshold 𝜃𝑆 :
   (2) The released dataset ̃ preserves statistical and structural prop-                                {
       erties essential for AI-based spatiotemporal analysis;                                                       ̂ 𝑖,𝑗 ) ≥ 𝜃𝑆 ,
                                                                                                            1, if 𝑆(𝑐
                                                                                          label(𝑝𝑖,𝑗 ) =                                                               (8)
   (3) The expected analytical error between results obtained from ̃                                       0, otherwise.
       and  remains bounded.                                                                 The resulting annotated dataset is represented as ′ = {𝑇1′ , 𝑇2′ , … , 𝑇𝑁′ },
                                                                                          where each 𝑇𝑖′ contains the points and corresponding sensitivity labels.
    Let 𝑓AI (⋅) denote an AI model trained or evaluated on trajectory                     The normalized score 𝑆(𝑐   ̂ 𝑖,𝑗 ) serves as a continuous privacy indicator in
data. The utility preservation objective is formulated as                                 the subsequent adaptive perturbation phase.
            [                    ]
                   ̃ − 𝑓AI ()‖2 ,
𝐿utility = E ‖𝑓AI ()                                             (5)
                               2                                                          4.2. Adaptive personalized perturbation
subject to ̃ satisfying 𝜀total -differential privacy. The goal is to minimize
𝐿utility while maintaining formal privacy guarantees.                                         This phase injects controlled noise into all trajectory points in ′ to
                                                                                          ensure trajectory-level differential privacy. All locations are perturbed
4. Proposed framework                                                                     to avoid inference risks arising from selective protection. The perturba-
                                                                                          tion strength is adaptively adjusted based on the normalized sensitivity
                                                                                           ̂ 𝑖,𝑗 ) and local spatial density, allowing the mechanism to preserve
                                                                                          𝑆(𝑐
    Rapid development of AI-driven spatiotemporal analysis has in-
creased the demand for high-quality trajectory data with strong privacy                   analytical fidelity while maintaining formal privacy guarantees.
protection. Traditional differential privacy mechanisms often adopt                       Adaptive Privacy Budget Allocation. Each trajectory point 𝑝𝑖,𝑗 is as-
fixed noise scales or uniform budget allocation, which can cause exces-                   signed an individual privacy budget 𝜀𝑝𝑖,𝑗 determined by both its sensi-
sive utility degradation in dense areas or insufficient protection in sensi-              tivity level and spatial context.
tive regions. To address these limitations, this study proposes AdaTraj-                       Let 𝜌(𝑝𝑖,𝑗 ) denote the local point density around 𝑝𝑖,𝑗 within a neigh-
DP, a framework that integrates adaptive personalized perturbation                        borhood radius 𝑟. The adaptive budget is defined as
with hierarchical aggregation to achieve trajectory-level differential                                                    (                                 )
                                                                                                                             ̂ 𝑖,𝑗 ) + (1 − 𝛼)(1 − 𝜌(𝑝𝑖,𝑗 )) ,
                                                                                          𝜀𝑝𝑖,𝑗 = 𝜀max − (𝜀max − 𝜀min ) × 𝛼 𝑆(𝑐                                    (9)
privacy while maintaining analytical utility for AI-based modeling.
As illustrated in Fig. 1, AdaTraj-DP operates in three main phases:                       where 𝛼 ∈ [0, 1] controls the balance between sensitivity-based and
(1) trajectory preprocessing and context-aware sensitivity detection;                     density-based adaptation.
(2) adaptive personalized perturbation guided by local sensitivity and                        A higher 𝑆(𝑐 ̂ 𝑖,𝑗 ) or lower 𝜌(𝑝𝑖,𝑗 ) leads to a smaller 𝜀𝑝 , introducing
                                                                                                                                                          𝑖,𝑗
spatial density; (3) hierarchical aggregation using Hilbert encoding and                  stronger noise for privacy-critical or sparsely visited regions. The range
dynamic layer-wise budget allocation.                                                     [𝜀min , 𝜀max ] defines the permissible privacy strength, ensuring stability
                                                                                          across heterogeneous data distributions.
4.1. Context-aware sensitivity detection
                                                                                          Two-Dimensional Laplace Perturbation. For each point 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ),
                                                                                          independent Laplace noise is applied to both coordinates according to
    Let  = {𝑇1 , … , 𝑇𝑁 } denote the trajectory dataset after basic
                                                                                          the assigned privacy budget:
preprocessing. Each trajectory 𝑇𝑖 = {(𝑝𝑖,1 , 𝑡𝑖,1 ), … , (𝑝𝑖,𝐿𝑖 , 𝑡𝑖,𝐿𝑖 )} consists               {
of temporally ordered spatial points 𝑝𝑖,𝑗 = (lat 𝑖,𝑗 , lon𝑖,𝑗 ). The objective                      lat 𝑖,𝑗 + Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )
                                                                                          𝑝′𝑖,𝑗 =                                                                 (10)
of this phase is to quantify the privacy sensitivity of each spatial point                          lon𝑖,𝑗 + Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )

                                                                                      3
Y. Zhao et al.                                                                                                                         Computer Standards & Interfaces 97 (2026) 104125


Algorithm 1 Adaptive Personalized Perturbation under AdaTraj-DP                            Algorithm 2 Dynamic Hierarchical Aggregation under AdaTraj-DP
Input: Annotated dataset           ′ ,    privacy range [𝜀min , 𝜀max ], sensitivity       Input: Perturbed dataset ′′ , total tree budget 𝜀tree , height ℎ,
    scores 𝑆, ̂ balance coefficient 𝛼                                                          parameters 𝑎, 𝛾, encoding length 𝐿enc
Output: Perturbed dataset ′′                                                              Output: Privacy-aware prefix tree  ′
 1: ′′ ← ∅                                                                                 1: Initialize empty tree 
 2: for each trajectory 𝑇𝑖 ∈ ′ do                                                          2: for each trajectory 𝑇𝑖′′ = {𝑝′𝑖,1 , … , 𝑝′𝑖,𝐿 } in ′′ do
                                                                                                                                            𝑖
 3:   𝑇𝑖′′ ← ∅                                                                              3:    Encode trajectory:
 4:   for each point 𝑝𝑖,𝑗 in 𝑇𝑖 do                                                                𝑆𝑖 ← [Encode1D(𝐻(𝑝′𝑖,1 )), … , Encode1D(𝐻(𝑝′𝑖,𝐿 ))]
                                                                                                                                                        𝑖
 5:      Compute local density 𝜌(𝑝𝑖,𝑗 )                                                     4:    Insert 𝑆𝑖 into  and increment node counts along each path
 6:      𝜀𝑝𝑖,𝑗 ← 𝜀max − (𝜀max − 𝜀min ) × (𝛼 𝑆(𝑐    ̂ 𝑖,𝑗 ) + (1 − 𝛼)(1 − 𝜌(𝑝𝑖,𝑗 )))         5: end for
 7:      𝑛lat ∼ Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )                                                        6: for layer 𝑖 = 1 to ℎ do
 8:      𝑛lon ∼ Laplace(0, 1∕𝜀𝑝𝑖,𝑗 )                                                        7:    Compute node count variance 𝜎𝑖2
 9:      𝑝′𝑖,𝑗 ← (lat 𝑖,𝑗 + 𝑛lat , lon𝑖,𝑗 + 𝑛lon )                                                               (log(𝑖+𝑎))(1+𝛾𝜎𝑖2 )
                                                                                            8:     𝜀level,𝑖 ← ∑ℎ                          ⋅ 𝜀tree
10:       Append 𝑝′𝑖,𝑗 to 𝑇𝑖′′                                                                                                      2
                                                                                                                 𝑗=1 (log(𝑗+𝑎))(1+𝛾𝜎𝑗 )
11:   end for                                                                               9:   for each node 𝑣 at layer 𝑖 do
12:   Add 𝑇𝑖′′ to ′′                                                                      10:     𝑐 ′ (𝑣) ← 𝑐(𝑣) + Laplace(0, 1∕𝜀level,𝑖 )
13: end for                                                                                11:     Update 𝑐(𝑣) ← 𝑐 ′ (𝑣)
14: return ′′                                                                             12:   end for
                                                                                           13: end for
                                                                                           14: return  ′

   The perturbed trajectory 𝑇𝑖′′ = {𝑝′𝑖,1 , 𝑝′𝑖,2 , … , 𝑝′𝑖,𝐿 } is constructed by
                                                               𝑖
replacing each original point with its perturbed counterpart. The com-
plete differentially private dataset is denoted as  = {𝑇1′′ , 𝑇2′′ , … , 𝑇𝑁′′ }.
                                                            ′′                             loss in fine-grained trajectories, the logarithmic term ensures that leaf
   Algorithm 1 outlines the adaptive personalized perturbation proce-                      nodes retain sufficient privacy budget to preserve local spatial details.
dure.                                                                                      Differentially Private Node Perturbation. For each node 𝑣 at layer 𝑖,
                                                                                           the sensitivity of its count query is 𝛥𝑓 = 1. Laplace noise is applied
                                                                                           according to its layer-wise budget:
4.3. Hierarchical aggregation with dynamic budget allocation
                                                                                                                   (            )
                                                                                                                          1
                                                                                           𝑐 ′ (𝑣) = 𝑐(𝑣) + Laplace 0,            .                          (13)
    This phase organizes the perturbed trajectories into a structured                                                  𝜀level,𝑖
form for privacy-preserving analytical querying and AI model training.                        The resulting prefix tree  ′ with perturbed counts serves as a
A hierarchical prefix tree is constructed from the encoded trajectories,                   privacy-preserving hierarchical representation supporting aggregate
where node counts are perturbed under a dynamically adjusted budget                        analytics and AI-based trajectory modeling.
to preserve global consistency while mitigating noise propagation.                            Algorithm 2 summarizes the hierarchical aggregation process with
                                                                                           dynamic budget adjustment.
Spatial Encoding via Hilbert Curve. Each perturbed point 𝑝′𝑖,𝑗 ∈ ′′
is mapped into a one-dimensional integer value 𝑣𝑖,𝑗 using a Hilbert
space-filling curve 𝐻(⋅), ensuring spatial locality preservation:                          4.4. Privacy analysis

𝑣𝑖,𝑗 = 𝐻(𝑝′𝑖,𝑗 ).                                                             (11)
                                                                                              The proposed AdaTraj-DP framework comprises two sequential
    Each integer value 𝑣𝑖,𝑗 is then converted into a fixed-length binary                   privacy-preserving mechanisms: adaptive personalized perturbation
string 𝑠𝑖,𝑗 of length 𝐿enc , forming a discretized trajectory representation               (with budget 𝜀point ) and hierarchical aggregation (with budget 𝜀tree ).
𝑆𝑖 = [𝑠𝑖,1 , 𝑠𝑖,2 , … , 𝑠𝑖,𝐿𝑖 ]. The set of all encoded trajectories {𝑆𝑖 } consti-         By the sequential composition theorem of differential privacy, the total
tutes the input to hierarchical aggregation. The technical details of this                 privacy guarantee satisfies
Hilbert-to-binary-string encoding, including the relationship between                      𝜀total = 𝜀point + 𝜀tree .                                                             (14)
the curve’s order and the string length, are elaborated in Appendix.

Prefix Tree Construction. A prefix tree  is built from {𝑆𝑖 }, where each                  Privacy of Adaptive Personalized Perturbation (𝜀point ). The adaptive
path from the root to a node 𝑣 represents a spatial prefix, and the node                   perturbation mechanism assigns an individual privacy budget 𝜀𝑝𝑖,𝑗 to
count 𝑐(𝑣) indicates the number of trajectories sharing that prefix. The                                                                                      ̂ 𝑖,𝑗 )
                                                                                           each trajectory point 𝑝𝑖,𝑗 derived from its normalized sensitivity 𝑆(𝑐
maximum tree depth ℎ corresponds to the maximum trajectory length                          and local density 𝜌(𝑝𝑖,𝑗 ). To ensure rigorous privacy guarantees, it is
or encoding depth.                                                                         assumed that the global weighting parameters (e.g., contextual weights
                                                                                           𝜔𝑐 and density thresholds) are computed from public sources, such as
Dynamic Layer-wise Budget Allocation. The total privacy budget 𝜀tree
                                                                                           map topologies or non-sensitive historical statistics. This reliance on
is distributed across tree layers according to both layer depth and
                                                                                           public metadata is a standard practice in privacy-preserving spatial
statistical variance. Let 𝜎𝑖2 denote the empirical variance of node counts
                                                                                           publishing [14,33], ensuring that the sensitivity calibration process
at layer 𝑖. The adaptive allocation for layer 𝑖 is defined as
                                                                                           itself does not leak private information. Consequently, the allocated
            (log(𝑖 + 𝑎)) ⋅ (1 + 𝛾𝜎𝑖2 )                                                     budget 𝜀𝑝𝑖,𝑗 depends solely on the characteristics of its corresponding
𝜀level,𝑖 = ∑ℎ                            ⋅ 𝜀tree ,                            (12)         trajectory 𝑇𝑖 . Under this assumption:
                                       2
            𝑗=1 (log(𝑗 + 𝑎))(1 + 𝛾𝜎𝑗 )

where 𝑎 > 0 is a smoothing parameter and 𝛾 ≥ 0 controls the weight of                         (1) The assignment of 𝜀𝑝𝑖,𝑗 relies solely on local statistics within 𝑇𝑖
variance-based adjustment. Adopting the logarithmic strategy from [9],                            and public constants, which ensures independence among users.
the function log(𝑖 + 𝑎) is selected to smooth the budget decay across                         (2) Each trajectory is processed through an independent Laplace
layers. Unlike linear or exponential allocation schemes, which might                              mechanism. For any point 𝑝𝑖,𝑗 , the Laplace mechanism with scale
excessively penalize deeper layers and lead to significant information                            1∕𝜀𝑝𝑖,𝑗 satisfies 𝜀𝑝𝑖,𝑗 -differential privacy.

                                                                                       4
Y. Zhao et al.                                                                                                           Computer Standards & Interfaces 97 (2026) 104125


   (3) Because the budgets are bounded within [𝜀min , 𝜀max ], the overall             Both datasets are preprocessed by: (1) removing sampling intervals
       privacy cost of this phase is dominated by the smallest allocated          exceeding 300 s; (2) filtering out trajectories shorter than 20 points;
       budget, and the worst-case (strongest) guarantee corresponds to            (3) normalizing all coordinates into a [0, 1] × [0, 1] grid to ensure scale
       𝜀min -DP for each point.                                                   comparability.
   (4) By parallel composition across trajectories, the global privacy                These datasets collectively provide both high-density and low-
       consumption of this phase is 𝜀point = 𝜀max , representing the max-         density spatial distributions, enabling a fair evaluation of the proposed
       imum privacy loss incurred when the weakest noise is added.                context-aware sensitivity modeling.

   Hence, the adaptive perturbation phase satisfies 𝜀max -differential            5.1.2. Baseline methods
privacy.                                                                             To demonstrate the advantages of AdaTraj-DP, we compare it with
Privacy of Hierarchical Aggregation (𝜀tree ). The hierarchical aggrega-           four representative baselines, each reflecting a distinct privacy design
tion mechanism constructs a prefix tree and perturbs its node counts              paradigm:
with layer-specific noise calibrated by 𝜀level,𝑖 . Each trajectory affects
                                                                                      • HA-Tree [9]: A hierarchical aggregation method based on Hilbert
exactly one node per layer, implying that the sensitivity of the count
                                                                                        mapping and fixed logarithmic budget allocation, representing
query at any layer is 𝛥𝑓 = 1. Adding Laplace noise with scale 1∕𝜀level,𝑖
                                                                                        state-of-the-art static DP trees.
guarantees 𝜀level,𝑖 -DP for that layer.
                                                                                      • TFIDF-DP [13]: A personalized perturbation method using TF–
    Because the per-layer budgets 𝜀level,𝑖 are partitioned from 𝜀tree ac-
                                                                                        IDF-based sensitivity scoring without hierarchical structure, cor-
cording to
                                                                                        responding to point-level DP only.
∑
ℎ
                                                                                      • QJLP (LDP) [7]: A local differential privacy baseline where each
      𝜀level,𝑖 = 𝜀tree ,                                              (15)              trajectory is perturbed independently on the client side.
𝑖=1
                                                                                      • AdaTraj-DP (Ours): The proposed adaptive framework that com-
and the layers are sequentially composed along each trajectory path,                    bines context-aware sensitivity detection, adaptive perturbation,
the entire prefix tree synthesis mechanism satisfies 𝜀tree -differential                and dynamic hierarchical aggregation.
privacy. The dynamic allocation factor (1 + 𝛾𝜎𝑖2 ) modifies the budget
distribution without altering the total privacy bound, ensuring that the          5.1.3. Evaluation metrics
overall guarantee remains unchanged.                                                 Performance is evaluated from three complementary perspectives:
Overall Privacy Guarantee. Applying the sequential composition theo-              Data Utility. We adopt three quantitative metrics: Mean Absolute Error
rem to the two phases yields the total privacy protection level:                  (MAE), Mean Relative Error (MRE), and Hausdorff Distance (HD).
𝜀total = 𝜀max + 𝜀tree .                                               (16)        MAE and MRE evaluate accuracy for range-count queries on perturbed
                                                                                  trajectories, while HD measures spatial fidelity between original and
    This ensures that AdaTraj-DP provides formal, trajectory-level                released datasets.
differential privacy. The adaptive and hierarchical mechanisms jointly
                                                                                  Model Utility. To align with AI-oriented evaluation, we train a down-
maintain consistent privacy guarantees while supporting utility-
                                                                                  stream trajectory classification model based on a lightweight Mamba
preserving analysis for AI-based spatiotemporal modeling.
                                                                                  encoder [37]. The model predicts driver ID from trajectory segments,
                                                                                  and classification accuracy on the perturbed data reflects end-task
5. Experimental evaluation
                                                                                  utility (𝑈cls ).

   This section presents an extensive empirical evaluation of the pro-            Computational Efficiency. We report total runtime (𝑇total ) from prepro-
posed AdaTraj-DP framework. The experiments aim to validate both                  cessing to privacy-protected publication, including all three phases of
privacy preservation and analytical utility in AI-oriented trajectory             AdaTraj-DP.
publishing. Specifically, we address the following research questions:
                                                                                  5.1.4. Parameter configuration
      • RQ1: How does the total privacy budget 𝜀total affect the analytical           Unless otherwise stated, experiments use the following default con-
        utility of the released trajectories?                                     figuration: the total privacy budget 𝜀total is divided by an allocation
      • RQ2: How does AdaTraj-DP perform compared to state-of-the-                ratio 𝛼, where 𝛼 ∈ [0.3, 0.7] controls the portion used for adaptive
        art differential privacy mechanisms in terms of accuracy and              perturbation (𝜀point ), and (1 − 𝛼) for hierarchical aggregation (𝜀tree ):
        computational efficiency?
      • RQ3: What are the impacts of the adaptive parameters—including            𝜀point = 𝛼𝜀total , 𝜀tree = (1 − 𝛼)𝜀total .                                       (17)
        allocation ratio 𝛼 and variance factor 𝛾—on privacy–utility trade-            We vary 𝜀total from 0.5 to 3.0 to investigate the privacy–utility
        offs?                                                                     trade-off.
                                                                                      The variance factor 𝛾 controlling dynamic budget adaptation is se-
5.1. Experimental setup                                                           lected from {0, 0.2, 0.5, 1.0}, and the hierarchical smoothing parameter
                                                                                  is set to 𝑎 = 1.0. The sensitivity threshold 𝜃𝑆 for classifying sensitive
    This subsection introduces the datasets, baseline methods, evalua-            points is chosen from {0.6, 0.7, 0.8, 0.9}. The personalized budget range
tion metrics, and parameter configurations used in the experiments.               is fixed at [𝜀min , 𝜀max ] = [0.1, 1.0].
                                                                                      To ensure comparability, all methods share identical grid resolution
5.1.1. Datasets                                                                   (𝐺 = 128) and Hilbert encoding length (𝐿enc = 16). All experiments are
   Experiments are primarily conducted on the widely used T-Drive                 implemented in Python 3.8 with PyTorch 2.4 on an NVIDIA RTX 4090
dataset, which records GPS trajectories of 10,357 taxis in Beijing                GPU.
over seven days (February 2–8, 2008) [35]. It contains approximately
15 million spatial points after preprocessing. To further verify cross-           5.2. RQ1: Data utility evaluation
domain robustness, we additionally include the GeoLife dataset [36],
which comprises 17,621 trajectories from 182 users, covering both                     This experiment evaluates how AdaTraj-DP preserves the analytical
dense urban and sparse suburban mobility patterns.                                utility of published trajectories under different privacy budgets. All

                                                                              5
Y. Zhao et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104125


                                             (a) MAE of Count Queries                  (b) MRE of Count Queries


                                        Fig. 2. Trajectory count query accuracy under varying 𝜀total on both datasets.


evaluations are conducted on both the T-Drive and GeoLife datasets,               Table 1
covering dense and sparse mobility scenarios to ensure cross-domain               Spatial fidelity comparison (average over T-Drive and GeoLife datasets). Lower
consistency.                                                                      values indicate higher spatial accuracy.
                                                                                   𝜀total   Hausdorff Distance (HD)               Mean Displacement (MD)

5.2.1. Accuracy of trajectory count queries                                                 AdaTraj-DP     Best Baseline          AdaTraj-DP      Best Baseline
    We evaluate the ability of each method to answer prefix-based count            0.5      0.152          0.171 (HA-Tree)        0.098           0.113 (HA-Tree)
queries accurately. For each dataset, a query set  consisting of 1000             1.0      0.096          0.127 (HA-Tree)        0.069           0.087 (HA-Tree)
                                                                                   1.5      0.089          0.125 (TFIDF-DP)       0.063           0.088 (TFIDF-DP)
random trajectory prefixes with lengths between 4 and 8 is selected.
                                                                                   2.0      0.083          0.118 (TFIDF-DP)       0.059           0.083 (TFIDF-DP)
Let 𝑐(𝑞) denote the true count of trajectories matching prefix 𝑞 ∈ , and          3.0      0.079          0.130 (QJLP)           0.056           0.094 (QJLP)
𝑐(𝑞)
̂    be the noisy count returned by the mechanism. The data utility is
quantified using Mean Absolute Error (MAE) and Mean Relative Error
(MRE), defined as:
                                                                                  tasks. Two representative learning tasks are considered: (1) trajectory
          1 ∑                          1 ∑ |𝑐(𝑞) − 𝑐(𝑞)|
                                                     ̂
MAE =            |𝑐(𝑞) − 𝑐(𝑞)|,
                         ̂      MRE =                                (18)         classification, which predicts the semantic category of a movement se-
         || 𝑞∈                      || 𝑞∈ max(𝑐(𝑞), 𝛿)
                                                                                  quence; (2) destination prediction, which estimates the likely endpoint
where 𝛿 is a smoothing parameter (set to 1% of the total dataset size)            of an ongoing trajectory. These tasks are evaluated on the T-Drive
to prevent division by zero for small counts. The results are averaged            and GeoLife datasets to reflect both dense and sparse urban mobility
over ten repetitions with independent noise realizations.                         environments.

Effect of Privacy Budget 𝜀total . Figs. 2(a) and 2(b) illustrate the quan-        5.3.1. Trajectory classification
titative relationship between privacy strength and data utility. All                  A hierarchical Transformer-based model with positional encoding is
methods exhibit a convex error decay curve as 𝜀total increases from 0.5           trained on the published trajectories to perform multi-class trajectory
to 3.0, reflecting the fundamental differential privacy trade-off.                classification. The model architecture follows a standard encoder setup
    In the strict privacy regime (𝜖𝑡𝑜𝑡𝑎𝑙 ∈ [0.5, 1.5]), our method achieves       with three attention layers and a hidden size of 256. Each experiment
the steepest marginal reduction in MAE, indicating a high return on               is repeated five times under independent noise realizations, and the
privacy budget investment. Specifically, when 𝜖𝑡𝑜𝑡𝑎𝑙 increases from 0.5           average classification accuracy and macro F1-score are reported. The
to 1.0, AdaTraj-DP reduces the MAE by approximately 45.3% (from                   total privacy budget 𝜀total is varied from 0.5 to 3.0.
18.1 to 9.9), whereas the second-best baseline, HA-Tree, only achieves
                                                                                  Effect of Privacy Budget 𝜀total . Figs. 4(a) and 4(b) illustrate the influ-
a 31.4% reduction. This quantitative gap demonstrates that AdaTraj-
                                                                                  ence of 𝜀total on model performance. As the privacy budget increases,
DP yields a significantly higher marginal utility gain for every unit of
                                                                                  both accuracy and F1-score improve across all methods. AdaTraj-
privacy budget expended compared to static hierarchical structures.
                                                                                  DP consistently maintains the highest model utility on both datasets,
                                                                                  demonstrating that adaptive sensitivity control effectively preserves
5.2.2. Preservation of spatial distribution
                                                                                  discriminative features. The hierarchical tree representation mitigates
   Spatial fidelity evaluates the geometric similarity between the orig-
                                                                                  local noise accumulation, supporting stable model convergence.
inal and perturbed trajectories. We use two complementary metrics:
the Hausdorff Distance (HD) for worst-case deviation and the Mean                 5.3.2. Destination prediction
Displacement (MD) for average positional distortion.                                 To evaluate predictive consistency, a sequence-to-sequence neural
Effect of Privacy Budget 𝜀total . Fig. 3 and Table 1 summarize the spatial        decoder is trained to predict the destination region of each trajectory
accuracy across privacy levels. For both T-Drive and GeoLife datasets,            prefix. Prediction accuracy is measured by the top-1 hit rate, while
AdaTraj-DP consistently achieves smaller deviations, demonstrating its            spatial accuracy is quantified by the mean geodesic distance between
robustness across data densities and spatial patterns. The sensitivity-           predicted and true destinations.
guided perturbation preserves local consistency, while adaptive budget            Effect of Privacy Budget 𝜀total . Figs. 5(a) and 5(b) illustrate the results
redistribution reduces distortion in dense urban regions.                         of destination prediction across both datasets. AdaTraj-DP maintains
   Overall, AdaTraj-DP demonstrates consistent spatial and statisti-              stable predictive performance even under strict privacy constraints
cal accuracy across both datasets, validating its generalizability to             (𝜀total < 1.0), consistently outperforming fixed-budget baselines that
heterogeneous mobility distributions.                                             cannot adapt to local sensitivity variations. As the privacy budget
                                                                                  increases, the prediction accuracy steadily improves, while the mean
5.3. RQ2: Model utility evaluation                                                spatial deviation between predicted and true destinations decreases.
                                                                                  This demonstrates that adaptive perturbation and hierarchical encoding
   This experiment evaluates how the differentially private trajectories          together preserve mobility semantics and ensure downstream models
generated by AdaTraj-DP retain their utility for AI-based downstream              can effectively capture trajectory intent despite injected noise.

                                                                              6
Y. Zhao et al.                                                                                                        Computer Standards & Interfaces 97 (2026) 104125


                                           (a) Hausdorff Distance vs. Privacy       (b) Mean Displacement vs. Privacy
                                           Budget                                   Budget


                                              Fig. 3. Spatial fidelity comparison on T-Drive and GeoLife datasets.


                                               (a) Classification Accuracy                     (b) F1-score


                             Fig. 4. Trajectory classification performance under varying 𝜀total on T-Drive and GeoLife datasets.


                                       (a) Destination Prediction Accuracy          (b) Destination Prediction Mean Dis-
                                       (Top-1 Hit Rate)                             tance Error (km)


                     Fig. 5. Destination prediction accuracy and spatial deviation under varying 𝜀total on T-Drive and GeoLife datasets.


5.4. RQ3: Parameter sensitivity analysis                                            𝛼 = 0.6, where both the query error and model accuracy achieve
                                                                                    near-balanced performance. When 𝛼 < 0.4, excessive noise in point
    This experiment investigates the effect of key parameters in AdaTraj-           perturbation causes degraded spatial precision, while 𝛼 > 0.8 reduces
DP on privacy–utility balance, focusing on two critical hyperparame-                the reliability of aggregated counts in the prefix tree, highlighting the
ters: the budget allocation ratio 𝛼 and the sensitivity threshold 𝜃TFIDF .          necessity of coordinated budget allocation.
All experiments are conducted with the total privacy budget 𝜀total = 1.5                In practice, the optimal 𝛼 depends on the specific utility require-
on both the T-Drive and GeoLife datasets.                                           ments. For applications prioritizing fine-grained point precision (e.g.,
                                                                                    destination prediction), a larger 𝛼 (e.g., 0.6–0.7) is recommended to
5.4.1. Effect of budget allocation ratio 𝛼                                          allocate more budget to the perturbation phase. Conversely, for range
    The parameter 𝛼 controls the distribution of the total privacy budget           query tasks relying on aggregate statistics, a smaller 𝛼 favors the hier-
between the point-level perturbation and the hierarchical tree aggre-               archical tree structure. An empirical strategy for parameter selection
gation phases, where 𝜀point = 𝛼𝜀total and 𝜀tree = (1 − 𝛼)𝜀total . A small           involves using a small, non-sensitive validation set to estimate the
𝛼 assigns more budget to aggregation, reducing hierarchical noise,                  inflection point of the loss function. A balanced initialization of 𝛼 = 0.6
whereas a large 𝛼 increases point-level fidelity at the expense of tree             is recommended as a default setting, which prioritizes neither point-
consistency. We vary 𝛼 from 0.1 to 0.9 and evaluate both data utility               level perturbation nor structural aggregation excessively. To ensure
and model accuracy.                                                                 privacy integrity, this validation set is constructed from public histor-
    Figs. 6 presents the effect of 𝛼 on count query error (MAE) and                 ical trajectory data (e.g., open-source T-Drive samples) or a disjoint
trajectory classification accuracy. An optimal trade-off is observed near           subset of historical records that does not overlap with the private

                                                                                7
Y. Zhao et al.                                                                                                          Computer Standards & Interfaces 97 (2026) 104125


                                                                                         Fig. 8. Computational cost decomposition of AdaTraj-DP across three key
Fig. 6. Impact of budget allocation ratio 𝛼 on query utility and model
                                                                                         stages.
performance at 𝜀total = 1.5.


                                                                                         T-Drive dataset and the sparse, diverse GeoLife dataset. This cross-
                                                                                         dataset stability suggests that AdaTraj-DP is robust to heterogeneous
                                                                                         spatial distributions, indicating that a standard parameter configura-
                                                                                         tion can yield reliable performance without the need for exhaustive
                                                                                         hyperparameter retuning for every new application scenario.

                                                                                         5.5. Scalability analysis

                                                                                            To address practical deployment concerns, particularly for city-wide
                                                                                         scenarios, we analyze the scalability of AdaTraj-DP regarding both
                                                                                         dataset volume (number of users 𝑁) and temporal duration (trajectory
                                                                                         length 𝐿).
                                                                                         Scalability to Large-scale User Datasets. The computational complex-
Fig. 7. Effect of the sensitivity threshold 𝜃TFIDF on spatial fidelity and predic-       ity of AdaTraj-DP is dominated by the linear scanning of trajectory
tive performance at 𝜀total = 1.5.                                                        points. Specifically, the sensitivity detection and adaptive perturbation
                                                                                         phases operate on each trajectory independently, with a time complex-
                                                                                         ity of 𝑂(𝑁 ⋅ 𝐿). This independence allows for trivial parallelization
                                                                                         across multiple processors, significantly reducing runtime on large-
dataset . This separation guarantees that the hyperparameter tuning
                                                                                         scale datasets. Furthermore, the hierarchical aggregation phase inserts
process relies solely on public knowledge and does not consume the
                                                                                         encoded sequences into the prefix tree with a complexity of 𝑂(𝑁 ⋅ 𝐿),
privacy budget allocated for the sensitive data.
                                                                                         avoiding the quadratic 𝑂(𝑁 2 ) pairwise comparisons often required by
                                                                                         clustering-based or 𝐾-anonymity approaches. Consequently, the run-
5.4.2. Effect of sensitivity threshold 𝜃TFIDF                                            time of AdaTraj-DP grows linearly with the number of users, indicating
    The threshold 𝜃TFIDF determines how many trajectory points are                       that the framework is scalable to large-scale spatiotemporal datasets
classified as sensitive during the TF–IDF-based detection process. A                     typical of modern urban computing.
smaller threshold labels more points as sensitive, resulting in stronger
                                                                                         Robustness for Long Historical Trajectories. For long historical tra-
protection but higher noise magnitude. We vary 𝜃TFIDF from 0.6 to 1.2
                                                                                         jectories, the challenge lies in maintaining structural efficiency and
and evaluate the mean displacement (MD) and destination prediction
                                                                                         data utility as the sequence length increases. AdaTraj-DP addresses this
accuracy.
                                                                                         through two mechanisms:
    Figs. 7 depicts the variation of spatial fidelity and predictive util-
ity under different 𝜃TFIDF values. As 𝜃TFIDF increases, the number of                      (1) Efficient Encoding: The Hilbert space-filling curve maps high-
sensitive points decreases, leading to reduced perturbation intensity                          dimensional spatial points into 1D integers via efficient bit-
and smaller average displacement. However, excessively large 𝜃TFIDF                            wise operations. Since the encoding complexity is constant per
weakens privacy coverage and slightly degrades downstream predic-                              point, the computational cost scales linearly with the trajectory
tion accuracy. The optimal setting is observed around 𝜃TFIDF = 0.9,                            length, avoiding the performance bottlenecks often associated
balancing spatial accuracy with model generalization.                                          with complex sequence alignment methods.
                                                                                           (2) Depth-Robust Aggregation: Long trajectories naturally necessitate
5.4.3. Generalization and parameter stability                                                  deeper prefix trees, which typically suffer from severe budget
    In the ablation studies presented above, we observed that the frame-                       dilution at lower levels. AdaTraj-DP addresses this through its
work’s utility is responsive to variations in the budget allocation ratio                      logarithmic layer-wise allocation (Eq. (12)), which dampens
𝛼 and sensitivity threshold 𝜃TFIDF , particularly when these parameters                        the noise increase rate relative to tree depth. This mechanism
approach the boundaries of their respective ranges. This sensitivity                           ensures that the tail ends of extended mobility sequences re-
necessitates a discussion on the model’s generalization capabilities                           tain analytical utility, preventing the rapid signal degradation
across different data distributions.                                                           commonly observed in uniform allocation schemes.
    While the framework exhibits sensitivity to extreme parameter vari-
ations, it is worth noting that the optimal operating points (𝛼 ≈                        Empirical Efficiency Evaluation. To complement the theoretical com-
0.6, 𝜃TFIDF ≈ 0.9) remain consistent across both the high-density                        plexity analysis, Fig. 8 presents the empirical runtime decomposition

                                                                                     8
Y. Zhao et al.                                                                                                          Computer Standards & Interfaces 97 (2026) 104125


of AdaTraj-DP on the T-Drive dataset. The total processing time is                    This transformation is controlled by the Hilbert curve’s order pa-
approximately 250 s. As observed, the TF–IDF Analysis phase con-                  rameter, designated as 𝑘. When applying a Hilbert curve with order 𝑘,
stitutes the majority of the computational overhead (approx. 60%)                 the two-dimensional space becomes divided into a (2𝑘 ) × (2𝑘 ) cellular
due to the necessity of global statistical aggregation across the spatial         grid. To guarantee that every coordinate within dataset 𝐷 receives
grid. However, the core privacy mechanisms—Prefix Tree Construction               a distinct Hilbert index √assignment, the order parameter must fulfill
and Perturbation—demonstrate high efficiency. Notably, the adaptive               the condition 𝑘 ≥ ⌈log |𝐷|⌉. This configuration assigns each cell,
perturbation phase accounts for less than 10% of the total time, con-             including any coordinate it contains, to a unique integer within the
firming that the granular noise injection introduces negligible latency.          interval [0, (2𝑘 )2 − 1].
This performance profile validates that AdaTraj-DP is well-suited for                 The binary sequence length, denoted 𝐿enc , depends on the total
periodic batch publishing scenarios (e.g., releasing trajectory updates           count of representable integer values. Representing all (2𝑘 )2 = 22𝑘
every 5-10 min for traffic monitoring). While the current execution               distinct values necessitates a binary sequence of length 𝐿enc = 2𝑘. The
time is sufficient for such batch-based near-real-time analytics, we              transformation consists of a direct conversion from integer 𝑣𝑖,𝑗 to its
acknowledge that strictly latency-critical streaming applications may             𝐿enc -bit binary form, applying leading zero-padding when needed to
require further optimization of the tree construction process. Neverthe-          maintain uniform length.
less, for the targeted high-utility analysis tasks, this computational cost           Consider the following illustration: assume a Hilbert curve with
is a justifiable trade-off for the structural consistency provided by the         order 𝑘 = 8. Under these conditions: The cellular count equals (28 )2 =
framework.                                                                        65,536. The integer value 𝑣𝑖,𝑗 resides within the interval [0, 65535]. The
                                                                                  necessary binary sequence length becomes 𝐿enc = 2 × 8 = 16.
6. Conclusion                                                                         When coordinate 𝑝′𝑖,𝑗 maps to integer 𝑣𝑖,𝑗 = 47593, its 16-bit binary
                                                                                  sequence representation becomes:
    This study presented AdaTraj-DP, an adaptive privacy-preserving
                                                                                  𝑠𝑖,𝑗 = Encode(47593, 16) = "1011100111101001".                                    (A.1)
framework for publishing trajectory data with differential privacy guar-
antees. The framework introduces context-aware sensitivity modeling                  This sequence 𝑠𝑖,𝑗 serves as the actual element for navigating and
and adaptive budget allocation to balance privacy protection and an-              constructing the prefix tree. Individual bits within the sequence deter-
alytical utility in AI-based mobility analysis. By integrating personal-          mine decisions at corresponding tree levels, establishing a multi-level
ized perturbation with hierarchical prefix-tree aggregation, AdaTraj-DP           spatial indexing structure. The selection of parameter 𝑘 (and conse-
enables trajectory-level differential privacy while maintaining spatial           quently 𝐿enc ) represents a crucial design choice that mediates between
fidelity and downstream model performance.                                        spatial granularity and the prefix tree’s dimensions and computational
    Future work will focus on extending AdaTraj-DP to support multi-              overhead.
modal trajectory data, integrating semantic and temporal context under
unified privacy constraints. Additionally, to address the efficiency con-         Data availability
cerns in high-frequency streaming environments, we plan to investigate
incremental tree update algorithms. This would allow the framework                   Data will be made available on request.
to handle real-time data streams with significantly lower latency while
maintaining the established privacy guarantees.
                                                                                  References
CRediT authorship contribution statement
                                                                                   [1] W. Zhang, M. Li, R. Tandon, H. Li, Online location trace privacy: An information
                                                                                       theoretic approach, IEEE Trans. Inf. Forensics Secur. 14 (1) (2018) 235–250.
    Yongxin Zhao: Writing – review & editing, Writing – original                   [2] F. Jin, W. Hua, M. Francia, P. Chao, M.E. Orlowska, X. Zhou, A survey and
draft, Visualization, Validation, Methodology, Investigation, Data cu-                 experimental study on privacy-preserving trajectory data publishing, IEEE Trans.
ration, Conceptualization. Chundong Wang: Writing – review & edit-                     Knowl. Data Eng. 35 (6) (2022) 5577–5596.
                                                                                   [3] J. Liu, J. Chen, R. Law, S. Wang, L. Yang, Travel patterns and spatial structure:
ing, Project administration, Methodology. Hao Lin: Visualization, Val-
                                                                                       understanding winter tourism by trajectory data mining, Asia Pac. J. Tour. Res.
idation, Methodology. Xumeng Wang: Writing – review & editing,                         29 (11) (2024) 1351–1368.
Methodology, Conceptualization. Yixuan Song: Methodology, Investi-                 [4] Z. Wu, X. Wang, Z. Huang, T. Zhang, M. Zhu, X. Huang, M. Xu, W. Chen, A
gation, Conceptualization. Qiuyu Du: Investigation, Conceptualization.                 utility-aware privacy-preserving method for trajectory publication, IEEE Trans.
                                                                                       Vis. Comput. Graphics.
                                                                                   [5] S. Schestakov, S. Gottschalk, T. Funke, E. Demidova, RE-Trace: Re-identification
Declaration of competing interest                                                      of modified GPS trajectories, ACM Trans. Spat. Algorithms Syst. 10 (4) (2024)
                                                                                       1–28.
    The authors declare that they have no known competing finan-                   [6] C. Dwork, Differential privacy, in: International Colloquium on Automata,
cial interests or personal relationships that could have appeared to                   Languages, and Programming, Springer, 2006, pp. 1–12.
                                                                                   [7] Z. Yang, R. Wang, D. Wu, H. Wang, H. Song, X. Ma, Local trajectory privacy
influence the work reported in this paper.                                             protection in 5G enabled industrial intelligent logistics, IEEE Trans. Ind. Inform.
                                                                                       18 (4) (2021) 2868–2876.
Acknowledgments                                                                    [8] Z. Shen, Y. Zhang, H. Wang, P. Liu, K. Liu, Y. Shen, BiGRU-DP: Improved
                                                                                       differential privacy protection method for trajectory data publishing, Expert Syst.
                                                                                       Appl. 252 (2024) 124264.
   Thanks to the National Key R&D Program of China (2023YFB2703
                                                                                   [9] Y. Zhao, C. Wang, Protecting privacy and enhancing utility: A novel approach for
900).                                                                                  personalized trajectory data publishing using noisy prefix tree, Comput. Secur.
                                                                                       144 (2024) 103922.
Appendix. Conversion from integer values to binary sequences                      [10] S. Yuan, D. Pi, X. Zhao, M. Xu, Differential privacy trajectory data protection
                                                                                       scheme based on R-tree, Expert Syst. Appl. 182 (2021) 115215.
                                                                                  [11] W. Cheng, R. Wen, H. Huang, W. Miao, C. Wang, OPTDP: Towards opti-
   Our prefix tree construction necessitates the representation of each                mal personalized trajectory differential privacy for trajectory data publishing,
geographic coordinate as a character sequence. Although the Hilbert                    Neurocomputing 472 (2022) 201–211.
space-filling curve successfully transforms a two-dimensional coordi-             [12] N. Niknami, M. Abadi, F. Deldar, A fully spatial personalized differentially private
nate 𝑝′𝑖,𝑗 into a one-dimensional integer 𝑣𝑖,𝑗 , this numerical value can-             mechanism to provide non-uniform privacy guarantees for spatial databases, Inf.
                                                                                       Syst. 92 (2020) 101526.
not be directly incorporated into a conventional prefix tree structure.           [13] P. Liu, D. Wu, Z. Shen, H. Wang, K. Liu, Personalized trajectory privacy data
Consequently, we implement an additional transformation phase that                     publishing scheme based on differential privacy, Internet Things 25 (2024)
converts this integer into a binary sequence 𝑠𝑖,𝑗 with fixed length.                   101074.


                                                                              9
Y. Zhao et al.                                                                                                                       Computer Standards & Interfaces 97 (2026) 104125


[14] W. Qardaji, W. Yang, N. Li, Differentially private grids for geospatial data, in:         [25] T. Wang, Y. Tao, A. Gilad, A. Machanavajjhala, S. Roy, Explaining differen-
     2013 IEEE 29th International Conference on Data Engineering, ICDE, IEEE, 2013,                 tially private query results with dpxplain, Proc. VLDB Endow. 16 (12) (2023)
     pp. 757–768.                                                                                   3962–3965.
[15] G. Cormode, C. Procopiuc, D. Srivastava, E. Shen, T. Yu, Differentially private           [26] Z. Huang, J. Liu, D.G. Alabi, R.C. Fernandez, E. Wu, Saibot: A differentially
     spatial decompositions, in: 2012 IEEE 28th International Conference on Data                    private data search platform, Proc. VLDB Endow. (PVLDB) 16 (11) (2023) PVLDB
     Engineering, IEEE, 2012, pp. 20–31.                                                            2023 demo / system paper.
[16] J. Hua, Y. Gao, S. Zhong, Differentially private publication of general time-             [27] Y. Dai, J. Shao, C. Wei, D. Zhang, H.T. Shen, Personalized semantic trajectory
     serial trajectory data, in: 2015 IEEE Conference on Computer Communications,                   privacy preservation through trajectory reconstruction, World Wide Web 21
     INFOCOM, IEEE, 2015, pp. 549–557.                                                              (2018) 875–914.
[17] Z. Zhang, X. Xu, F. Xiao, LGAN-DP: A novel differential private publication               [28] K. Zuo, R. Liu, J. Zhao, Z. Shen, F. Chen, Method for the protection of
     mechanism of trajectory data, Future Gener. Comput. Syst. 141 (2023) 692–703.                  spatiotemporal correlation location privacy with semantic information, J. Xidian
[18] Y. Hu, Y. Du, Z. Zhang, Z. Fang, L. Chen, K. Zheng, Y. Gao, Real-time trajectory               Univ. 49 (1) (2022) 67–77.
     synthesis with local differential privacy, in: 2024 IEEE 40th International               [29] S. Denisov, H.B. McMahan, J. Rush, A. Smith, A. Guha Thakurta, Improved
     Conference on Data Engineering, ICDE, IEEE, 2024, pp. 1685–1698.                               differential privacy for sgd via optimal private linear operators on adaptive
[19] R. Zhang, W. Ni, N. Fu, L. Hou, D. Zhang, Y. Zhang, DP-LTGAN: Differentially                   streams, Adv. Neural Inf. Process. Syst. 35 (2022) 5910–5924.
     private trajectory publishing via Locally-aware Transformer-based GAN, Future             [30] H. Fang, X. Li, C. Fan, P. Li, Improved convergence of differential private sgd
     Gener. Comput. Syst. 166 (2025) 107686.                                                        with gradient clipping, in: The Eleventh International Conference on Learning
[20] S. Jiao, J. Cheng, Z. Huang, T. Li, T. Xie, W. Chen, Y. Ma, X. Wang, DPKnob: A                 Representations, 2023.
     visual analysis approach to risk-aware formulation of differential privacy schemes        [31] J. Fu, coauthors, DPSUR: Accelerating differentially private training via selective
     for data query scenarios, Vis. Inform. 8 (3) (2024) 42–52.                                     updates and release, Proc. VLDB Endow. (PVLDB) 17 (2024) PVLDB paper; PDF
[21] X. Wang, S. Jiao, C. Bryan, Defogger: A visual analysis approach for data                      available from VLDB site.
     exploration of sensitive data protected by differential privacy, IEEE Trans. Vis.         [32] Y. Zheng, Trajectory data mining: an overview, ACM Trans. Intell. Syst. Technol.
     Comput. Graphics 31 (1) (2025) 448–458, http://dx.doi.org/10.1109/TVCG.                        (TIST) 6 (3) (2015) 1–41.
     2024.3456304.                                                                             [33] M.E. Andrés, N.E. Bordenabe, K. Chatzikokolakis, C. Palamidessi, Geo-
[22] R. Chen, B.C.M. Fung, B.C. Desai, Differentially private trajectory data                       indistinguishability: Differential privacy for location-based systems, in: Proceed-
     publication, 2011, arXiv:1112.2020, URL https://arxiv.org/abs/1112.2020.                       ings of the 2013 ACM SIGSAC Conference on Computer & Communications
[23] C. Yin, J. Xi, R. Sun, J. Wang, Location privacy protection based on differential              Security, 2013, pp. 901–914.
     privacy strategy for big data in industrial internet of things, IEEE Trans. Ind.          [34] W. Zhang, M. Li, R. Tandon, H. Li, Semantic-aware privacy-preserving online
     Inform. 14 (8) (2017) 3628–3636.                                                               location trajectory data sharing, IEEE Trans. Inf. Forensics Secur. 17 (2022)
[24] Y. Zhao, C. Wang, E. Zhao, X. Zheng, H. Lin, PerTrajTree-DP: A personalized                    2292–2306.
     privacy-preserving trajectory publishing framework for trustworthy AI systems,            [35] J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, Y. Huang, T-drive: driving
     in: Data Security and Privacy Protection, Springer Nature Singapore, Singapore,                directions based on taxi trajectories, in: Proceedings of the 18th SIGSPATIAL
     ISBN: 978-981-95-3182-0, 2026, pp. 57–75.                                                      International Conference on Advances in Geographic Information Systems, 2010,
                                                                                                    pp. 99–108.
                                                                                               [36] Y. Zheng, X. Xie, W.-Y. Ma, et al., GeoLife: A collaborative social networking
                                                                                                    service among user, location and trajectory, IEEE Data Eng. Bull. 33 (2) (2010)
                                                                                                    32–39.
                                                                                               [37] Y. Zhao, C. Wang, L. Li, X. Wang, H. Lin, Z. Liu, TrajMamba: A multi-scale
                                                                                                    mamba-based framework for joint trajectory and road network representation
                                                                                                    learning, 2025, https://ssrn.com/abstract=5624451.


                                                                                          10