830 lines
99 KiB
Plaintext
830 lines
99 KiB
Plaintext
Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
Contents lists available at ScienceDirect
|
||
|
||
|
||
Computer Standards & Interfaces
|
||
journal homepage: www.elsevier.com/locate/csi
|
||
|
||
|
||
|
||
|
||
Co-distillation-based defense framework for federated knowledge graph
|
||
embedding against poisoning attacks
|
||
∗
|
||
Yiqin Lu, Jiarui Chen , Jiancheng Qin
|
||
School of Electronic and Information Engineering, South China University of Technology, 510641, China
|
||
|
||
|
||
|
||
ARTICLE INFO ABSTRACT
|
||
|
||
Keywords: Federated knowledge graph embedding (FKGE) enables collaborative knowledge sharing without data ex-
|
||
Federated learning change, but it also introduces risks of poisoning attacks that degrade model accuracy or force incorrect
|
||
Knowledge graph outputs. Protecting FKGE from poisoning attacks becomes a critical research problem. This paper reveals
|
||
Poisoning attack
|
||
the malicious strategy of untargeted FKGE poisoning attacks and proposes CoDFKGE, a co-distillation-based
|
||
Knowledge distillation
|
||
FKGE framework for defending against poisoning attacks. CoDFKGE deploys two collaborative knowledge
|
||
graph embedding models on clients, decoupling prediction parameters from shared parameters as a model-
|
||
agnostic solution. By designing distinct distillation loss functions, CoDFKGE transfers clean knowledge from
|
||
potentially poisoned shared parameters while compressing dimensions to reduce communication overhead.
|
||
Experiments show CoDFKGE preserves link prediction performance with lower communication costs, eliminates
|
||
malicious manipulations under targeted poisoning attacks, and significantly mitigates accuracy degradation
|
||
under untargeted poisoning attacks.
|
||
|
||
|
||
|
||
1. Introduction embedding for entities and relations. However, real-world KGs of dif-
|
||
ferent organizations are often incomplete, making it difficult to train
|
||
Knowledge graphs (KGs) are structured representations of real- high-quality knowledge graph reasoning models. Moreover, KG data
|
||
world entities and their relationships, supporting applications in search often contains a large amount of private data, and direct data sharing
|
||
engines [1,2], recommendation systems [3,4], and security analysis [5, will inevitably lead to privacy leakage. For this reason, federated
|
||
6]. Knowledge graph embedding (KGE) techniques project entities learning [12] is introduced into knowledge graph reasoning.
|
||
and relations into low-dimensional vector spaces, enabling efficient
|
||
FKGE assumes that there are multiple participants with comple-
|
||
knowledge reasoning and completion [7]. Due to privacy regulations
|
||
mentary but incomplete KGs, aiming to derive optimal knowledge
|
||
and data sensitivity requirements, KGs across organizations within the
|
||
embeddings for each participant without data exchange. Most existing
|
||
same domain remain fragmented despite growing data volumes. In this
|
||
context, federated knowledge graph embedding (FKGE) emerges as a studies [13–15] model FKGE as multiple clients that maintain local
|
||
collaborative learning technique for sharing KG embeddings without KGE models and a central server. Clients train models locally and
|
||
data exchange. However, the introduction of federation mechanisms upload the model parameters to the central server, which aggregates
|
||
will bring new privacy risks. malicious participants can inject poisoned the parameters and then returns them to the clients.
|
||
parameters during training or aggregation to launch a poisoning attack, However, since the embedding vectors are directly the model pa-
|
||
degrading model accuracy or forcing incorrect outputs. Consequently, rameters, FKGE is highly vulnerable to poisoning attacks. With the
|
||
protecting FKGE systems against poisoning attacks has emerged as a intent to reduce model performance, steal sensitive information, or dis-
|
||
critical research challenge. rupt system stability, poisoning attacks refer to malicious modifications
|
||
Unlike graph neural network (GNN)-based models, KGE models of parameters during local training or parameter aggregation on the
|
||
usually rely on the translation-based model [8–11]. The embedding
|
||
server. To protect the participants of FKGE, it is necessary to propose
|
||
vectors of entity and relation in the KG are directly used as learnable
|
||
a protection mechanism against FKGE poisoning attacks.
|
||
parameters. KGE models utilize different score functions to measure
|
||
Moreover, other related indicators in FKGE deserve attention. For
|
||
the plausibility of triples (h,r,t). By contrasting the outputs of existing
|
||
triples and negatively sampled triples, KGE models derive appropriate example, the federated learning of KGE requires frequent parameter
|
||
|
||
|
||
|
||
∗ Corresponding author.
|
||
E-mail addresses: eeyqlu@scut.edu.cn (Y. Lu), ee_jrchen@mail.scut.edu.cn (J. Chen), jcqin@scut.edu.cn (J. Qin).
|
||
|
||
https://doi.org/10.1016/j.csi.2025.104113
|
||
Received 3 June 2025; Received in revised form 8 November 2025; Accepted 8 December 2025
|
||
Available online 9 December 2025
|
||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
exchange, and the use of a translation-based model will submit the en- 2.3. Poisoning attack in federated learning
|
||
tity or relation embeddings, which makes the communication overhead
|
||
greater than that of traditional federated learning. Federated Learning (FL), due to its distributed training nature,
|
||
Knowledge distillation [16] is a model compression technique that creates favorable conditions for poisoning attacks while protecting
|
||
improves the performance of a simple (student) model by transfer- data privacy. Poisoning attacks in federated learning have attracted
|
||
ring the knowledge from a complex (teacher) model. Distillation-based significant attention from researchers [25]. In federated learning sce-
|
||
methods are considered to be a feasible solution to combat poisoning narios, poisoning attacks pose serious threats to model security by
|
||
attacks [17–19]. A teacher model can extract clean knowledge from manipulating partial training data or local models to embed malicious
|
||
the poisoned parameters and transfer it to a student model, thereby behaviors [26]. The literature [27] generates stealthy backdoor trig-
|
||
improving the robustness without changing the model structure. Co- gers by extracting high-frequency features from images using discrete
|
||
distillation [20] is a variant of knowledge distillation that trains two or wavelet transform and introduces an asymmetric frequency confusion
|
||
more models simultaneously, allowing mutual learning and information mechanism, achieving efficient backdoor attacks on multiple datasets.
|
||
sharing. This paper aims to design a federated knowledge graph defense
|
||
Meanwhile, many studies have proposed defense methods against poi-
|
||
framework based on Co-distillation, which can enhance the model’s
|
||
soning attacks. The Literature [28] proposes the Krum method, which
|
||
resistance to poisoning attacks through collaborative learning without
|
||
selects the most reliable gradient update by evaluating the consistency
|
||
changing the original FKGE architecture.
|
||
of gradients, thereby effectively defending against poisoning attacks.
|
||
The rest of this paper is organized as follows. Section 2 reviews the
|
||
The Literature [29] proposes Fl-Defender, which improves robustness
|
||
related work on FKGE and knowledge distillation. Section 3 introduces
|
||
by introducing cosine similarity to adjust the weights of parameter
|
||
the preliminary concepts and methodologies essential for addressing
|
||
aggregation. The literature [30] proposed a two-stage backdoor defense
|
||
FKGE poisoning attacks, with the main contributions of this paper
|
||
method called MCLDef based on Model Contrastive Learning (MCL),
|
||
summarized at the end of this section. In Section 4, we detail the threat
|
||
which can significantly reduce the success rate of backdoor attacks with
|
||
model and malicious strategies for targeted and untargeted poison-
|
||
only a small amount of clean data. In summary, existing research on
|
||
ing attacks in FKGE. Section 5 presents the CoDFKGE framework for
|
||
poisoning attacks in federated learning mainly focuses on traditional
|
||
defending against FKGE poisoning attacks, followed by experimental
|
||
validation in Section 6. Finally, concluding remarks and future research deep learning domains. The design ideas of defense frameworks have
|
||
directions are outlined in Section 7. laid the foundation for subsequent poisoning attack defense methods of
|
||
FKGE.
|
||
2. Related work
|
||
2.4. Security issues in FKGE
|
||
2.1. Basic FKGE framework
|
||
With the development of FKGE, its security and privacy issues have
|
||
Early research on FKGE mainly focused on how to achieve cross- attracted increasing attention, with existing research mainly focusing
|
||
client knowledge sharing and model aggregation while protecting data on privacy leakage defense. The literature [31] proposed a decentral-
|
||
privacy. FedE [13] is the first paper to introduce federated learning into ized scalable learning framework where embeddings from different KGs
|
||
KGE. FedE facilitates cross-client knowledge sharing by maintaining an can be learned in an asynchronous and peer-to-peer manner while
|
||
entity table. Nevertheless, the mechanism of sharing entity embeddings being privacy-preserving. The literature [21] conducts the first holistic
|
||
in FedE has been proven to contain privacy vulnerabilities [21]. At- study of the privacy threat on FKGE from both attack and defense
|
||
tackers can leverage the embedding information to infer the existence perspectives. It introduced three new inference attacks and proposed
|
||
of private triples within client datasets. Based on FedE, FedEC [14] a differentially private FKGE model DP-Flames with private selection
|
||
applies embedding contrastive learning for tackling data heterogeneity and an adaptive privacy budget allocation policy. Based on [21], the
|
||
and utilizes a global update procedure for sharing entity embeddings. literature [32] introduces five new inference attacks, and proposed
|
||
In response to the privacy vulnerability of FedE, FedR [15] proposed a PDP-Flames, which leverages the sparse gradient nature of FKGE for
|
||
privacy-preserving relation embedding aggregation method. By sharing
|
||
better privacy-utility trade-off.
|
||
relation embeddings instead of entity embeddings, FedR can signifi-
|
||
Compared with privacy leakage issues, research on defending
|
||
cantly reduce the communication overhead of privacy leakage risks
|
||
against poisoning attacks in FKGE is still in its early stages. Traditional
|
||
while retaining the semantic information of the KG.
|
||
federated learning typically does not directly transmit original embed-
|
||
dings. However, entity and relation embeddings are core components
|
||
2.2. Knowledge distillation in FKGE
|
||
in translation-based KGE, so direct transmission of embeddings is
|
||
required during FKGE aggregation. Direct malicious modifications to
|
||
Knowledge Distillation techniques are widely applied in the FKGE
|
||
embeddings are difficult to effectively defend against using traditional
|
||
field due to their advantages in model compression and knowledge
|
||
transfer. To cope with the drift between local optimization and global federated learning defense methods.
|
||
convergence caused by data heterogeneity, FedLU [22] proposes mu- The recent literature [33] is the first work to systematize the risks of
|
||
tual knowledge distillation. Moreover, it contains an unlearning method FKGE poisoning attacks. However, it primarily focuses on several forms
|
||
to erase specific knowledge from local clients. FedKD [23] uses knowl- of targeted poisoning attacks in FKGE, without mentioning untargeted
|
||
edge distillation to reduce communication costs, and proposes to adap- poisoning attacks. Although this research provides some defense sug-
|
||
tively learn temperature to scale the scores of triples to mitigate teacher gestions, such as zero-knowledge proof and privacy set intersection, it
|
||
over-confidence issues. In addition to FKGE, the KGE model ColE [24] does not propose specific defense methods. In summary, the existing
|
||
proposes co-distillation learning to exploit the complementarity of research lacks a systematic introduction to the untargeted poisoning
|
||
graph structure and text information. It employs Transformer and Bert attack of FKGE, and there is no complete defense method against FKGE
|
||
for graph and text respectively, then distills selective knowledge from poisoning attacks.
|
||
each other’s prediction logits. Overall, existing research on knowledge To address the above issues, this paper reveals the malicious strat-
|
||
distillation in FKGE primarily focuses on handling data heterogeneity, egy of FKGE untargeted poisoning attacks and proposes CoDFKGE,
|
||
with insufficient exploration of its potential value in model security. a co-distillation-based federating knowledge graph embedding frame-
|
||
This paper will explore the application of knowledge distillation in work for defending against poisoning attacks. The main contributions
|
||
FKGE security to defend against poisoning attacks. of this paper are summarized as follows.
|
||
|
||
2
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
1 We systematically define untargeted poisoning attacks in FKGE local KGE model to update its local embedding 𝜃𝐿𝑘 and server-shared
|
||
𝑐
|
||
and reveal the poisoning attacks’ malicious strategy, thereby en- embedding 𝜃𝑆𝑘 . Then, client 𝑐 uploads its shared embedding 𝜃𝑆𝑘 to the
|
||
𝑐 𝑐
|
||
hancing threat identification in FKGE and providing a foundation server. In server aggregate stage, the central server 𝑆 aggregates the
|
||
for subsequent defense research. shared embeddings from all clients to obtain the shared parameters
|
||
2 We propose CoDFKGE, the first co-distillation defense framework 𝜃𝑆𝑘+1 . Finally, the server broadcasts the shared parameters 𝜃𝑆𝑘+1 to all
|
||
against poisoning attacks in FKGE. By deploying bidirectional clients. Entity embeddings in KGE are usually shared parameters, while
|
||
distillation models with distinct distillation loss at the client side, relation embeddings are local parameters. Only rare literature [15] uses
|
||
CoDFKGE as a model-agnostic solution decouples prediction pa- relation embeddings as shared parameters.
|
||
rameters from shared parameters, thereby enhancing the model’s In FKGE, how the server effectively aggregates shared embeddings
|
||
resistance to poisoning attacks and improving robustness. We from different clients is a common problem. The most common FKGE
|
||
designed distinct distillation loss functions for the two models in server aggregation method is FedE [13], which is an improvement on
|
||
CoDFKGE, enabling CoDFKGE to transfer clean knowledge from FedAvg [12]. To handle the imbalance in the number of entities across
|
||
potentially poisoned shared parameters and compress shared pa- different clients, FedE aggregate the shared entities using the number
|
||
rameter dimensions, which reduces communication overhead. of occurrences in the local data as the weight 𝑤𝑐 . This weight value
|
||
3 We validated the performance of CoDFKGE against poisoning can be obtained using the existence matrix 𝑀 mentioned above. The
|
||
attacks through experiments. The results show that without com- mathematical expression for FedE’s server aggregation method is shown
|
||
promising link prediction performance CoDFKGE can completely in (2).
|
||
eliminate targeted poisoning attacks and significantly mitigate ∑
|
||
𝜃𝑆𝑘+1 = 𝑐 𝑤𝑐 𝜃𝑆𝑘 (2)
|
||
the performance degradation caused by untargeted poisoning 𝑐
|
||
|
||
attacks, while simultaneously reducing communication overhead. The final target of FKGE is to minimize the loss function of all client
|
||
Ablation experiments further confirm the effectiveness of the two local triplets simultaneously through federated learning. Its optimiza-
|
||
distillation loss functions in CoDFKGE. tion objective can be expressed as Eq. (3).
|
||
∑𝐶
|
||
𝑎𝑟𝑔 min 𝑐 (𝜃𝐿𝑐 , 𝜃𝑆𝑐 ) (3)
|
||
3. Preliminaries (𝜃 ,𝜃 ) 𝑐
|
||
𝐿𝑐 𝑆𝑐
|
||
|
||
|
||
3.1. Knowledge graph embedding 3.3. Knowledge distillation
|
||
|
||
KG can be represented as (, , ), where E and R are entity sets Knowledge distillation is a model compression technique that trans-
|
||
and relationship sets. is a set of triples, where a triple (ℎ, 𝑟, 𝑡) ∈ fers knowledge contained in a complex model (teacher) to a simple
|
||
indicates that a relationship 𝑟 ∈ connects the entities ℎ, 𝑡 ∈ . model (student) to improve the performance of the simple model. In the
|
||
Translation-based KGE models project entities and relationships classic knowledge distillation framework, the student model’s training
|
||
in KGs into a continuous vector space. Models employ the scoring loss comprises two components: the cross entropy loss 𝐿𝐶𝐸 , computed
|
||
function 𝑔(ℎ, 𝑟, 𝑡; 𝜃) to evaluate the plausibility of triples, while 𝜃 rep- between its output and the true label, and the distillation loss 𝐿𝐾𝐷 ,
|
||
resents the embedding parameters. During model training, negative computed between its output and the teacher model’s output (soft
|
||
samples (ℎ, 𝑟, 𝑡′ ) are constructed by randomly replacing the tail entities label). In practical applications, the distillation loss is usually quantified
|
||
of positive triples. The training process aims to maximize the score using the Kullback–Leibler divergence 𝐷𝐾𝐿 between the student model
|
||
discrepancy between positive and negative samples. Currently, most output and the soft label, and its mathematical expression is shown
|
||
KGE models [9,11] employ the binary cross-entropy loss to measure in Eq. (4).
|
||
the difference between positive and negative samples. Its mathematical ( ) ∑ ( )
|
||
𝑝 (𝑖)
|
||
expression is as Eq. (1). 𝐷𝐾𝐿 𝑝𝑡𝑒𝑎 ∥ 𝑝𝑠𝑡𝑢 = 𝑖 𝑝𝑡𝑒𝑎 (𝑖) log 𝑝𝑡𝑒𝑎 (𝑖)
|
||
( ) 𝑠𝑡𝑢 ( ) (4)
|
||
(
|
||
∑ 𝐿𝐾𝐷 = 𝜏 2 𝐷𝐾𝐿 𝜎(𝑧(𝑛) (𝑛)
|
||
𝑡𝑒𝑎 ) ∥ 𝜎(𝑧𝑠𝑡𝑢 ) , 𝑤ℎ𝑒𝑟𝑒 𝜎(𝑥) = sof tmax 𝜏
|
||
𝑥
|
||
𝐿 = − log 𝜎 (𝑔(ℎ, 𝑟, 𝑡; 𝜃) − 𝛾)
|
||
(ℎ,𝑟,𝑡)∈ Among them, 𝑧𝑡𝑒𝑎 and 𝑧𝑠𝑡𝑢 are the logits of the teacher model and
|
||
)
|
||
∑ student model, respectively. 𝜏 is the temperature coefficient, which is
|
||
+ 𝑝(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃) log 𝜎(𝛾 − 𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃)) (1) used to control the smoothness of the output.
|
||
𝑖
|
||
To allow the student model to effectively absorb the knowledge
|
||
Among them, 𝛾 represents the margin, and (ℎ, 𝑟, 𝑡′𝑖 ) is 𝑖th negative contained in the teacher model while fitting the real data distribution,
|
||
triples. 𝑝(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃) stands for the occurrence probability of this negative the final loss function is usually the weighted sum of 𝐿𝐶𝐸 and 𝐿𝐾𝐷 .
|
||
sample given the embedding parameters 𝜃.
|
||
4. Threat model
|
||
3.2. Federated knowledge graph embedding
|
||
Poisoning attacks in federated learning can be categorized into
|
||
FKGE is an application of federated learning that aims to fuse and targeted poisoning attacks, semi-targeted poisoning attacks, and untar-
|
||
share knowledge vectors from different KGs to enhance the effective- geted poisoning attacks according to the intention of attackers [34].
|
||
ness of KGE. Currently, most related studies are based on the framework In FKGE, a semi-targeted poisoning attack can be regarded as a special
|
||
proposed in FedE [13]. case of a targeted poisoning attack. Therefore, this paper focuses on the
|
||
The basic framework of FKGE consists of a client set 𝐶 and a central targeted and untargeted poisoning attack type.
|
||
server 𝑆. Each client 𝑐 ∈ 𝐶 holds a local KG 𝑐 (𝑐 , 𝑐 , 𝑐 ). The entity
|
||
sets of different KGs are partially overlapping, so the understanding of 4.1. Targeted poisoning attack
|
||
entities in a certain client can be supplemented by information from
|
||
other clients. The server has the one-hot existence matrix 𝑀 ∈ R𝐶×𝑁 Targeted poisoning attacks are a attack strategy where the attacker
|
||
of all entities in the client, where 𝑁 is the number of entities. crafts specific malicious triples that do not exist in the target system,
|
||
In each client, KGE model parameters consist of local parame- and manipulate the target model to accept these fake triples by inject-
|
||
ters 𝜃𝐿 and shared parameters 𝜃𝑆 . During FKGE training, each epoch ing poisoned parameters into the shared parameters. This type of attack
|
||
progresses through two sequential phases: client update and server poses a serious threat to the application of FKGE, as the false relation-
|
||
aggregation. In the 𝑘th client update stage, client 𝑐 first trains its ships it introduces can lead to reasoning errors and decision-making
|
||
|
||
3
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
|
||
|
||
Fig. 1. Process of targeted poisoning attack.
|
||
|
||
|
||
|
||
|
||
Fig. 2. Framework of CoDFKGE model.
|
||
|
||
|
||
|
||
|
||
biases in downstream tasks. For example, in financial transaction net- attacker’s deceptive information. The shadow model’s parameters in-
|
||
works, a knowledge graph is constructed with transaction entities clude 𝜃𝑆𝑝 , which can be initialized with the victim shared parameters
|
||
as nodes and transaction relationships as edges. Link prediction can 𝜃𝑆𝑐 , and 𝜃𝐿𝑝 , which approximates the victim’s local model parameters
|
||
then be applied to detect potential transaction relationships (such as 𝜃𝐿𝑐 from random initial values. To ensure the shadow model effectively
|
||
money laundering or fraud). If an attacker compromises one of the bridges both the victim’s genuine knowledge and the attacker’s ma-
|
||
participants, they can introduce false transaction relationships through licious objectives, its parameters are optimized to minimize the loss
|
||
targeted poisoning attacks, leading to unreasonable inferences about function across all triples in the poisoned dataset, as formalized in Eq.
|
||
the victim entity. (5).
|
||
∑
|
||
To execute such an attack successfully, the attacker typically follows arg min 𝐿(ℎ, 𝑟, 𝑡; 𝜃𝑆𝑝 , 𝜃𝐿𝑝 )
|
||
(𝜃𝑆𝑝 ,𝜃𝐿𝑝 ) (5)
|
||
a multi-stage process that begins with victim’s local information gath- (ℎ,𝑟,𝑡)∈𝑝
|
||
ering. Fig. 1 shows the process of a targeted poisoning attack. In FKGE
|
||
Where L is the loss function of the baseline model.
|
||
systems, while the server can observe the entities and relations each
|
||
After training the shadow model, the attacker extracts the poisoned
|
||
client possesses, it lacks visibility into how these elements are struc- shared parameters 𝜃𝑆𝑝 using the same procedure that legitimate clients
|
||
tured into specific triples. However, for frameworks that share entity employ to prepare parameters for server aggregation. The attacker can
|
||
embeddings (such as FedE [13]), recent research [21] has shown that a aggregate the poisoned parameters 𝜃𝑆𝑝 with the normal clients’ shared
|
||
malicious server can use KGE scoring function to infer the victim’s local parameters. The attacker usually operates as a compromised server and
|
||
relationship patterns and reconstruct the victim’s triple 𝑣 . Armed with assigns a disproportionately high weight to the poisoned parameters
|
||
this inferred knowledge, the attacker strategically constructs malicious during the aggregation process to ensure that the poisoned parameter
|
||
triples 𝑚 that align with the victim’s existing KG schema but represent dominate the aggregated shared parameters.
|
||
false information. The final stage of the attack exploits the implicit trust in feder-
|
||
The next critical attack phase involves training a shadow model, a ated systems. The victim client, unaware of the poisoning, directly
|
||
surrogate KGE model designed to mimic the victim’s learning process. incorporates the compromised aggregated parameters into its local
|
||
The shadow model is trained on a poisoned dataset 𝑝 , which combines training process without validation. As a result, the victim’s model
|
||
the inferred victim triples 𝑣 and the malicious triples 𝑚 . This training gradually learns to accept the malicious triples as valid, ultimately pro-
|
||
strategy ensures the shadow model learns to generate embeddings ducing incorrect predictions on these non-existent relationships while
|
||
that are consistent with both the victim’s genuine knowledge and the maintaining seemingly normal performance on other parts of the KG.
|
||
|
||
4
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
4.2. Untargeted poisoning attack facilitate the reproducibility of our CoDFKGE model, we provide the
|
||
complete training framework pseudocode as shown in Algorithm 1.
|
||
The conditions for achieving a targeted poisoning attack are com-
|
||
plex. For example, FedR [15] shares only relation embeddings (not
|
||
Algorithm 1 CoDFKGE Training Framework
|
||
entity embeddings), preventing attackers from inferring victim rela-
|
||
tions via entity matrices and thus avoiding targeted poisoning attacks. Require: Baseline KGE model 𝑔, Training triples , Learning rate 𝜂,
|
||
Even with relational data leaks, targeted poisoning attacks are difficult. Distillation weight 𝛽, Distillation temperature 𝜏, Total iterations 𝐾
|
||
Compared with sharing entity embeddings, the sparsity of relation Initialization:
|
||
embeddings reduces the shadow model’s ability to align parameters 1: Initialize client-side prediction model with 𝜃0𝑃 = (𝜃0𝑆 , 𝜃0𝐿 ) ⊳ Local
|
||
with the victim’s vector space. However, FedR has almost no defense parameters randomly initialized
|
||
2: Initialize client-side communication model with reduced feature
|
||
effect against untargeted poisoning attacks.
|
||
dimensions
|
||
An untargeted poisoning attack means that the attacker aims to dis-
|
||
3: Initialize server-side aggregated parameters 𝜃1𝑆 = 𝜃0𝑆 ⊳ First round
|
||
rupt victim model convergence or maximize the mispredictions among
|
||
initialization
|
||
test cases. By maximizing the victim’s loss function during training,
|
||
Main Training Loop (Iterations 𝑘 = 1, 2, ..., 𝐾):
|
||
attackers can force non-convergent predictions. The attacker can gen-
|
||
// Client Update Phase (For each client)
|
||
erate the poisoned shared parameter 𝜃𝑆∗ for the victim, which can be
|
||
𝑣 4: for each client 𝑐 ∈ 𝐶 do
|
||
formalized in Eq. (6).
|
||
∑ 5: // Step 1: Communication to Prediction Model Distillation
|
||
arg max 𝐿(ℎ, 𝑟, 𝑡; 𝜃𝑆∗ , 𝜃𝐿𝑣 ) (6) 6: Load server-shared parameters 𝜃𝑘𝑆 ⊳ Latest global shared
|
||
𝜃∗𝑆𝑣 (ℎ,𝑟,𝑡)∈𝑣
|
||
𝑣
|
||
embeddings
|
||
𝐶𝐿
|
||
Among them, 𝜃𝐿𝑣 denotes the victim’s local parameters. 𝑣 is the 7: Initialize communication model with 𝜃 𝐶 = (𝜃𝑘𝑆 , 𝜃𝑘−1 )
|
||
8: Freeze communication model parameters ⊳ Act as teacher
|
||
victim’s triplet set. Since it is difficult for the attacker to obtain these
|
||
model
|
||
two parameters directory, they can use random values as guesses for 𝑃
|
||
9: Compute distillation loss 𝐿𝑘 𝐾𝐷 using Equation (7) ⊳ Only
|
||
𝜃𝐿𝑣 and use triples of random combinations of 𝑣 and as guesses for
|
||
positive samples
|
||
𝑣 . 𝑃
|
||
10: Compute KGE loss 𝐿𝑘 𝐾𝐺𝐸 on training triples
|
||
In particular, for the TransE model [7] with the scoring function 𝑃 𝑃
|
||
𝑔(ℎ, 𝑟, 𝑡) = |ℎ + 𝑟 − 𝑡|, the attacker can launch an untargeted poisoning 11: Update prediction model parameters (𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) with:
|
||
𝑃 𝑃
|
||
attack by setting the shared parameter 𝜃𝑆′ sent to the victim to identical 12: ∇𝜃𝑘𝑃 = ∇(𝛽𝐿𝑘 𝐾𝐺𝐸 + (1 − 𝛽)𝐿𝑘 𝐾𝐷 )⊳ Gradient flows through
|
||
𝑣
|
||
value or using negative aggregation parameters. To avoid detection, prediction model only
|
||
𝑃 𝑃
|
||
noise is often added to poisoned parameters. The prediction perfor- 13: 𝜃𝑘 = 𝜃𝑘 − 𝜂∇𝜃𝑘𝑃 , 𝑤ℎ𝑒𝑟𝑒 𝜃𝑘 = {𝜃𝑘 𝐿 , 𝜃𝑘 𝑆 } ⊳ Update
|
||
mance of the victim model may even be lower than that of standalone prediction model parameters
|
||
training without federated aggregation. 14: Unfreeze communication model parameters
|
||
In general, the success of FKGE poisoning attacks relies on vic- 15: // Step 2: Prediction to Communication Model Distillation
|
||
tims using attacker-provided aggregate parameters directly for training 16: Freeze prediction model parameters 𝜃𝑘𝑃 ⊳ Used as teacher
|
||
without validation. To prevent poisoning attacks, it is critical to isolate model
|
||
𝐶
|
||
the parameters of the prediction model from externally provided aggre- 17: Compute distillation loss 𝐿𝑘 𝐾𝐷 using Equation (9) ⊳ Both
|
||
gate parameters. Specifically, potentially poisoned shared parameters samples
|
||
𝐶 𝐶
|
||
must be filtered before training. Meanwhile, minimizing parameter ex- 18: Update communication model parameters (𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) with
|
||
𝐶
|
||
posure to the external environment is essential. Therefore, we propose 19: ∇𝜃𝑘𝐶 = ∇𝐿𝑘 𝐾𝐷 ⊳ Gradient flows through communication
|
||
CoDFKGE, a defense FKGE framework based on co-distillation. model only
|
||
𝐶 𝐶
|
||
20: 𝜃𝑘 = 𝜃𝑘 − 𝜂∇𝜃𝑘𝐶 , 𝑤ℎ𝑒𝑟𝑒 𝜃𝑘 = {𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 }
|
||
𝐶
|
||
5. Model design 21: Upload updated shared parameters 𝜃𝑘 𝑆 to server
|
||
22: Unfreeze prediction model parameters
|
||
CoDFKGE is a training framework on the client side. Its training 23: end for
|
||
process is shown in Fig. 2. CoDFKGE initializes two baseline models // Server Aggregation Phase
|
||
with the same structure and scoring function, but for different purposes. 24: Server aggregates 𝜃𝑘𝑆 + 1 from all clients using baseline federated
|
||
The communication model is mainly responsible for receiving and aggregate method.
|
||
processing shared parameters, while the prediction model is used for 25: Set 𝑘 = 𝑘 + 1 and repeat main loop until 𝑘 > 𝐾 ⊳ Continue Main
|
||
the final embedding and prediction. To minimize potential parameter Training Loop
|
||
leakage and communication overhead, the feature dimension of the return Final prediction model parameters of each client.
|
||
communication model is intentionally designed to be smaller than that
|
||
of the prediction model.
|
||
During the training process, the two models learn collaboratively CoDFKGE is designed to be model-agnostic, enabling seamless in-
|
||
through knowledge distillation. Once the communication model re- tegration with diverse FKGE models based on their shared parameter
|
||
ceives the potentially poisoned shared parameters from the server, types. Both communication and prediction models used by CoDFKGE
|
||
it acts as a teacher model to transfer clean knowledge to the pre- clients utilize the same scoring function 𝑔 as the original KGE model.
|
||
diction model. Following the training of the prediction model, the Clients upload and utilize shared parameters identically to the baseline
|
||
roles are reversed: the prediction model becomes the teacher, and the
|
||
model, with these parameters maintaining the same form and dimen-
|
||
communication model serves as the student for distillation. This stage
|
||
sionality as the original implementation. This parameter compatibility
|
||
extracts knowledge from the prediction model and compresses it into
|
||
the communication model, ensuring efficient knowledge sharing while enables the server to aggregate updates using existing federated learn-
|
||
minimizing parameter exposure and communication overhead. By de- ing aggregation methods without modification. This design ensures that
|
||
ploying two distinct model instances, the framework physically isolates CoDFKGE preserves the original knowledge representation capabilities
|
||
attacker-injected parameters from the prediction model’s parameters, while maintaining consistent operational semantics with the baseline
|
||
making poisoning attacks significantly more difficult to execute. To model.
|
||
|
||
5
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
5.1. Communication to prediction model distillation of 𝑝 follows the approach in [9], with its mathematical formulation
|
||
provided in Eq. (10).
|
||
In the first iteration, the model trains the prediction component exp 𝜏 𝑔(ℎ,𝑟,𝑡′ )
|
||
following the standard procedure. Starting from the second iteration of 𝑝(ℎ, 𝑟, 𝑡′𝑖 ) = ∑ exp𝛼𝜏 𝑔(ℎ,𝑟,𝑡
|
||
𝑖
|
||
′) (10)
|
||
𝑗 𝛼 𝑗
|
||
the training process, the communication model loads the server-shared
|
||
Where 𝜏𝛼 is the self-adversarial sampling temperature.
|
||
parameters 𝜃𝑘𝑆 and initializes itself jointly with the local embeddings
|
||
𝐿 from the previous iteration’s local prediction model. After the bidirectional distillation process of CoDFKGE, the com-
|
||
𝜃𝑘−1 𝐶 𝐶
|
||
munication model parameters are updated to 𝜃𝑘 𝑆 and 𝜃𝑘 𝐿 . Client then
|
||
After the communication model receives and applies the server- 𝐶𝑆
|
||
uploads 𝜃𝑘 to the server, which aggregates these parameters from all
|
||
shared parameters, it filters out potentially poisoned model parameters
|
||
clients using federated averaging to generate the next round’s shared
|
||
through knowledge distillation. The communication model acts as a 𝑆 .
|
||
parameters 𝜃𝑘+1
|
||
teacher model to transfer clean knowledge to the prediction model,
|
||
which serves as the student model. During this process, the prediction
|
||
6. Experiments
|
||
model parameters are frozen to ensure that the knowledge transfer
|
||
direction is strictly from the communication model to the prediction
|
||
Experiments are conducted on the open available dataset FB15K-
|
||
model. Gradients only flow through the prediction model parameters,
|
||
237 [35], which is a subset of Freebase, containing 14,505 entities,
|
||
while the communication model parameters remain frozen, preventing
|
||
544,230 triples, and 474 relations. To perform federated learning, we
|
||
gradient leakage back to potentially poisoned shared parameters.
|
||
adopt the relational partitioning method in [22]. This method first
|
||
If the communication model suffers from poisoning attacks and
|
||
partitions the relationships through clustering, ensuring that the triple
|
||
contains the poisoning parameter, its outputs for negative samples are
|
||
relationships within each partition are as close as possible. Then, these
|
||
not reliable. Distilling or teaching such uncertain predictions would
|
||
partitions are divided into groups of roughly equal numbers of triples
|
||
propagate noise rather than useful knowledge. To exclude the poisoned
|
||
and distributed to the client. This results in tighter triple relationships
|
||
knowledge, the prediction model should focus on positive samples
|
||
within the client, better reflecting real-world scenarios.
|
||
during distillation, ensuring that only trustworthy knowledge is trans-
|
||
The TransE model [7] is selected as the KGE model, serving as
|
||
ferred. The mathematical expression for the distillation loss of the
|
||
the foundation for all federated learning methods in the experiments—
|
||
prediction model in the 𝑘th training epoch is provided in Eq. (7). including the attacker’s shadow model. To benchmark CoDFKGE, we
|
||
∑ ( ) select multiple baseline models. First, the local training model without
|
||
𝑃 𝑃𝐿 𝑃 𝑃
|
||
𝐿𝑘 𝐾𝐷 = 𝜏 2 𝐷𝐾𝐿 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘𝑆 , 𝜃𝑘−1 )) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ))
|
||
federated learning is selected as the KGE baseline model. It does not
|
||
(ℎ,𝑟,𝑡)∈
|
||
share parameters between clients, so it has no communication over-
|
||
(7) head and is not vulnerable to poisoning attacks. Then, FedE [13] and
|
||
Among them, 𝑡 is the distillation temperature coefficient, and 𝜎 is FedR [15] are also chosen as baseline FGKE models, representing stan-
|
||
dard approaches in the field. Additionally, we implement a knowledge
|
||
the softmax function of the ratio of the model output to 𝑡. 𝑔 represents
|
||
distillation model, which utilizes communication and prediction models
|
||
the scoring function of the prediction model, which is used to compute
|
||
𝑃𝐿 similar to CoDFKGE but only processes a unidirectional knowledge dis-
|
||
the KGE loss. 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘𝑆 , 𝜃𝑘−1 ) represents the communication model
|
||
𝑃𝐿 tillation. Specifically, it uses the communication model as the teacher
|
||
output under server-shared parameter 𝜃𝑘𝑆 and local parameter 𝜃𝑘−1 , and
|
||
𝑃𝑆 𝑃𝐿 model and the prediction model as the student model to filter out
|
||
𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 , 𝜃𝑘 ) represents the training prediction model output. poisoning knowledge, with the distillation loss function following Eq.
|
||
When training distillation, the model also needs to consider the (4).
|
||
KGE loss function. The overall loss function of the prediction model All experiments are performed on a 72-core Ubuntu 18.04.6 LTS
|
||
is the weighted sum of the KGE loss and the distillation loss, and its machine with an Intel(R) Xeon(R) Gold 5220 CPU @ 2.20 GHz and
|
||
mathematical expression is shown in Eq. (8). a V100S-PCIE-32GB GPU. We implemented the proposed FKGE frame-
|
||
𝑃
|
||
𝐿𝑃𝑘 = 𝛽𝐿𝑘 𝐾𝐺𝐸 + (1 − 𝛽)𝐿𝑘 𝐾𝐷
|
||
𝑃
|
||
(8) work and baseline model based on PyTorch Geometric [36] and dis-
|
||
tributed AI framework Ray [37]. We used KGE hyperparameter settings
|
||
𝑃𝑘
|
||
Where, 𝐿𝐾𝐺𝐸 is the KGE loss of the 𝑘th epoch of the prediction model based on [9] and FKGE hyperparameter settings based on FedE [13].
|
||
defined by Eq. (1), and 𝛽 is the weight. Specifically, we used the Adam [38] optimizer with a learning rate of
|
||
1e-3. 𝛾 is 10, and self-advertise negative sampling temperature 𝜏𝛼 in
|
||
5.2. Prediction to communication model distillation KGE is 1. The distillation temperature 𝜏 is 2, and the coefficient 𝛽 of
|
||
distillation and KGE loss are both 0.5. The maximum training epoch
|
||
After training the prediction model, we train the communication is 400. In each epoch, the client performs 3 iterations locally before
|
||
model through distillation, which extracts and propagates knowledge uploading the parameters to the server.
|
||
without directly sharing prediction parameters, thereby avoiding pri- We utilize the link prediction task, a sub-task of KGE, to validate the
|
||
vacy leakage. During the communication model’s distillation, the out- model’s accuracy. Referencing the common implementation of the link
|
||
put of the prediction model under positive and negative samples serves prediction, we employ the Mean Reciprocal Rank (MRR) and Hits@N as
|
||
as soft labels. As Eq. (1) illustrates, the loss function must account accuracy metrics. The MRR is the average of the reciprocals of the ranks
|
||
for the probability of negative samples when balancing the impact of the predicted triples among all possible triples. Mathematically, if
|
||
of positive and negative predictions. Therefore, the distillation loss 𝑟𝑎𝑛𝑘𝑖 is the rank of the correct triple for the 𝑖th query, and 𝑛 is the
|
||
∑
|
||
function of the communication model is formalized in Eq. (9). total number of queries, then 𝑀𝑅𝑅 = 1𝑛 𝑛𝑖=1 𝑟𝑎𝑛𝑘 1
|
||
. The Hits@N is the
|
||
𝑖
|
||
∑ proportion of query triples for which the correct triple is present among
|
||
𝐶𝑘 𝑃 𝑃 𝐶 𝐶
|
||
𝐿𝐾𝐷 = 𝜏2 (𝐷𝐾𝐿 (𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 )) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 )))) the top 𝑁 candidates generated by the model. Generally, higher values
|
||
∑(ℎ,𝑟,𝑡)∈ for both metrics indicate better model performance in link prediction.
|
||
𝑃 𝑃 𝐶 𝐶
|
||
+ 𝑝(ℎ, 𝑟, 𝑡′𝑖 )𝐷𝐾𝐿 (𝜎(𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) ∥ 𝜎(𝑔(ℎ, 𝑟, 𝑡′𝑖 ; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 )))) Through experiments, the following research questions will be ver-
|
||
𝑖 ified.
|
||
(9)
|
||
𝐶 𝐶
|
||
RQ1 Does CoDFKGE maintain KGE prediction performance while re-
|
||
Among them, 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 𝑆 , 𝜃𝑘 𝐿 ) represents the communication model ducing FKGE communication overhead?
|
||
𝑃𝑆 𝑃𝐿
|
||
output. 𝑔(ℎ, 𝑟, 𝑡; 𝜃𝑘 , 𝜃𝑘 ) represents the prediction model output under RQ2 Can CoDFKGE effectively defend against targeted poisoning at-
|
||
𝑃 𝑃
|
||
shared parameter 𝜃𝑘 𝑆 and local parameter 𝜃𝑘 𝐿 . The calculation method tacks?
|
||
|
||
6
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
Table 1
|
||
Experiment result on normal link prediction.
|
||
Fed type Model Mem(MB) CC(MB) MRR Hits@1 Hits@5 Hits@10
|
||
Local Local(128) 57.05 – 0.4081 ± 0.0015 0.3066 ± 0.0014 0.5223 ± 0.0023 0.6077 ± 0.0015
|
||
Entity FedE(128) 185.58 42.60 0.4082 ± 0.0004 0.3068 ± 0.0012 0.5232 ± 0.0013 0.6080 ± 0.0018
|
||
Entity Distillation (128-128) 356.10 42.60 0.4129 ± 0.0008 0.3118 ± 0.0016 0.5279 ± 0.0008 0.6122 ± 0.0003
|
||
Entity CoDFKGE (128-128) 356.10 42.60 0.4109 ± 0.0043 0.3097 ± 0.0041 0.5246 ± 0.0044 0.6087 ± 0.0040
|
||
Entity Distillation (32-128) 217.39 10.65 0.3914 ± 0.0011 0.2935 ± 0.0008 0.5005 ± 0.0014 0.5838 ± 0.0032
|
||
Entity CoDFKGE (32-128) 217.40 10.65 0.4090 ± 0.0010 0.3079 ± 0.0007 0.5233 ± 0.0019 0.6068 ± 0.0019
|
||
Relation FedR(128) 75.49 0.69 0.4085 ± 0.0011 0.3079 ± 0.0021 0.5219 ± 0.0016 0.6066 ± 0.0017
|
||
Relation Distillation (128-128) 151.74 0.69 0.4106 ± 0.0013 0.3092 ± 0.0023 0.5242 ± 0.0008 0.6098 ± 0.0009
|
||
Relation CoDFKGE (128-128) 150.02 0.69 0.4065 ± 0.0007 0.3056 ± 0.0013 0.5190 ± 0.0023 0.6063 ± 0.0012
|
||
Relation Distillation (32-128) 94.53 0.17 0.3920 ± 0.0012 0.2960 ± 0.0007 0.4996 ± 0.0019 0.5807 ± 0.0013
|
||
Relation CoDFKGE (32-128) 93.69 0.17 0.4078 ± 0.0009 0.3060 ± 0.0007 0.5224 ± 0.0031 0.6074 ± 0.0015
|
||
|
||
|
||
|
||
RQ3 Can CoDFKGE effectively defend against untargeted poisoning 6.2. Targeted poisoning attack experiment (RQ2)
|
||
attacks?
|
||
RQ4 Do the two proposed distillation loss functions individually con- In the targeted poisoning attack, 32 pairs of non-existent triples
|
||
tribute to poisoning defense? are selected as attack targets from the victim’s KG through negative
|
||
sampling to construct a poisoned triple dataset. First, a predetermined
|
||
6.1. Normal link prediction (RQ1) number of normal triples are selected from the victim’s training triples.
|
||
Subsequently, the head or tail nodes of these triples are randomly re-
|
||
To explore the performance of the proposed model in normal link placed, and any triples already existing in the training set are iteratively
|
||
prediction, we first tested the model on a conventional dataset. The removed until 32 pairs of non-existent triples are successfully con-
|
||
performance of the model is measured using MRR and Hits@1, Hits@5, structed. In each epoch, the shadow model undergoes the same number
|
||
and Hits@10. The model is trained by federated learning and evaluated of local training rounds as legitimate clients on the poisoned dataset to
|
||
on the local test sets of clients. generate poisoned parameters. The malicious server aggregates these
|
||
Table 1 lists the performance of the local KGE model, FedE, FedR, poisoned parameters with the parameters of the normal client into
|
||
and CoDFKGE with different dimensions. The experimental results are shared parameters and distributes them to all clients. Attackers can
|
||
grouped according to the type of shared embeddings and the dimension assign high weights to poisoned model parameters during aggregation.
|
||
of the prediction model. The parameter dimensions are specified in Following the setup in Ref. [33], we set the weight of the attacker’s
|
||
parentheses within the ‘‘Model’’ column. For example, CoDFKGE(32- aggregated poisoned triples to be 256 times that of normal triples.
|
||
128) denotes the CoDKGE model with a 32-dimensional communication Experiments focus on models with shared entity parameters (required
|
||
model and a 128-dimensional prediction model. All link prediction
|
||
for targeted poisoning attacks) and non-federated local baselines.
|
||
experiments were repeated 5 times with different random seeds, and
|
||
For space considerations, this section reports only MRR and
|
||
the accuracy results of all models are reported as (mean ± standard
|
||
Hits@10 metrics. Attack effectiveness is measured by the MRR and
|
||
deviation). The best performing model results in each group (excluding
|
||
Hits@10 of poisoned triples on the victim. The higher metrics of the
|
||
the local model) are bolded. The results of the CoDFKGE (32-128)
|
||
poisoned triples indicate greater vulnerability to poisoning and weaker
|
||
model that are better than those of Distillation(32-128) are underlined.
|
||
resistance of the model to targeted poisoning attacks.
|
||
The performance of locally trained models is lower than most feder-
|
||
Table 2 lists the performance of baseline models and CoDFKGE
|
||
ated learning models, highlighting the advantages of sharing model pa-
|
||
under targeted poisoning attacks, grouped by the prediction model
|
||
rameters. High-dimensional distillation(128-128) models achieve better
|
||
dimension. The parameter dimensions are specified in parentheses
|
||
link prediction performance. Compared to distillation(128-128), CoD-
|
||
within the ‘‘Model’’ column. The ‘‘All Clients’’ column reports av-
|
||
FKGE models show slightly inferior prediction performance. However,
|
||
erage performance across all clients’ test sets during attacks, while
|
||
by comparing models with the same dimensions, CoDFKGE outperform
|
||
both local baselines and federated baselines (FedE, FedR). The co- ‘‘Victim Poisoned’’ measures the victim’s performance on predicting
|
||
distillation process in CoDFKGE may lead to a loss of generalization poisoned triples. All experiments were repeated 5 times with differ-
|
||
accuracy. We believe that the main advantage of CoDFKGE is its ent random seeds, and the results are reported as (mean ± standard
|
||
ability to enhance the security of FKGE. In addition to the security deviation). The best performing model results are bolded. Moreover,
|
||
performance demonstrated in Sections 6.2 and 6.3, it also maintains the ‘‘Communication Poison’’ column highlights the communication
|
||
link prediction performance comparable to its baseline FKGE models. model’s performance on poisoned triples for CoDFKGE and the dis-
|
||
Beyond accuracy metrics, the ‘‘CC’’ (Communication Cost) column tillation model, demonstrating that both communication models are
|
||
reports the communication overhead per training epoch, which is impacted by targeted poisoning attacks. Through distillation, the pre-
|
||
calculated based on the byte size of PyTorch Embedding used in the diction accuracy of poisoned triples by the prediction model decreases
|
||
implementation. The ‘‘Mem’’ column shows the GPU memory usage in both cases.
|
||
of federated models in MB. Distillation-based model requires main- For targeted poisoning attacks, the primary evaluation metrics
|
||
taining two KGE models, resulting in higher computational resource should be the MRR and Hits@10 performance indicators of the victim
|
||
consumption. Distillation-based models need larger GPU memory to model when predicting poisoned triples. The Local training model,
|
||
store the parameters of both models. Compared to using model pa- which does not employ federated learning, remains immune to poi-
|
||
rameters of the same size, distillation-based models allow to compress soning attacks, resulting in low MRR for poisoned triples, with the
|
||
parameters in the communication model, achieving significantly lower Hits@10 value being exactly 0. This indicates that the unpoisoned Local
|
||
communication overhead. In cases of smaller communication overhead, model does not include non-existent poisoned triples among its top
|
||
CoDFKGE(32-128) outperforms distillation(32-128) in link prediction 10 candidate results when making predictions. If a model incorrectly
|
||
performance. Therefore, we believe that the CoDFKGE model does marks non-existent poisoned test triples as one of the top 10 candidates,
|
||
not degrade the normal link prediction performance of baseline FKGE it demonstrates that the poisoning attack has successfully manipulated
|
||
models and can effectively reduce the communication overhead of the the model’s predictions. Therefore, we use Hits@10 as the metric to
|
||
model. measure the Attack Success Rate (ASR).
|
||
|
||
7
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
Table 2
|
||
Experiment result under targeted poisoning attack.
|
||
Model All clients Victim poison Communication poison
|
||
MRR Hits@10 MRR Hits@10(ASR) MRR Hits@10
|
||
Local(128, unpoisoned) 0.4081 ± 0.0015 0.6077 ± 0.0015 0.0003 ± 0.0001 0.0000 ± 0.0000 – –
|
||
FedE(128) 0.4034 ± 0.0035 0.6004 ± 0.0029 0.4450 ± 0.0938 0.7857 ± 0.1248 – –
|
||
Distillation(128-128) 0.4026 ± 0.0025 0.6006 ± 0.0039 0.0844 ± 0.0552 0.2000 ± 0.1311 0.4999 ± 0.1429 0.7714 ± 0.1046
|
||
CoDFKGE(128-128) 0.4086 ± 0.0007 0.6089 ± 0.0012 0.0010 ± 0.0003 0.0009 ± 0.0005 0.4694 ± 0.1511 0.6589 ± 0.1242
|
||
Distillation(32-128) 0.3821 ± 0.0022 0.5717 ± 0.0018 0.1511 ± 0.3356 0.1960 ± 0.4362 0.4919 ± 0.2364 0.6625 ± 0.1887
|
||
CoDFKGE(32-128) 0.3856 ± 0.0039 0.5740 ± 0.0054 0.0010 ± 0.0001 0.0010 ± 0.0003 0.3794 ± 0.0032 0.5702 ± 0.005
|
||
|
||
|
||
|
||
|
||
Fig. 3. Performance degradation comparison.
|
||
|
||
|
||
The FedE model maintains high prediction accuracy on normal communication model in CoDFKGE(32-128) less susceptible to poison-
|
||
test triples when under attack, but exhibits abnormally high MRR and ing attacks.
|
||
Hits@10 metrics for targeted poisoned triples, even exceeding those
|
||
of normal triples. This indicates that targeted poisoning attacks can 6.3. Untargeted poisoning attack experiment (RQ3)
|
||
effectively manipulate the FedE model to generate incorrect prediction
|
||
results. Similarly, in distillation-based models, their communication In untargeted poisoning attack experiments, the attacker returns
|
||
models are severely affected by poisoning attacks, while the impact on negative aggregate parameters to the victim client, making the victim
|
||
the prediction models is relatively minor. Although the distill(128-128) model non-converge and degrading prediction performance. The results
|
||
model can partially eliminate poisoning knowledge, it still remains vul- presented in this section reflect average prediction performance on
|
||
nerable to the targeted poisoning attacks. Moreover, as the dimension local test triples of clients.
|
||
of the communication model parameter increases, the extent of the Table 3 lists the performance of each model under untargeted
|
||
model’s vulnerability to poisoning attacks also grows. poisoning attacks, grouped by the prediction model dimension and
|
||
In contrast, CoDFKGE’s prediction model performs distillation learn- federated type. The parameter dimensions are specified in parenthe-
|
||
ing exclusively on verified positive samples, effectively eliminating ses within the ‘‘Model’’ column. The ‘‘All Clients’’ column shows the
|
||
potential poisoning knowledge that might exist in negative samples. average performance of all clients under untargeted poisoning attacks,
|
||
Similar to the Local training model, CoDFKGE achieves extremely low and the ‘‘Victim Client’’ column shows the performance of the victim
|
||
MRR and Hits@10 metrics for poisoned triples, which fully demon- client. To measure the severity of the model being attacked, the MRR of
|
||
strates that the CoDFKGE model can effectively defend against targeted the local model in Table 1 is used as a benchmark. The ‘‘Decay Ratio’’
|
||
poisoning attacks in FKGE. Furthermore, due to the compression of column shows the ratio of performance degradation on the victim
|
||
the communication model’s dimension, the amount of information client compared to the local model shown in Table 1. All experiments
|
||
that attackers can transmit is correspondingly reduced, making the were repeated 5 times with different random seeds, and the results
|
||
|
||
8
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
Table 3
|
||
Experiment result under untargeted poisoning attack.
|
||
Fed Type Model All clients Victim Decay ratio (%)
|
||
MRR Hits@10 MRR Hits@10 MRR Hits@10
|
||
Entity FedE(128) 0.3896 ± 0.0010 0.5939 ± 0.0009 0.3625 ± 0.0102 0.5620 ± 0.0144 11.21 7.58
|
||
Entity Distillation(128-128) 0.3900 ± 0.0017 0.5921 ± 0.0007 0.3641 ± 0.0012 0.5664 ± 0.0018 11.82 7.54
|
||
Entity CoDFKGE(128-128) 0.4084 ± 0.0007 0.6068 ± 0.0003 0.4017 ± 0.0010 0.6009 ± 0.0005 2.25 1.28
|
||
Entity Distillation (32-128) 0.3024 ± 0.0208 0.5422 ± 0.0105 0.2739 ± 0.0264 0.5262 ± 0.0124 30.02 9.49
|
||
Entity CoDFKGE (32-128) 0.4093 ± 0.0018 0.6081 ± 0.0014 0.4022 ± 0.0022 0.6023 ± 0.0011 1.66 0.75
|
||
Relation FedR(128) 0.3915 ± 0.0010 0.5951 ± 0.0016 0.3637 ± 0.0093 0.5636 ± 0.0150 10.96 7.10
|
||
Relation Distillation(128-128) 0.3978 ± 0.0017 0.6022 ± 0.0019 0.3881 ± 0.0023 0.5942 ± 0.0028 5.51 2.56
|
||
Relation CoDFKGE(128-128) 0.4086 ± 0.0017 0.6075 ± 0.0029 0.4014 ± 0.0020 0.6018 ± 0.0037 1.24 0.75
|
||
Relation Distillation (32-128) 0.3058 ± 0.0079 0.5463 ± 0.0029 0.2787 ± 0.0101 0.5307 ± 0.0038 27.78 8.61
|
||
Relation CoDFKGE (32-128) 0.4090 ± 0.0008 0.6066 ± 0.0011 0.4026 ± 0.0008 0.6018 ± 0.0013 1.27 0.92
|
||
|
||
|
||
Table 4
|
||
Ablation study in normal link prediction and under targeted attack.
|
||
Model Link prediction Targeted all clients Targeted victim poisoning
|
||
MRR Hits@10 MRR Hits@10 MRR Hits@10 (targeted poisoning ASR)
|
||
CoDFKGE 0.4112 ± 0.0039 0.6084 ± 0.0036 0.4086 ± 0.0007 0.6089 ± 0.0012 0.0010 ± 0.0003 0.0009 ± 0.0005
|
||
Ablation(Comm) 0.4095 ± 0.0016 0.6074 ± 0.0014 0.4086 ± 0.0022 0.6076 ± 0.0021 0.0017 ± 0.0008 0.0013 ± 0.0008
|
||
Ablation(Pred) 0.4132 ± 0.0006 0.6116 ± 0.0012 0.4098 ± 0.0011 0.6080 ± 0.0009 0.8086 ± 0.0064 0.9702 ± 0.0228
|
||
|
||
|
||
|
||
are reported as (mean ± standard deviation). The best and second best were repeated 5 times with different random seeds, and the results are
|
||
results in each group have been marked in bold and underline. reported as (mean ± standard deviation). The best results are bolded.
|
||
From the experimental results, it can be observed that when sub- Experimental results demonstrate that while Ablation(Pred) per-
|
||
jected to untargeted poisoning attacks, the CoDFKGE series models forms well in conventional link prediction, its resistance to poisoning
|
||
achieve optimal MRR and Hits@10 performance metrics compared to attacks lags behind the other two models due to not employing a
|
||
other models. In this context, all models exhibit varying degrees of negative sample exclusion strategy in its loss function. Among the re-
|
||
decline in both their overall performance metrics and their performance maining two models, while both demonstrate robust resilience against
|
||
metrics on victims. In Fig. 3, we present a comparison of the predic- poisoning attacks, the CoDFKGE model achieves superior link pre-
|
||
tion performance of various models under normal link prediction and diction performance compared to Ablation(Comm). Ablation(Comm)
|
||
untargeted poisoning attack scenarios. It can be observed that the Dis- employs a baseline loss function during the distillation training of
|
||
tillation(32-128) model experiences the most significant performance the communication model. In contrast, the CoDFKGE model adopts
|
||
degradation; for Distillation(128-128), FedE, and FedR models, their the approach from [9] and utilizes self-adversarial sampling temper-
|
||
performance degradation is also substantial and cannot be ignored. ature 𝜏𝛼 to reweight negative samples, thereby enhancing the model’s
|
||
These models directly incorporate poisoned global knowledge as an ability to distinguish between negative samples. Overall, the ablation
|
||
integral part of their own models, causing the convergence process of experiments demonstrate that applying the proposed distillation loss
|
||
the models to be adversely affected. In contrast, the performance degra- functions simultaneously enhances the model’s capability in defending
|
||
dation of CoDFKGE models is fully within 3%. This is because even in against poisoning attacks and link prediction.
|
||
the absence of global knowledge, the prediction model of CoDFKGE still
|
||
7. Conclusion
|
||
utilizes local data knowledge for training, and its training effectiveness
|
||
is comparable to that of local KGE models without knowledge sharing.
|
||
This paper proposes CoDFKGE, a co-distillation-based defense
|
||
Baseline models may have their results manipulated or exhibit
|
||
framework for FKGE poisoning attacks. As the first co-distillation
|
||
significant performance degradation when facing poisoning attacks.
|
||
defense framework against poisoning attacks in FKGE, CoDFKGE does
|
||
Although in link prediction experiments, distillation models exhibited
|
||
have some limitations. First, maintaining two separate models requires
|
||
advantages in performance, their defense effectiveness is extremely
|
||
higher computational resource consumption on clients. Second, the
|
||
limited when facing poisoning attacks. In contrast, CoDFKGE remains
|
||
bidirectional distillation process may lead to a loss of generalization
|
||
unmanipulated when encountering targeted poisoning attacks and does
|
||
accuracy. In contrast, CoDFKGE’s advantages lie in its model-agnostic
|
||
not exhibit significant performance degradation when subjected to
|
||
applicability to existing FKGE models without compromising perfor-
|
||
untargeted poisoning attacks, demonstrating its effective defense capa- mance. By decoupling clients’ prediction models from shared parameter
|
||
bility against poisoning attacks. models, CoDFKGE effectively filters out poisoned knowledge embedded
|
||
in shared updates. CoDFKG eliminates malicious manipulations under
|
||
6.4. Ablation study (RQ4) targeted poisoning attacks, and significantly mitigates accuracy degra-
|
||
dation under untargeted poisoning attacks. Leveraging distillation,
|
||
This section evaluates the defensive effects of applying different the framework further reduces communication overhead. This work
|
||
loss functions in CoDFKGE against poisoning attacks. Specifically, we provides new ideas for enhancing the security of FKGE.
|
||
compare the performance of models using 128-dimensional training The limitations of FKGE poisoning defense research are partially
|
||
parameters for both communication and prediction models across nor- rooted in the unique characteristics of KGE. When considering
|
||
mal link prediction, targeted poisoning attack scenarios, and untargeted translation-based KGE models in FKGE, sharing entity or relation
|
||
poisoning attack scenarios. Two ablation baselines were implemented: embeddings introduces risks related to both privacy preservation and
|
||
Ablation(Comm) applies the baseline loss function (Eq. (4)) solely poisoning attacks. Employing GNN-based KGE models in FKGE that
|
||
during the communication module’s distillation, while Ablation(Pred) transmit GNN parameters or gradients can alleviate these concerns.
|
||
uses it exclusively for the prediction module’s distillation. However, due to their superior robustness to sparse data and lower
|
||
Tables 4 and 5 shows the experiment results of models with different computational resource requirements, translation-based models still
|
||
distillation loss functions sharing entity embeddings. All experiments maintain unparalleled advantages in specific application scenarios.
|
||
|
||
9
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
Table 5
|
||
Ablation study under untargeted attack.
|
||
Model Untargeted all clients Untargeted victim Decay ratio (%)
|
||
MRR Hits@10 MRR Hits@10 MRR Hits@10
|
||
CoDFKGE 0.4084 ± 0.0007 0.6068 ± 0.0003 0.4017 ± 0.0010 0.6009 ± 0.0005 2.25 1.27
|
||
Ablation(Comm) 0.4056 ± 0.0017 0.6062 ± 0.0011 0.3996 ± 0.0018 0.6003 ± 0.0013 2.42 1.16
|
||
Ablation(Pred) 0.3951 ± 0.0011 0.6022 ± 0.0008 0.3852 ± 0.0009 0.5951 ± 0.0005 6.76 2.69
|
||
|
||
|
||
|
||
For future research, we recommend exploring the application of the [8] Z. Wang, J. Zhang, J. Feng, Z. Chen, Knowledge graph embedding by translating
|
||
CoDFKGE framework in more complex real-world scenarios, such as on hyperplanes, in: Proceedings of the AAAI Conference on Artificial Intelligence,
|
||
vol. 28, 2014.
|
||
personalized FKGE problems. Additionally, in large-scale dynamic KG
|
||
[9] Z. Sun, Z.-H. Deng, J.-Y. Nie, J. Tang, Rotate: Knowledge graph embedding by
|
||
environments, the security landscape for FKGE may undergo signifi- relational rotation in complex space, 2019, arXiv preprint arXiv:1902.10197.
|
||
cant changes, necessitating further investigation into defense methods [10] Z. Zhang, J. Jia, Y. Wan, Y. Zhou, Y. Kong, Y. Qian, J. Long, Transr*: Repre-
|
||
tailored to these evolving scenarios. sentation learning model by flexible translation and relation matrix projection,
|
||
J. Intell. Fuzzy Systems 40 (5) (2021) 10251–10259.
|
||
[11] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel, Convolutional 2d knowl-
|
||
CRediT authorship contribution statement edge graph embeddings, in: Proceedings of the AAAI Conference on Artificial
|
||
Intelligence, vol. 32, (1) 2018.
|
||
Yiqin Lu: Supervision. Jiarui Chen: Writing – original draft, Soft- [12] B. McMahan, E. Moore, D. Ramage, S. Hampson, B.A. y Arcas, Communication-
|
||
efficient learning of deep networks from decentralized data, in: Artificial
|
||
ware, Methodology. Jiancheng Qin: Writing – review & editing.
|
||
Intelligence and Statistics, PMLR, 2017, pp. 1273–1282.
|
||
[13] M. Chen, W. Zhang, Z. Yuan, Y. Jia, H. Chen, Fede: Embedding knowledge graphs
|
||
Declaration of Generative AI and AI-assisted technologies in the in federated setting, in: Proceedings of the 10th International Joint Conference
|
||
writing process on Knowledge Graphs, 2021, pp. 80–88.
|
||
[14] M. Chen, W. Zhang, Z. Yuan, Y. Jia, H. Chen, Federated knowledge graph
|
||
During the preparation of this work the author(s) used deepseek in completion via embedding-contrastive learning, Knowl.-Based Syst. 252 (2022)
|
||
109459.
|
||
order to improve language and readability. After using this tool/service, [15] K. Zhang, Y. Wang, H. Wang, L. Huang, C. Yang, X. Chen, L. Sun, Efficient fed-
|
||
the author(s) reviewed and edited the content as needed and take(s) full erated learning on knowledge graphs via privacy-preserving relation embedding
|
||
responsibility for the content of the publication. aggregation, 2022, arXiv preprint arXiv:2203.09553.
|
||
[16] G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network,
|
||
2015, arXiv preprint arXiv:1503.02531.
|
||
Declaration of competing interest [17] N. Papernot, P. McDaniel, X. Wu, S. Jha, A. Swami, Distillation as a de-
|
||
fense to adversarial perturbations against deep neural networks, in: 2016 IEEE
|
||
The authors declare that they have no known competing finan- Symposium on Security and Privacy, SP, IEEE, 2016, pp. 582–597.
|
||
cial interests or personal relationships that could have appeared to [18] K. Yoshida, T. Fujino, Countermeasure against backdoor attack on neural
|
||
networks utilizing knowledge distillation, J. Signal Process. 24 (4) (2020)
|
||
influence the work reported in this paper.
|
||
141–144.
|
||
[19] K. Yoshida, T. Fujino, Disabling backdoor and identifying poison data by
|
||
Acknowledgment using knowledge distillation in backdoor attacks on deep neural networks, in:
|
||
Proceedings of the 13th ACM Workshop on Artificial Intelligence and Security,
|
||
2020, pp. 117–127.
|
||
This work is supported by the Special Project for Research and [20] R. Anil, G. Pereyra, A. Passos, R. Ormandi, G.E. Dahl, G.E. Hinton, Large
|
||
Development in Key Areas of Guangdong Province, under Grant scale distributed neural network training through online distillation, 2018, arXiv
|
||
2019B010137001. preprint arXiv:1804.03235.
|
||
[21] Y. Hu, W. Liang, R. Wu, K. Xiao, W. Wang, X. Li, J. Liu, Z. Qin, Quantifying and
|
||
defending against privacy threats on federated knowledge graph embedding, in:
|
||
Data availability
|
||
Proceedings of the ACM Web Conference 2023, 2023, pp. 2306–2317.
|
||
[22] X. Zhu, G. Li, W. Hu, Heterogeneous federated knowledge graph embedding
|
||
Data will be made available on request. learning and unlearning, in: Proceedings of the ACM Web Conference 2023,
|
||
2023, pp. 2444–2454.
|
||
[23] X. Zhang, Z. Zeng, X. Zhou, Z. Shen, Low-dimensional federated knowledge graph
|
||
embedding via knowledge distillation, 2024, arXiv preprint arXiv:2408.05748.
|
||
References
|
||
[24] Y. Liu, Z. Sun, G. Li, W. Hu, I know what you do not know: Knowledge
|
||
graph embedding via co-distillation learning, in: Proceedings of the 31st ACM
|
||
[1] X. Zhao, H. Chen, Z. Xing, C. Miao, Brain-inspired search engine assistant based International Conference on Information & Knowledge Management, 2022, pp.
|
||
on knowledge graph, IEEE Trans. Neural Netw. Learn. Syst. 34 (8) (2021) 1329–1338.
|
||
4386–4400. [25] F. Xia, W. Cheng, A survey on privacy-preserving federated learning against
|
||
[2] S. Sharma, Fact-finding knowledge-aware search engine, in: Data Management, poisoning attacks, Clust. Comput. 27 (10) (2024) 13565–13582.
|
||
Analytics and Innovation: Proceedings of ICDMAI 2021, vol. 2, Springer, 2021, [26] J. Chen, H. Yan, Z. Liu, M. Zhang, H. Xiong, S. Yu, When federated learning
|
||
pp. 225–235. meets privacy-preserving computation, ACM Comput. Surv. (ISSN: 0360-0300)
|
||
[3] Y. Jiang, Y. Yang, L. Xia, C. Huang, DiffKG: Knowledge graph diffusion model for 56 (12) (2024).
|
||
recommendation, in: Proceedings of the 17th ACM International Conference on [27] J. Xia, Z. Yue, Y. Zhou, Z. Ling, Y. Shi, X. Wei, M. Chen, Waveattack: Asymmetric
|
||
Web Search and Data Mining, WSDM ’24, Association for Computing Machinery, frequency obfuscation-based backdoor attacks against deep neural networks, Adv.
|
||
New York, NY, USA, ISBN: 9798400703713, 2024, pp. 313–321. Neural Inf. Process. Syst. 37 (2024) 43549–43570.
|
||
[4] W. Wang, X. Shen, B. Yi, H. Zhang, J. Liu, C. Dai, Knowledge-aware fine-grained [28] P. Blanchard, E.M. El Mhamdi, R. Guerraoui, J. Stainer, Machine learning with
|
||
attention networks with refined knowledge graph embedding for personalized adversaries: Byzantine tolerant gradient descent, Adv. Neural Inf. Process. Syst.
|
||
recommendation, Expert Syst. Appl. 249 (2024) 123710. 30 (2017).
|
||
[5] J. Chen, Y. Lu, Y. Zhang, F. Huang, J. Qin, A management knowledge graph [29] N.M. Jebreel, J. Domingo-Ferrer, Fl-defender: Combating targeted attacks in
|
||
approach for critical infrastructure protection: Ontology design, information ex- federated learning, Knowl.-Based Syst. 260 (2023) 110178.
|
||
traction and relation prediction, Int. J. Crit. Infrastruct. Prot. (ISSN: 1874-5482) [30] Z. Yue, J. Xia, Z. Ling, M. Hu, T. Wang, X. Wei, M. Chen, Model-contrastive
|
||
43 (2023) 100634. learning for backdoor elimination, in: Proceedings of the 31st ACM International
|
||
[6] Y. Zhang, J. Chen, Z. Cheng, X. Shen, J. Qin, Y. Han, Y. Lu, Edge propagation Conference on Multimedia, 2023, pp. 8869–8880.
|
||
for link prediction in requirement-cyber threat intelligence knowledge graph, [31] H. Peng, H. Li, Y. Song, V. Zheng, J. Li, Differentially private federated
|
||
Inform. Sci. (ISSN: 0020-0255) 653 (2024) 119770. knowledge graphs embedding, in: Proceedings of the 30th ACM International
|
||
[7] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, O. Yakhnenko, Translating Conference on Information & Knowledge Management, CIKM ’21, Association
|
||
embeddings for modeling multi-relational data, Adv. Neural Inf. Process. Syst. for Computing Machinery, New York, NY, USA, ISBN: 9781450384469, 2021,
|
||
26 (2013). pp. 1416–1425.
|
||
|
||
|
||
10
|
||
Y. Lu et al. Computer Standards & Interfaces 97 (2026) 104113
|
||
|
||
|
||
[32] Y. Hu, Y. Wang, J. Lou, W. Liang, R. Wu, W. Wang, X. Li, J. Liu, Z. Qin, Privacy [36] M. Fey, J.E. Lenssen, Fast graph representation learning with PyTorch Geometric,
|
||
risks of federated knowledge graph embedding: New membership inference in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
|
||
attacks and personalized differential privacy defense, IEEE Trans. Dependable [37] P. Moritz, R. Nishihara, S. Wang, A. Tumanov, R. Liaw, E. Liang, M. Elibol,
|
||
Secur. Comput. (2024). Z. Yang, W. Paul, M.I. Jordan, I. Stoica, Ray: A distributed framework for
|
||
[33] E. Zhou, S. Guo, Z. Ma, Z. Hong, T. Guo, P. Dong, Poisoning attack on federated emerging AI applications, in: 13th USENIX Symposium on Operating Systems
|
||
knowledge graph embedding, in: Proceedings of the ACM Web Conference 2024, Design and Implementation (OSDI 18), USENIX Association, Carlsbad, CA, ISBN:
|
||
2024, pp. 1998–2008. 978-1-939133-08-3, 2018, pp. 561–577.
|
||
[34] G. Xia, J. Chen, C. Yu, J. Ma, Poisoning attacks in federated learning: A survey, [38] D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, 2014, arXiv
|
||
Ieee Access 11 (2023) 10708–10722. preprint arXiv:1412.6980.
|
||
[35] K. Toutanova, D. Chen, P. Pantel, H. Poon, P. Choudhury, M. Gamon, Repre-
|
||
senting text for joint embedding of text and knowledge bases, in: Proceedings
|
||
of the 2015 Conference on Empirical Methods in Natural Language Processing,
|
||
2015, pp. 1499–1509.
|
||
|
||
|
||
|
||
|
||
11
|
||
|