Graph-Aware Text-Only Backdoor Poisoning for Text-Attributed Graphs

arXiv cs.LG / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies a realistic threat model for graph learning systems with text-attributed nodes, where an attacker poisons only node text without altering the graph structure.
  • It introduces TAGBD, which identifies highly influenceable training nodes, uses a shadow graph model to generate natural-looking trigger text, and injects the trigger by either replacing node text or appending a short phrase.
  • Experiments on three benchmark datasets show TAGBD is highly effective, can transfer across different graph models, and stays strong even when common defenses are applied.
  • The findings highlight that text-only manipulation is a practical backdoor channel in text-attributed graphs, motivating defenses that validate both node content and graph connections.

Abstract

Many learning systems now use graph data in which each node also contains text, such as papers with abstracts or users with posts. Because these texts often come from open platforms, an attacker may be able to quietly poison a small part of the training data and later make the model produce wrong predictions on demand. This paper studies that risk in a realistic setting where the attacker edits only node text and does not change the graph structure. We propose TAGBD, a text-only backdoor attack for text-attributed graphs. TAGBD first finds training nodes that are easier to influence, then generates natural-looking trigger text with the help of a shadow graph model, and finally injects the trigger by either replacing the original text or appending a short phrase. Experiments on three benchmark datasets show that the attack is highly effective, transfers across different graph models, and remains strong under common defenses. These results demonstrate that text alone is a practical attack channel in graph learning systems and suggest that future defenses should inspect both graph links and node content.