Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs
arXiv cs.AI / 2026/3/24
💬 オピニオンSignals & Early TrendsIdeas & Deep AnalysisModels & Research
要点
- The paper studies vulnerabilities in text-attributed graphs (TAGs), where combining topology with node text improves learning but introduces new adversarial risk surfaces.
- It highlights the difficulty of creating universal attacks that work across different backbones (GNNs vs. PLMs) and across black-box access to many LLMs exposed only via APIs.
- The authors propose BadGraph, an attack framework that elicits an LLM’s graph understanding and jointly perturbs node topology and textual semantics to craft cross-modal, generalizable attack “shortcuts.”
- Experiments indicate BadGraph can produce universal and effective attacks against both GNN-based and LLM-based graph reasoners, reporting performance drops of up to 76.3%.
- The work includes both theoretical and empirical analyses suggesting the attacks can be stealthy while remaining interpretable.
