Topology-Aware Reasoning over Incomplete Knowledge Graph with Graph-Based Soft Prompting

arXiv cs.CL / 4/15/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses hallucinations in knowledge-intensive QA by grounding LLM generation in incomplete Knowledge Graphs (KGs) rather than relying on brittle explicit edge traversal.
  • It introduces a graph-based soft prompting method where a Graph Neural Network (GNN) encodes structural subgraphs into soft prompts, enabling subgraph-level reasoning for better entity/relation relevance when edges are missing.
  • A two-stage framework is proposed to cut computation: a lightweight LLM first uses the soft prompts to select relevant entities and relations, then a stronger LLM performs evidence-aware answer generation.
  • Experiments on four multi-hop KBQA benchmarks report state-of-the-art results on three benchmarks, and the authors provide code via a public GitHub repository.

Abstract

Large Language Models (LLMs) have shown remarkable capabilities across various tasks but remain prone to hallucinations in knowledge-intensive scenarios. Knowledge Base Question Answering (KBQA) mitigates this by grounding generation in Knowledge Graphs (KGs). However, most multi-hop KBQA methods rely on explicit edge traversal, making them fragile to KG incompleteness. In this paper, we proposed a novel graph-based soft prompting framework that shifts the reasoning paradigm from node-level path traversal to subgraph-level reasoning. Specifically, we employ a Graph Neural Network (GNN) to encode extracted structural subgraphs into soft prompts, enabling LLM to reason over richer structural context and identify relevant entities beyond immediate graph neighbors, thereby reducing sensitivity to missing edges. Furthermore, we introduce a two-stage paradigm that reduces computational cost while preserving good performance: a lightweight LLM first leverages the soft prompts to identify question-relevant entities and relations, followed by a more powerful LLM for evidence-aware answer generation. Experiments on four multi-hop KBQA benchmarks show that our approach achieves state-of-the-art performance on three of them, demonstrating its effectiveness. Code is available at the repository: https://github.com/Wangshuaiia/GraSP.