KG-Reasoner: A Reinforced Model for End-to-End Multi-Hop Knowledge Graph Reasoning

arXiv cs.CL / 4/15/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents KG-Reasoner, an end-to-end framework that performs multi-hop knowledge graph (KG) reasoning within a single unified “thinking” phase of a reasoning LLM rather than a fixed step-by-step pipeline.
  • It trains the LLM using reinforcement learning (RL) so it can internalize KG traversal, dynamically explore different reasoning paths, and backtrack when needed to maintain coherence and preserve intermediate information.
  • Experiments across eight multi-hop, knowledge-intensive benchmarks show KG-Reasoner achieves competitive or superior results compared with state-of-the-art approaches.
  • The authors provide open-source code via a public repository, enabling other researchers and practitioners to test and build upon the framework.

Abstract

Large Language Models (LLMs) exhibit strong abilities in natural language understanding and generation, yet they struggle with knowledge-intensive reasoning. Structured Knowledge Graphs (KGs) provide an effective form of external knowledge representation and have been widely used to enhance performance in classical Knowledge Base Question Answering (KBQA) tasks. However, performing precise multi-hop reasoning over KGs for complex queries remains highly challenging. Most existing approaches decompose the reasoning process into a sequence of isolated steps executed through a fixed pipeline. While effective to some extent, such designs constrain reasoning flexibility and fragment the overall decision process, often leading to incoherence and the loss of critical intermediate information from earlier steps. In this paper, we introduce KG-Reasoner, an end-to-end framework that integrates multi-step reasoning into a unified "thinking" phase of a Reasoning LLM. Through Reinforcement Learning (RL), the LLM is trained to internalize the KG traversal process, enabling it to dynamically explore reasoning paths, and perform backtracking when necessary. Experiments on eight multi-hop and knowledge-intensive reasoning benchmarks demonstrate that KG-Reasoner achieves competitive or superior performance compared to the state-of-the-art methods. Codes are available at the repository: https://github.com/Wangshuaiia/KG-Reasoner.