Knowledge Boundary Discovery for Large Language Models

arXiv cs.AI / 2026/3/24

💬 オピニオンIdeas & Deep AnalysisModels & Research

要点

  • The paper introduces Knowledge Boundary Discovery (KBD), a reinforcement learning framework that maps where an LLM can and cannot answer questions with confidence.
  • KBD distinguishes between an “within-knowledge boundary” set of answerable questions and a “beyond-knowledge boundary” set of unanswerable ones by iteratively probing the model.
  • To address hallucinations, it treats questioning as an agent interacting with a partially observable environment, using entropy reduction as the reward signal.
  • The method incrementally builds belief states from the LLM’s responses and generates a set of non-trivial answerable/unanswerable questions.
  • Validation against manually crafted benchmark datasets finds the automatically generated question sets are comparable to human-created evaluations, suggesting KBD as a new LLM evaluation direction.

Abstract

We propose Knowledge Boundary Discovery (KBD), a reinforcement learning based framework to explore the knowledge boundaries of the Large Language Models (LLMs). We define the knowledge boundary by automatically generating two types of questions: (i) those the LLM can confidently answer (within-knowledge boundary) and (ii) those it cannot (beyond-knowledge boundary). Iteratively exploring and exploiting the LLM's responses to find its knowledge boundaries is challenging because of the hallucination phenomenon. To find the knowledge boundaries of an LLM, the agent interacts with the LLM under the modeling of exploring a partially observable environment. The agent generates a progressive question as the action, adopts an entropy reduction as the reward, receives the LLM's response as the observation and updates its belief states. We demonstrate that the KBD detects knowledge boundaries of LLMs by automatically finding a set of non-trivial answerable and unanswerable questions. We validate the KBD by comparing its generated knowledge boundaries with manually crafted LLM benchmark datasets. Experiments show that our KBD-generated question set is comparable to the human-generated datasets. Our approach paves a new way to evaluate LLMs.