IUQ: Interrogative Uncertainty Quantification for Long-Form Large Language Model Generation

arXiv cs.CL / 4/17/2026

📰 NewsModels & Research

共有:

Key Points

The paper addresses a core problem in long-form LLM generation: models can generate semantically coherent text that may still contain factual inaccuracies.
It proposes Interrogative Uncertainty Quantification (IUQ), which estimates uncertainty in long-form outputs using inter-sample consistency and intra-sample faithfulness.
IUQ uses an “interrogate-then-respond” paradigm to produce claim-level uncertainty measures as well as an assessment of the model’s faithfulness.
Experiments across multiple model families and sizes show IUQ outperforms two established long-form generation datasets/benchmarks.
The authors provide an implementation and release code on GitHub for reproducibility and further use.

Abstract

Despite the rapid advancement of Large Language Models (LLMs), uncertainty quantification in LLM generation is a persistent challenge. Although recent approaches have achieved strong performance by restricting LLMs to produce short or constrained answer sets, many real-world applications require long-form and free-form text generation. A key difficulty in this setting is that LLMs often produce responses that are semantically coherent yet factually inaccurate, while the underlying semantics are multifaceted and the linguistic structure is complex. To tackle this challenge, this paper introduces Interrogative Uncertainty Quantification (IUQ), a novel framework that leverages inter-sample consistency and intra-sample faithfulness to quantify the uncertainty in long-form LLM outputs. By utilizing an interrogate-then-respond paradigm, our method provides reliable measures of claim-level uncertainty and the model's faithfulness. Experimental results across diverse model families and model sizes demonstrate the superior performance of IUQ over two widely used long-form generation datasets. The code is available at https://github.com/louisfanhz/IUQ.