Conditional Factuality Controlled LLMs with Generalization Certificates via Conformal Sampling

arXiv cs.LG / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Conditional Factuality Control (CFC), a post-hoc conformal method that produces set-valued LLM outputs with conditional (prompt-difficulty-aware) hallucination coverage guarantees rather than only marginal ones.
  • CFC uses a continuous, feature-conditional acceptance threshold learned via augmented quantile regression on a latent “success” score, then applies it at inference with a fixed-point threshold rule.
  • The authors prove CFC’s conditional coverage under exchangeability assumptions and show it is more sample-efficient than marginal conformal prediction for the same target coverage under mild distributional conditions.
  • A PAC-style variant, CFC-PAC, provides a finite-sample certificate bounding how much conditional miscoverage can deviate from the target, with an explicit dependence on N and a confidence parameter δ.
  • Experiments on synthetic data, reasoning/QA benchmarks, and a Flickr8k VLM setting indicate CFC and CFC-PAC achieve near-target conditional coverage across difficulty groups while using smaller prediction sets than conformal baselines and non-conformal methods.

Abstract

Large language models (LLMs) need reliable test-time control of hallucinations. Existing conformal methods for LLMs typically provide only \emph{marginal} guarantees and rely on a single global threshold, which can under-cover hard prompts, over-cover easy ones, and produce oversized prediction sets. We propose \emph{Conditional Factuality Control} (CFC), a post-hoc conformal framework that returns \emph{set-valued} outputs with \emph{conditional} coverage guarantees. CFC defines a continuous, feature-conditional acceptance threshold through augmented quantile regression on a latent ``success'' score, and deploys it through a fixed-point threshold rule at inference time. Theoretically, we show that CFC satisfies a conditional coverage guarantee under exchangeability and analyze its \emph{efficiency}, proving that, under mild assumptions on the score distributions, the conditional rule is strictly more sample-efficient than marginal conformal prediction at the same target coverage. We further derive a PAC-style variant, CFC-PAC, which shrinks the nominal risk level based on a stability bound, yielding a finite-sample certificate that the conditional miscoverage deviates from the target by at most O(\sqrt{\log(1/\delta)/N}). Empirically, on synthetic data, real-world reasoning and QA benchmarks, and a Flickr8k VLM setting, CFC and CFC-PAC consistently attain near-target coverage across difficulty groups while using smaller prediction sets than CP and non-CP baselines.