Geometry-Calibrated Conformal Abstention for Language Models

arXiv cs.CL / 5/1/2026

📰 NewsModels & Research

共有:

Key Points

The paper addresses a common LLM failure mode: when models lack relevant knowledge, they often still produce plausible but potentially hallucinated answers instead of admitting uncertainty.
It proposes a post-hoc framework called Conformal Abstention (CA), adapted from conformal prediction, to decide whether the model should abstain on a per-query basis.
CA provides finite-sample guarantees for both participation (not abstaining) and accuracy of generated responses, while making the abstention decision using prediction confidence rather than intractable conformal non-conformity scores.
To connect prediction confidence to true ignorance, the authors introduce a calibration method that uses representation-geometry measurements (knowledge involvement) inside the model.
Experiments show improved selective answering performance, achieving about 75% conditional correctness.

Abstract

When language models lack relevant knowledge for a given query, they frequently generate plausible responses that can be hallucinations, rather than admitting being agnostic about the answer. Retraining models to reward admitting ignorance can lead to overly conservative behaviors and poor generalization due to scarce evaluation benchmarks. We propose a post hoc framework, Conformal Abstention (CA), adapted from conformal prediction (CP) to determine whether to abstain from answering a query. CA provides finite-sample guarantees on both the probability of participation (i.e., not abstaining) and the probability that the generated response is correct. Importantly, the abstention decision relies on prediction confidence rather than the non-conformity scores used in CP, which are intractable for open-ended generation. To better align prediction confidence with the model's ignorance, we introduce a calibration strategy using representation geometry within the model to measure knowledge involvement in shaping the response. Experiments demonstrate that we improve selective answering significantly with 75 percent conditional correctness.

Every handle invocation on BizNode gets a WFID — a universal transaction reference for accountability. Full audit trail,...

Dev.to

Panduan Lengkap TestSprite MCP Server — Dokumentasi Getting Started dalam Bahasa Indonesia

Dev.to

MCP, Skills, AI Agents, and New Models: The New Stack for Software Development

Dev.to

GitHub - intel/auto-round: A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.

Reddit r/LocalLLaMA

ChatGPT's goblin obsession may be hilarious, but it points to a deeper problem in AI training

THE DECODER

Geometry-Calibrated Conformal Abstention for Language Models

Key Points

Abstract

Related Articles

Every handle invocation on BizNode gets a WFID — a universal transaction reference for accountability. Full audit trail,...

Panduan Lengkap TestSprite MCP Server — Dokumentasi Getting Started dalam Bahasa Indonesia

MCP, Skills, AI Agents, and New Models: The New Stack for Software Development

GitHub - intel/auto-round: A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.

ChatGPT's goblin obsession may be hilarious, but it points to a deeper problem in AI training

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer