A Formal Framework for Uncertainty Analysis of Text Generation with Large Language Models

arXiv cs.LG / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a formal framework to measure uncertainty in LLM text generation by considering uncertainty from prompting, generation, and downstream interpretation.
  • It represents prompting, generation, and interpretation as interconnected autoregressive processes that can be unified into a single “sampling tree.”
  • The authors introduce filters and objective functions that let different uncertainty aspects be expressed across the sampling tree.
  • The framework is used to show formal relationships among existing uncertainty methods and to identify additional, previously understudied sources of uncertainty.

Abstract

The generation of texts using Large Language Models (LLMs) is inherently uncertain, with sources of uncertainty being not only the generation of texts, but also the prompt used and the downstream interpretation. Within this work, we provide a formal framework for the measurement of uncertainty that takes these different aspects into account. Our framework models prompting, generation, and interpretation as interconnected autoregressive processes that can be combined into a single sampling tree. We introduce filters and objective functions to describe how different aspects of uncertainty can be expressed over the sampling tree and demonstrate how to express existing approaches towards uncertainty through these functions. With our framework we show not only how different methods are formally related and can be reduced to a common core, but also point out additional aspects of uncertainty that have not yet been studied.