Abstract
Even when decoding with temperature T=0, large language models (LLMs) can produce divergent outputs for identical inputs. Recent work by Thinking Machines Lab highlights implementation-level sources of nondeterminism, including batch-size variation, kernel non-invariance, and floating-point non-associativity. In this short note we formalize this behavior by introducing the notion of \emph{background temperature} T_{\mathrm{bg}}, the effective temperature induced by an implementation-dependent perturbation process observed even when nominal T=0. We provide clean definitions, show how T_{\mathrm{bg}} relates to a stochastic perturbation governed by the inference environment I, and propose an empirical protocol to estimate T_{bg} via the equivalent temperature T_n(I) of an ideal reference system. We conclude with a set of pilot experiments run on a representative pool from the major LLM providers that demonstrate the idea and outline implications for reproducibility, evaluation, and deployment.