The Format Tax

arXiv cs.CL / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that asking an LLM to output structured formats (JSON/XML/LaTeX/Markdown) should not reduce reasoning quality, but structured-output requirements significantly degrade performance in open-weight models.
The study finds that constrained decoding explains only a small part of the loss; most accuracy degradation comes directly from the prompt-level instructions demanding a specific format.
Across six open-weight models (and additional comparisons with closed-weight models), the authors show that separating/decoupling reasoning from formatting substantially recovers lost accuracy.
The proposed recovery strategies include generating freeform first and then reformatting in a second pass, or allowing extended “thinking” within a single generation before producing the final structured output.
Closed-weight models are observed to exhibit little to no “format tax,” implying the issue is largely a current open-weight-model capability gap rather than a fundamental limitation of structured generation.

Abstract

Asking a large language model to respond in JSON should be a formatting choice, not a capability tax. Yet we find that structured output requirements -- JSON, XML, LaTeX, Markdown -- substantially degrade reasoning and writing performance across open-weight models. The research response has focused on constrained decoding, but sampling bias accounts for only a fraction of the degradation. The dominant cost enters at the prompt: format-requesting instructions alone cause most of the accuracy loss, before any decoder constraint is applied. This diagnosis points to a simple principle: decouple reasoning from formatting. Whether by generating freeform first and reformatting in a second pass, or by enabling extended thinking within a single generation, separating the two concerns substantially recovers lost accuracy. Across six open-weight models, four API models, four formats, and tasks spanning math, science, logic, and writing, decoupling recovers most lost accuracy. Notably, most recent closed-weight models show little to no format tax, suggesting the problem is not inherent to structured generation but a gap that current open-weight models have yet to close. Code is available at https://github.com/ivnle/the-format-tax.

[R] The ECIH: Model Modeling Agentic Identity as an Emergent Relational State [R]

Reddit r/MachineLearning

Google DeepMind Unveils Project Genie: The Dawn of Infinite AI-Generated Game Worlds

Dev.to

Artificial Intelligence and Life in 2030: The One Hundred Year Study onArtificial Intelligence

Dev.to

From Booth Chaos to Scalable Conversations: AI for Hyper-Personalized Follow-Up

Dev.to

AI in 2030: 20 Powerful Trends That Will Shape the Future

Dev.to

The Format Tax

Key Points

Abstract

Related Articles

[R] The ECIH: Model Modeling Agentic Identity as an Emergent Relational State [R]

Google DeepMind Unveils Project Genie: The Dawn of Infinite AI-Generated Game Worlds

Artificial Intelligence and Life in 2030: The One Hundred Year Study onArtificial Intelligence

From Booth Chaos to Scalable Conversations: AI for Hyper-Personalized Follow-Up

AI in 2030: 20 Powerful Trends That Will Shape the Future

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer