RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

arXiv cs.AI / 5/4/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces RSAT, a training method for small language models (1–8B) that generates step-by-step answers to table questions with cell-level citations tied to the evidence.
RSAT uses a two-phase approach: SFT to enforce a structured, verifiable JSON reasoning format, and GRPO to optimize for faithfulness (via NLI-based checks) plus citation validity and brevity.
Experiments across six models from Qwen 2.5 (1.5B/3B/7B) and Llama 3 (1B/3B/8B) show RSAT boosts faithfulness by 3.7× compared with SFT alone (0.224 → 0.826), while keeping citation validity near-perfect (0.992).
The study finds that post-hoc attribution fails (below 13% format success), indicating that evidence attribution must be built into the reasoning process rather than added after the fact.
Ablation results confirm the faithfulness reward is critical, as removing it sharply reduces faithfulness from 0.97 to 0.03.

Abstract

When a language model answers a table question, users have no way to verify which cells informed which reasoning steps. We introduce RSAT, a method that trains small language models (SLMs, 1-8B) to produce step-by-step reasoning with cell-level citations grounded in table evidence. Phase 1 (SFT) teaches a structured JSON output format from verified reasoning traces. Phase 2 (GRPO) optimizes a composite reward centered on NLI-based faithfulness, alongside citation validity and parsimony. Across six models from two families-Qwen 2.5 (1.5B/3B/7B) and Llama 3 (1B/3B/8B)-RSAT improves faithfulness 3.7

\times

over SFT alone (0.224

\rightarrow

0.826), with near-perfect citation validity (0.992). Post-hoc attribution collapses below 13% format success, confirming that attribution must be integrated into reasoning, not retrofitted. Ablations show the faithfulness reward is essential: removing it drops faithfulness from 0.97 to 0.03.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 5/4DailyView insight →

AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs

Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI

The Verge

CLMA Frame Test

Dev.to

You Are Right — You Don't Need CLAUDE.md

Dev.to

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions

Dev.to

RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

Key Points

Abstract

💡 Insights using this article

Related Articles

AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI

CLMA Frame Test

You Are Right — You Don't Need CLAUDE.md

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer