RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

arXiv cs.AI / 5/4/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces RSAT, a training method for small language models (1–8B) that generates step-by-step answers to table questions with cell-level citations tied to the evidence.
  • RSAT uses a two-phase approach: SFT to enforce a structured, verifiable JSON reasoning format, and GRPO to optimize for faithfulness (via NLI-based checks) plus citation validity and brevity.
  • Experiments across six models from Qwen 2.5 (1.5B/3B/7B) and Llama 3 (1B/3B/8B) show RSAT boosts faithfulness by 3.7× compared with SFT alone (0.224 → 0.826), while keeping citation validity near-perfect (0.992).
  • The study finds that post-hoc attribution fails (below 13% format success), indicating that evidence attribution must be built into the reasoning process rather than added after the fact.
  • Ablation results confirm the faithfulness reward is critical, as removing it sharply reduces faithfulness from 0.97 to 0.03.

Abstract

When a language model answers a table question, users have no way to verify which cells informed which reasoning steps. We introduce RSAT, a method that trains small language models (SLMs, 1-8B) to produce step-by-step reasoning with cell-level citations grounded in table evidence. Phase 1 (SFT) teaches a structured JSON output format from verified reasoning traces. Phase 2 (GRPO) optimizes a composite reward centered on NLI-based faithfulness, alongside citation validity and parsimony. Across six models from two families-Qwen 2.5 (1.5B/3B/7B) and Llama 3 (1B/3B/8B)-RSAT improves faithfulness 3.7\times over SFT alone (0.224\rightarrow0.826), with near-perfect citation validity (0.992). Post-hoc attribution collapses below 13% format success, confirming that attribution must be integrated into reasoning, not retrofitted. Ablations show the faithfulness reward is essential: removing it drops faithfulness from 0.97 to 0.03.

RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners | AI Navigate