Reasoning-Intensive Regression
arXiv cs.CL / 5/4/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper defines reasoning-intensive regression (RiR) as the task of inferring subtle numerical scores from text, which differs from standard regression targets like sentiment or similarity.
- It introduces four realistic RiR problems to form an initial benchmark and tests the claim that both prompting frozen LLMs and fine-tuning Transformer encoders via gradient descent often perform poorly on RiR.
- The authors propose MENTAT, a lightweight approach that combines batch-reflective prompt optimization with neural ensemble learning to improve RiR performance.
- Experiments show MENTAT can deliver up to 65% improvement over the two baseline strategies, while also indicating that there is still significant scope for further research.
Related Articles
AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI
The Verge
CLMA Frame Test
Dev.to
You Are Right — You Don't Need CLAUDE.md
Dev.to
Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions
Dev.to