TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas
arXiv cs.AI / 3/18/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- TRUST-SQL introduces a tool-augmented reinforcement learning framework for Text-to-SQL under Unknown Schema, enabling grounding in verified metadata rather than pre-loading full schemas.
- It models the task as a Partially Observable Markov Decision Process with a four-phase protocol and a Dual-Track GRPO strategy to separate exploration rewards from execution outcomes.
- The approach yields a 9.9% relative improvement over standard GRPO and an average absolute improvement of 30.6% (4B) and 16.6% (8B) over their base models, despite not using pre-loaded metadata.
- Extensive experiments across five benchmarks demonstrate that TRUST-SQL matches or surpasses strong baselines that rely on schema prefilling.
- By addressing the Unknown Schema scenario in enterprise databases, the framework enables efficient identification of the relevant subset of schema and reduces the need for upfront metadata.
Related Articles
Is AI becoming a bubble, and could it end like the dot-com crash?
Reddit r/artificial

Externalizing State
Dev.to

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.
Dev.to

My AI Does Not Have a Clock
Dev.to
How to settle on a coding LLM ? What parameters to watch out for ?
Reddit r/LocalLLaMA