R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

arXiv cs.CL / 4/29/2026

💬 OpinionModels & Research

共有:

Key Points

The paper introduces R$^3$-SQL, a Text-to-SQL framework that improves how systems rank generated SQL candidates.
It tackles inconsistent scoring of functionally equivalent SQL by grouping candidates with identical execution results and ranking those groups for consistency.
The proposed ranking reward combines pairwise preferences across result groups with a pointwise utility derived from the best group’s rank and size to capture relative preference, consistency, and candidate quality.
To address cases where the correct SQL is missing from the candidate pool, it adds agentic resampling that evaluates the pool and selectively resamples when the correct answer is likely absent.
Experiments report 75.03 execution accuracy on BIRD-dev, stated as new SOTA among methods using models with disclosed sizes, with gains across five benchmarks.

Abstract

Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ranking cannot recover when the correct SQL is absent from the candidate pool. We propose R

^3

-SQL, a Text-to-SQL framework that addresses both issues through unified reward for ranking and resampling. R

^3

-SQL first groups candidates by execution result and ranks groups for consistency. To score each group, it combines a pairwise preference across groups with a pointwise utility from the best group rank and size, capturing relative preference, consistency, and candidate quality. To improve candidate recall, R

^3

-SQL introduces agentic resampling, which judges the generated candidate pool and selectively resamples when the correct SQL is likely absent. R

^3

-SQL achieves 75.03 execution accuracy on BIRD-dev, a new state of the art among methods using models with disclosed sizes, with consistent gains across five benchmarks.

How I Use AI Agents to Maintain a Living Knowledge Base for My Team

Dev.to

IK_LLAMA now supports Qwen3.5 MTP Support :O

Reddit r/LocalLLaMA

OpenAI models, Codex, and Managed Agents come to AWS

Dev.to

Automatic Error Recovery in AI Agent Networks

Dev.to

AeroJAX: JAX-native CFD, differentiable end-to-end. ~560 FPS at 128x128 on CPU [P]

Reddit r/MachineLearning

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

Key Points

Abstract

Related Articles

How I Use AI Agents to Maintain a Living Knowledge Base for My Team

IK_LLAMA now supports Qwen3.5 MTP Support :O

OpenAI models, Codex, and Managed Agents come to AWS

Automatic Error Recovery in AI Agent Networks

AeroJAX: JAX-native CFD, differentiable end-to-end. ~560 FPS at 128x128 on CPU [P]

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer