PV-SQL: Synergizing Database Probing and Rule-based Verification for Text-to-SQL Agents

arXiv cs.AI / 4/21/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • PV-SQL is a new agentic text-to-SQL framework designed to improve performance on complex queries that require deeper contextual understanding.
  • It uses a two-part approach: Probe iteratively issues probing queries to fetch concrete database records and clarify ambiguous values, column meanings, and relationships across tables.
  • It complements this with Verify, a rule-based component that extracts verifiable conditions and builds an executable checklist to guide iterative SQL refinement.
  • On the BIRD benchmarks, PV-SQL improves execution accuracy by 5% over the strongest text-to-SQL baseline and boosts valid efficiency by 20.8% while using fewer tokens.

Abstract

Text-to-SQL systems often struggle with deep contextual understanding, particularly for complex queries with subtle requirements. We present PV-SQL, an agentic framework that addresses these failures through two complementary components: Probe and Verify. The Probe component iteratively generates probing queries to retrieve concrete records from the database, resolving ambiguities in value formats, column semantics, and inter-table relationships to build richer contextual understanding. The Verify component employs a rule-based method to extract verifiable conditions and construct an executable checklist, enabling iterative SQL refinement that effectively reduces missing constraints. Experiments on the BIRD benchmarks show that PV-SQL outperforms the best text-to-SQL baseline by 5% in execution accuracy and 20.8% in valid efficiency score while consuming fewer tokens.