Gemini-SQL2 tops BIRD——
80% accuracy crossed

Gemini-SQL2, built on Gemini 3.1 Pro, has scored 80.04% on the BIRD text-to-SQL benchmark, taking the single-model top spot. This is the accuracy level that makes LLM-driven BI genuinely viable for standard queries.

AI Navigate Editorial·2026.06.13·6 min read

Background

The 60–70% ceiling held
for years — until now

Text-to-SQL translates natural-language questions into SQL queries — a core need for LLM integration with BI dashboards and analytics tools. Despite significant interest, accuracy had been stuck in the 60–70% range for years. The error rate was high enough that human SQL review remained a mandatory step in any real analytical workflow, limiting practical adoption.

Gemini-SQL2, based on Gemini 3.1 Pro, achieved 80.04% execution accuracy on the BIRD text-to-SQL benchmark, taking the top position in the single-model category. A schema-grounding implementation pattern has also been published.

Schema Grounding

Feeding schema precisely
is the accuracy key

Beyond Gemini-SQL2's architecture, the published schema-grounding pattern that tells the model exactly what it needs to generate correct SQL is now available to apply.

FIG. Precise schema input drives the accuracy gain — the pattern is now published for others to apply.

The published schema-grounding pattern describes how to communicate table definitions, column types, and foreign-key relationships to the model in a way that maximises SQL correctness. Even teams not using Gemini may find the pattern applicable to improving text-to-SQL accuracy with other models.

Practical Impact

LLM-driven BI becomes
viable for standard queries

80% accuracy is not perfect, but it covers the majority of standard BI and analytics queries. For teams that want a Google-native path to text-to-SQL without middleware like Snowflake or Databricks, the option now exists. The remaining 20% error rate still requires a human review layer — designing that gate is the implementation work to do before going live.

The 60–70% ceiling heldfor years — until now

Feeding schema preciselyis the accuracy key

LLM-driven BI becomesviable for standard queries

The 60–70% ceiling held
for years — until now

Feeding schema precisely
is the accuracy key

LLM-driven BI becomes
viable for standard queries