Database world trying to build natural language query systems again – this time with LLMs
Text-to-SQL might be useful for analysts and DBAs, but be cautious with general user adoption
Over the past few years, database and analytics vendors have hopped on a bandwagon that may take us all to a destination where common data queries are free from the constraints of the specialist query language SQL.
Earlier this month, for example, AWS promised "a natural text-to-SQL solution … that transforms business questions into database queries and returns actionable answers."
However, while text-to-SQL tools might help database developers — who already know SQL — to save time, organizations might wish to exercise caution before letting the idea of letting business user loose with these tools, warned Nick Koudas, professor in the Department of Computer Science at the University of Toronto.
Speaking to The Register, Koudas, who researches natural language SQL systems, said the problem is that, for a business user, systems can create queries that have the right syntax, but which don't reflect their intended query.
SQL was first designed to be as close as possible to natural language, co-author Donald Chamberlin told The Register in 2024. However, due to the limitations and ambiguities of natural English, there had to be some compromise in creating the more-or-less ubiquitous data language we see today.
Because of the limits to the number of people with SQL skills, vendors have stepped in to fill the gap, assuming LLMs would be a good way of building a natural language data system.
AWS, for example, described how users could develop a text-to-SQL proof of concept using its Bedrock platform to build generative AI applications and agents.
"Many organizations find that accessing data insights remains a significant bottleneck in business decision-making processes. The traditional approach requires either learning SQL syntax, waiting for technical resources, or settling for pre-built dashboards that might not answer your specific questions," it said.
Such a tool might help remove the SQL expertise barrier that blocks rapid analysis, it argues. "Most business users lack the technical SQL knowledge needed to access complex data. Simple questions often require multi-table joins, temporal calculations, and hierarchical aggregations. This dependency creates bottlenecks where business users wait extended periods for custom reports, while analysts spend valuable time on repetitive query requests rather than strategic analysis," it said.
AWS is not alone in this ambition. Back in 2024, cloud data platform provider Snowflake built a new service it claims will help organizations build chatbots that can serve up data from its own analytics systems and those external to the cloud data platform vendor. The approach employs the vendor-managed AI service Cortex.
The company has since introduced Cortex Analyst, which introduces a semantic model that "connects business terms to database schema," according to the vendor. "It guides LLMs to build more accurate SQL queries by including the right elements, such as joins and filters," it said.
Even MongoDB — a document store system NoSQL database — has its own text to MongoDB query API, built on the LangChain software framework.
The tech industry has produced resources to help developers decide which tools might help in building text-to-SQL systems. The benchmark BIRD-SQL, for example, claims to offer a cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing. The current leader is a study using AskData and OpenAI's GPT-4o, with nearly 82 percent execution accuracy, rated against expert human performance of about 93 percent.
Nonetheless, Koudas warns of the over-reliance on such systems.
"SQL has been the language for databases for more than 50 years. People have envisioned, even before LLM, to try make it even more accessible than it is by essentially translating natural language to SQL. In terms of pure research, people were trying to do it, and showing how difficult it is. Now with LLMs, because of the deep semantic understanding of pre-training, there is rapid development," he said.
He explained that the approach usually happens in two steps. The first is to try to map the natural language to the schema of the database, in order to understand which tables and which attributes are involved in a query. The second phase is to generate the SQL.
"Pretty much every database vendor has already implemented this functionality in one form or another. If you go to companies such as Snowflake or Databricks, they have some sort of a text-to-SQL interface where people can just use natural language to get an SQL query," he said.
The current limit of around 80 percent accuracy out-of-the-box comes from fact that most organizations use systems with proprietary data sets and proprietary schemas with a specific terminology used inside the company. Without more training, LLMs don't have access to these definitions, he said.
- Spark creator bags computing gong for making big data a little bit smaller
- Oracle cuts jobs across sales, engineering, security
- SAP looking to pull more external data into its AI platform with Reltio acquisition
- Fixing Claude with Claude: Anthropic reports on AI site reliability engineering
Nonetheless, the tools can still save time for database developers and administrators, as they have the skills to check the LLM-produced SQL and verify it.
"Think of it as a utility to a database developer that knows SQL and is able to judge if the query that comes back from the LLM is a correct query, if there is a mistake, to fix it," Koudas said.
But using text-to-SQL systems without someone in the loop who understands SQL can be risky, he said. While a query that produces syntactically wrong SQL might be harmless — it simply won't work.
"There is also always the chance for LLM, after a couple of tries, to create an SQL query that is actually perfectly syntactically correct but semantically wrong. If you trust it and you get back something that doesn't bear any connection to [the query], then you have a big problem, and I think people understand that," he said.
To improve the level of accuracy, or at least mitigate the risk in using these systems, research currently focuses on putting the human back in the loop, but rather than an SQL expert, it is the user from whom the LLM might seek further clarification.
"Natural language is ambiguous, and it has a lot of nuances. So when the LLM takes the natural language and starts producing the SQL, you build in a mechanism inside the inference of the LLM which — while producing your SQL query — understands that the next token it generates will create extreme uncertainty. Because it captures its own uncertainty, and because of the schema mapping, when you have an uncertain token, you can take that token, it can go ahead and analyze the schema and ask a natural language question back to the user. If it's uncertain, it says, 'Do you mean this or that in your question? or is this attribute part of your answer?'
"With a synergy between the human and the LLM, the human can help the LLM complete the job correctly," he said.
But for the moment, most use cases for text-to-SQL are similar to using LLMs for software development: they are tools for increasing productivity rather than replacing developers, Koudas said.
Despite the benefits of text-to-SQL LLM systems, the 50-year search for a completely natural way to query and manage data is not yet at an end. ®
