Semantic Layers for Reliable LLM-Powered Data Analytics: A Paired Benchmark of Accuracy and Hallucination Across Three Frontier Models
arXiv cs.AI / 4/29/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- The study argues that LLM-based natural-language analytics fail due to a shared root cause: the model must infer business semantics that the database schema does not encode, leading to both wrong answers and confident hallucinations.
- It benchmarks three frontier models (Claude Opus 4.7, Claude Sonnet 4.6, and GPT-5.4) on 100 questions using ClickHouse over the Cleaned Contoso Retail Dataset, comparing schema-only prompting versus schema plus a 4KB hand-authored “semantic layer” markdown document.
- Adding the semantic-layer document boosts accuracy by about +17 to +23 percentage points across all models, reducing the hallucination-prone behavior by grounding interpretation in explicit definitions.
- After adding the document, all three models perform similarly (67.7–68.7%); without it they are also similar (45.5–50.5%), and all cross-cluster comparisons are significant at p < 0.01.
- The authors conclude the key driver is the explicit business-semantics input itself: it changes the task the model is asked to perform, suppressing the dominant text-to-SQL error mode more than differences in model capability or model selection.
Related Articles

What to Build Still Beats How
Dev.to

I Build Systems, Flip Land, and Drop Trap Music — Meet Tyler Moncrieff aka Father Dust
Dev.to

From Claim Denials to Smart Decisions: My Experience Using AI in Healthcare Claims Processing
Dev.to

Whatsapp AI booking system in one prompt in 5 minutes
Dev.to
v0.22.1
Ollama Releases