Structure-Grounded Knowledge Retrieval via Code Dependencies for Multi-Step Data Reasoning
arXiv cs.CL / 4/14/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that retrieval-augmented LLM approaches often use lexical/embedding similarity, which can be a poor proxy for the actual knowledge needed for multi-step data reasoning.
- It proposes SGKR (Structure-Grounded Knowledge Retrieval), which builds a dependency graph of domain knowledge based on function-call relationships rather than textual similarity alone.
- For a given question, SGKR derives semantic input/output tags, finds dependency paths connecting them, and assembles a task-relevant subgraph plus corresponding function implementations as structured context for LLM code generation.
- Experiments on multi-step data analysis benchmarks show SGKR improves solution correctness compared with no-retrieval and similarity-based retrieval baselines, for both vanilla LLMs and coding agents.
Related Articles

Black Hat Asia
AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Don't forget, there is more than forgetting: new metrics for Continual Learning
Dev.to

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale
Dev.to
Bit of a strange question?
Reddit r/artificial