SHAPE: Unifying Safety, Helpfulness and Pedagogy for Educational LLMs
arXiv cs.CL / 4/27/2026
📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- The paper highlights a vulnerability in educational LLMs called “pedagogical jailbreaks,” where students use prompts to force direct answers instead of receiving scaffolded instruction.
- It formalizes what it means for educational LLMs to be safe, helpful, and pedagogical using a knowledge-mastery graph, aiming to enable systematic evaluation and research.
- The authors introduce SHAPE, a benchmark containing 9,087 student-question pairs designed to test tutoring behavior under adversarial conditions.
- They propose a graph-augmented tutoring pipeline that infers prerequisite concepts, detects mastery gaps, and uses explicit gating to choose between instructing and problem-solving.
- Experiments across multiple LLMs show improved safety against two jailbreak settings while preserving near-top-level helpfulness under the same evaluation protocol, with code and data released publicly.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.




