Pangu-ACE: Adaptive Cascaded Experts for Educational Response Generation on EduBench
arXiv cs.CL / 4/17/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- Pangu-ACE is an educational response generation system that dynamically spends more compute only when needed, using a sample-level cascade from a 1B “tutor-router” to a 7B specialist prompt.
- The pipeline generates a draft answer and routing signals with the 1B model, then either accepts the draft or escalates each sample to the 7B expert based on task-dependent routing decisions.
- The paper fixes a major offline evaluation bug that previously over-credited open-form outputs that merely passed superficial formatting checks, and reports improved metrics on EduBench’s full Chinese test archive (7,013 samples).
- Results show deterministic quality increases from 0.457 to 0.538 and format validity from 0.707 to 0.866 versus the legacy rule_v2 system, with 19.7% of requests handled directly by the 1B model.
- Although the archived deployment does not yet demonstrate latency gains, the efficiency claim is supported by routing selectivity rather than wall-clock speedup, and the GPT-5.4 baseline re-judging remains pending due to invalid provider configuration credentials.

![[Patterns] AI Agent Error Handling That Actually Works](/_next/image?url=https%3A%2F%2Fmedia2.dev.to%2Fdynamic%2Fimage%2Fwidth%3D1200%2Cheight%3D627%2Cfit%3Dcover%2Cgravity%3Dauto%2Cformat%3Dauto%2Fhttps%253A%252F%252Fdev-to-uploads.s3.amazonaws.com%252Fuploads%252Farticles%252Frn5czaopq2vzo7cglady.png&w=3840&q=75)


![[2026] OpenTelemetry for LLM Observability — Self-Hosted Setup](/_next/image?url=https%3A%2F%2Fmedia2.dev.to%2Fdynamic%2Fimage%2Fwidth%3D1200%2Cheight%3D627%2Cfit%3Dcover%2Cgravity%3Dauto%2Cformat%3Dauto%2Fhttps%253A%252F%252Fdev-to-uploads.s3.amazonaws.com%252Fuploads%252Farticles%252Flu4b6ttuhur71z5gemm0.png&w=3840&q=75)