CODE-GEN: A Human-in-the-Loop RAG-Based Agentic AI System for Multiple-Choice Question Generation
arXiv cs.AI / 4/7/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- CODE-GEN is introduced as a human-in-the-loop, retrieval-augmented (RAG) agentic AI system for generating context-aligned multiple-choice coding comprehension questions tied to course learning objectives.
- The system uses two cooperating agents— a Generator to draft questions and a Validator to independently score content quality across seven pedagogical dimensions, supported by specialized tools for computational accuracy and code verification.
- An evaluation with six subject-matter experts reviewed 288 AI-generated questions, producing 2,016 human-AI rating comparisons and additional qualitative feedback.
- Results show strong performance, with human-validated success rates of 79.9%–98.6% across most dimensions that align with explicit criteria and computational checks.
- The study finds that human expertise remains critical for harder pedagogical tasks such as crafting meaningfully plausible distractors and writing feedback that deepens understanding.
Related Articles

Black Hat Asia
AI Business
Research with ChatGPT
Dev.to
Silicon Valley is quietly running on Chinese open source models and almost nobody is talking about it
Reddit r/LocalLLaMA

Why AI Product Quality Is Now an Evaluation Pipeline Problem, Not a Model Problem
Dev.to

The 10 Best AI Tools for SEO and Digital Marketing in 2026
Dev.to