REFINE: Real-world Exploration of Interactive Feedback and Student Behaviour

arXiv cs.AI / 4/1/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper introduces REFINE, a locally deployable multi-agent feedback system that treats formative feedback as an interactive process rather than a static one-way artifact.
REFINE uses a feedback-generation agent plus an LLM-as-a-judge guided regeneration loop (with a human-aligned judge) to improve feedback quality.
An interactive, tool-calling agent enables context-aware student follow-up questions and produces actionable responses comparable to a state-of-the-art closed-source model.
Controlled experiments and an authentic undergraduate computer science classroom deployment show that judge-guided regeneration boosts feedback quality and that student interaction analysis reveals engagement patterns influenced by system-generated feedback.

Abstract

Formative feedback is central to effective learning, yet providing timely, individualised feedback at scale remains a persistent challenge. While recent work has explored the use of large language models (LLMs) to automate feedback, most existing systems still conceptualise feedback as a static, one-way artifact, offering limited support for interpretation, clarification, or follow-up. In this work, we introduce REFINE, a locally deployable, multi-agent feedback system built on small, open-source LLMs that treats feedback as an interactive process. REFINE combines a pedagogically-grounded feedback generation agent with an LLM-as-a-judge-guided regeneration loop using a human-aligned judge, and a self-reflective tool-calling interactive agent that supports student follow-up questions with context-aware, actionable responses. We evaluate REFINE through controlled experiments and an authentic classroom deployment in an undergraduate computer science course. Automatic evaluations show that judge-guided regeneration significantly improves feedback quality, and that the interactive agent produces efficient, high-quality responses comparable to a state-of-the-art closed-source model. Analysis of real student interactions further reveals distinct engagement patterns and indicates that system-generated feedback systematically steers subsequent student inquiry. Our findings demonstrate the feasibility and effectiveness of multi-agent, tool-augmented feedback systems for scalable, interactive feedback.