Listen, Correct, and Feed Back: Spoken Pedagogical Feedback Generation

arXiv cs.CL / 4/17/2026

📰 NewsTools & Practical UsageModels & Research

共有:

Key Points

The paper introduces SPFG (Spoken Pedagogical Feedback Generation), a new dataset aimed at producing learner-friendly, actionable, level-appropriate, and encouraging spoken pedagogical feedback alongside grammatical error corrections and explanations.
SPFG is built from the Speak & Improve Challenge 2025 corpus and includes fluency-oriented transcriptions paired with GEC targets and human-verified teacher-style feedback, including preferred/rejected feedback pairs for preference learning.
The study evaluates three instruction-tuned LLMs (Qwen2.5, Llama-3.1, GLM-4) on transcript-based Spoken Grammatical Error Correction (SGEC), comparing supervised fine-tuning (SFT) versus preference-based alignment methods (DPO and KTO) for jointly generating corrections and feedback.
Experiment results indicate SFT delivers the most consistent improvements, while DPO/KTO achieve smaller or mixed gains, and the quality of corrections and feedback are only weakly correlated.
The authors provide an implementation and release it publicly on GitHub for reproducibility and further research.

Abstract

Grammatical error correction (GEC) and explanation (GEE) have made rapid progress, but real teaching scenarios also require \emph{learner-friendly pedagogical feedback} that is actionable, level-appropriate, and encouraging. We introduce \textbf{SPFG} (\textbf{S}poken \textbf{P}edagogical \textbf{F}eedback \textbf{G}eneration), a dataset built based on the Speak \& Improve Challenge 2025 corpus, pairing fluency-oriented transcriptions with GEC targets and \emph{human-verified} teacher-style feedback, including preferred/rejected feedback pairs for preference learning. We study a transcript-based Spoken Grammatical Error Correction (SGEC) setting and evaluate three instruction-tuned LLMs (Qwen2.5, Llama-3.1, and GLM-4), comparing supervised fine-tuning (SFT) with preference-based alignment (using DPO and KTO) for jointly generating corrections and feedback. Results show that SFT provides the most consistent improvements, while DPO/KTO yield smaller or mixed gains, and that correction quality and feedback quality are weakly coupled. Our implementation is available at https://github.com/Skywalker-Harrison/spfg.