Scoring Edit Impact in Grammatical Error Correction via Embedded Association Graphs

arXiv cs.CL / 4/9/2026

📰 News

Key Points

  • The paper introduces a new evaluation task, “Scoring Edit Impact in Grammatical Error Correction,” aiming to automatically estimate how important each edit is when a GEC model transforms an input sentence.
  • It proposes an Embedded Association Graph scoring framework that models latent dependencies among edits and groups syntactically related edits into coherent structures.
  • The method uses perplexity-based scoring to quantify each edit’s contribution to sentence fluency, targeting scenarios where multiple corrections can be valid.
  • Experiments on 4 GEC datasets, 4 languages, and 4 GEC systems show consistent improvements over multiple baselines, indicating the approach is robust across settings.
  • Additional analysis suggests the embedded association graph captures cross-linguistic structural dependencies, supporting generalization across languages.
  • categories: [

Abstract

A Grammatical Error Correction (GEC) system produces a sequence of edits to correct an erroneous sentence. The quality of these edits is typically evaluated against human annotations. However, a sentence may admit multiple valid corrections, and existing evaluation settings do not fully accommodate diverse application scenarios. Recent meta-evaluation approaches rely on human judgments across multiple references, but they are difficult to scale to large datasets. In this paper, we propose a new task, Scoring Edit Impact in GEC, which aims to automatically estimate the importance of edits produced by a GEC system. To address this task, we introduce a scoring framework based on an embedded association graph. The graph captures latent dependencies among edits and syntactically related edits, grouping them into coherent groups. We then perform perplexity-based scoring to estimate each edit's contribution to sentence fluency. Experiments across 4 GEC datasets, 4 languages, and 4 GEC systems demonstrate that our method consistently outperforms a range of baselines. Further analysis shows that the embedded association graph effectively captures cross-linguistic structural dependencies among edits.