ArXiv TLDR

Edit-level Majority Voting Mitigates Over-Correction in LLM-based Grammatical Error Correction

🐦 Tweet
2605.13624

Takumi Goto, Yusuke Sakai, Taro Watanabe

cs.CL

TLDR

This paper introduces edit-level majority voting to reduce over-correction in LLM-based grammatical error correction, improving performance.

Key contributions

  • Proposes a training-free inference method for LLM-based GEC.
  • Uses edit-level majority voting on multiple candidates from a single model.
  • Outperforms greedy and MBR decoding on 9 benchmarks across 7 languages.
  • Ensures stable correction quality, independent of instruction prompts.

Why it matters

LLMs for GEC often over-correct, hindering their utility. This method offers a simple, training-free solution that significantly improves correction quality across diverse languages. Its stability and broad applicability make it a valuable advancement for practical GEC systems.

Original Abstract

Grammatical error correction using large language models often suffers from the over-correction issue. To mitigate this, we propose a training-free inference method that performs edit-level majority voting over multiple candidates generated by a single model, without requiring model modifications or additional training. Across nine benchmarks covering English, Czech, German, Ukrainian, Korean, Hindi, and Romanian, the proposed method outperforms both greedy and MBR decoding in most cases. Moreover, it yields stable correction quality regardless of the instruction prompts used. We release two repository supporting GEC datasets loading and LLM inference.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.