Edit-level Majority Voting Mitigates Over-Correction in LLM-based Grammatical Error Correction
Takumi Goto, Yusuke Sakai, Taro Watanabe
TLDR
This paper introduces edit-level majority voting to reduce over-correction in LLM-based grammatical error correction, improving performance.
Key contributions
- Proposes a training-free inference method for LLM-based GEC.
- Uses edit-level majority voting on multiple candidates from a single model.
- Outperforms greedy and MBR decoding on 9 benchmarks across 7 languages.
- Ensures stable correction quality, independent of instruction prompts.
Why it matters
LLMs for GEC often over-correct, hindering their utility. This method offers a simple, training-free solution that significantly improves correction quality across diverse languages. Its stability and broad applicability make it a valuable advancement for practical GEC systems.
Original Abstract
Grammatical error correction using large language models often suffers from the over-correction issue. To mitigate this, we propose a training-free inference method that performs edit-level majority voting over multiple candidates generated by a single model, without requiring model modifications or additional training. Across nine benchmarks covering English, Czech, German, Ukrainian, Korean, Hindi, and Romanian, the proposed method outperforms both greedy and MBR decoding in most cases. Moreover, it yields stable correction quality regardless of the instruction prompts used. We release two repository supporting GEC datasets loading and LLM inference.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.