ArXiv TLDR

Parser agreement and disagreement in L2 Korean UD: Implications for human-in-the-loop annotation

🐦 Tweet
2605.06625

Hakyung Sung, Gyu-Ho Shin

cs.CL

TLDR

This paper proposes a human-in-the-loop workflow for L2 Korean UD annotation, leveraging parser agreement for semi-automatic correctness.

Key contributions

  • Introduces a simplified human-in-the-loop workflow for L2 Korean morphosyntactic annotation.
  • Evaluates parser agreement as a proxy for annotation correctness against human judgments.
  • Demonstrates strong correspondence between parser and human judgments for L2 Korean UD.
  • Identifies predictable linguistic domains where parser disagreements cluster.

Why it matters

This research significantly advances L2 Korean UD annotation by proposing a semi-automatic method. It provides insights into parser limitations and challenges in L2 parsing, paving the way for more efficient and accurate language resource creation.

Original Abstract

We propose a simplified human-in-the-loop workflow for second language (L2) Korean morphosyntactic annotation by leveraging agreement between two domain-adapted parsers. We first evaluate whether parser agreement can serve as a proxy for annotation correctness by comparing it with independent human judgments. The results show strong correspondence between parser and human judgments, supporting the feasibility of semi-automatic L2-Korean UD annotation. Further analysis demonstrates that parser disagreements cluster in linguistically predictable domains such as grammatical-relation distinctions and clause-boundary ambiguity. While many disagreement cases are tractable for iterative model refinement, others reflect deeper representational challenges inherent in parsing and tagging L2-Korean corpora.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.