Universal NER v2: Towards a Massively Multilingual Named Entity Recognition Benchmark

April 14, 20262604.12744

Terra Blevins, Stephen Mayhew, Marek Šuppa, Hila Gonen, Shachar Mirkin + 9 more

cs.CL

TLDR

Universal NER v2 introduces an expanded, massively multilingual benchmark for Named Entity Recognition, addressing the scarcity of gold-standard evaluation datasets.

Key contributions

Addresses the critical scarcity of gold-standard multilingual NER evaluation benchmarks.
Employs a general tagset and thorough guidelines for standardized cross-lingual annotation.
Collects standardized, cross-lingual named entity span annotations for diverse languages.
Expands upon the initial UNER v1, fostering an active community for dataset development.

Why it matters

This paper is crucial as it addresses the critical lack of gold-standard evaluation benchmarks for multilingual language models in Named Entity Recognition. By providing a standardized, massively multilingual dataset, it enables more robust and accurate assessment of LLMs across diverse languages, fostering progress in cross-lingual NLP.

Original Abstract

While multilingual language models promise to bring the benefits of LLMs to speakers of many languages, gold-standard evaluation benchmarks in most languages to interrogate these assumptions remain scarce. The Universal NER project, now entering its fourth year, is dedicated to building gold-standard multilingual Named Entity Recognition (NER) benchmark datasets. Inspired by existing massively multilingual efforts for other core NLP tasks (e.g., Universal Dependencies), the project uses a general tagset and thorough annotation guidelines to collect standardized, cross-lingual annotations of named entity spans. The first installment (UNER v1) was released in 2024, and the project has continued and expanded since then, with various organizers, annotators, and collaborators in an active community.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers