ArXiv TLDR

Translating Under Pressure: Domain-Aware LLMs for Crisis Communication

🐦 Tweet
2604.26597

Antonio Castaldo, Maria Carmen Staiano, Johanna Monti, Sheila Castilho, Francesca Chiusaroli

cs.CLcs.AI

TLDR

A domain-adaptive pipeline uses LLMs to translate crisis communication into simplified English, improving readability and adequacy.

Key contributions

  • Proposes a domain-adaptive pipeline to expand small crisis communication corpora.
  • Fine-tunes small language models for crisis translation using the expanded dataset.
  • Employs preference optimization to generate CEFR A2-level simplified English outputs.

Why it matters

This paper addresses the critical challenge of multilingual communication during crises, where parallel data is scarce. It offers a practical solution by demonstrating that simplified English, combined with domain adaptation, can serve as an effective lingua franca for emergency situations.

Original Abstract

Timely and reliable multilingual communication is critical during natural and human-induced disasters, but developing effective solutions for crisis communication is limited by the scarcity of curated parallel data. We propose a domain-adaptive pipeline that expands a small reference corpus, by retrieving and filtering data from general corpora. We use the resulting dataset to fine-tune a small language model for crisis-domain translation and then apply preference optimization to bias outputs toward CEFR A2-level English. Automatic and human evaluation shows that this approach improves readability, while maintaining strong adequacy. Our results indicate that simplified English, combined with domain adaptation, can function as a practical lingua franca for emergency communication when full multilingual coverage is not feasible.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.