Domain-Adapted Small Language Models for Reliable Clinical Triage
Manar Aljohani, Brandon Ho, Kenneth McKinley, Dennis Ren, Xuan Wang
TLDR
Domain-adapted small language models (SLMs) reliably improve clinical triage accuracy, outperforming LLMs for Emergency Severity Index (ESI) assignment.
Key contributions
- Evaluated open-source SLMs for Emergency Severity Index (ESI) assignment in clinical triage.
- Found clinical vignettes to be the most effective prompting strategy for accurate predictions.
- Qwen2.5-7B showed the best balance of accuracy, stability, and computational efficiency.
- Domain-adapted Qwen2.5-7B significantly reduced errors, outperforming even GPT-4o.
Why it matters
This paper demonstrates that institution-specific small language models can provide reliable and privacy-preserving decision support for clinical triage. It highlights the critical role of targeted domain adaptation and fine-tuning in achieving superior performance over larger, more complex models. This approach offers a practical solution for improving patient care.
Original Abstract
Accurate and consistent Emergency Severity Index (ESI) assignment remains a persistent challenge in emergency departments, where highly variable free-text triage documentation contributes to mistriage and workflow inefficiencies. This study evaluates whether open-source small language models (SLMs) can serve as reliable, privacy-preserving decision-support tools for clinical triage. We systematically compared multiple SLMs across diverse prompting pipelines and found that clinical vignettes, concise summaries of triage narratives, yielded the most accurate predictions. The SLM, Qwen2.5-7B, demonstrated the strongest balance of accuracy, stability, and computational efficiency. Through large-scale domain adaptation using expert-curated and silver-standard pediatric triage data, fine-tuned Qwen2.5-7B models substantially reduced discordance and clinically significant errors, outperforming all baseline SLMs and advanced proprietary large language models (LLMs, e.g., GPT-4o). These findings highlight the feasibility of institution-specific SLMs for reliable, privacy-preserving ESI decision support and underscore the importance of targeted fine-tuning over more complex inference strategies.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.