EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records
Saeed Shurrab, Mariam Al-Omari, Dana El Samad, Farah E. Shamout
TLDR
EHR-RAGp is a retrieval-augmented foundation model for EHRs, dynamically integrating relevant patient history via a prototype-guided module for better clinical predictions.
Key contributions
- Introduces EHR-RAGp, a retrieval-augmented foundation model for Electronic Health Records.
- Employs a prototype-guided retrieval module to dynamically integrate relevant patient history.
- Estimates relevance of historical data chunks, guiding the model to the most informative clinical context.
- Outperforms SOTA EHR foundation models and baselines on multiple clinical prediction tasks.
Why it matters
This work tackles the critical challenge of effectively utilizing extensive and complex Electronic Health Records for predictive modeling. EHR-RAGp's dynamic, prototype-guided retrieval of relevant patient history offers substantial performance gains, advancing clinical AI applications.
Original Abstract
Electronic Health Records (EHR) contain rich longitudinal patient information and are widely used in predictive modeling applications. However, effectively leveraging historical data remains challenging due to long trajectories, heterogeneous events, temporal irregularity, and the varying relevance of past clinical context. Existing approaches often rely on fixed windows or uniform aggregation, which can obscure clinically important signals. In this work, we introduce EHR-RAGp, a retrieval-augmented foundation model that dynamically integrates the most relevant patient history across diverse clinical event types. We propose a prototype-guided retrieval module that acts as an alignment mechanism and estimates the relevance of retrieved historical chunks with respect to a given prediction task, guiding the model towards the most informative context. Across multiple clinical prediction tasks, EHR-RAGp consistently outperforms state-of-the-art EHR foundation models and transformer-based baselines. Furthermore, integrating EHR-RAGp with existing clinical foundation models yields substantial performance gains. Overall, EHR-RAGp provides a scalable and efficient framework for leveraging long-range clinical context to improve downstream performance.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.