Foundation Models to Unlock Real-World Evidence from Nationwide Medical Claims
Fan Ma, Yuntian Liu, Xiang Lan, Weipeng Zhou, Jun Ni + 19 more
TLDR
ReClaim, a generative transformer, unlocks real-world evidence from nationwide medical claims, significantly improving disease prediction and RWE analysis.
Key contributions
- Introduces ReClaim, a 1.7B parameter generative transformer trained on 43.8B medical events from 200M patients.
- Achieved 75.6% mean AUC for disease prediction, outperforming LightGBM (66.3%) and Delphi (69.4%).
- Improved healthcare expenditure forecasting (explained variance 0.37 vs 0.28) and reduced RWE bias by 72%.
- Demonstrates claims data as a scalable substrate for healthcare foundation models with generalizable representations.
Why it matters
This paper introduces ReClaim, a groundbreaking foundation model for healthcare that leverages vast medical claims data. It significantly advances disease prediction, expenditure forecasting, and real-world evidence generation, offering a powerful tool for healthcare decision-making and regulatory evaluation.
Original Abstract
Evidence derived from large-scale real-world data (RWD) is increasingly informing regulatory evaluation and healthcare decision-making. Administrative claims provide population-scale, longitudinal records of healthcare utilization, expenditure, and detailed coding of diagnoses, procedures, and medications, yet their potential as a substrate for healthcare foundation models remains largely unexplored. Here we present ReClaim, a generative transformer trained from scratch on 43.8 billion medical events from more than 200 million enrollees in the MarketScan claims data spanning 2008-2022. ReClaim models longitudinal trajectories across diagnoses, procedures, medications, and expenditure, and was scaled to 140 million, 700 million, and 1.7 billion parameters. Across over 1,000 disease-onset prediction tasks, ReClaim achieved a mean AUC of 75.6%, substantially outperforming disease-specific LightGBM (66.3%) and the transformer-based Delphi model (69.4%), with the largest gains for rare diseases. These advantages held across retrospective and prospective evaluations and in external validation on two independent datasets. Performance improved monotonically with scale, and post-training added 13.8 percentage points over pre-training alone. Beyond disease prediction, ReClaim captured financial outcomes and improved real-world evidence (RWE) analyses: for healthcare expenditure forecasting it increased explained variance from 0.28 to 0.37 relative to LightGBM, and in a target trial emulation it reduced systematic bias by 72% on average relative to Delphi. Together, these results establish administrative claims as a scalable substrate for healthcare foundation models and show that learned representations generalize across time periods and data sources, supporting disease surveillance, expenditure forecasting, and RWE generation.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.