Foundation Models to Unlock Real-World Evidence from Nationwide Medical Claims

May 4, 20262605.02740

Fan Ma, Yuntian Liu, Xiang Lan, Weipeng Zhou, Jun Ni + 19 more

cs.AIcs.CL

TLDR

ReClaim, a generative transformer, unlocks real-world evidence from nationwide medical claims, significantly improving disease prediction and RWE analysis.

Key contributions

Introduces ReClaim, a 1.7B parameter generative transformer trained on 43.8B medical events from 200M patients.
Achieved 75.6% mean AUC for disease prediction, outperforming LightGBM (66.3%) and Delphi (69.4%).
Improved healthcare expenditure forecasting (explained variance 0.37 vs 0.28) and reduced RWE bias by 72%.
Demonstrates claims data as a scalable substrate for healthcare foundation models with generalizable representations.

Why it matters

This paper introduces ReClaim, a groundbreaking foundation model for healthcare that leverages vast medical claims data. It significantly advances disease prediction, expenditure forecasting, and real-world evidence generation, offering a powerful tool for healthcare decision-making and regulatory evaluation.

Original Abstract

Evidence derived from large-scale real-world data (RWD) is increasingly informing regulatory evaluation and healthcare decision-making. Administrative claims provide population-scale, longitudinal records of healthcare utilization, expenditure, and detailed coding of diagnoses, procedures, and medications, yet their potential as a substrate for healthcare foundation models remains largely unexplored. Here we present ReClaim, a generative transformer trained from scratch on 43.8 billion medical events from more than 200 million enrollees in the MarketScan claims data spanning 2008-2022. ReClaim models longitudinal trajectories across diagnoses, procedures, medications, and expenditure, and was scaled to 140 million, 700 million, and 1.7 billion parameters. Across over 1,000 disease-onset prediction tasks, ReClaim achieved a mean AUC of 75.6%, substantially outperforming disease-specific LightGBM (66.3%) and the transformer-based Delphi model (69.4%), with the largest gains for rare diseases. These advantages held across retrospective and prospective evaluations and in external validation on two independent datasets. Performance improved monotonically with scale, and post-training added 13.8 percentage points over pre-training alone. Beyond disease prediction, ReClaim captured financial outcomes and improved real-world evidence (RWE) analyses: for healthcare expenditure forecasting it increased explained variance from 0.28 to 0.37 relative to LightGBM, and in a target trial emulation it reduced systematic bias by 72% on average relative to Delphi. Together, these results establish administrative claims as a scalable substrate for healthcare foundation models and show that learned representations generalize across time periods and data sources, supporting disease surveillance, expenditure forecasting, and RWE generation.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers