ArXiv TLDR

Foundation Models to Unlock Real-World Evidence from Nationwide Medical Claims

🐦 Tweet
2605.02740

Fan Ma, Yuntian Liu, Xiang Lan, Weipeng Zhou, Jun Ni + 19 more

cs.AIcs.CL

TLDR

ReClaim, a generative transformer, unlocks real-world evidence from nationwide medical claims, significantly improving disease prediction and RWE analysis.

Key contributions

  • Introduces ReClaim, a 1.7B parameter generative transformer trained on 43.8B medical events from 200M patients.
  • Achieved 75.6% mean AUC for disease prediction, outperforming LightGBM (66.3%) and Delphi (69.4%).
  • Improved healthcare expenditure forecasting (explained variance 0.37 vs 0.28) and reduced RWE bias by 72%.
  • Demonstrates claims data as a scalable substrate for healthcare foundation models with generalizable representations.

Why it matters

This paper introduces ReClaim, a groundbreaking foundation model for healthcare that leverages vast medical claims data. It significantly advances disease prediction, expenditure forecasting, and real-world evidence generation, offering a powerful tool for healthcare decision-making and regulatory evaluation.

Original Abstract

Evidence derived from large-scale real-world data (RWD) is increasingly informing regulatory evaluation and healthcare decision-making. Administrative claims provide population-scale, longitudinal records of healthcare utilization, expenditure, and detailed coding of diagnoses, procedures, and medications, yet their potential as a substrate for healthcare foundation models remains largely unexplored. Here we present ReClaim, a generative transformer trained from scratch on 43.8 billion medical events from more than 200 million enrollees in the MarketScan claims data spanning 2008-2022. ReClaim models longitudinal trajectories across diagnoses, procedures, medications, and expenditure, and was scaled to 140 million, 700 million, and 1.7 billion parameters. Across over 1,000 disease-onset prediction tasks, ReClaim achieved a mean AUC of 75.6%, substantially outperforming disease-specific LightGBM (66.3%) and the transformer-based Delphi model (69.4%), with the largest gains for rare diseases. These advantages held across retrospective and prospective evaluations and in external validation on two independent datasets. Performance improved monotonically with scale, and post-training added 13.8 percentage points over pre-training alone. Beyond disease prediction, ReClaim captured financial outcomes and improved real-world evidence (RWE) analyses: for healthcare expenditure forecasting it increased explained variance from 0.28 to 0.37 relative to LightGBM, and in a target trial emulation it reduced systematic bias by 72% on average relative to Delphi. Together, these results establish administrative claims as a scalable substrate for healthcare foundation models and show that learned representations generalize across time periods and data sources, supporting disease surveillance, expenditure forecasting, and RWE generation.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.