From Trajectories to Phenotypes: Disease Progression as Structural Priors for Multi-organ Imaging Representation Learning
Zian Wang, Lizhen Lan, Guangming Wang, Haosen Zhang, Minxuan Xu + 7 more
TLDR
A new framework distills disease trajectory knowledge into imaging models, significantly improving disease prediction, especially for rare conditions.
Key contributions
- Introduces a trajectory-aware distillation framework for integrating dynamic disease progression into imaging.
- Leverages a generative disease trajectory Transformer to supervise organ-wise imaging representation learning.
- Significantly improves disease discrimination and time-to-onset prediction, especially for low-prevalence diseases.
Why it matters
This work addresses the limitation of static imaging phenotypes by incorporating dynamic disease progression. It provides a novel method to improve disease prediction, particularly for rare conditions where data is scarce. This approach enhances the robustness of imaging models under realistic cohort constraints.
Original Abstract
Imaging-derived phenotypes (IDPs) summarize multi-organ physiology but provide only static snapshots of diseases that evolve over time. In contrast, longitudinal electronic health records encode disease trajectories through temporal dependencies among past diagnosis events and comorbidity structure. We hypothesize that IDPs and disease trajectories contain partially shared disease-relevant structure. We propose a trajectory-aware distillation framework that transfers structural knowledge from a generative disease trajectory Transformer into an organ-wise IDP encoder. A population-scale trajectory model trained on longitudinal diagnosis sequences produces subject-level embeddings that supervise IDP representation learning via geometry-preserving alignment. During downstream prediction, trajectory and imaging representations can also be fused via cross-attention. Across 159 diseases in the UK Biobank cohort, trajectory-aware pretraining consistently improves both discrimination (AUC) and time-to-onset prediction (MAE), with the largest gains for low-prevalence diseases. Similarity relationships in IDP embedding space also align with those in trajectory space, providing supportive evidence for partially aligned representation geometry. These results suggest that population-scale generative disease models can serve as structural priors for data-limited imaging modalities, improving robustness under realistic cohort constraints.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.