TriFit: Trimodal Fusion with Protein Dynamics for Mutation Fitness Prediction
TLDR
TriFit predicts mutation fitness by fusing sequence, structure, and protein dynamics, outperforming prior methods.
Key contributions
- TriFit integrates sequence, structure, and protein dynamics for mutation fitness prediction.
- Uses a Mixture-of-Experts (MoE) module with trimodal contrastive learning for adaptive data fusion.
- Achieves state-of-the-art AUROC 0.897 on ProteinGym, outperforming all prior supervised models.
- Ablation confirms protein dynamics provides the largest marginal contribution to prediction accuracy.
Why it matters
TriFit significantly advances mutation fitness prediction by uniquely integrating protein dynamics with sequence and structure, a crucial factor previously overlooked. This novel multimodal approach achieves state-of-the-art accuracy, providing a powerful tool for understanding genetic disease and engineering therapeutic proteins.
Original Abstract
Predicting the functional impact of single amino acid substitutions (SAVs) is central to understanding genetic disease and engineering therapeutic proteins. While protein language models and structure-based methods have achieved strong performance on this task, they systematically neglect protein dynamics; residue flexibility, correlated motions, and allosteric coupling are well-established determinants of mutational tolerance in structural biology, yet have not been incorporated into supervised variant effect predictors. We present TriFit, a multimodal framework that integrates sequence, structure, and protein dynamics through a four-expert Mixture-of-Experts (MoE) fusion module with trimodal cross-modal contrastive learning. Sequence embeddings are extracted via masked marginal scoring with ESM-2 (650M); structural embeddings from AlphaFold2-predicted C-alpha geometries; and dynamics embeddings from Gaussian Network Model (GNM) B-factors, mode shapes, and residue-residue cross-correlations. The MoE router adaptively weights modality combinations conditioned on the input, enabling protein-specific fusion without fixed modality assumptions. On the ProteinGym substitution benchmark (217 DMS assays, 696k SAVs), TriFit achieves AUROC 0.897 +/- 0.0002, outperforming all supervised baselines including Kermut (0.864) and ProteinNPT (0.844), and the best zero-shot model ESM3 (0.769). Ablation studies confirm that dynamics provides the largest marginal contribution over pairwise modality combinations, and TriFit achieves well-calibrated probabilistic outputs (ECE = 0.044) without post-hoc correction.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.