Align Your Structures: Generating Trajectories with Structure Pretraining for Molecular Dynamics
Aniketh Iyengar, Jiaqi Han, Pengwei Sun, Mingjian Jiang, Jianwen Xie + 1 more
TLDR
A new framework uses structure pretraining and diffusion models to generate realistic molecular dynamics trajectories, overcoming data scarcity.
Key contributions
- Leverages structure pretraining to generate molecular dynamics (MD) trajectories.
- Trains a diffusion model on large conformer datasets for initial structure generation.
- Introduces an interpolator module for temporal consistency using MD trajectory data.
- Decomposes MD modeling into structural generation and temporal alignment subproblems.
Why it matters
This paper addresses the challenge of limited MD data by decoupling the problem. By pretraining on abundant structural data, it significantly improves the realism and accuracy of generated MD trajectories. This approach makes deep generative models more practical for molecular dynamics.
Original Abstract
Generating molecular dynamics (MD) trajectories using deep generative models has attracted increasing attention, yet remains inherently challenging due to the limited availability of MD data and the complexities involved in modeling high-dimensional MD distributions. To overcome these challenges, we propose a novel framework that leverages structure pretraining for MD trajectory generation. Specifically, we first train a diffusion-based structure generation model on a large-scale conformer dataset, on top of which we introduce an interpolator module trained on MD trajectory data, designed to enforce temporal consistency among generated structures. Our approach effectively harnesses abundant structural data to mitigate the scarcity of MD trajectory data and effectively decomposes the intricate MD modeling task into two manageable subproblems: structural generation and temporal alignment. We comprehensively evaluate our method on the QM9 and DRUGS small-molecule datasets across unconditional generation, forward simulation, and interpolation tasks, and further extend our framework and analysis to tetrapeptide and protein monomer systems. Experimental results confirm that our approach excels in generating chemically realistic MD trajectories, as evidenced by remarkable improvements of accuracy in geometric, dynamical, and energetic measurements.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.