EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers

April 10, 20262604.09130

Yi-Lun Liao, Alexander J. Hoffman, Sabrina C. Shen, Alexandre Duval, Sam Walton Norwood + 1 more

cs.LGcs.AIphysics.comp-ph

TLDR

EquiformerV3 enhances SE(3)-equivariant graph attention Transformers for 3D atomistic modeling, boosting efficiency, expressivity, and generality.

Key contributions

Achieves 1.75x speedup through optimized software implementation.
Introduces equivariant merged layer normalization and smooth radius cutoff attention.
Proposes SwiGLU-$S^2$ activations for many-body interactions and strict equivariance.
Enables accurate modeling of smoothly varying potential energy surfaces (PES).

Why it matters

This paper significantly advances SE(3)-equivariant GNNs, crucial for large-scale 3D atomistic modeling. By improving efficiency, expressivity, and generality, EquiformerV3 enables more accurate and faster simulations. Its ability to model complex energy surfaces and achieve state-of-the-art results makes it a powerful tool for materials science and drug discovery.

Original Abstract

As $SE(3)$-equivariant graph neural networks mature as a core tool for 3D atomistic modeling, improving their efficiency, expressivity, and physical consistency has become a central challenge for large-scale applications. In this work, we introduce EquiformerV3, the third generation of the $SE(3)$-equivariant graph attention Transformer, designed to advance all three dimensions: efficiency, expressivity, and generality. Building on EquiformerV2, we have the following three key advances. First, we optimize the software implementation, achieving $1.75\times$ speedup. Second, we introduce simple and effective modifications to EquiformerV2, including equivariant merged layer normalization, improved feedforward network hyper-parameters, and attention with smooth radius cutoff. Third, we propose SwiGLU-$S^2$ activations to incorporate many-body interactions for better theoretical expressivity and to preserve strict equivariance while reducing the complexity of sampling $S^2$ grids. Together, SwiGLU-$S^2$ activations and smooth-cutoff attention enable accurate modeling of smoothly varying potential energy surfaces (PES), generalizing EquiformerV3 to tasks requiring energy-conserving simulations and higher-order derivatives of PES. With these improvements, EquiformerV3 trained with the auxiliary task of denoising non-equilibrium structures (DeNS) achieves state-of-the-art results on OC20, OMat24, and Matbench Discovery.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers