Crystalite: A Lightweight Transformer for Efficient Crystal Modeling
Tin Hadži Veljković, Joshua Rosenthal, Ivor Lončarić, Jan-Willem van de Meent
TLDR
Crystalite is a lightweight diffusion Transformer that efficiently models crystalline materials, achieving state-of-the-art results with faster sampling.
Key contributions
- Introduces Crystalite, a lightweight diffusion Transformer for crystal modeling.
- Employs Subatomic Tokenization for compact, chemically structured atom representation.
- Uses Geometry Enhancement Module (GEM) to inject periodic geometry into attention.
- Achieves SOTA crystal structure prediction and faster de novo generation.
Why it matters
Current generative models for crystals are slow and costly. Crystalite offers a more efficient and accurate alternative by adapting Transformers for crystal structures, accelerating new material discovery.
Original Abstract
Generative models for crystalline materials often rely on equivariant graph neural networks, which capture geometric structure well but are costly to train and slow to sample. We present Crystalite, a lightweight diffusion Transformer for crystal modeling built around two simple inductive biases. The first is Subatomic Tokenization, a compact chemically structured atom representation that replaces high-dimensional one-hot encodings and is better suited to continuous diffusion. The second is the Geometry Enhancement Module (GEM), which injects periodic minimum-image pair geometry directly into attention through additive geometric biases. Together, these components preserve the simplicity and efficiency of a standard Transformer while making it better matched to the structure of crystalline materials. Crystalite achieves state-of-the-art results on crystal structure prediction benchmarks, and de novo generation performance, attaining the best S.U.N. discovery score among the evaluated baselines while sampling substantially faster than geometry-heavy alternatives.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.