SLIM: Sparse Latent Steering for Interpretable and Property-Directed LLM-Based Molecular Editing

May 11, 20262605.10831

Mingxu Zhang, Yuhan Li, Lujundong Li, Dazhong Shen, Hui Xiong + 1 more

cs.LGcs.AIcs.CEcs.CL

TLDR

SLIM enhances LLM molecular editing by using sparse latent steering to precisely control properties and improve success rates.

Key contributions

Decomposes LLM hidden states into sparse, property-aligned features using a Sparse Autoencoder.
Enables precise steering in this sparse space, activating only property-relevant dimensions for editing.
Significantly improves molecular editing success rates without modifying the underlying LLM parameters.
Provides interpretable analysis of LLM editing behavior through its sparse feature basis.

Why it matters

LLMs are powerful for molecular design, but their black-box nature hinders precise property control, leading to inefficient edits. SLIM offers a plug-and-play solution for targeted property modification, making LLM-based molecular editing more effective and reliable for drug discovery and materials science.

Original Abstract

Large language models possess strong chemical reasoning capabilities, making them effective molecular editors. However, property-relevant information is implicitly entangled across their dense hidden states, providing no explicit handle for property control: a substantial fraction of edits fail to improve or even degrade target properties. To address these issues, we propose SLIM (Sparse Latent Interpretable Molecular editing), a plug-and-play framework that decomposes the editor's hidden states into sparse, property-aligned features via a Sparse Autoencoder with learnable importance gates. Steering in this sparse feature space precisely activates property-relevant dimensions, improving editing success rate without modifying model parameters. The same sparse basis further supports interpretable analysis of editing behavior. Experiments on the MolEditRL benchmark across four model architectures and eight molecular properties show consistent gains over baselines, with improvements of up to 42.4 points.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers