AblateCell: A Reproduce-then-Ablate Agent for Virtual Cell Repositories

April 21, 20262604.19606

Xue Xia, Chengkai Yao, Mingyu Tsoi, Xinjie Mao, Wenxuan Huang + 7 more

cs.AIcs.MA

TLDR

AblateCell is an AI agent that reproduces baselines and performs systematic ablations on virtual cell repositories to identify critical components.

Key contributions

Reproduces baselines in virtual cell repositories by auto-configuring environments and resolving dependencies.
Conducts closed-loop ablations, generating repository mutations and adaptively selecting experiments.
Achieves 88.9% workflow success and 93.3% accuracy in identifying critical components.

Why it matters

AI Virtual Cells lack systematic ablation due to under-standardized biological repositories. AblateCell closes this gap by providing a verifiable method for attributing performance gains. This enables scalable, repository-grounded verification directly on biological codebases.

Original Abstract

Systematic ablations are essential to attribute performance gains in AI Virtual Cells, yet they are rarely performed because biological repositories are under-standardized and tightly coupled to domain-specific data and formats. While recent coding agents can translate ideas into implementations, they typically stop at producing code and lack a verifier that can reproduce strong baselines and rigorously test which components truly matter. We introduce AblateCell, a reproduce-then-ablate agent for virtual cell repositories that closes this verification gap. AblateCell first reproduces reported baselines end-to-end by auto-configuring environments, resolving dependency and data issues, and rerunning official evaluations while emitting verifiable artifacts. It then conducts closed-loop ablation by generating a graph of isolated repository mutations and adaptively selecting experiments under a reward that trades off performance impact and execution cost. Evaluated on three single-cell perturbation prediction repositories (CPA, GEARS, BioLORD), AblateCell achieves 88.9% (+29.9% to human expert) end-to-end workflow success and 93.3% (+53.3% to heuristic) accuracy in recovering ground-truth critical components. These results enable scalable, repository-grounded verification and attribution directly on biological codebases.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers