AblateCell: A Reproduce-then-Ablate Agent for Virtual Cell Repositories
Xue Xia, Chengkai Yao, Mingyu Tsoi, Xinjie Mao, Wenxuan Huang + 7 more
TLDR
AblateCell is an AI agent that reproduces baselines and performs systematic ablations on virtual cell repositories to identify critical components.
Key contributions
- Reproduces baselines in virtual cell repositories by auto-configuring environments and resolving dependencies.
- Conducts closed-loop ablations, generating repository mutations and adaptively selecting experiments.
- Achieves 88.9% workflow success and 93.3% accuracy in identifying critical components.
Why it matters
AI Virtual Cells lack systematic ablation due to under-standardized biological repositories. AblateCell closes this gap by providing a verifiable method for attributing performance gains. This enables scalable, repository-grounded verification directly on biological codebases.
Original Abstract
Systematic ablations are essential to attribute performance gains in AI Virtual Cells, yet they are rarely performed because biological repositories are under-standardized and tightly coupled to domain-specific data and formats. While recent coding agents can translate ideas into implementations, they typically stop at producing code and lack a verifier that can reproduce strong baselines and rigorously test which components truly matter. We introduce AblateCell, a reproduce-then-ablate agent for virtual cell repositories that closes this verification gap. AblateCell first reproduces reported baselines end-to-end by auto-configuring environments, resolving dependency and data issues, and rerunning official evaluations while emitting verifiable artifacts. It then conducts closed-loop ablation by generating a graph of isolated repository mutations and adaptively selecting experiments under a reward that trades off performance impact and execution cost. Evaluated on three single-cell perturbation prediction repositories (CPA, GEARS, BioLORD), AblateCell achieves 88.9% (+29.9% to human expert) end-to-end workflow success and 93.3% (+53.3% to heuristic) accuracy in recovering ground-truth critical components. These results enable scalable, repository-grounded verification and attribution directly on biological codebases.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.