AI Co-Mathematician: Accelerating Mathematicians with Agentic AI
Daniel Zheng, Ingrid von Glehn, Yori Zwols, Iuliya Beloshapka, Lars Buesing + 13 more
TLDR
The AI co-mathematician is an interactive workbench using agentic AI to accelerate mathematicians' open-ended research and discovery.
Key contributions
- Introduces an interactive AI workbench for accelerating mathematicians' open-ended research.
- Provides holistic support for full mathematical workflows: ideation, search, computation, and theorem proving.
- Offers an asynchronous, stateful workspace that manages uncertainty and mirrors human collaboration.
- Achieves state-of-the-art results on FrontierMath Tier 4, scoring 48% and setting a new high score.
Why it matters
This paper introduces a novel interactive paradigm for AI-assisted mathematical discovery, significantly accelerating researchers. It helps mathematicians solve open problems, identify new directions, and sets a new state-of-the-art on hard problem-solving benchmarks like FrontierMath Tier 4.
Original Abstract
We introduce the AI co-mathematician, a workbench for mathematicians to interactively leverage AI agents to pursue open-ended research. The AI co-mathematician is optimized to provide holistic support for the exploratory and iterative reality of mathematical workflows, including ideation, literature search, computational exploration, theorem proving and theory building. By providing an asynchronous, stateful workspace that manages uncertainty, refines user intent, tracks failed hypotheses, and outputs native mathematical artifacts, the system mirrors human collaborative workflows. In early tests, the AI co-mathematician helped researchers solve open problems, identify new research directions, and uncover overlooked literature references. Besides demonstrating a highly interactive paradigm for AI-assisted mathematical discovery, the AI co-mathematician also achieves state of the art results on hard problem-solving benchmarks, including scoring 48% on FrontierMath Tier 4, a new high score among all AI systems evaluated.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.