MAS-Algorithm: A Workflow for Solving Algorithmic Programming Problems with a Multi-Agent System
Yuliang Xu, Xiang Xu, Yao Wan, Hu Wei, Tong Jia
TLDR
MAS-Algorithm introduces a multi-agent workflow for algorithmic problem solving, significantly boosting AI coding system performance and interpretability.
Key contributions
- Proposes MAS-Algorithm, a multi-agent workflow for algorithmic problem solving, inspired by human experts.
- Decomposes problem-solving into modular stages, enabling structured reasoning, tool integration, and agent coordination.
- Achieves significant performance gains (6.48% on custom benchmark, 4.72% on LiveCodeBench-Pro).
- Provides comprehensive analysis of reasoning, error patterns, and individual agent contributions.
Why it matters
MAS-Algorithm introduces a multi-agent system significantly improving AI's ability to solve complex algorithmic problems. It offers a more interpretable and effective approach than traditional methods, demonstrating substantial performance gains and paving the way for robust AI coding systems.
Original Abstract
Algorithmic problem solving serves as a rigorous testbed for evaluating structured reasoning in AI coding systems, as it directly reflects a model's ability to perform structured reasoning in complex scenarios.Existing approaches predominantly rely on model-centric strategies, such as architectural modifications and data scaling, which are costly and offer limited interpretability. Alternative methods leveraging external tools or prompting techniques (e.g., chain-of-thought) are often fragmented and lack a unified framework. In this paper, we propose MAS-Algorithm, a systematic multi-agent workflow for algorithmic problem solving inspired by the practices of competitive programmers and algorithm engineers. Our framework decomposes the end-to-end solving process into modular stages, enabling structured reasoning, tool integration, and flexible coordination among agents. The design emphasizes both rigor and extensibility, allowing it to generalize across diverse problem types.Experimental results on a self-constructed benchmark demonstrate consistent improvements across multiple Qwen series models, achieving an average gain of 6.48% in acceptance rate. In contrast, parameter-efficient fine-tuning on the same data yields only a marginal improvement of 0.89%. We further observe a 4.72% gain on LiveCodeBench-Pro, along with consistent improvements across additional accuracy and efficiency metrics.Beyond performance gains, we conduct comprehensive analyses to better understand the reasoning process within the workflow, including error patterns and cross-scenario behaviors. We further perform customized replacement and ablation studies to explore the upper bound of the framework, showing that individual agents can contribute improvements of up to 27.7%. These results highlight the strong potential of MAS-Algorithm for advancing AI-driven algorithmic reasoning.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.