CollabCoder: Plan-Code Co-Evolution via Collaborative Decision-Making for Efficient Code Generation
Duy Tung Doan, Quang Huy Phung, Dzung Nguyen, Khac-Hoai Nam Bui
TLDR
CollabCoder introduces a Plan-Code Co-Evolution framework using dynamic multi-agent collaboration for efficient and robust automated code generation.
Key contributions
- Introduces CollabCoder, a Plan-Code Co-Evolution framework for dynamic code generation.
- Uses collaborative decision-making between plan and code modules for debugging.
- Significantly improves code quality and robustness across various benchmarks.
- Reduces computational overhead and API calls while matching SOTA performance.
Why it matters
Automated code generation faces challenges like static planning and high overhead. CollabCoder addresses this with dynamic collaboration, leading to more efficient and robust solutions. It achieves SOTA results with fewer resources, making complex code generation more practical.
Original Abstract
Automated code generation remains a persistent challenge in software engineering, as conventional multi-agent frameworks are often constrained by static planning, isolated execution, high computational overhead, and limited adaptability to complex tasks. This paper introduces CollabCoder, a novel Plan-Code Co-Evolution framework that improves code generation through dynamic multi-agent collaboration. The core idea is to design a collaborative decision-making process between the plan module and the code module to decide which module should be executed for the debugging process. Extensive experiments on widely used benchmarks demonstrate that CollabCoder consistently improves code quality and robustness across tasks. Importantly, CollabCoder achieves performance comparable to or exceeding current state-of-the-art methods while reducing computational overhead, with efficiency gains becoming more pronounced as benchmark difficulty increases. On the more challenging LiveCodeBench and xCodeEval benchmarks, our approach improves performance by 11-20% over strong baselines while reducing the number of API calls by an average of 4-10 per execution.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.