ArXiv TLDR

SWE-AGILE: A Software Agent Framework for Efficiently Managing Dynamic Reasoning Context

🐦 Tweet
2604.11716

Shuquan Lian, Juncheng Liu, Yazhe Chen, Yuhong Chen, Hui Li

cs.AIcs.CL

TLDR

SWE-AGILE is a software agent framework that uses dynamic context management to improve reasoning depth and efficiency in multi-turn SWE tasks.

Key contributions

  • Manages context explosion in multi-turn SWE tasks by balancing depth and efficiency.
  • Employs a Dynamic Reasoning Context strategy with a "sliding window" for continuity.
  • Compresses historical reasoning into concise "Reasoning Digests" to prevent redundancy.
  • Achieves new state-of-the-art for 7B-8B models on SWE-Bench-Verified with fewer trajectories.

Why it matters

This paper addresses a critical challenge in autonomous software engineering: managing reasoning context without sacrificing depth or efficiency. By introducing a novel dynamic context strategy, SWE-AGILE enables agents to perform complex analysis more effectively. This significantly improves the performance of smaller models on challenging SWE tasks.

Original Abstract

Prior representative ReAct-style approaches in autonomous Software Engineering (SWE) typically lack the explicit System-2 reasoning required for deep analysis and handling complex edge cases. While recent reasoning models demonstrate the potential of extended Chain-of-Thought (CoT), applying them to the multi-turn SWE task creates a fundamental dilemma: retaining full reasoning history leads to context explosion and ``Lost-in-the-Middle'' degradation, while discarding it would force the agent to redundantly re-reason at every step. To address these challenges, we propose SWE-AGILE, a novel software agent framework designed to bridge the gap between reasoning depth, efficiency, and context constraints. SWE-AGILE introduces a Dynamic Reasoning Context strategy, maintaining a ``sliding window'' of detailed reasoning for immediate continuity to prevent redundant re-analyzing, while compressing historical reasoning content into concise Reasoning Digests. Empirically, SWE-AGILE sets a new standard for 7B-8B models on SWE-Bench-Verified using only 2.2k trajectories and 896 tasks. Code is available at https://github.com/KDEGroup/SWE-AGILE.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.