ArXiv TLDR

To What Extent Does Agent-generated Code Require Maintenance? An Empirical Study

🐦 Tweet
2605.06464

Shota Sawada, Tatsuya Shirai, Yutaro Kashiwa, Ken'ichi Yamaguchi, Hiroshi Iwata + 1 more

cs.SE

TLDR

An empirical study reveals AI-generated code requires less frequent maintenance, primarily for feature extensions by humans, unlike human code needing bug fixes.

Key contributions

  • AI-generated code receives less frequent maintenance than human-authored code.
  • Modifications to AI code are mostly feature extensions, while human code updates are bug fixes.
  • Human developers perform the vast majority of maintenance on AI-generated code.

Why it matters

This paper sheds light on the long-term implications of using LLM-generated code in software development. It highlights that while AI code may need less frequent updates, human oversight remains crucial, especially for evolving features, informing future autonomous agent design.

Original Abstract

LLM-based autonomous coding agents have reshaped software development. While these agents excel at code generation, open questions persist about the long-term maintainability of AI-generated code. This study empirically investigates the maintenance extent, human involvement, and modification types of AI-generated files versus human-authored code. Using the AIDev dataset of AI-generated pull requests and GitHub, we analyzed over 1,000 files and approximately 3,200 changes from 100 popular repositories. Our findings show that: (i) AI-generated files receive less frequent maintenance than human-authored code, with updates affecting only a small fraction of file size; (ii) the most frequent modifications to AI code are feature extensions, whereas human updates focus on bug fixes, and (iii) human developers perform the large majority of this maintenance.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.