Insights into Security-Related AI-Generated Pull Requests

April 21, 20262604.19965

Md Fazle Rabbi, Asif K. Turzo, Arifa I. Champa, Minhaz F. Zibran

cs.SE

TLDR

This paper analyzes 33,000 AI-generated pull requests, finding 675 security-related ones with recurring weaknesses, often merged despite flaws.

Key contributions

Analyzed 33,000 AI PRs, identifying 675 security-related submissions by agentic AIs.
Identified recurring weaknesses in AI security PRs: regex inefficiencies, injection flaws, and path traversal.
Many flawed AI security PRs are merged; rejections often due to social or process factors.
AI PR commit message quality has limited effect on acceptance or latency, unlike human PRs.

Why it matters

This paper highlights critical security vulnerabilities introduced by AI coding agents in pull requests. It reveals that many flawed contributions are merged, underscoring the need for improved AI security practices and review processes. The findings offer insights into autonomous coding system limitations.

Original Abstract

Recent years have experienced growing contributions of AI coding agents that assist human developers in various software engineering tasks. However, this growing AI-assisted autonomy raises questions about security and trust. In this paper, we analyze more than 33,000 AI-generated pull requests (PRs) and identify 675 security-related submissions made by agentic AIs. Then we examine the security-related PRs with a focus on recurring security weaknesses, review outcomes and latency, commit message quality, and rejection reasons. The results show that security-related AI PRs introduce a small set of recurring weaknesses such as regex inefficiencies, injection flaws, and path traversal. Many flawed contributions are still merged, while rejections often arise from social or process factors such as inactivity or missing test coverage. The commit message quality of AI PRs has a limited effect on acceptance or latency, in contrast to human PRs reported in previous studies. We also extend existing rejection taxonomies by adding categories that are unique to AI-generated security contributions. These findings offer new insights into the strengths and shortcomings of autonomous coding systems in secure software development.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers