Autonomous Adversary: Red-Teaming in the age of LLM

May 7, 20262605.06486

Mohammad Mamun, Mohamed Gaber, Scott Buffett, Sherif Saad

cs.CR

TLDR

This paper explores Language Model Agents (LMAs) for red-teaming, benchmarking their effectiveness in lateral movement scenarios and identifying key limitations.

Key contributions

LMAs enhance red-teaming for attack planning and multi-step adversary emulation, including lateral movement.
Benchmarked LMAs in two lateral-movement scenarios within a controlled adversary-emulation environment.
Compared fully autonomous, self-scaffolded, and expert-defined action plans for LMA operations.
Expert-defined plans performed best, but all modalities faced frequent failures from brittle commands and state issues.

Why it matters

This paper is crucial for understanding the current capabilities and limitations of LLM agents in cybersecurity red-teaming. It highlights that while expert guidance improves performance, significant challenges remain in deploying these autonomous adversaries effectively.

Original Abstract

Language Model Agents (LMAs) are emerging as a powerful primitive for augmenting red-team operations. They can support attack planning, adversary emulation, and the orchestration of multi-step activity such as lateral movement, a core enabling capability of advanced persistent threat (APT) campaigns. Using frameworks such as MITRE ATT&CK, we analyze where these agents intersect with core offensive functions and assess current strengths and limitations of LMAs with an emphasis on governance and realistic evaluation. We benchmark LMAs across two lateral-movement scenarios in a controlled adversary-emulation environment, where LMAs interact with instrumented cyber agents, observe execution artifacts, and iteratively adapt based on environmental feedback. Each scenario is formalized as an ordered task chain with explicit validation predicates, leveraging an LLM-as-a-Judge paradigm to ensure deterministic outcome verification. We compare three operational modalities: fully autonomous execution, self-scaffolded planning, and expert-defined action plans. Preliminary findings indicate that expert-defined action plans yield higher task-completion rates relative to other operational modes. However, failure remains frequent across all modalities, largely attributable to brittle command invocation, environmental and deployment instability, and recurring errors in credential management and state handling.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers