PIArena: A Platform for Prompt Injection Evaluation
Runpeng Geng, Chenlong Yin, Yanting Wang, Ying Chen, Jinyuan Jia
TLDR
PIArena is a unified platform for evaluating prompt injection defenses, revealing their limitations in generalizability and robustness.
Key contributions
- Introduces PIArena, a unified and extensible platform for prompt injection evaluation.
- Enables easy integration and comparison of state-of-the-art attacks and defenses across benchmarks.
- Designs a novel dynamic strategy-based attack that adaptively optimizes injected prompts.
- Uncovers critical limitations of current defenses, including poor generalizability and vulnerability to adaptive attacks.
Why it matters
The absence of a standardized evaluation platform has made it difficult to assess prompt injection defenses reliably. PIArena fills this critical gap, providing a much-needed environment for rigorous testing. Its findings expose significant weaknesses in current defenses, paving the way for the development of more robust and generalizable solutions.
Original Abstract
Prompt injection attacks pose serious security risks across a wide range of real-world applications. While receiving increasing attention, the community faces a critical gap: the lack of a unified platform for prompt injection evaluation. This makes it challenging to reliably compare defenses, understand their true robustness under diverse attacks, or assess how well they generalize across tasks and benchmarks. For instance, many defenses initially reported as effective were later found to exhibit limited robustness on diverse datasets and attacks. To bridge this gap, we introduce PIArena, a unified and extensible platform for prompt injection evaluation that enables users to easily integrate state-of-the-art attacks and defenses and evaluate them across a variety of existing and new benchmarks. We also design a dynamic strategy-based attack that adaptively optimizes injected prompts based on defense feedback. Through comprehensive evaluation using PIArena, we uncover critical limitations of state-of-the-art defenses: limited generalizability across tasks, vulnerability to adaptive attacks, and fundamental challenges when an injected task aligns with the target task. The code and datasets are available at https://github.com/sleeepeer/PIArena.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.