Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

April 16, 20262604.15022

Haochun Tang, Yuliang Yan, Jiahua Lu, Huaxiao Liu, Enyan Dai

cs.CRcs.AIcs.CLcs.LG

TLDR

R$^2$A is a novel black-box attack that manipulates LLM routers to select expensive models, demonstrating a new security vulnerability.

Key contributions

Introduces R$^2$A, a black-box adversarial suffix optimization attack on LLM routers.
Proposes a hybrid ensemble surrogate router to effectively mimic black-box routing systems.
Adapts a suffix optimization algorithm for the ensemble surrogate to generate attack suffixes.
Demonstrates R$^2$A's effectiveness on open-source and commercial LLM routing systems.

Why it matters

This paper highlights a critical security vulnerability in cost-aware LLM routing systems. By demonstrating a practical black-box attack, it urges developers to enhance the robustness of their routing mechanisms. The findings are crucial for securing LLM deployments against malicious cost manipulation.

Original Abstract

Cost-aware routing dynamically dispatches user queries to models of varying capability to balance performance and inference cost. However, the routing strategy introduces a new security concern that adversaries may manipulate the router to consistently select expensive high-capability models. Existing routing attacks depend on either white-box access or heuristic prompts, rendering them ineffective in real-world black-box scenarios. In this work, we propose R$^2$A, which aims to mislead black-box LLM routers to expensive models via adversarial suffix optimization. Specifically, R$^2$A deploys a hybrid ensemble surrogate router to mimic the black-box router. A suffix optimization algorithm is further adapted for the ensemble-based surrogate. Extensive experiments on multiple open-source and commercial routing systems demonstrate that {R$^2$A} significantly increases the routing rate to expensive models on queries of different distributions. Code and examples: https://github.com/thcxiker/R2A-Attack.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers