MASPrism: Lightweight Failure Attribution for Multi-Agent Systems Using Prefill-Stage Signals
Yang Liu, Hongjiang Feng, Junsong Pu, Zhuangbin Chen
TLDR
MASPrism uses SLM prefill-stage signals for lightweight, fast, and accurate failure attribution in multi-agent systems, outperforming larger LLMs.
Key contributions
- MASPrism attributes failures in multi-agent systems using lightweight prefill-stage signals from a small language model.
- It identifies symptom-like steps and candidate sources via two prefill passes, avoiding costly decoding.
- Achieves state-of-the-art accuracy, improving Top-1 on Who&When-HC by 33.41% and outperforming Gemini-2.5-Pro.
- Processes traces 6.69x faster than baselines (2.66s/trace) with zero output tokens, making it highly efficient.
Why it matters
Current failure attribution methods for multi-agent systems are slow and resource-intensive. MASPrism offers a novel, lightweight approach using small language models and prefill-stage signals. This significantly improves accuracy and speed, making it a practical solution for debugging complex LLM-based systems.
Original Abstract
Failure attribution in LLM-based multi-agent systems aims to identify the steps that contribute to a failed execution. This task remains difficult because a single execution can contain many agent actions and tool calls, failure evidence can appear many steps after the original mistake, and existing methods often rely on costly agent workflows, replay, or training on synthetic failure logs. To address these challenges, we propose MASPrism, a lightweight framework that performs failure attribution using prefill-stage signals from a small language model (SLM). MASPrism first extracts token-level negative log-likelihood and attention weights during a prefill pass to identify symptom-like steps and earlier candidate sources, without decoding. It then reconstructs a focused diagnostic prompt and performs a second prefill pass to rank failure-source candidates. Using Qwen3-0.6B as the SLM, MASPrism achieves the best performance on three of the four evaluated subsets across Who&When and TRAIL, improving Top-1 accuracy on Who&When-HC by 33.41% over the best baseline. On TRAIL, MASPrism outperforms strong proprietary LLMs, including Gemini-2.5-Pro, with up to 89.50% relative improvement. MASPrism processes each trace in 2.66 seconds on average, achieving a 6.69$\times$ speedup over the single-pass prompting baseline, with zero output tokens. These results show that MASPrism provides an effective and practical framework for failure attribution in long multi-agent execution logs.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.