Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps
Jonas Waldendorf, Bashar Awwad Shiekh Hasan, Evgenii Tsymbalov
TLDR
This paper introduces novel attention-derived metrics and lightweight classifiers to efficiently detect hallucinations in SpeechLLMs at inference time.
Key contributions
- Developed four novel attention-derived metrics for efficient SpeechLLM hallucination detection.
- Trained lightweight logistic regression classifiers on these features for inference-time use.
- Outperformed uncertainty-based and prior attention-based baselines by up to +0.23 PR-AUC.
- Achieved strong performance and improved out-of-domain generalization using only ~100 attention heads.
Why it matters
Hallucinations in SpeechLLMs pose significant risks, and current detection methods are often impractical. This paper offers an efficient, inference-time solution using attention maps, crucial for real-world deployment. It improves upon existing baselines and generalizes well, making SpeechLLMs more reliable.
Original Abstract
Hallucinations in Speech Large Language Models (SpeechLLMs) pose significant risks, yet existing detection methods typically rely on gold-standard outputs that are costly or impractical to obtain. Moreover, hallucination detection methods developed for text-based LLMs do not directly capture audio-specific signals. We investigate four attention-derived metrics: AUDIORATIO, AUDIOCONSISTENCY, AUDIOENTROPY, and TEXTENTROPY, designed to capture pathological attention patterns associated with hallucination, and train lightweight logistic regression classifiers on these features for efficient inference-time detection. Across automatic speech recognition and speech-to-text translation tasks, evaluations on Qwen-2-Audio and Voxtral-3B show that our approach outperforms uncertainty-based and prior attention-based baselines on in-domain data, achieving improvements of up to +0.23 PR-AUC, and generalises to out-of-domain ASR settings. We further find that strong performance can be achieved with approximately 100 attention heads, improving out-of-domain generalisation compared to using all heads. While effectiveness is model-dependent and task-specific training is required, our results demonstrate that attention patterns provide a valuable tool for hallucination detection in SpeechLLMs.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.