When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs
Pegah Khayatan, Jayneel Parekh, Arnaud Dapogny, Mustafa Shukor, Alasdair Newson + 1 more
TLDR
This paper shows LVLM hallucinations are often caused by textual prompts overriding vision and introduces HalluVL-DPO to mitigate this.
Key contributions
- Proposes HalluScope, a benchmark to analyze hallucination factors in LVLMs.
- Identifies that LVLM hallucinations largely stem from excessive reliance on textual instruction priors.
- Introduces HalluVL-DPO, a preference optimization framework to fine-tune LVLMs for visual grounding.
- HalluVL-DPO effectively mitigates prompt-induced hallucinations while maintaining other performance.
Why it matters
This paper clarifies that LVLM hallucinations often stem from textual prompts overriding vision. It introduces HalluScope and HalluVL-DPO, offering practical tools for more visually grounded AI. This is vital for reliable LVLM deployment.
Original Abstract
Despite impressive progress in capabilities of large vision-language models (LVLMs), these systems remain vulnerable to hallucinations, i.e., outputs that are not grounded in the visual input. Prior work has attributed hallucinations in LVLMs to factors such as limitations of the vision backbone or the dominance of the language component, yet the relative importance of these factors remains unclear. To resolve this ambiguity, We propose HalluScope, a benchmark to better understand the extent to which different factors induce hallucinations. Our analysis indicates that hallucinations largely stem from excessive reliance on textual priors and background knowledge, especially information introduced through textual instructions. To mitigate hallucinations induced by textual instruction priors, we propose HalluVL-DPO, a framework for fine-tuning off-the-shelf LVLMs towards more visually grounded responses. HalluVL-DPO leverages preference optimization using a curated training dataset that we construct, guiding the model to prefer grounded responses over hallucinated ones. We demonstrate that our optimized model effectively mitigates the targeted hallucination failure mode, while preserving or improving performance on other hallucination benchmarks and visual capability evaluations. To support reproducibility and further research, we will publicly release our evaluation benchmark, preference training dataset, and code at https://pegah-kh.github.io/projects/prompts-override-vision/ .
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.