ViBR: Automated Bug Replay from Video-based Reports using Vision-Language Models
Sidong Feng, Dingbang Wang, Nikola Tomic, Tingting Yu, Aldeida Aleti + 1 more
TLDR
ViBR automatically reproduces software bugs from GUI video reports using vision-language models, outperforming existing methods.
Key contributions
- ViBR automates bug reproduction directly from GUI screen capture videos.
- Uses CLIP-based embedding similarity for precise action boundary segmentation.
- Leverages Vision-Language Models (VLMs) for region-aware GUI state comparison and guided replay.
- Successfully reproduces 72% of bug recordings, outperforming state-of-the-art baselines.
Why it matters
Bug reports with videos are popular but hard to automate. ViBR offers a lightweight, fully automated solution using modern VLMs. This significantly improves the efficiency of software maintenance by streamlining bug reproduction.
Original Abstract
Bug reports play a critical role in software maintenance by helping users convey encountered issues to developers. Recently, GUI screen capture videos have gained popularity as a bug reporting artifact due to their ease of use and ability to retain rich contextual information. However, automatically reproducing bugs from such recordings remains a significant challenge. Existing methods often rely on fragile image-processing heuristics, explicit touch indicators, or pre-constructed UI transition graphs, which require non-trivial instrumentation and app-specific setup. This paper presents ViBR, a lightweight and fully automated approach that reproduces bugs directly from GUI recordings. Specifically, ViBR combines CLIP-based embedding similarity for action boundary segmentation with Vision-Language Models (VLMs) for region-aware GUI state comparison and guided bug replay. Experimental results show that ViBR successfully reproduces 72% of bug recordings, significantly outperforming state-of-the-art baselines and ablation variants.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.