Yihang Lou
2 papers ยท Latest:
Computer Vision
MMVIAD: Multi-view Multi-task Video Understanding for Industrial Anomaly Detection
Introduces MMVIAD, the first continuous multi-view video dataset for industrial anomaly detection, and VISTA, a model outperforming GPT-5.4.
2605.10833
Computer VisionUnveiling Fine-Grained Visual Traces: Evaluating Multimodal Interleaved Reasoning Chains in Multimodal STEM Tasks
Introduces StepSTEM, a new benchmark and evaluation framework for fine-grained, cross-modal STEM reasoning in MLLMs, revealing current models struggle.
2604.19697
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.