Yifan Du
2 papers ยท Latest:
Computer Vision
Improving Vision-language Models with Perception-centric Process Reward Models
Perceval is a new process reward model that improves vision-language models by providing token-level supervision to identify and correct perceptual errors.
2604.24583
Computer VisionTowards Long-horizon Agentic Multimodal Search
LMM-Searcher enables long-horizon multimodal search by offloading visual data to files, using UIDs, and achieving SOTA performance.
2604.12890
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.