Shuang Chen
4 papers ยท Latest:
Natural Language Processing
Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents
ODE enhances multimodal deep search agents via an image bank for reusable visual evidence and on-policy data evolution, improving performance significantly.
2605.10832
Computer VisionFlow-OPD: On-Policy Distillation for Flow Matching Models
Flow-OPD introduces an on-policy distillation framework for Flow Matching text-to-image models, resolving multi-task alignment issues.
2605.08063
Computer VisionOpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents
OpenSearch-VL provides an open-source recipe for training frontier multimodal deep search agents, achieving state-of-the-art performance.
2605.05185
Computer VisionDiffusion Model as a Generalist Segmentation Learner
DiGSeg repurposes diffusion models for versatile, text-conditioned segmentation across diverse domains without custom architectures.
2604.24575
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.