Mike Zheng Shou
4 papers ยท Latest:
Computer Vision
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
AnyFlow introduces an any-step video diffusion model using flow map distillation, outperforming consistency-based methods and scaling with sampling steps.
2605.13724
RoboticsWorld Action Models: The Next Frontier in Embodied AI
This survey introduces World Action Models (WAMs), a new embodied AI paradigm unifying predictive state modeling with action generation, providing a systematic overview.
2605.12090
Computer VisionSparkle: Realizing Lively Instruction-Guided Video Background Replacement via Decoupled Guidance
Sparkle introduces a new dataset and benchmark for high-quality video background replacement, significantly improving model performance.
2605.06535
Artificial IntelligenceAgentic World Modeling: Foundations, Capabilities, Laws, and Beyond
This paper introduces a "levels x laws" taxonomy for agentic world models, synthesizing over 400 works and outlining a roadmap for future development.
2604.22748
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.