Jie Wu
4 papers ยท Latest:
Computer Vision
AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward
AlphaGRPO enhances multimodal generation in UMMs using GRPO and a novel Decompositional Verifiable Reward for self-reflection and reasoning.
2605.12495
Computer VisionLeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories
LeapAlign fine-tunes flow matching models for preference alignment by using two-step trajectories, enabling efficient, stable direct gradient updates at any generation step.
2604.15311
Computer VisionSeedance 2.0: Advancing Video Generation for World Complexity
Seedance 2.0 is a new multi-modal audio-video generation model with a unified architecture, offering advanced capabilities and improved performance.
2604.14148
Computer VisionTowards Long-horizon Agentic Multimodal Search
LMM-Searcher enables long-horizon multimodal search by offloading visual data to files, using UIDs, and achieving SOTA performance.
2604.12890
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.