Hengshuang Zhao
3 papers ยท Latest:
Computer Vision
AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward
AlphaGRPO enhances multimodal generation in UMMs using GRPO and a novel Decompositional Verifiable Reward for self-reflection and reasoning.
2605.12495
Natural Language ProcessingContinuous Latent Diffusion Language Model
Cola DLM is a hierarchical latent diffusion language model that generates text by modeling global semantics in a continuous latent space, offering a flexible non-autoregressive approach.
2605.06548
Computer VisionHERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
HERMES++ unifies 3D scene understanding and future geometry prediction in a driving world model, outperforming specialist methods.
2604.28196
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.