Lingdong Kong
5 papers ยท Latest:
OmniLiDAR: A Unified Diffusion Framework for Multi-Domain 3D LiDAR Generation
OmniLiDAR is a unified diffusion framework that generates 3D LiDAR scans across eight diverse domains using text conditioning, addressing single-domain limitations.
Masked Generative Transformer Is What You Need for Image Editing
EditMGT, a novel Masked Generative Transformer, offers faster, more precise image editing by localizing changes, outperforming diffusion models.
Is Your Driving World Model an All-Around Player?
WorldLens is a new benchmark, dataset, and agent for evaluating driving world models beyond visual realism, focusing on physical and behavioral fidelity.
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
This paper introduces a "levels x laws" taxonomy for agentic world models, synthesizing over 400 works and outlining a roadmap for future development.
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation
OneVL introduces a unified VLA and World Model framework, achieving state-of-the-art latent Chain-of-Thought reasoning at real-time speed.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.