ArXiv TLDR

Ziwei Liu

6 papers ยท Latest:

Artificial Intelligence

Senses Wide Shut: A Representation-Action Gap in Omnimodal LLMs

Omnimodal LLMs struggle to reject false textual claims contradicting sensory input, revealing a "Representation-Action Gap" in grounding.

2605.13737
Computer Vision

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

SenseNova-U1 introduces a unified architecture (NEO-unify) that seamlessly integrates multimodal understanding and generation, outperforming specialized VLMs.

2605.12500
Computer Vision

Is Your Driving World Model an All-Around Player?

WorldLens is a new benchmark, dataset, and agent for evaluating driving world models beyond visual realism, focusing on physical and behavioral fidelity.

2605.10858
Computer Vision

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

This paper proposes a new five-level taxonomy for visual generation, shifting from appearance synthesis to intelligent, agentic world modeling.

2604.28185
Artificial Intelligence

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

This paper introduces a "levels x laws" taxonomy for agentic world models, synthesizing over 400 works and outlining a roadmap for future development.

2604.22748
Robotics

XRZero-G0: Pushing the Frontier of Dexterous Robotic Manipulation with Interfaces, Quality and Ratios

XRZero-G0 is a hardware-software system that enables scalable, high-quality robot-free data collection for dexterous manipulation, reducing costs significantly.

2604.13001

๐Ÿ“ฌ Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week โ€” summarized, scored, and delivered to your inbox every Monday.