Xin Zhou

5 papers · Latest: April 30, 2026

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

HERMES++ unifies 3D scene understanding and future geometry prediction in a driving world model, outperforming specialist methods.

2604.28196Apr 30, 2026

Computer Vision

Inter-Stance: A Dyadic Multimodal Corpus for Conversational Stance Analysis

Inter-Stance introduces a 20TB multimodal dyadic corpus for conversational stance analysis, enabling novel modeling of interpersonal behavior.

2604.22739Apr 24, 2026

Quantum limits on squeezing

This paper derives quantum limits on steady-state squeezing in bosonic networks, showing new bounds for dissipative and parametrically driven systems.

2604.22500Apr 24, 2026

Robotics

OVPD: A Virtual-Physical Fusion Testing Dataset of OnSite Auton-omous Driving Challenge

OVPD is a new virtual-physical fusion dataset for autonomous driving, offering high-fidelity, replayable, and diagnosable testing data.

2604.20423Apr 22, 2026

Computer Vision

When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

NUMINA improves numerical alignment in text-to-video diffusion models by guiding regeneration, boosting counting accuracy and CLIP alignment.

2604.08546Apr 9, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.