CityRAG: Stepping Into a City via Spatially-Grounded Video Generation

April 21, 20262604.19741

Gene Chou, Charles Herrmann, Kyle Genova, Boyang Deng, Songyou Peng + 4 more

cs.CV

TLDR

CityRAG generates long, 3D-consistent, navigable videos of real cities by leveraging geo-registered data, crucial for autonomous driving simulations.

Key contributions

Generates 3D-consistent, navigable video simulations of real-world cities.
Leverages geo-registered data and disentangles scene from transient attributes.
Produces minutes-long, physically grounded videos, maintaining weather and achieving loop closure.

Why it matters

Existing video generative models struggle to reconstruct real-world environments under varying conditions. CityRAG fills this gap by creating physically grounded simulations, which is essential for advancing autonomous driving and robotics applications.

Original Abstract

We address the problem of generating a 3D-consistent, navigable environment that is spatially grounded: a simulation of a real location. Existing video generative models can produce a plausible sequence that is consistent with a text (T2V) or image (I2V) prompt. However, the capability to reconstruct the real world under arbitrary weather conditions and dynamic object configurations is essential for downstream applications including autonomous driving and robotics simulation. To this end, we present CityRAG, a video generative model that leverages large corpora of geo-registered data as context to ground generation to the physical scene, while maintaining learned priors for complex motion and appearance changes. CityRAG relies on temporally unaligned training data, which teaches the model to semantically disentangle the underlying scene from its transient attributes. Our experiments demonstrate that CityRAG can generate coherent minutes-long, physically grounded video sequences, maintain weather and lighting conditions over thousands of frames, achieve loop closure, and navigate complex trajectories to reconstruct real-world geography.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers