Map2World: Segment Map Conditioned Text to 3D World Generation

May 1, 20262605.00781

Jaeyoung Chung, Suyoung Lee, Jianfeng Xiang, Jiaolong Yang, Kyoung Mu Lee

cs.CV

TLDR

Map2World generates consistent, detailed 3D worlds from user-defined segment maps, outperforming existing methods in control and scale.

Key contributions

Generates 3D worlds conditioned on user-defined segment maps of arbitrary shapes and scales.
Ensures global-scale consistency and flexibility across expansive environments.
Proposes a detail enhancer network for fine-grained details while maintaining scene coherence.
Leverages asset generator priors for robust generalization with limited training data.

Why it matters

Map2World significantly advances 3D world generation by enabling user-controlled, globally consistent, and detailed environments. This addresses critical limitations for applications like immersive content and autonomous driving, generalizing robustly with limited data.

Original Abstract

3D world generation is essential for applications such as immersive content creation or autonomous driving simulation. Recent advances in 3D world generation have shown promising results; however, these methods are constrained by grid layouts and suffer from inconsistencies in object scale throughout the entire world. In this work, we introduce a novel framework, Map2World, that first enables 3D world generation conditioned on user-defined segment maps of arbitrary shapes and scales, ensuring global-scale consistency and flexibility across expansive environments. To further enhance the quality, we propose a detail enhancer network that generates fine details of the world. The detail enhancer enables the addition of fine-grained details without compromising overall scene coherence by incorporating global structure information. We design the entire pipeline to leverage strong priors from asset generators, achieving robust generalization across diverse domains, even under limited training data for scene generation. Extensive experiments demonstrate that our method significantly outperforms existing approaches in user-controllability, scale consistency, and content coherence, enabling users to generate 3D worlds under more complex conditions.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers