GeoRelight: Learning Joint Geometrical Relighting and Reconstruction with Flexible Multi-Modal Diffusion Transformers

April 22, 20262604.20715

Yuxuan Xue, Ruofan Liang, Egor Zakharov, Timur Bagautdinov, Chen Cao + 4 more

cs.CV

TLDR

GeoRelight introduces a Multi-Modal Diffusion Transformer for joint 3D geometry reconstruction and relighting from a single image, improving physical consistency.

Key contributions

Introduces GeoRelight, a Multi-Modal Diffusion Transformer for joint 3D geometry and relighting.
Proposes iNOD, a distortion-free 3D representation compatible with latent diffusion models.
Employs strategic mixed-data training combining synthetic and auto-labeled real data.
Achieves superior relighting performance by explicitly leveraging 3D geometry.

Why it matters

Relighting a person from a single photo is challenging due to entangled 3D information. This paper offers a unified approach that jointly estimates 3D geometry and relighting, overcoming limitations of prior sequential or geometry-ignoring methods. This leads to more physically consistent and higher-quality relighting results.

Original Abstract

Relighting a person from a single photo is an attractive but ill-posed task, as a 2D image ambiguously entangles 3D geometry, intrinsic appearance, and illumination. Current methods either use sequential pipelines that suffer from error accumulation, or they do not explicitly leverage 3D geometry during relighting, which limits physical consistency. Since relighting and estimation of 3D geometry are mutually beneficial tasks, we propose a unified Multi-Modal Diffusion Transformer (DiT) that jointly solves for both: GeoRelight. We make this possible through two key technical contributions: isotropic NDC-Orthographic Depth (iNOD), a distortion-free 3D representation compatible with latent diffusion models; and a strategic mixed-data training method that combines synthetic and auto-labeled real data. By solving geometry and relighting jointly, GeoRelight achieves better performance than both sequential models and previous systems that ignored geometry.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers