ArXiv TLDR

ETCH-X: Robustify Expressive Body Fitting to Clothed Humans with Composable Datasets

🐦 Tweet
2604.08548

Xiaoben Li, Jingyi Wu, Zeyu Cai, Yu Siyuan, Boqian Li + 1 more

cs.CV

TLDR

ETCH-X robustly fits expressive SMPL-X models to clothed 3D point clouds by filtering clothing dynamics and using dense correspondences.

Key contributions

  • Leverages a tightness-aware "undress" paradigm to filter out clothing dynamics from point clouds.
  • Extends expressiveness with SMPL-X and uses implicit dense correspondences for robust, fine-grained fitting.
  • Modular "undress" and "dense fit" stages enable scalable training on diverse, composable datasets.
  • Achieves substantial performance improvements over ETCH on both seen and unseen data, including hands.

Why it matters

ETCH-X offers a robust and expressive solution for 3D human body fitting from clothed point clouds. It overcomes prior limitations by effectively handling clothing dynamics, pose variations, and partial inputs. This advancement is crucial for downstream tasks like animation and texturing.

Original Abstract

Human body fitting, which aligns parametric body models such as SMPL to raw 3D point clouds of clothed humans, serves as a crucial first step for downstream tasks like animation and texturing. An effective fitting method should be both locally expressive-capturing fine details such as hands and facial features-and globally robust to handle real-world challenges, including clothing dynamics, pose variations, and noisy or partial inputs. Existing approaches typically excel in only one aspect, lacking an all-in-one solution.We upgrade ETCH to ETCH-X, which leverages a tightness-aware fitting paradigm to filter out clothing dynamics ("undress"), extends expressiveness with SMPL-X, and replaces explicit sparse markers (which are highly sensitive to partial data) with implicit dense correspondences ("dense fit") for more robust and fine-grained body fitting. Our disentangled "undress" and "dense fit" modular stages enable separate and scalable training on composable data sources, including diverse simulated garments (CLOTH3D), large-scale full-body motions (AMASS), and fine-grained hand gestures (InterHand2.6M), improving outfit generalization and pose robustness of both bodies and hands. Our approach achieves robust and expressive fitting across diverse clothing, poses, and levels of input completeness, delivering a substantial performance improvement over ETCH on both: 1) seen data, such as 4D-Dress (MPJPE-All, 33.0% ) and CAPE (V2V-Hands, 35.8% ), and 2) unseen data, such as BEDLAM2.0 (MPJPE-All, 80.8% ; V2V-All, 80.5% ). Code and models will be released at https://xiaobenli00.github.io/ETCH-X/.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.