FashionStylist: An Expert Knowledge-enhanced Multimodal Dataset for Fashion Understanding

April 10, 20262604.09249

Kaidong Feng, Zhuoxuan Huang, Huizhong Guo, Yuting Jin, Xinyu Chen + 5 more

cs.CVcs.IR

TLDR

FashionStylist is a new expert-annotated multimodal dataset designed for holistic fashion understanding, supporting tasks like grounding, completion, and evaluation.

Key contributions

Introduces FashionStylist, an expert-annotated multimodal dataset for holistic fashion understanding.
Supports three key tasks: outfit-to-item grounding, completion, and expert-level evaluation.
Provides professionally grounded annotations at both item and outfit levels.

Why it matters

Existing fashion datasets are fragmented and lack expert-level reasoning for holistic outfit understanding. FashionStylist fills this gap by providing a unified, expert-annotated benchmark. It significantly advances MLLM-based fashion systems by improving grounding, completion, and semantic evaluation capabilities.

Original Abstract

Fashion understanding requires both visual perception and expert-level reasoning about style, occasion, compatibility, and outfit rationale. However, existing fashion datasets remain fragmented and task-specific, often focusing on item attributes, outfit co-occurrence, or weak textual supervision, and thus provide limited support for holistic outfit understanding. In this paper, we introduce FashionStylist, an expert-annotated benchmark for holistic and expert-level fashion understanding. Constructed through a dedicated fashion-expert annotation pipeline, FashionStylist provides professionally grounded annotations at both the item and outfit levels. It supports three representative tasks: outfit-to-item grounding, outfit completion, and outfit evaluation. These tasks cover realistic item recovery from complex outfits with layering and accessories, compatibility-aware composition beyond co-occurrence matching, and expert-level assessment of style, season, occasion, and overall coherence. Experimental results show that FashionStylist serves not only as a unified benchmark for multiple fashion tasks, but also as an effective training resource for improving grounding, completion, and outfit-level semantic evaluation in MLLM-based fashion systems.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers