Representation Fréchet Loss for Visual Generation

April 30, 20262604.28190

Jiawei Yang, Zhengyang Geng, Xuan Ju, Yonglong Tian, Yue Wang

cs.CV

TLDR

This paper introduces FD-loss, a method to optimize Fréchet Distance in representation space, significantly improving visual generation quality and efficiency.

Key contributions

Introduces FD-loss, decoupling FD estimation population size from gradient batch size for effective optimization.
Post-training with FD-loss consistently improves visual quality, achieving 0.72 FID on ImageNet 256x256.
Repurposes multi-step generators into strong one-step models without distillation or adversarial training.
Proposes FDr^k, a multi-representation metric, addressing FID's limitations in ranking visual quality.

Why it matters

This work makes Fréchet Distance practical as a training objective, offering a simpler and more effective way to improve generative models. It not only boosts visual quality and efficiency but also challenges the sole reliance on FID, proposing a more robust evaluation metric. This could lead to new advancements in generative model training and evaluation.

Original Abstract

We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term this approach FD-loss. Optimizing FD-loss reveals several surprising findings. First, post-training a base generator with FD-loss in different representation spaces consistently improves visual quality. Under the Inception feature space, a one-step generator achieves0.72 FID on ImageNet 256x256. Second, the same FD-loss repurposes multi-step generators into strong one-step generators without teacher distillation, adversarial training or per-sample targets. Third, FID can misrank visual quality: modern representations can yield better samples despite worse Inception FID. This motivates FDr$^k$, a multi-representation metric. We hope this work will encourage further exploration of distributional distances in diverse representation spaces as both training objectives and evaluation metrics for generative models.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers