ArXiv TLDR

Representation Fréchet Loss for Visual Generation

🐦 Tweet
2604.28190

Jiawei Yang, Zhengyang Geng, Xuan Ju, Yonglong Tian, Yue Wang

cs.CV

TLDR

This paper introduces FD-loss, a method to optimize Fréchet Distance in representation space, significantly improving visual generation quality and efficiency.

Key contributions

  • Introduces FD-loss, decoupling FD estimation population size from gradient batch size for effective optimization.
  • Post-training with FD-loss consistently improves visual quality, achieving 0.72 FID on ImageNet 256x256.
  • Repurposes multi-step generators into strong one-step models without distillation or adversarial training.
  • Proposes FDr^k, a multi-representation metric, addressing FID's limitations in ranking visual quality.

Why it matters

This work makes Fréchet Distance practical as a training objective, offering a simpler and more effective way to improve generative models. It not only boosts visual quality and efficiency but also challenges the sole reliance on FID, proposing a more robust evaluation metric. This could lead to new advancements in generative model training and evaluation.

Original Abstract

We show that Fréchet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population size for FD estimation (e.g., 50k) from the batch size for gradient computation (e.g., 1024). We term this approach FD-loss. Optimizing FD-loss reveals several surprising findings. First, post-training a base generator with FD-loss in different representation spaces consistently improves visual quality. Under the Inception feature space, a one-step generator achieves0.72 FID on ImageNet 256x256. Second, the same FD-loss repurposes multi-step generators into strong one-step generators without teacher distillation, adversarial training or per-sample targets. Third, FID can misrank visual quality: modern representations can yield better samples despite worse Inception FID. This motivates FDr$^k$, a multi-representation metric. We hope this work will encourage further exploration of distributional distances in diverse representation spaces as both training objectives and evaluation metrics for generative models.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.