ArXiv TLDR

Direct Product Flow Matching: Decoupling Radial and Angular Dynamics for Few-Shot Adaptation

🐦 Tweet
2605.05054

Hongxu Chen, Yanghao Wang, Bowei Zhu, Hongxiang Li, Zhen Wang + 4 more

cs.CVcs.AIcs.LG

TLDR

Introduces Direct Product Flow Matching (DP-FM) to decouple radial and angular dynamics, achieving state-of-the-art few-shot adaptation.

Key contributions

  • Analyzes existing flow matching methods, identifying three key limitations in angular dynamics, radial information, and context.
  • Proposes Direct Product Flow Matching (DP-FM) by reformulating alignment on a decoupled cylindrical manifold.
  • DP-FM enables independent radial evolution and constant-speed angular transport, resolving distortion and preserving radial consistency.
  • Incorporates classifier-free guidance to inject missing dataset-specific information, boosting adaptation performance.

Why it matters

Existing flow matching methods for few-shot adaptation are constrained by incompatible geometric priors. This paper offers a novel Riemannian framework, DP-FM, that addresses these issues by decoupling feature dynamics and incorporating context. It significantly improves adaptation performance for vision-language models across 11 benchmarks.

Original Abstract

Recent flow matching (FM) methods improve the few-shot adaptation of vision-language models, by modeling cross-modal alignment as a continuous multi-step flow. In this paper, we argue that existing FM methods are inherently constrained by incompatible geometric priors on pre-trained cross-modal features, resulting in suboptimal adaptation performance. We first analyze these methods from a polar decomposition perspective (i.e., radial and angular sub-manifolds). Under this new geometric view, we identify three overlooked limitations in them: 1) Angular dynamics distortion: The radial-angular coupling induces non-uniform speed on the angular sub-manifold, leading to regression training difficulty and extra truncation errors. 2) Radial dynamics neglect: Feature normalization discards modality confidence, failing to distinguish out-of-distribution and in-distribution data, and abandoning crucial radial dynamics. 3) Context-agnostic unconditional flow: Dataset-specific information loss during pre-trained cross-modal feature extraction remains unrecovered. To resolve these issues, we propose warped product flow matching (WP-FM), a unified Riemannian framework that reformulates alignment on a warped product manifold. Within this framework, we derive direct product flow matching (DP-FM) by introducing a constant-warping metric, which yields a decoupled cylindrical manifold (i.e., direct product manifold). DP-FM enables independent radial evolution and constant-speed angular geodesic transport, effectively eliminating angular dynamics distortion while preserving radial consistency. Meanwhile, we incorporate classifier-free guidance by conditioning the flow on the pre-trained VLMs' hidden states to inject missing dataset-specific information. Extensive results across 11 benchmarks have demonstrated that DP-FM achieves a new state-of-the-art for multi-step few-shot adaptation.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.