Drifting Field Policy: A One-Step Generative Policy via Wasserstein Gradient Flow

May 8, 20262605.07727

Juil Koo, Mingue Park, Jiwon Choi, Yunhong Min, Minhyuk Sung

cs.LGcs.AIcs.RO

TLDR

DFP is a new one-step generative policy using Wasserstein gradient flow, achieving state-of-the-art performance on manipulation tasks.

Key contributions

Introduces Drifting Field Policy (DFP), a non-ODE one-step generative policy.
Frames policy updates as a reverse-KL Wasserstein-2 gradient flow in probability space.
Decomposes gradient into action-value ascent and anchor policy score matching.
Achieves state-of-the-art performance on Robomimic and OGBench manipulation tasks.

Why it matters

DFP offers a novel, efficient approach to generative policies by avoiding ODEs and using a unique gradient flow. This leads to state-of-the-art results in complex manipulation tasks, making it a significant advancement for robotics and control. Its one-step inference is particularly beneficial for real-time applications.

Original Abstract

We propose Drifting Field Policy (DFP), a non-ODE one-step generative policy built on the drifting model paradigm. We frame the policy update as a reverse-KL Wasserstein-2 gradient flow toward a soft target policy, so that each DFP update corresponds to a gradient step in probability space. By construction, this gradient is decomposed into an ascent toward higher action-value regions and a score matching with the anchor policy as a trust region. We further derive a simple, tractable surrogate of the otherwise intractable update loss, akin to behavior cloning on top-K critic-selected actions. We find empirically that this mechanism uniquely benefits the drifting backbone owing to its non-ODE parameterization. With one-step inference, DFP achieves state-of-the-art performance on several manipulation tasks across Robomimic and OGBench, outperforming ODE-based policies.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers