Video Analysis and Generation via a Semantic Progress Function

April 24, 20262604.22554

Gal Metzer, Sagi Polaczek, Ali Mahdavi-Amiri, Raja Giryes, Daniel Cohen-Or

cs.CV

TLDR

This paper introduces a Semantic Progress Function to analyze and linearize the semantic pacing of video generation models, creating smoother transitions.

Key contributions

Introduces a Semantic Progress Function (SPF) to quantify the temporal evolution of semantic meaning in videos.
Proposes a semantic linearization procedure that re-times videos for consistent, smooth semantic transitions.
Provides a model-agnostic framework to identify temporal irregularities and compare pacing across video generators.
Enables steering of video sequences (generated or real) towards desired semantic pacing.

Why it matters

Current video generation often suffers from abrupt semantic changes. This paper offers a novel method to analyze and correct this, leading to more coherent and natural video outputs. It also provides a versatile tool for comparing and controlling semantic flow in various video applications.

Original Abstract

Transformations produced by image and video generation models often evolve in a highly non-linear manner: long stretches where the content barely changes are followed by sudden, abrupt semantic jumps. To analyze and correct this behavior, we introduce a Semantic Progress Function, a one-dimensional representation that captures how the meaning of a given sequence evolves over time. For each frame, we compute distances between semantic embeddings and fit a smooth curve that reflects the cumulative semantic shift across the sequence. Departures of this curve from a straight line reveal uneven semantic pacing. Building on this insight, we propose a semantic linearization procedure that reparameterizes (or retimes) the sequence so that semantic change unfolds at a constant rate, yielding smoother and more coherent transitions. Beyond linearization, our framework provides a model-agnostic foundation for identifying temporal irregularities, comparing semantic pacing across different generators, and steering both generated and real-world video sequences toward arbitrary target pacing.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers