ArXiv TLDR

Randomized Subspace Nesterov Accelerated Gradient

🐦 Tweet
2605.00740

Gaku Omiya, Pierre-Louis Poirion, Akiko Takeda

math.OCcs.LGstat.ML

TLDR

Introduces randomized-subspace Nesterov accelerated gradient methods, achieving improved oracle complexity for convex optimization.

Key contributions

  • Developed randomized-subspace Nesterov accelerated gradient for smooth convex and strongly convex optimization.
  • Introduces a novel three-sequence formulation tailored to matrix smoothness for acceleration.
  • Establishes accelerated oracle-complexity guarantees, showing explicit dependence on sketch distribution.
  • Provides a unified basis to compare sketch families and identify when subspace acceleration is beneficial.

Why it matters

This paper addresses a gap in accelerated optimization for general subspace sketches, providing a novel theoretical framework. It offers improved oracle complexity guarantees, making first-order optimization more efficient, particularly in communication-limited or large-scale settings. This work helps identify when randomized subspace acceleration is most beneficial.

Original Abstract

Randomized-subspace methods reduce the cost of first-order optimization by using only low-dimensional projected-gradient information, a feature that is attractive in forward-mode automatic differentiation and communication-limited settings. While Nesterov acceleration is well understood for full-gradient and coordinate-based methods, obtaining accelerated methods for general subspace sketches that use only projected-gradient information and can improve over full-dimensional Nesterov acceleration in oracle complexity is technically nontrivial. We develop randomized-subspace Nesterov accelerated gradient methods for smooth convex and smooth strongly convex optimization under matrix smoothness and generic sketch moment assumptions. The key technical ingredient is a three-sequence formulation tailored to matrix smoothness, which recovers the corresponding classical Nesterov methods in the full-dimensional case. The resulting theory establishes accelerated oracle-complexity guarantees and makes explicit how matrix smoothness and the sketch distribution enter the complexity. It also provides a unified basis for comparing sketch families and identifying when randomized-subspace acceleration improves over full-dimensional Nesterov acceleration in oracle complexity.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.