Randomized Subspace Nesterov Accelerated Gradient
Gaku Omiya, Pierre-Louis Poirion, Akiko Takeda
TLDR
Introduces randomized-subspace Nesterov accelerated gradient methods, achieving improved oracle complexity for convex optimization.
Key contributions
- Developed randomized-subspace Nesterov accelerated gradient for smooth convex and strongly convex optimization.
- Introduces a novel three-sequence formulation tailored to matrix smoothness for acceleration.
- Establishes accelerated oracle-complexity guarantees, showing explicit dependence on sketch distribution.
- Provides a unified basis to compare sketch families and identify when subspace acceleration is beneficial.
Why it matters
This paper addresses a gap in accelerated optimization for general subspace sketches, providing a novel theoretical framework. It offers improved oracle complexity guarantees, making first-order optimization more efficient, particularly in communication-limited or large-scale settings. This work helps identify when randomized subspace acceleration is most beneficial.
Original Abstract
Randomized-subspace methods reduce the cost of first-order optimization by using only low-dimensional projected-gradient information, a feature that is attractive in forward-mode automatic differentiation and communication-limited settings. While Nesterov acceleration is well understood for full-gradient and coordinate-based methods, obtaining accelerated methods for general subspace sketches that use only projected-gradient information and can improve over full-dimensional Nesterov acceleration in oracle complexity is technically nontrivial. We develop randomized-subspace Nesterov accelerated gradient methods for smooth convex and smooth strongly convex optimization under matrix smoothness and generic sketch moment assumptions. The key technical ingredient is a three-sequence formulation tailored to matrix smoothness, which recovers the corresponding classical Nesterov methods in the full-dimensional case. The resulting theory establishes accelerated oracle-complexity guarantees and makes explicit how matrix smoothness and the sketch distribution enter the complexity. It also provides a unified basis for comparing sketch families and identifying when randomized-subspace acceleration improves over full-dimensional Nesterov acceleration in oracle complexity.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.