A second-order method on the Stiefel manifold via Newton$\unicode{x2013}$Schulz
Xinhui Xiong, Bin Gao, P. -A. Absil
TLDR
A new second-order, retraction-free method for Stiefel manifold optimization uses Newton-Schulz for quadratic convergence, outperforming existing methods.
Key contributions
- Proposes a second-order, retraction-free optimization method for the Stiefel manifold.
- Achieves local quadratic convergence by combining tangent and normal update components.
- Utilizes Newton-Schulz iteration for the normal component, ensuring orthogonalization.
- Demonstrates superior performance on problems like PCA and ICA compared to existing methods.
Why it matters
This paper introduces a novel second-order, retraction-free method for optimization on the Stiefel manifold, overcoming efficiency limits of first-order approaches. By integrating Newton-Schulz for orthogonalization, it achieves quadratic convergence. This advancement is crucial for high-accuracy applications in areas like PCA and ICA.
Original Abstract
Retraction-free approaches offer attractive low-cost alternatives to Riemannian methods on the Stiefel manifold, but they are often first-order, which may limit the efficiency under high-accuracy requirements. To this end, we propose a second-order method landing on the Stiefel manifold without invoking retractions, which is proved to enjoy local quadratic (or superlinear for its inexact variant) convergence. The update consists of the sum of (i) a component tangent to the level set of the constraint-defining function that aims to reduce the objective and (ii) a component normal to the same level set that reduces the infeasibility. Specifically, we construct the normal component via Newton$\unicode{x2013}$Schulz, a fixed-point iteration for orthogonalization. Moreover, we establish a geometric connection between the Newton$\unicode{x2013}$Schulz iteration and Stiefel manifolds, in which Newton$\unicode{x2013}$Schulz moves along the normal space. For the tangent component, we formulate a modified Newton equation that incorporates Newton$\unicode{x2013}$Schulz. Numerical experiments on the orthogonal Procrustes problem, principal component analysis, and real-data independent component analysis illustrate that the proposed method performs better than the existing methods.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.