mPage: Probabilistic Gradient Estimator With Momentum for Non-Convex Optimization

The probabilistic gradient estimator (PAGE) algorithm allows switching between vanilla SGD and variance-reduced methods in a flexible probabilistic manner. This motivates us to develop novel momentum-based algorithms for non-convex finite-sum problems. Specifically, we replace SGD with momentum acceleration in PAGE, and the momentum term is integrated in the inner and outer parts of the gradient estimator, named mPAGE-l and mPAGE-O, respectively. Furthermore, we propose a unified algorithmic framework for momentum variants to cover mPAGE-I and mPAGE-O, denoted as mPAGE. For non-convex objectives, we establish a unified analysis of mPAGE and show that the mPAGE algorithms sublinearly converge to an $\epsilon$-accurate solution with a high probability. By choosing an appropriate probability, the well-known stochastic gradient complexity bound is also achieved. Additionally, mPAGE can degenerate into widely used optimization algorithms, such as SGD, SHB, and PAGE, demonstrating the generality and effectiveness of our theoretical analysis. Finally, numerical experiments validate the benefits of mPAGE methods and support our theoretical findings.
Source: IEEE Transactions on Signal Processing - Category: Biomedical Engineering Source Type: research