930. Monitoring the Convergence Speed of PDHG to Find Better Primal and Dual Step Sizes
Invited abstract in session MD-32: Algorithms for machine learning and inverse problems: adaptive strategies, stream Advances in large scale nonlinear optimization.
Monday, 14:30-16:00Room: 41 (building: 303A)
Authors (first author is the speaker)
1. | Olivier Fercoq
|
Telecom Paris University |
Abstract
Primal-dual algorithms for the resolution of convex-concave saddle point problems usually come with one or several step size parameters. Within the range where convergence is guaranteed, choosing well the step size can make the difference between a slow or a fast algorithm. A usual way to adaptively set step sizes is to ensure that there is a fair balance between primal and dual variable's amount of change.
In this work, we show how to find even better step sizes for the primal-dual hybrid gradient. Getting inspiration from quadratic problems, we base our method on a spectral radius estimation procedure and try to minimize this spectral radius, which is directly related to the rate of convergence. Building on power iterations, we could produce spectral radius estimates that are always smaller than 1 and work also in the case of conjugate principal eigenvalues.
For strongly convex quadratics, we show that our step size rule yields an algorithm as fast as inertial gradient descent. Moreover, since our spectral radius estimates only rely on residual norms, our method can be readily adapted to more general convex-concave saddle point problems.
In a second part, we extend these results to a randomized version of PDHG called PURE-CD. We design a statistical test to compare observed convergence rates and decide whether a step size is better than another.
Numerical experiments on least squares, sparse SVM, TV-L1 denoising and TV-L2 denoising problems support our findings.
Keywords
- Convex Optimization
- Non-smooth Optimization
- Large Scale Optimization
Status: accepted
Back to the list of papers