Welcome to the homepage of the

Mathematical Optimization Group

Department of Mathematics, University of Tübingen, Germany


TRactable quasI-Newton non-smooth Optimization Methods
for large scale Data Science

joint project by J. Fadili and P. Ochs

Structured composite non-smooth optimization has proved to be extremely useful in data science, where a common trend is the deluge of large-scale multidimensional data, which drives the need for more efficient optimization schemes. While, so far, primarily, first-order schemes have been widely used to solve such optimization problems, they are quickly approaching their natural (and provable) limitations. In contrast, higher-order methods, in particular quasi-Newton ones, are hardly used due to their lack of scalability, the complexity in their mathematical analysis and the deployment in the non-smooth case. TRINOM-DS will unlock these bottlenecks and develop theoretical, numerical and algorithmic advances to exploit the great potential of quasi-Newton-type schemes for non-smooth large-scale optimization problems, which are ubiquitous in data science. These algorithms will be developed in a variety of challenging settings, and are expected to have far reaching applications in data science, e.g., machine learning (optimal transport, deep learning, etc), imaging and computer vision. They will be implemented as fast optimization codes, possibly on dedicated architectures, that will be made publicly available following the philosophy of reproducible research. TRINOM-DS's members are very active players in the European school of optimization and data science, which makes this project a unique opportunity to give a scalable, computational and practical embodiment to higher-order non-smooth optimization methods.

Funded by the ANR DFG 2020 NLE.

Open Positions: We have several open PhD and PostDoc positions for this project. Applications are welcome from now on. If you are interested, please send an eMail to J. Fadili (jalal.fadili@ensicaen.fr) or P. Ochs (ochs@math.uni-tuebingen.de) and include your CV as well as a letter describing your interest and proficiency in the field.

Our Preliminary Work:

[BF12] S. Becker, J. Fadili:
A quasi-Newton proximal splitting method. [pdf]
In Advances in Neural Information Processing Systems (NIPS), pages 2618--2626. Curran Associates Inc., 2012.

[BFO19] S. Becker, J. Fadili, P. Ochs:
On Quasi-Newton Forward--Backward Splitting: Proximal Calculus and Convergence. [pdf]
SIAM Journal on Optimization, 29(4):2445-2481, 2019.

In our joint work on proximal quasi-Newton methods, we have built the basement for this project. The contribution distinguishes significantly from related work, which either study the theory of variable metric methods or their practical implementation. In contrast, we developed in [BF12] and [BFO19] an advanced and efficient proximal calculus for a common type of non-diagonal quasi-Newton metrics, which can be written as the sum of a diagonal matrix and a rank-r matrix. In particular, we have studied a proximal SR1- and BFGS-type quasi-Newton method with a favorable performance in practical convex optimization problems. While this contribution is a significant step towards the usage of quasi-Newton methods for non-smooth optimization, their establishment as generic state-of-the-art solvers for applications in large-scale data science requires successful completion of the work program in this project proposal.

[OP19] P. Ochs, T. Pock:
Adaptive Fista for Non-convex Optimization. [pdf]
SIAM Journal on Optimization, 29(4):2482-2503, 2019.

The surprising equivalence between types of (proximal) quasi-Newton methods and extrapolated (proximal) gradient based algorithms has been established. A certain choice of extrapolation leads exactly to the proximal SR1-metric used in [BF12] and [BFO19]. This equivalence yields a novel point of view on proximal quasi-Newton methods and serves as a basis for further development for both types of algorithms.

[Ochs19] P. Ochs:
Unifying abstract inexact convergence theorems and block coordinate variable metric iPiano. [pdf]
SIAM Journal on Optimization, 29(1):541-570, 2019.
[Ochs18] P. Ochs:
Local Convergence of the Heavy-ball Method and iPiano for Non-convex Optimization. [pdf]
Journal of Optimization Theory and Applications, 177(1):153-180, 2018.
[OFB18] P. Ochs, J. Fadili, T. Brox:
Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms. [pdf]
Journal of Optimization Theory and Applications, 181(1):244-278, 2018.

In [Ochs19] we developed a general and unifying view on convergence of descent methods for non-convex non-smooth optimization problems, which is inspired by [ABS13] H. Attouch, J. Bolte, and B. Svaiter: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Mathematical Programming, 137(1-2):91--129, 2013. . As a special case of the generalization, the framework allows convergence of inertial proximal gradient descent (iPiano) and its variable metric variant to be studied conveniently. In [OFB18], a unifying convergence framework for Bregman-based minimization algorithms was also developed. In [Ochs18], for a variant of the inertial methods just mentioned, we derived the local convergence (including convergence rates) and an equivalence to alternating minimization schemes. All this previous work provides a good theoretical basis for the convergence analysis of quasi-Newton methods in non-convex setting and the practical combination of quasi-Newton methods with extrapolation methods and/or Bregman distances.

[VPF15] S. Vaiter, G. Peyré and J. Fadili. Low Complexity Regularization of Linear Inverse Problems. Chapter in Sampling Theory, a Renaissance, 103--153. Applied and Numerical Harmonic Analysis. Birkhäuser, 2015.

[LFP17] J. Liang, J. Fadili and G. Peyré: Activity Identification and Local Linear Convergence of Forward-Backward-type methods. SIAM Journal on Optimization, 27(1):408--437, 2017.

[FMP18] J. Fadili, J. Malick and G. Peyré: Sensitivity Analysis for Mirror-Stratifiable Convex Functions. SIAM Journal on Optimization, 28(4): 2975--3000, 2018.

[FGMP18] J. Fadili, G. Garrigos, J. Malick and G. Peyré: Model Consistency for Learning with Mirror-Stratifiable Regularizers. Proceedings of Machine Learning Research, PMLR 89:1236--1244, 2019.

In [VPF15] and [LFP17], we have developed a unifying theory for exact low-complexity model finite-time identification by proximal splitting methods for a large class of regularizers called partly smooth functions. In [FMP18] and [FGMP18], a key limiting assumption of [VPF15], [LFP17] was removed, but for a smaller class of regularizers coined mirror-stratifiable functions, while ensuring finite-time approximate (but not exact) model identification. These results were the first to explain empirical observations made in inverse problems and machine learning. They will provide a good basis for the design and the analysis of scalable quasi-Newton methods when applied to solve problems with low-complexity regularization.

MOP Group
The author is not
responsible for
the content of
external pages.