Isotropic SGD: a practical approach to Bayesian posterior sampling

Franzese, Giulio; Candela, Rosa; Milios, Dimitrios; Filippone, Maurizio; Michiardi, Pietro
Submitted on ArXiV, 9 June 2020

In this work we define a unified mathematical framework to deepen our understanding of the role of stochastic gradient (SG) noise on the behavior of Markov chain Monte Carlo sampling (SGMCMC) algorithms.

Our formulation unlocks the design of a novel, practical approach to posterior sampling, which makes the SG noise isotropic using a fixed learning rate that we determine analytically, and that requires weaker assumptions than existing algorithms. In contrast, the common traits of existing sgmcmc algorithms is to approximate the isotropy condition either by drowning the gradients in additive noise (annealing the learning rate) or by making restrictive assumptions on the sg noise covariance and the geometry of the loss landscape.

Extensive experimental validations indicate that our proposal is competitive with the state-of-the-art on sgmcmc, while being much more practical to use.


HAL
Type:
Conference
Date:
2020-06-09
Department:
Data Science
Eurecom Ref:
6292
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Submitted on ArXiV, 9 June 2020 and is available at :

PERMALINK : https://www.eurecom.fr/publication/6292