We revisit the theoretical properties of Hamiltonian stochastic differential equations (SDES) for Bayesian posterior sampling, and we study the two types of errors that arise from numerical SDE simulation: the discretization error and the error due to noisy gradient estimates in the context of data subsampling. Our main result is a novel analysis for the effect of mini-batches through the lens of differential operator splitting, revising previous literature results. The stochastic component of a Hamiltonian SDE is decoupled from the gradient noise, for which we make no normality assumptions. This leads to the identification of a convergence bottleneck: when considering mini-batches, the best achievable error rate is , with being the integrator step size. Our theoretical results are supported by an empirical study on a variety of regression and classification tasks for Bayesian neural networks.
Revisiting the effects of stochasticity for Hamiltonian samplers
Submitted to ArXiV, 4 November 2021
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Submitted to ArXiV, 4 November 2021 and is available at :
PERMALINK : https://www.eurecom.fr/publication/6740