Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning

Ruqi Zhang, Chunyuan Li, Jianyi Zhang, Changyou Chen, Andrew Gordon Wilson

Keywords: bayesian inference, imagenet

Tues Session 4 (17:00-19:00 GMT) [Live QA] [Cal]
Tues Session 5 (20:00-22:00 GMT) [Live QA] [Cal]
Tuesday: Probabilistic Approaches

Abstract: The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We prove non-asymptotic convergence theory of our proposed algorithm. Moreover, we provide extensive experimental results, including ImageNet, to demonstrate the effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.

Similar Papers

Stochastic AUC Maximization with Deep Neural Networks
Mingrui Liu, Zhuoning Yuan, Yiming Ying, Tianbao Yang,
Transferring Optimality Across Data Distributions via Homotopy Methods
Matilde Gargiani, Andrea Zanelli, Quoc Tran Dinh, Moritz Diehl, Frank Hutter,