In this talk we discuss how to exploit the geometry of training objectives for scalable Bayesian model averaging, leading to better point predictions, as well as uncertainty and calibration in deep learning. We will focus primarily on five works, which include the surprising discovery of mode connectivity, and its implications.
- Loss surfaces, mode connectivity, and fast ensembling of DNNs (NeurIPS 2018): https://arxiv.org/abs/1802.10026
- Averaging Weights Leads to Wider Optima and Better Generalization (UAI 2018): https://arxiv.org/abs/1803.05407
- A Simple Baseline for Bayesian Uncertainty in Deep Learning (NeurIPS 2019): https://arxiv.org/abs/1902.02476
- Subspace Inference for Bayesian Deep Learning (UAI 2019): https://arxiv.org/abs/1907.07504
- SWALP: Stochastic Weight Averaging in Low-Precision Training: https://arxiv.org/abs/1904.11943