You are here

MIA Talks

Primer: How do we build neural networks we can trust?

October 30, 2019
Courant Institute, New York University
Courant Institute, New York University

Bayesian methods can provide full-predictive distributions and well-calibrated uncertainties in modern deep learning. The Bayesian approach is especially relevant in scientific and healthcare applications --- where we wish to have reliable predictive distributions for decision making, and the facility to naturally incorporate domain expertise. With a Bayesian approach, we not only want to find a single point that optimizes a loss, but rather to integrate over a loss landscape to form a Bayesian model average. The geometric properties of the loss surface, rather than the specific locations of optima, therefore greatly influence the predictive distribution in a Bayesian procedure. By better understanding loss geometry, we can realize the significant benefits of Bayesian methods in modern deep learning, overcoming challenges of dimensionality. In this talk, we review work on Bayesian inference and loss geometry in modern deep learning, including challenges, new opportunities, and applications.

References:

  1. Weight Uncertainty in Neural Networks: https://arxiv.org/abs/1505.05424
  2. On Calibration of Modern Neural Networks: https://arxiv.org/abs/1706.04599
  3. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima: https://arxiv.org/abs/1609.04836
  4. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning: https://arxiv.org/abs/1506.02142
  5. Bayesian Learning via Stochastic Gradient Langevin Dynamics: https://www.ics.uci.edu/~welling/publications/papers/stoclangevin_v6.pdf
  6. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles: https://arxiv.org/abs/1612.0147