Dept. of Biostatistics, Harvard University; Dana-Farber Cancer Institute; Broad Institute
Bagged posteriors for robust inference
Standard Bayesian inference is known to be sensitive to misspecification between the model and the data-generating mechanism, leading to unreliable uncertainty quantification and poor predictive performance. However, finding generally applicable and computationally feasible methods for robust Bayesian inference under misspecification has proven to be a difficult challenge. An intriguing, easy-to-implement approach is to use bagging on the Bayesian posterior (“BayesBag”); that is, to use the average of posterior distributions conditioned on bootstrapped datasets. In this talk, I describe the statistical behavior of BayesBag, propose a model–data mismatch index for diagnosing model misspecification using BayesBag, and show empirical validation our BayesBag methodology on synthetic and real-world data. We find that in the presence of significant misspecification, BayesBag yields more reproducible inferences, has better predictive accuracy, and selects correct models more often than the standard Bayesian posterior; meanwhile, when the model is correctly specified, BayesBag produces superior or equally good results for parameter inference and prediction, while being slightly more conservative for model selection. Overall, our results demonstrate that BayesBag combines the attractive modeling features of standard Bayesian inference with the distributional robustness properties of frequentist methods.
Computer Science & Artificial Intelligence Lab (CSAIL), Massachusetts Institute of Technology
Bayesian inference is a popular and practical tool for statistical inference. However, practitioners must make two major modeling choices when using Bayesian inference: the choice of the prior and likelihood. Uncertainty in these choices gives rise to the study of Bayesian robustness, which in part seeks to answer how posterior inferences would change had a practitioner made different modeling choices. I will give an overview of the field of Bayesian robustness with an emphasis on sensitivity to the specification of the prior and sensitivity to likelihood misspecification. To highlight practical implications of these issues, I will give some examples of how standard Bayesian inference for widely used models such as linear regression and Gaussian mixtures models can be dangerously non-robust.