Bayesian Deep Learning
The Case For Bayesian Learning (Wilson, n.d.)
- Vague parameter prior + structured model (e.g. CNN) = structured function prior!
- The success of ensembles encourages Bayesians, since ensembles approximate the Bayesian Model Average
Bayesian Perspective on Generalization (Smith and Le, n.d.)
Bayesian model comparisons were first made on Neural Networks by
Mackay. Consider a classification model
The assumption of a Gaussian prior for
We can evaluate the normalizing constant,
In models with many parameters,
Bibliography
Smith, Sam, and Quoc V. Le. n.d. “A Bayesian Perspective on Generalization and Stochastic Gradient Descent.” https://openreview.net/pdf?id=BJij4yg0Z.
Wilson, Andrew Gordon. n.d. “The Case for Bayesian Deep Learning.”