CSC 2541 - Topics in Machine Learning: Bayesian Methods for Machine Learning (Jan-Apr 2011)

This course will explore how Bayesian statistical methods can be applied to problems in machine learning. I will talk about the theory of Bayesian inference, methods for performing Bayesian computations, including Markov chain Monte Carlo and variational approximation, and ways of constructing Bayesian models, particularly models that are appropriate for the high dimensional problems that often arise in fields such as bioinformatics. Exercises in the course will deal both with theoretical issues and with practical aspects of applying software for Bayesian learning to real data.

Prerequisite: Some basic knowledge of probability will be assumed.

NEW PICKUP TIME: You can pick up your major assignment and your project Wednesday, May 18, from 4:30-5:30pm, in PT 290E.


Radford Neal, Office: PT 290E / SS 6026A, Phone: (416) 946-8482 / (416) 978-4970, Email:
Office Hours: Fridays 2:10-3:00, in Pratt 290E.


Mondays, 1:10 to 3:00, in BA 2139. The first lecture is January 10. The last lecture is April 4. There is no lecture on February 21 (Family Day / Reading Week).


25% Five small exercises
25% One major assignment
20% Test
30% Project

Small exercises:

Exercise 1
Exercise 2. Here is the data.
Exercise 3. Here is the data.
Exercise 4.
Exercise 5.

Major Assignment:

Handout. Here is the data.


Projects are to be done individually, in groups of two, or in groups of three if there is some special reason for a large group.

You should aim to email me a brief (a few paragraphs) project proposal by February 28, or shortly thereafter.

Some project ideas are here.

Lecture notes:

Week 1: Course info, Introduction, Conjugate priors
Week 2: Monte Carlo, Importance sampling, MCMC, Metropolis Algorithm
Week 3: Gibbs sampling, slice sampling, MCMC accuracy, multiple chains
Week 4: Bayesian mixture models, MCMC for mixtures, infinite mixtures.
Week 5: Linear basis function models, regularization.
Week 6: Inference using marginal likelihood, inference in terms of observed data, infinite basis function models.
Week 7: Gaussian process regression models.
Week 8: Gaussian process classification, hierarchical Bayesian models.
Week 9: Latent feature models, Indian Buffet Process.
Week 10: Gaussian/Laplace approximations, variational approximations.

Links to On-line References:

General reviews of Bayesian methods:

D. J. C. MacKay (2003) Information Theory, Inference, and Learning Algorithms.

R. M. Neal (1993) Probabilistic Inference Using Markov Chain Monte Carlo Methods.

References relating to lectures:

Neal, R. M. (2003) Slice sampling (with discussion), Annals of Statistics, vol. 31, pp. 705-767.

Neal, R. M. (1998) Markov chain sampling methods for Dirichlet process mixture models.

MacKay, D. J. C. (1992) Bayesian Methods for Adaptive Models, PhD Thesis, Caltech.

Rasmussen, C. E. and Williams, C. K. I. (2006) Gaussian Processes for Machine Learning.

Murray, I., Adams, R. P., and MacKay, D. J. C. (2010) Elliptical Slice Sampling.

Griffiths, T. and Ghahramani, Z. (2005) Infinite latent feature models and the Indian buffet process , NIPS*2005.

Meeds, E., Ghahramani, Z., Neal, R. M., and Roweis, S. T. (2006) Modeling Dyadic Data with Binary Latent Factors , NIPS*2006.

Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. (2005) Hierarchical Dirichlet Processes.

Neal, R. M. (2003) Density Modeling and Clustering Using Dirichlet Diffusion Trees.

Useful References Not On-line:

J. M. Bernardo and A. F. M. Smith, Bayesian Theory, Wiley.

A. Gelman, J. B. Carlin, H. S. Stern, D. B. Rubin, Bayesian Data Analysis, 2nd edition, Chapman & Hall.

R. M. Neal, Bayesian Learning for Neural Networks, Springer.

Web page for a previous version of this course:

CSC 2541 (Fall 2004)