CSC 2541 - Topics in Machine Learning: Bayesian Methods for Machine Learning (Sept-Dec 2004)

This course will explore how Bayesian statistical methods can be applied to problems in machine learning. I will talk about the theory of Bayesian inference, methods for performing Bayesian computations, including Markov chain Monte Carlo and variational approximation, and ways of constructing Bayesian models, particularly models that are appropriate for the high dimensional problems that often arise in fields such as bioinformatics. Exercises in the course will deal both with theoretical issues and with practical aspects of applying software for Bayesian learning to real data.

Prerequisite: Some basic knowledge of probability will be assumed.


Radford Neal, Office: PT 290E / SS 6016A, Phone: (416) 946-8482 / (416) 978-4970, Email:


Wednesdays, 1:10 to 3:00, from September 15 to December 8, in BA 3012.

Office Hours:

Mondays, 1:10-2:00, in Pratt 290E.


Term test: 17%
Assignments: Three, 16% each
Project: 35%


Assignment 1: Handout in Postscript or PDF, and the data. Note: A minus sign was missing from the expression for the density for the exponential distribution in the version handed out; it's fixed in the version here.

You may want to generate random variables from a gamma distribution for this assignment. Here is one way to do this. More efficient ways may be found in Luc Devroye's book (see below).

Here is a solution for assignment 1: R program, plots in postscript and PDF, and comments.

Assignment 2: Handout in Postscript or PDF. Here is the training data and the test data.

Assignment 3: Handout in Postscript or PDF. Here are the data files: xdata.trn, xdata.tst, hdata.trn, hdata.tst, sdata.trn, sdata.tst.

Note: I forgot a `...' argument in the data-spec command in the assignment handout. It's corrected in the version above.


You may do a project alone, or in a group of two people. Naturally, more will be expected of projects done by two people than by one.

Here are some suggested project ideas. Let me know if you are interested in one of these, since I'd like to avoid having more than one group work on the same idea (though perhaps two groups could work on two variations of one idea). [ Project suggestions last updated November 23. ]

The projects will be due on January 4, 2005. You should submit a report that includes exposition and discussion, any supporting experimental results, and any program code.

Links to On-line References:

General reviews of Bayesian methods:

D. J. C. MacKay (2003) Information Theory, Inference, and Learning Algorithms.

R. M. Neal (1993) Probabilistic Inference Using Markov Chain Monte Carlo Methods.

Other useful books:

Devroye, L. (1986) Non-Uniform Random Variate Generation

Papers mentioned in lectures:

L. Wasserman (1998) Asymptotic Inference for Mixture Models Using Data Dependent Priors

R. M. Neal (1998) Markov Chain Sampling Methods for Dirichlet Process Mixture Models

C. Kemp, T. L. Griffitsh, J. B. Tenenbaum (2004) Discovering latent classes in relational data

Other resources:

Tom Griffiths has a good collection of links about Bayesian inference

Matthew Beal has compiled a collection of references on Dirichlet processes and other nonparametric models

Useful References Not On-line:

J. M. Bernardo and A. F. M. Smith, Bayesian Theory, Wiley.

A. Gelman, J. B. Carlin, H. S. Stern, D. B. Rubin, Bayesian Data Analysis, 2nd edition, Chapman & Hall.

R. M. Neal, Bayesian Learning for Neural Networks, Springer.

Example programs:

Gibbs sampling for a simple mixture model: matlab program, test script.