FACILITIES PROVIDED BY THIS SOFTWARE

This software implements Bayesian methods for learning multilayer
perceptron networks, as described in my thesis, "Bayesian Learning for
Neural Networks", which has now been published by Springer-Verlag
(ISBN 0-387-94724-8).  The implementation uses Markov chain Monte
Carlo methods.  Software modules that support Markov chain sampling
are included in the distribution, and may be useful in other
applications.  Note that I am distributing this software to facilitate
research in this area.  Potential users should make note of the
copyright notice at the beginning of this document (or accessible via
the first hypertext link).  You must obtain permission from me before
using this software for purposes other than research or education.
You should also note that the software may have bugs, particularly
regarding recently added experimental features.

The software supports Bayesian learning for regression problems,
classification problems, and survival analysis (experimental), using
models based on networks with any number of hidden layers, with a wide
variety of prior distributions for network parameters and
hyperparameters.  The advantages of Bayesian learning include the
automatic determination of "regularization" hyperparameters, without
the need for a validation set, the avoidance of overfitting when using
large networks, and the quantification of uncertainty in predictions.
The software implements the Automatic Relevance Determination (ARD)
approach to handling inputs that may turn out to be irrelevant
(developed with David MacKay).  For problems and networks of moderate
size (eg, 200 training cases, 10 inputs, 20 hidden units), full
training (to the point where one can be reasonably sure that the
correct Bayesian answer has been found) typically takes several hours
to a day on our SGI machine.  However, quite good results, competitive
with other methods, are often obtained after training for under an
hour.  (Of course, your machine may not be as fast as ours!)

To understand how to use this software, it is essential for you to
have read my thesis or the book based on it.  The neural network
models implemented are essentially as described in the Appendix of the
thesis and book.

The software consists of a number of programs and modules.  Three
major components are included in this distribution, each with its own
directory:
  
    util    Modules and programs of general utility.

    mc      Modules and programs that support sampling using Markov 
            chain Monte Carlo methods, using modules from util.

    net     Modules and programs that implement Bayesian inference
            for models based on multilayer perceptrons, using the
            modules from util and mc.

In addition, the 'bvg' directory contains modules and programs for
sampling from a bivariate Gaussian distribution, as a simple
demonstration of the capabilities of the Markov chain Monte Carlo
facilities.  Other than by providing this example, and the detailed
documentation on various commands, I have not attempted to document
how you might go about using the Markov chain Monte Carlo modules for
another application.

The 'examples' directory contains the data sets that are used in the
tutorial examples of Bayesian neural network learning, along with the
shell scripts with the commands used.

It is possible to use this software to do learning and prediction
without any knowledge of how the programs are written (assuming that
the software can be installed as described below without any
problems).  However, the complete source code is included so that
researchers can modify the programs to try out their own ideas.

The software is written in ANSI C, and is meant to be run in a UNIX
environment.  Specifically, it was developed on an SGI machine running
IRIX Release 5.3.  It also seems to run OK on a SPARC machine running
SunOS 5, using the 'gcc' C compiler.  As far as I know, the software
does not depend on any peculiarities of these environments (except
perhaps for the use of the drand48 psuedo-random number generator),
but you may nevertheless have problems getting it to work in
substantially different environments, and I can offer little or no
assistance in this regard.  There is no dependence on any particular
graphics package or graphical user interface.  (The 'xxx-plt' programs
are designed to allow their output to be piped directly into the
'xgraph' plotting program, but other plotting programs can be used
instead, or the numbers can be examined directly.)