NET-MC: Do Markov chain simulation to sample networks. The net-mc program is the specialization of xxx-mc to the task of sampling from the posterior distribution for a neural network model, or from the prior distribution, if no training set is specified. See xxx-mc.doc for the generic features of this program. The following applications-specific sampling procedures are implemented: sample-hyper Does Gibbs sampling for the hyperparameters controlling the distributions of parameters (weights, biases, etc.). sample-noise Does Gibbs sampling for the noise variances. sample-lower-hyper Does Gibbs sampling for all lower-level hyperparameters. rgrid--upper-hyper [ stepsize ] Does random-grid Metropolis updates (one at a time) for the logs of all upper-level hyperparameters, in "precision" form. The default stepsize is 0.1. sample-lower-noise Does Gibbs sampling for all lower-level noise variances. rgrid-upper-noise [ stepsize ] Does random-grid Metropolis updates (one at a time) for the logs of all upper-level hyperameters controlling noise variances, in "precison" form. The default stepsize is 0.1. sample-sigmas Does the equivalent of both sample-hyper and sample-noise. sample-lower-sigmas Does the equivalent of both sample-lower-hyper and sample-lower-noise. sample-upper-sigmas Does the equivalent of both sample-upper-hyper and sample-upper-noise. An "upper-level" hyperparameter is one that controls the distribution of lower-level hyperparameters or noise variances (which may be either explicit or implicit). The "lower-level" hyperparameters directly control the distributions of weights. Looked at another way, the lower-level hyperparameters are the ones at the bottom level of the hierarchy, or for which all lower-level hyperparameters have degenerate distributions concentrated on the value of the higher-level hyperparameter. The random grid metropolis updates done with the above commands record information (eg, rejection rate) for later display in the same way as generic rgrid-met-1 updates. When coupling is being done, upper-level hyperparameters should be updated only with random-grid updates; Gibbs sampling for these upper-level hyperparamters will desynchronize the random number streams (because of the way it is implemented using ARS), preventing coalescence. Lower-level hyperparameters can be updated with Gibbs sampling, and they will exactly coalesce once the parameters they control and the upper-level hyperparameters that control them have exactly coalesced. Default stepsizes for updates of parameters (weigths, biases, etc.) by the generic Markov chain operations are set by a complicated heuristic procedure that is described in Appendix A of the thesis. Tempering methods and Annealed Importance sampling are supported. The effect of running at an inverse temperature other than one is to multiply the likelihood part of the energy by that amount. At inverse temperature zero, the distribution is simply the prior for the hyperparameters and weights. The marginal likelihood for a model can be found using Annealed Importance Sampling, since the log likelihood part of the energy has all the appropriate normalizing constants. Copyright (c) 1995-2001 by Radford M. Neal