DFT:  Models based on Dirichlet diffustion trees.

The 'dft' programs implement Bayesian models for multivariate
probability or probability density estimation that are based on
Dirichlet diffusion trees.  The results of fitting such a model to
data (a set of training cases) can be used to make predictions for
future observations (test cases), or they can be interpreted to
produce a hierarchical clustering of the training cases.

A 'dft' model consists of one or more Dirichlet diffusion trees, whose
parameters may be fixed or may be given prior distributions.  Each
tree produces a real-valued vector for each training case; these are
added together to produce real-valued "latent" vectors associated with
each training case.  The latent vector for a case is used to define a
probability distribution for the case's data.  The data vector for a
case can be real or binary, but it cannot at present be partly real
and partly binary.  The model for binary data is that the probability
of data items being 1 is found by applying the logistic function to
the corresponding latent value.  Real data is modeled as being
Gaussian distributed with mean given by the latent vector, or as being
t-distributed with location parameter given by the latent vector.  A
t-distribution for the noise is obtained using a hierarhical prior
specification for Gaussian noise variances that includes a level
allowing for different noise variances for each variable and for each
training case, which produces a t-distribution once the case-by-case
variances are integrated over.

The Markov chain used for sampling from the posterior distribution
over trees has as its state the structures of the trees, the
divergence times for each node in each tree, and any variable
hyperparameters for the trees or the noise distribution.  The latent
latent vectors for the training case and the locations of non-terminal
nodes may are also present (and must be in some circumstances).

            Copyright (c) 1995-2003 by Radford M. Neal