The Helmholtz Machine Through Time
Geoffrey E.
Hinton, Peter Dayan, Ava To, and Radford M. Neal
Department of Computer Science
University of Toronto
Abstract
We describe the "wake-sleep'' algorithm that allows a
multilayer, unsupervised, stochastic neural network to build a hierarchical, top-down
generative model of an ensemble of data vectors. Because the generative model uses
distributed representations that are a non-linear function of the input, it is intractable
to compute the posterior probability distribution over hidden representations given the
generative model and the current data vector. It is therefore intractable to fit the
generative model to data using standard techniques such as gradient descent or EM. Instead
of computing the posterior distribution exactly, a "Helmholtz Machine'' uses a
separate set of bottom-up "recognition'' connections to produce a compact
approximation to the posterior distribution. The wake-sleep algorithm uses the top-down
generative connections to provide training data for the bottom-up recognition connections
and vice versa. In this paper, we show that the wake-sleep algorithm can be
generalized to model the temporal structure in sequences of data vectors. This gives a
very simple online algorithm that fits generative models which have distributed hidden
representations which can be exponentially more powerful than conventional Hidden Markov
Models.
in F. Fogelman-Soulie and R. Gallinari (editors) ICANN-95, pp.
483-490.
Download [ps] [pdf]
Associated reference:
This conference paper discusses the wake-sleep algorithm and applies
it to models of temporal sequences. The wake-sleep algorithm was introduced in the
following paper: Hinton, G. E., Dayan, P., Frey, B. J., and Neal, R. M. (1995) The
wake-sleep algorithm for unsupervised neural networks, Science, vol. 268, pp.
1158-1161 - Download [abstract] [ps]
[pdf]
[home page] [publications]