download
Download and extract the files in some directory, and let matlab know its path.
Eventually the source will become available for download.
This is a preliminary version so there could be bugs.
A description of convolutional neural networks
The library provides the function mul for multiplication, mult for multiplication by the transpose and outp for outer products. In addition, makeWB is a function for creating the weight matrices.
Note
The library is implemented in C; however, it does not use any sophisticated methods of matrix multiplication. In particular, in the case where there is only one local field, the resulting computation will take \Omega(n^3) running time. Another source of inefficiency is that during the multiplication, the required array address is computed for every indexing operation, yielding big constant factors.Examples
Suppose that we are given an image of size 30 X 30. We would like to use a matrix with local fields of size 6 X 6, with a horizontal and vertical spacing of sizes 2, but we would not like it to be convolutional: the local fields may differ. So two nearby local fields have overlap of size 6 X 4 or 4 X 6. Suppose that, in addition, we want 7 outputs per local field. To implement this, we type the following into matlab:
I.xSize = [30 30]; % size of input vector I.Size = [6 6]; % size of local field I.Step = [2 2]; % the spacing I.Conv = [0 0]; % no convolution along any dimensions I.k = 7; % number of neurons per local field. [W I] = makeWB(I, @randn);
We now created the matrix W. We could have used @zeros, to create a matrix with zeros in its entries.
After creating W, it is safer not to modify the entries of I. A particularly useful field of I that is created by makeWB is I.ySize, the size of the output matrix.
We can now look inside the matrix W. Note that W is not a special matlab object. Its just a high dimensional array. So things like W+1 are valid in matlab.
>> size(W) ans = 13 13 7 6 6The first three two dimensions index the local field. The first two represent the position of the local field, and the third dimension represents the neuron to whcih the local field belongs. The last two dimensions represent position within the local field.
If it is impossible to tile the entire image
with the given field size and step size a warning will be issued.
Such a situation may occur if we use local fields of odd size (i.e., I.Size = [7 7]),
with even step size (i.e, I.Step = [2 2]) and even image sizes (i.e, I.xSize=[30 30]).
Now we can easily multiply, multiply by the transpose and take
outer product.
X = rand(30,30); Y = mul (X, W, I); X1= mult(Y, W, I); O = outp(X, Y, I);
The dimensions of Y are
>> size(Y) ans = 13 13 7Thus, we have 7 elements for each possible placement of the local field in the image. In genreal, both X and the local field are n-dimensional. We specify an n dimensinoal spacing vector which describes how the local field are to located. For each position there are several outputs, thus several distinct local fields, which correspond to the last dimension of y.
Suppose that we want to use a convolutional network instead, with the same spacing of the local fields. Then all we need to do is to set in the initialization
I.Conv = [1 1]This means that we want convolution along all dimensions. We could've written
I.Conv = [1 0]so that we got convolution only along the y coordinate.
For a concrete example, consider
I.xSize = [30 30]; I.Size = [6 6]; I.Step = [2 2]; I.Conv = [1 1]; [W I] = makeWB(I, @randn);Then we created a random convolutional network. It is used exactly as before, except that we have
>> size(W) ans = 1 1 7 6 6So we use the same weights for all the local fields. This is the only difference between convolutional and non-convolutional networks. If written carefully, the rest of the program should not notice any difference.
I_new = I; I_new.xSize = [100 100]; % size of input vector [junk_W I_new] = makeWB(I, @randn);And to use the W that we have together with I_new. So for example, we could now have the following:
X = rand(100,100); Y = mul(X, W, I_new);And this code will work as expected. The reason that we cannot just write I.xSize = [100 100] for the old I that we have is that it contains more information, so it is safer to use makeWB and to throw away the junk_W it produces.
Example: The Spiking Boltzmann Machine
The Spiking Boltzmann Machine (*) is a big RBM whose weight matrix is convolutional in time,
allowing it catpure temporal regularities.
If the SBM has a 30 by 30 image for each time step, and each hidden unit is allowed
to observe the visible variables from 5 time steps, we write
I.xSize = [30 30 T]; I.Size = [30 30 5]; I.Step = [1 1 1]; I.Conv = [0 0 1]; [W I] = makeWB(I, @randn);Where T is the number of time frames. This means that we convolve only around the 3rd dimension, which in our case happens to be the dimensions of time.
It is also possible to modify the above example, so that the hidden units would be "sparse" -- we would throw away every second unit, so that when a multilayered network is created, it can catprue longer range regularities.
I.xSize = [30 30 T]; I.Size = [30 30 5]; I.Step = [1 1 2]; % <-- This means that we have twice as fewer time steps % in the next layer, i.e. each time step in the next layer takes "more time" I.Conv = [0 0 1]; [W I] = makeWB(I, @randn);