(aside image)

Deep learning is a fairly new area of machine learning and neural network research. It uses neural networks having several hidden layers for finding hierarchical representations of data, starting from observations towards more and more abstract representations. Traditionally, backpropagation type algorithms have been used for learning appropriate representations of data in such multilayer networks. However, when the number of hidden layers in a feedforward neural network is larger than two, such learning algorithms suffer from several problems which make it often impossible to find satisfactory representations of the data.

Hinton and Salakhutdinov found that unsupervised pretraining of the hidden layers using restricted Boltzmann machines helps to obtain much better representations of the data. The idea is not only to learn the nonlinear mapping between input and output vectors but also a good representation of the input data. The learning results provided by restricted Boltzmann machines can be refined and improved further by using backpropagation type supervised learning algorithms. Trained in this manner, deep networks have provided world-record results in many classification and regression bencmark problems, leading to a renaissance of neural network research.

It is also possible to use multilayer perceptron (MLP) networks as auto-encoders for finding hierarchical representations of data. In an auto-encoder MLP network, the input and output vectors are the same, and the network includes a middle bottleneck layer whose activations are used as a representation of data. It is essential to regularize an autoencoder for instance by using a lower dimensionality in the bottleneck, adding artificial noise to the inputs, or using a contractive regularizer.

Below are some general references on deep learning. Our results and publications are presented on other subpages on deep learning found under the 'Research' page.

References


Web portal on Deep learning.

R. Salakhutdinov, Learning Deep Generative Models. Doctoral thesis, MIT, USA, 2009.

Y. Bengio, ``Learning deep architectures for AI'', Foundations and Trends in Machine Learning, vol. 2, no. 1, 2009, pp. 1-127.

G. Hinton, S. Osindero, and Y.-W. Teh, ``A fast learning algorithm for deep belief nets''. Neural Computation, vol. 18, pp. 1527-1554, 2006.