Restricted Boltzmann machine (RBM) is a typical building block of deep networks.
It is based on undirected connections between the visible and the hidden
layer each consisting of a binary vector. Learning of such models has been rather
cumbersome, but we have proposed several improvements to the learning algorithm
in (Cho et al. 2010, Cho et al. 2011a) that make the algorithm stable and robust
against learning parameters and data representation. Gaussian-Bernoulli restricted Boltzmann machine (GRBM)
is a version of RBM for continuous-valued data. We have improved its learning
algorithm in (Cho et al. 2011b), see image for some filters learned from small natural image patches. We have published
software packages implementing our new algorithms.
The improvements are also applicable to deep models (Cho et al. 2011c).
We have also studied connections between the two basic types of building blocks: RBMs and auto-encoders.
In (Cho et al. 2012a), we propose a framework where auto-encoders can be used for initializing deep Boltzmann machines.
In (Cho et al. 2012b), we borrow the contractive regularization idea from auto-encoders for use in RBMs.
There have also been recent advances in using traditional back-propagation
for deeper networks than before. It seems that one crucial problem has been that
optimization algorithms have been simply too slow and prone to local minima and plateaus.
In (Raiko et al. 2012) we present simple transformations that make the optimization problem much
easier, allowing learning deep networks with a simple stochastic gradient.
There are also two master's theses written on the theory of deep learning:
-
Cho, KyungHyun. Improved Learning Methods for Restricted Boltzmann Machines.
Master's Thesis, Aalto University, 2011.
-
Calandra, Roberto. An Exploration of Deep Belief Networks toward Adaptive Learning.
Master's Thesis, Aalto University, 2011.
References
K. Cho, T. Raiko, and A. Ilin.
Enhanced Gradient for Training Restricted Boltzmann Machines.
Neural Computation, Vol. 25, No. 3, pp. 805-831, March 2013.
K. Cho, A. Ilin, and T. Raiko.
Tikhonov-Type Regularization for Restricted Boltzmann Machines.
In Artificial Neural Networks and Machine Learning - ICANN 2012, Lecture Notes in Computer Science, volume 7552, pages 81-88, September,
2012b.
K. Cho, T. Raiko, A. Ilin, and J. Karhunen.
A Two-stage Pretraining Algorithm for Deep Boltzmann Machines.
Presented in the Deep Learning and Unsupervised Feature Learning Workshop at NIPS, 2012a.
T. Raiko, H. Valpola, and Y. LeCun.
Deep Learning Made Easier by Linear Transformations in Perceptrons.
In Proc. of the 15th Int. Conf. on Artificial Intelligence and Statistics
(AISTATS 2012), JMLR W&CP, volume 22, pp. 924-932, La Palma, Canary Islands,
April 21-23, 2012.
K. Cho, T. Raiko, and A. Ilin
Gaussian-Bernoulli Deep Boltzmann
Machine.
In the NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning,
Granada, Spain, December 16, 2011c.
K. Cho, A. Ilin, and T. Raiko.
Improved Learning of Gaussian-Bernoulli
Restricted Boltzmann Machines.
In Lecture Notes in Computer Science, Volume 6791, Artificial Neural Networks and
Machine Learning - ICANN 2011, pp. 10-17, Espoo, Finland, June 14-17, 2011b.
K. Cho, T. Raiko, and A. Ilin.
Enhanced Gradient and Adaptive Learning
Rate for Training Restricted Boltzmann Machines.
In the proceedings of the International Conference on Machine Learning (ICML 2011),
Bellevue, Washington, USA, June 28-July 2, 2011a.
K. Cho, T. Raiko, and A. Ilin.
Parallel Tempering is Efficient for
Learning Restricted Boltzmann Machines.
In the proceedings of the International Joint Conference on Neural Networks
(IJCNN 2010), Barcelona, Spain, 18-23 July, 2010.