(aside image)

Restricted Boltzmann machine (RBM) is a typical building block of deep networks. It is based on undirected connections between the visible and the hidden layer each consisting of a binary vector. Learning of such models has been rather cumbersome, but we have proposed several improvements to the learning algorithm in (Cho et al. 2010, Cho et al. 2011a) that make the algorithm stable and robust against learning parameters and data representation. Gaussian-Bernoulli restricted Boltzmann machine (GRBM) is a version of RBM for continuous-valued data. We have improved its learning algorithm in (Cho et al. 2011b), see image for some filters learned from small natural image patches. We have published software packages implementing our new algorithms. The improvements are also applicable to deep models (Cho et al. 2011c).

We have also studied connections between the two basic types of building blocks: RBMs and auto-encoders. In (Cho et al. 2012a), we propose a framework where auto-encoders can be used for initializing deep Boltzmann machines. In (Cho et al. 2012b), we borrow the contractive regularization idea from auto-encoders for use in RBMs.

There have also been recent advances in using traditional back-propagation for deeper networks than before. It seems that one crucial problem has been that optimization algorithms have been simply too slow and prone to local minima and plateaus. In (Raiko et al. 2012) we present simple transformations that make the optimization problem much easier, allowing learning deep networks with a simple stochastic gradient.

There are also two master's theses written on the theory of deep learning: