Below the short overview is provided from the Deep Learning Summer school 2016 in Montreal and papers with high impact.

Neural Networks, Hugo Larochelle

Tips for training NNs:

Recurrent Neural Networks,Yoshua Bengio

High gradients are sensitive to noise, we want to keep gradients less than 1. However vanishing gradients phenomena can also occur. When gradient is large, dont trust it, use gradient norm clipping (Mikolov thesis)

Yoshua told that forward propagation could be used in the future instead of the back-propagation..

Convolutional Neural Networks, Rob Fergus

Pooling is used for feature invariance plus a larger receptive field. Depth of the network is the key. Towards the deeper layer both vertical and horizontal translations are better. Anneal the learning rates, take a small batch. Take better a bigger model and regularize it better than a smaller model.

The most interesting papers mentioned:

Computer vision, Antonio Garibaldi

Natural Language Processing, Kyunghyun Cho

The summary was written by Luiza Sayfullina. All comments please send to luiza.sayfullina@aalto.fi.