This page contains a selection of our most important publications.
For complete listings, see home pages of the members.

There is also a list of our older publications (1999-2008).

The figure shows the model structure used in this paper.

Y. Lu, Unsupervised Learning on Neural Network Outputs.

In Proc. of the 25th International Joint Conference on Artificial Intelligence (IJCAI), New York, July 2016.

K. Greff, A. Rasmus, M. Berglund, T. Hotloo Hao, J. Schmidhuber, and H. Valpola, Tagger: Deep Unsupervised Perceptual Grouping.

Extended version of the paper published in Proc. of the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, December 2016. arXiv:1606.06724v2.

C. K. Sønderby, T. Raiko, L. Maaløe, S. K. Sønderby, O. Winther.

Ladder Variational Autoencoders.

Published in Proc. of the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, December 2016. arXiv:1602.02282.

M. Berglund, Stochastic gradient estimate variance in contrastive divergence and persistent contrastive divergence.

In Proc. of the 2016 European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges, Belgium, April 2016.

M. Abbas

Understanding regularization by virtual adversarial training, ladder networks and others.

Presented in the workshop track of the International Conference on Learning Representations (ICLR), Puerto Rico, May 2016.

J. Luketina, M. Berglund, K. Greff, and T. Raiko.

Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters.

In Proc. of the 33rd International Conference on Machine Learning (ICML 2016), New York, USA, June 2016, pp. 2952-2960.

A. Rasmus, H. Valpola, M. Honkala, M. Berglund, and T. Raiko.

Semi-Supervised Learning with Ladder Networks.

Advances in Neural Information Processing Systems 28 (NIPS 2015), pages 3532-3540, December 2015.

Extended version available as arXiv:1507.02672 [cs.NE], July 2015.

A. Rasmus, T. Raiko, and H. Valpola

Denoising autoencoder with modulated lateral connections learns invariant representations of natural images.

Preprint available as arXiv:1412.7210 [cs.NE], December 2014.

T. Raiko, M. Berglund, G. Alain, and L. Dinh.

Techniques for Learning Binary Stochastic Feedforward Neural Networks.

In the International Conference on Learning Representations (ICLR 2015), San Diego, May, 2015.

T. Raiko, L. Yao, K. Cho, and Y. Bengio.

Iterative Neural Autoregressive Distribution Estimator (NADE-k).

In Advances in Neural Information Processing Systems 27 (NIPS 2014), Montreal, December, 2014.

J. Karhunen, T. Raiko, and K. Cho.

Chapter 7: Unsupervised Deep Learning: A Short Survey.

In Advances in Independent Component Analysis and Learning Machines, Ella Bingham, Samuel Kaski, Jorma Laaksonen, and Jouko Lampinen editors, Academic Press, 2015.

H. Schulz, K. Cho, T. Raiko, and S. Behnke.

Two-Layer Contractive Encodings for Learning Stable Nonlinear Features Learning Systems.

In Neural Networks journal, special issue on Deep Learning of Representations, Yoshua Bengio and Honglak Lee editors, Elsevier, 2015.

T. Vatanen, T. Raiko, H. Valpola, and Y. LeCun.

Pushing Stochastic Gradient towards Second-Order Methods - Backpropagation Learning with Transformations in Nonlinearities.

In Lecture Notes in Computer Science, volume 8226, Neural Information Processing (ICONIP 2013), Special Session on Deep Learning and Related Technologies, pages 442-449, Springer, Heidelberg, November 2013.

T. Raiko, H. Valpola, and Y. LeCun.

Deep Learning Made Easier by Linear Transformations in Perceptrons.

In Proc. of the 15th Int. Conf. on Artificial Intelligence and Statistics (AISTATS 2012), JMLR W&CP, volume 22, pp. 924-932, La Palma, Canary Islands, April 21-23, 2012.

M. Berglund, T. Raiko, M. Honkala, L. Kärkkäinen, A. Vetek, J. Karhunen.

Bidirectional Recurrent Neural Networks as Generative Models.

Advances in Neural Information Processing Systems 28 (NIPS 2015), pages 856-864, December 2015.

J. Luttinen, T. Raiko, and A. Ilin.

Linear State-Space Model with Time-Varying Dynamics.

In Machine Learning and Knowledge Discovery in Databases (ECML),
Lecture Notes in Computer Science, Volume 8725, pp 338-353,
September 2014.

T. Raiko and M. Tornio.
Variational Bayesian learning of nonlinear hidden state-space models for model predictive control.
In Neurocomputing, volume 72, issues 16-18, pages 3704-3712, October 2009.

A. Ilin, H. Valpola, E. Oja. (2004). Nonlinear Dynamical Factor Analysis for State Change Detection. *IEEE Transactions on Neural Networks* 15(3), pp. 559-575.

doi:10.1109/TNN.2004.826129

H. Valpola, J. Karhunen. (2002). An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models. *Neural Computation* 14(11), pp. 2647-2692.

Gzipped postscript (654k), Pdf (937k)

M. Berglund, T. Raiko, K. Cho.

Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information
Learning Systems.

In Neural Networks journal, special issue on Deep Learning of Representations,
Yoshua Bengio and Honglak Lee editors,
Elsevier, 2015.

K. Cho, T. Raiko, A. Ilin, and J. Karhunen.

A Two-stage Pretraining Algorithm for Deep Boltzmann Machines.

In V. Mladenov et al. (Eds.), Artificial Neural Networks and Machine Learning - ICANN 2013,
Lecture Notes in Computer Science, volume 8131, Springer-Verlag, pages 106-113, September 2013.

K. Cho, T. Raiko, and A. Ilin.

Enhanced Gradient for Training Restricted Boltzmann
Machines.

Neural Computation, Vol. 25, No. 3, pp. 805-831, March 2013.

K. Cho, A. Ilin, and T. Raiko.

Tikhonov-Type Regularization for Restricted Boltzmann Machines.

In Artificial Neural Networks and Machine Learning - ICANN 2012, Lecture Notes in Computer Science, volume 7552, pages 81-88, September,
2012.

T. Hao, T. Raiko, A. Ilin, and J. Karhunen.

Gated Boltzmann Machine in Texture Modeling.

In Artificial Neural Networks and Machine Learning - ICANN 2012, Lecture Notes in Computer Science,
volume 7553, pages 124-131, September 2012.

K. Cho, T. Raiko, and A. Ilin.

Gaussian-Bernoulli Deep Boltzmann Machine.

In the proceedings of the IEEE International Joint Conference
on Neural Networks (IJCNN 2013), Dallas, Texax, August, 2013.

K. Cho, T. Raiko, and A. Ilin.

Enhanced Gradient and Adaptive Learning Rate for Training
Restricted Boltzmann Machines.

In Proc. of the Int. Conf. on Machine Learning (ICML 2011), Bellevue, Washington,
June 2011.

K. Cho, A. Ilin, and T. Raiko.

Improved Learning of Gaussian-Bernoulli Restricted
Boltzmann Machines.

In Proc. of the Int. Conf. on Artificial Neural Networks (ICANN 2011), Espoo,
Finland, June 2011.

K. Cho, T. Raiko, and A. Ilin.

Parallel Tempering is Efficient for Learning
Restricted Boltzmann Machines.

In Proc. of the Int. Joint Conf. on Neural Networks (IJCNN 2010), pp. 3246-3253,
Barcelona, Spain, July 2010.

A. Honkela, T. Raiko, M. Kuusela, M. Tornio, and J. Karhunen. (2010) Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes. Journal of Machine Learning Research, 11(Nov):3235-3268, 2010.

T. Raiko, H. Valpola, M. Harva, J. Karhunen. (2007).
Building blocks
for variational Bayesian learning of latent variable models.
*Journal of Machine Learning Research* 8(Jan), pp. 155-201.

A. Ilin, H. Valpola. (2005). On the Effect of the Form of the Posterior
Approximation in Variational Learning of ICA Models. *Neural Processing
Letters* 22(2), pp. 183-204.

doi:10.1007/s11063-005-5265-0

A. Honkela, H. Valpola. (2004). Variational learning and bits-back coding:
an information-theoretic view to Bayesian learning. *IEEE Transactions on
Neural Networks* 15(4), pp. 800-810.

doi:10.1109/TNN.2004.828762

Pdf (308k)

A. Honkela, H. Valpola, J. Karhunen. (2003). Accelerating Cyclic Update Algorithms for Parameter Estimation by Pattern Searches. *Neural Processing Letters* 17(2), pp. 191-203.

doi:10.1023/A:1023655202546

Pdf (220k)

H. Lappalainen, J. Miskin. (2000). Ensemble Learning. In M. Girolami, editor, *Advances in Independent Component Analysis*, pp. 75-92, Springer-Verlag.

Gzipped postscript (127k)

J. Luttinen, A. Ilin, and J. Karhunen. Bayesian Robust PCA of Incomplete Data. Neural Processing Letters, Online, 6 June 2012.

A. Ilin and T. Raiko. Practical Approaches to Principal Component Analysis in the Presence of Missing Values. In the Journal of Machine Learning Research (JMLR), volume 11, pages 1957-2000, July 2010.

H. Valpola, M. Harva, J. Karhunen. (2004). Hierarchical Models of Variance Sources. *Signal Processing* 84(2), pp. 267-282.

doi:10.1016/j.sigpro.2003.10.014

Pdf (1128k)

C. Jutten, M. Babaie-Zadeh, J. Karhunen. (2010). Nonlinear Mixtures. Chapter 14 in C. Jutten
and P. Comon (editors), *Handbook of Blind Source Separation, Independent Component Analysis
and Applications*, pp. 549-592, Academic Press.

Home page of the
book

M. Harva, A. Kabán. (2007). Variational Learning for Rectified Factor Analysis. *Signal Processing* 87(3), pp. 509-527.

doi:10.1016/j.sigpro.2006.06.006

Pdf (655k)

A. Honkela, H. Valpola, A. Ilin, J. Karhunen. (2007). Blind Separation of Nonlinear Mixtures by Variational Bayesian Learning. *Digital Signal Processing* 17(5), pp. 914-934.

doi:10.1016/j.dsp.2007.02.009

Pdf (1961k)

A. Honkela, H. Valpola. (2005). Unsupervised Variational Bayesian Learning of Nonlinear Models. In L. Saul, Y. Weiss, L. Bottou, editors, *Advances in Neural Information Processing Systems 17*, pp. 593-600, MIT Press.

Pdf (118k)

H. Lappalainen, A. Honkela. (2000). Bayesian Nonlinear Independent Component Analysis by Multi-Layer Perceptrons. In M. Girolami, editor, *Advances in Independent Component Analysis*, pp. 93-121, Springer-Verlag.

Gzipped postscript (420k), Pdf (991k)

K. Kersting, L. D. Raedt, T. Raiko. (2006). Logical Hidden Markov Models. *Journal of Artificial Intelligence Research* 25(), pp. 425-456.