Nonlinear blind source separation

Today, several good methods exist for standard linear independent component analysis (ICA) and blind source separation (BSS) (Hyvärinen et al., 2001; Cichocki and Amari, 2002). The extension of ICA and the closely related factor analysis (FA) (Hyvärinen et al., 2001) as well as BSS to nonlinear mixtures is much more difficult due to severe uniqueness and computational problems. In particular, there exist infinitely many nonlinear ICA solutions, while the nonlinear BSS problem can be regularized using suitable constraints (Jutten and Karhunen, 2004). Several of the approaches proposed for nonlinear BSS can be applied to small-scale problems only due to their exponentially growing computational requirements. See the recent review chapter (Jutten, Babaie-Zadeh, and Karhunen, 2010) as well as the earlier review paper (Jutten and Karhunen, 2004) and our nonlinear ICA page for more information and references on nonlinear ICA and BSS.

We have applied variational Bayesian learning to nonlinear FA (NFA) and and BSS problems. The methods are based on the generative model

x(t) = f(s(t), θf) + n(t), (1)

where x(t) is the observed data vector, s(t) are the corresponding hidden sources, and f is the nonlinear mapping from sources to observations parameterized by θf.

Most of our work, starting from (Lappalainen and Honkela, 2000), is based on modeling the nonlinear generative mapping f from sources to mixtures using well-known multi-layer perceptron (MLP) neural networks with sigmoidal tanh nonlinearities. MLP networks are well-suited for nonlinear BSS, because they allow modeling of any type of nonlinearity at least in principle. It is easy to model smooth, nearly linear mappings with them. This makes it possible to learn high-dimensional nonlinear representations in practice. Our early work on this topic has been described in (Lappalainen and Honkela, 2000), with application of nonlinear Bayesian factor analysis to a fairly high-dimensional real-world nonlinear mixture of 30 components.

Much of our more recent work on nonlinear BSS is summarized in (Honkela et al., 2007). New developments of the basic method include a more accurate linearization of the nonlinearity (Honkela and Valpola, 2005) and using kernel PCA for initialization to help avoid local minima and speed up convergence (Honkela et al., 2004). The new linearization increases stability and accuracy of the method in problems with a large number of sources. The new kernel PCA initialization can lead to significant improvement in separation results when the mixing is strongly nonlinear if the applied kernel is chosen suitably.

[An illustration of the HNFA model]f used in HNFA is illustrated in the figure on right. Compared to the standard MLP used before, the values at hidden nodes are treated as latent variables to decrease computational complexity. As this makes modeling the nonlinearity less efficient, a linear shortcut mapping was added to compensate. Still, HNFA is not as good as the MLP-based NFA in modeling highly nonlinear mappings.

HNFA is applicable to larger problems than the MLP based method, as the computational complexity is linear with respect to the number of sources. The efficient pruning facilities of Bayes Blocks framework also allow determining whether the nonlinearity is really needed and pruning it out when the mixing is linear, as demonstrated in (Honkela et al., 2005).

In (Ilin and Honkela, 2004; Honkela et al., 2007), we have applied variational Bayesian learning to the important special case of post-nonlinear (PNL) mixtures (Hyvärinen et al., 2001; Jutten and Karhunen, 2004). The PNL model consists of a linear mixture followed by nonlinear distortions applied separately to each component of the linear mixture. In the PNL case, the uniqueness conditions become essentially the same as for linear ICA and BSS problems. In (Ilin and Honkela, 2004), we have shown that the developed Bayesian method can achieve separation of signals in a very challenging post-nonlinear BSS problem with non-invertible post-nonlinearities that is not separable using standard techniques (Jutten and Karhunen, 2004).

Free implementations of the NFA and HNFA methods are available on our software pages.

Non-negative blind source separation by rectified factor analysis

Linear BSS, ICA and FA models with non-negativity constraints have been considered by many authors especially in applications to digital image data, where the pixels have non-negative values. In the variational Bayesian framework, positivity of the factors or sources can be achieved by putting a non-negatively supported prior on the factors. The rectified Gaussian distribution is particularly convenient, as it is conjugate to the Gaussian likelihood arising in the FA model. Unfortunately, this solution has serious technical limitations. They can be circumvented by reformulating the model using rectification nonlinearities (Harva and Kabán, 2007). In this paper, a variational learning procedure was derived for the proposed model, and it was shown that it indeed overcomes the problems that exist with the related approaches. The method has been applied to the analysis of galaxy spectra; see the section on applications to astronomy.

References

A. Cichocki and S.-I. Amari, Adaptive Blind Signal and Image Processing, Wiley 2002. Home page of the book.

M. Harva, A. Kabán. "Variational Learning for Rectified Factor Analysis". Signal Processing, vol. 87, no. 3, 2007, pp. 509-527.
Pdf (655k)

A. Honkela, S. Harmeling, L. Lundqvist, and H. Valpola, "Using kernel PCA for initialisation of variational Bayesian nonlinear blind source separation method". In C. Puntonet and A. Prieto (Eds.), Proc. of the 5th Int. Conf. on Independent Component Analysis and Blind Signal Separation (ICA 2004), Granada, Spain, September 2004, pp. 790-797. Publisher electronic edition

A. Honkela, T. Östman, and R. Vigàrio, "Empirical evidence of the linear nature of magnetoencephalograms". In Proc. 13th European Symp. on Artificial Neural Networks (ESANN 2005), Bruges, Belgium, April 2005, pp. 285-290. Pdf (487k).

A. Honkela and H. Valpola, "Unsupervised variational Bayesian learning of nonlinear models." In L. Saul, Y. Weiss, L. Bottou (Eds.), Advances in Neural Information Processing Systems 17, pp. 593-600, 2005, MIT Press. Pdf (118k)

A. Honkela, H. Valpola, A. Ilin, J. Karhunen. "Blind Separation of Nonlinear Mixtures by Variational Bayesian Learning". Digital Signal Processing, 2007.
doi:10.1016/j.dsp.2007.02.009

A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis, Wiley 2001. Home page of the book.

A. Ilin and A. Honkela, "Postnonlinear independent component analysis by variational Bayesian learning". In C. Puntonet and A. Prieto (Eds.), Proc. of the 5th Int. Conf. on Independent Component Analysis and Blind Signal Separation (ICA 2004), Granada, Spain, September 2004, pp. 766-773. Pdf (257k).

C. Jutten, M. Babaie-Zadeh, and J. Karhunen, "Nonlinear Mixtures". Chapter 14, pp. 549-592, in C. Jutten and P. Comon (Eds.), Handbook of Blind Source Separation, Independent Component Analysis and Applications, Academic Press, 2010.

C. Jutten and J. Karhunen, "Advances in blind source separation (BSS) and independent component analysis (ICA) for nonlinear mixtures". Int. J. of Neural Systems, vol. 14, no. 5, 2004, pp. 267-292. Gzipped postscript (147k)

H. Lappalainen and A. Honkela, "Bayesian nonlinear independent component analysis by multi-layer perceptrons". In M. Girolami (Ed.), Advances in Independent Component Analysis, pp. 93-121, Springer 2000. Pdf (991k)

H. Valpola, T. Östman, and J. Karhunen, "Nonlinear independent factor analysis by hierarchical models". In Proc. 4th Int. Symp. on Independent Component Analysis and Blind Signal Separation (ICA2003), Nara, Japan, April 2003, pp. 257-262. Pdf (1022k).