In (Keronen et al., 2013), we propose to use a Gaussian restricted Boltzmann
machine (GRBM, Cho et al., 2011) to extract features from the cross-correlation
coefficients of stereo channels of speech. By simply plugging in the GRBM in
the existing speech recognition pipeline (Keronen et al., 2012), we were able to
improve the performance of keyword recognition in noisy environment.
References
Keronen, S.,
Cho, K.,
Raiko, T.
Ilin, A., and
and
Palomäki, K..
Gaussian-Bernoulli restricted Boltzmann machines and automatic feature extraction for noise robust missing data mask estimation
In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013).
May 2013. (to appear)
Cho, K.,
Ilin, A., and
Raiko, T.
Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines.
In Proceedings of the International Conference on Artificial Neural Networks (ICANN 2011).
Espoo, Finland. June 2011.
Keronen, S.,
Kallasjoki H.,
Remes U.,
Brown, G. J.,
Gemmeke J. F., and
Palomäki, K..
Mask estimation and imputation methods for missing data speech recognition
in a multisource reverberant environment.
Computer Speech and Language, 27:3, 2013