CONTACT AND AFFILIATION ----------------------- * Contact: F. Mustiere (1,2); M. Bouchard (2); H. Najaf-Zadeh (1); R. Pichevar (1); L. Thibault (1) e-mail: fmustier@crc.gc.ca * Affiliation: (1) Communications Research Centre, Ottawa, Canada (2) University of Ottawa, Ottawa, Canada BASIC INFORMATION ----------------- * Strategy used: 1 ("process each mixture (= 1 isolated sentence) alone") * We have submitted three different algorithms, codenamed "TF", "LSBEAM", and "LSBEAM+TF" (described below in more details) * For all algorithms (except the living room environment), the DOA estimation is done with the reference code provided (C. Blandin, A. Ozerov and E. Vincent, "Multi-source TDOA estimation in reverberant audio using angular spectra and clustering"), as it performed the fastest and best among the tested algorithms. * Algorithm "TF": a subband version of a highly simplified state-space based method (elementary Kalman Filters), with under/overestimation of the observation variance to remove less/more noise. It has similarities with the one from [1]. Except for the living room conditions, the Noise PSD estimation is based on diffuse noise assumptions, found by solving for the transfer function of the source signal to each channel. The noise estimation is similar in essence to the one proposed in [2]. For the living room assumptions, the noise is estimated via an approximated null-beamformer, assuming that the ratio of transfer functions from the source to each ear is unity (noise estimation is likely the weakest link in this environment). * Algorithm "LSBEAM": a least-squares beamformer tuned for naturalness and to respect spatial cues, no noise assumptions. The basic LS beamformer is found in [3]. It assumes a free-field model for each transfer function from the source to the microphones, and thus could not be applied for the living room environment. * Algorithm "LSBEAM+TF": a combination of the above, but both tuned to be significantly more aggressive, except in the high-frequencies where parameters are tuned for soft enhancement. Similarly to the LSBEAM algorithm, it was not applied to the living room environment. * Average running time: Algorithms were run on a AMD V120 processor, 3 Gigs of RAM, 2.2 GHz, Linux Debian 6.0.1, Matlab 7.11. TF; unoptimized code: about 20 seconds per file (assuming the filterbank has already been designed) -- except the living room files, for which it takes 5-10 seconds. LSBEAM; 2 second per file LSBEAM+TF: about 25 seconds REFERENCES ---------- [1] F. Mustiere, M. Bolic, B. Bouchard, "Real-world particle filtering based speech enhancement", CIP 2010. [2] H. Kamkar-Parsi, M. Bouchard, "Improved noise power spectrum density estimation for binaural hearing aids operating in a diffuse noise field environment", TASLP, 2009 [3] J. Benesty, J. Chen, Y. Huang, Microphone Array Signal Processing, Springer, 2008