Alexey Ozerov Institut TELECOM; TELECOM ParisTech; CNRS LTCI - Signal and Image Processing Department alexey.ozerov@telecom-paristech.fr A. Ozerov & C. Févotte (EM-NMF) ALGORITHM: We use the Expectation-Maximization (EM) algorithm for multichannel Nonnegative Matrix Factorization (NMF) in convolutive mixtures proposed in [1] in the following setting: 1. Every stereo source image is modeled as - an instantaneous point source image, or - a convolutive point source image, or - a sum of several convolutive point source images (for images of non-point sources), and more precisely a) for \"Tamy - Que pena tanto faz\": 1. \"vocals\" are modeled as an instantaneous point source image with 8 NMF components 2. \"guitar\" is modeled as a sum of 3 convolutive point source images, each modeled by 3 NMF components b) for \"Bearlin - Roads\": 1. \"bass\" is modeled as an instantaneous point source image with 6 NMF components 2. \"vocals\" are modeled as an instantaneous point source image with 6 NMF components 3. \"piano\" is modeled as a convolutive point source image with 6 NMF components 4. \"remaining background\" is modeled as a sum of 3 convolutive point source images, each modeled by 4 NMF components 2. For each source image a mixing system A_i and an an NMF model (spectral patterns W_i and activation gains H_i) are estimated from the development data using either the EM algorithm for multichannel NMF [1] or the single-channel NMF with Itakura-Saito divergence [2]. 3. (A_i, W_i, H_i) of different sources are grouped together forming (A, W, H), and H is discarded. 4. Given A and W fixed, H is estimated from the test data using 200 iterations of the EM algorithm for multichannel NMF [1]. 5. The sources are recovered via Wiener filtering, as described in [1], given the estimated model (A, W and H). COMPUTATIONAL TIME Our Matlab implementation on 2.2 GHz CPU runs a. 3750 seconds for \"Tamy - Que pena tanto faz\" b. 7540 seconds for \"Bearlin - Roads\" REFERENCES: [1] A. Ozerov, and C. Févotte, \"Multichannel nonnegative matrix factorization in convolutive mixtures. With application to blind audio source separation\", In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP\'09), 2009, (submitted). [2] C. Févotte, N. Bertin, and J.-L. Durrieu, \"Nonnegative matrix factorization with the Itakura-Saito divergence. with application to music analysis\", Tech. Rep., 2008, In press. http://www.tsi.enst.fr/~fevotte/TechRep/techrep08_is-nmf.pdf.