Alexey Ozerov IRISA / INRIA - Rennes alexey.ozerov@irisa.fr A. Ozerov, S. Arberet, and E. Vincent ALGORITHM: 1. 200 iterations of Generalized Expectation-Maximization algorithm from [1] for joint estimation of flexible model. The following particular models were used: a. for speech sources: harmonic NMF (for spectral power) / rank-1 spatial covariance (see [1] for details) b. for music sources: NMF with K = 4 (for spectral power) / rank-1 spatial covariance (see [1] for details) 2. The sources are recovered via Wiener filtering, as described in [1], given estimated model. INITIALIZATION: 1. An initial mixing matrix is estimated by DEMIX algorithm [2]. 2. Initial source estimates are obtained via l0 norm minimization (given initial mixing matrix) of the source STFTs [3]. 3. An initial NMF source decompositions are computed from the power spectrograms of initial source estimates minimizing Kullback-Leibler divergence. COMPUTATIONAL TIME Our Matlab implementation on 2.2 GHz CPU runs up to 25 minutes, depending on particular configuration. REFERENCES: [1] A. Ozerov, E. Vincent, and F. Bimbot, \"A General Modular Framework for Audio Source Separation\". In: 9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA 2010) (2010), submitted [2] S. Arberet, R. Gribonval, and F. Bimbot, \"A robust method to count and locate audio sources in a stereophonic linear instantaneous mixture\\\", In Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA). (2006) 536-543 [3] E. Vincent, \"Complex nonconvex lp norm minimization for underdetermined source separation\", In Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA). (2007)