The data is processed with the algorithm described in [1].
First, the mixing parameters for all the sources are estimated from the data with an interleaved weighted ICA algorithm followed by a method for reducing the permutations based on a spatio-temporal coherence [1]. Then, signals are separated with L0 norm minimization in each TF point. Similarly to [2], an additional post-processing based on a Wiener filter is applied at end of the enhancement chain, evaluating the power spectral density from the STFT of each channel of the estimated source images[1][2].
NOTE:
Algorithm "Nesta1" uses only the L0 norm minimization.
Algorithm "Nesta2" uses the L0 norm minimization and the Wiener postprocessing.
A non optimized Matlab code takes 5-10 minutes for each audio file with the current parameters settings on an Intel CPU Xeon E5520 @ 2.27GHz. (with different parameters computational time can be reduced to about 1-2 minutes with a very small loss of performance).
[1]"Convolutive underdetermined source separation through weighted Interleaved ICA and spatio-temporal correlation", Francesco Nesta, Maurizio Omologo, submitted to LVA/ICA 2012, Tel Aviv
[2] "Robust Automatic Speech Recognition through On-line Semi-Blind Source Extraction"
Francesco Nesta and Marco Matassoni, CHIME Workshop 2011, Florence(Italy)
http://spandh.dcs.shef.ac.uk/projects/chime/workshop/papers/pS22_nesta.pdf