Stereo Audio Source Separation Evaluation Campaign
Development data
Download the development data (40 MB)

Description of the data

The development data are generated from four sets of source signals with 10 s duration sampled at 16 kHz:

  • 4 male speech sources
  • 4 female speech sources
  • 3 non-percussive music sources
  • 3 music sources including drums
In practice, music sources have a duration of 11 s, but only the last 10 s of the derived mixture signals are kept to avoid border effects.

Instantaneous mixtures are all obtained using the same mixing matrix with positive coefficients. Only the first three columns are used for music mixtures.

Synthetic convolutive mixtures and live recordings are obtained for a meeting room of 250 ms reverberation time using omnidirectional microphones with two different spacings: 5 cm and 1 m. The positions of the sources and the microphones are illustrated below. The three loudspeakers used for music mixtures are those numbered 1, 2, 3 for synthetic convolutive mixtures, 1, 3, 4 for live recordings of non-percussive music and 1, 2, 4 for live recordings of music including drums.

Loudspeakers and microphones positions

File naming convention

The development data consist of Matlab MAT-files and WAV audio files, that can be imported in Matlab using the commands load and wavread respectively. These files are named as follows:

  • matrix.mat: mixing matrix for instantaneous mixtures
  • <spacing>_filt.mat: mixing filter system for synthetic convolutive mixtures
  • <srcset>_src_<j>.wav: single-channel source signal
  • <srcset>_inst_sim_<j>.wav or <srcset>_<mixtype>_<spacing>_sim_<j>.wav : contribution of a source signal on the two mixture channels
  • <srcset>_inst_mix.wav or <srcset>_<mixtype>_<spacing>_mix.wav: stereo mixture signal
where <srcset> is the set of source signals (male for male speech, female for female speech, nodrums for non-percussive music, wdrums for music including drums), <mixtype> is the mixture type (inst for instantaneous mixtures, synthconv for synthetic convolutive mixtures, liverec for live recordings), <spacing> is the microphone spacing (5cm or 1m) and <j> is the source index (1 to 4).

The development data are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 2.0 license. They can be freely used for non-commercial purposes and redistributed under the same license, provided the authors are acknowledged (Another Dreamer and Alex Q for music sources, Emmanuel Vincent and Hiroshi Sawada for other data).