next up previous contents index
Next: On the Use of Up: Descriptions of Our New Previous: Comparison of the Methods   Contents   Index


MOS Calculation and Statistical Analysis

Let us denote by $N$ the number of subjects in the chosen subjective method and by $u_{is}$ the evaluation of sequence $\sigma_s$ made by user $i$. The set of values $(u_{is})_{i = 1,\cdots,N}$ will probably present variations due to the differences in judgment between subjects. Moreover, it is possible that some subjects do not pay enough attention during the experiment, or behave in some unusual way face to the sequences; this can lead to inconsistent data for the training phase. Some statistical filtering is thus necessary on the set of raw data. The most widely used reference to deal with this topic is the ITU-R BT.500-10 recommendation, [67]. The described procedure allows to remove the ratings of those subjects who could not conduct consistent scores. First, denote by $\bar{u}_s$ the mean of the evaluations of sequence $\sigma_s$ on the set of subjects, that is,
\begin{displaymath}
\bar{u}_s=\frac{1}{N}\displaystyle \sum_{i=1}^{N}u_{is}.
\end{displaymath} (41)

Denote by $[\bar{u}_s-\Delta_s,\bar{u}_s+\Delta_s]$ the 95%-confidence interval obtained from the $(u_{is})$, that is, $\Delta_s = 1.96 \delta_s / \sqrt{N}$, where

\begin{displaymath}
\delta_s = \sqrt{\displaystyle \sum_{i=1}^{N} {\frac{(u_{is} -
\ \bar{u}_s)^2}{N-1}}}. \end{displaymath}

As stated in [67], it must be ascertained whether this distribution of scores is normal or not using the $\beta_2$ test (by calculating the ``kurtosis'' coefficient of the function, i.e. the ratio of the fourth order moment to the square of the second order moment). If $\beta_2$ is between 2 and 4, the distribution may be taken to be normal. In symbols, denoting $\beta_{2s} = m_{4s}/m^2_{2s}$ where

\begin{displaymath}m_{xs} = \frac{1}{N} \sum_{i=1}^{N} (u_{is}-\bar{u}_s)^x, \end{displaymath}

if $2 \leq \beta_{2s} \leq 4$ then the distribution $(u_{is})_{i = 1,\cdots,N}$ can be assumed to be normal. For each subject $i$, we must compute two integer values $L_i$ and $R_i$, following the following procedure:

 ifff yyyy xxxx zzzz uuuu 
				 set $L_i = 0$ and $R_i = 0$ 

for each sequence $\sigma_s \in {\cal S} = \{ \sigma_1,\cdots,\sigma_S\}$
if $2 \leq \beta_{2s} \leq 4$, then
if $u_{is} \geq \bar{u}_s + 2 \, \delta_s$ then $R_i=R_i+1$
if $u_{is} \leq \bar{u}_s - 2 \, \delta_s$ then $L_i=L_i+1$
else
if $u_{is} \geq \bar{u}_s + \sqrt{20} \, \delta_s$ then $R_i=R_i+1$
if $u_{is} \leq \bar{u}_s - \sqrt{20} \, \delta_s$ then $L_i=L_i+1$
Finally, if $(L_i + R_i)/S > 0.05$ and $\vert(L_i - R_i)/(L_i + R_i)\vert < 0.3$ then the scores of subject $i$ must be deleted. For more details about this topic and the other methods of subjective tests see [67]. After eliminating the scores of those subjects who could not conduct coherent ratings using the above technique, the mean score should be recomputed using Eq. 4.1. This will constitute the MOS database that we will use to train and test the NN.
next up previous contents index
Next: On the Use of Up: Descriptions of Our New Previous: Comparison of the Methods   Contents   Index
Samir Mohamed 2003-01-08