next up previous contents index
Next: Testing the Success Rate Up: Performance Evaluation of the Previous: and Parameters for AM-LM   Contents   Index

Algorithms' Performance Comparison

We compare here the performance of the different training algorithms (GD, LM, LM1, LM2 and AM-LM) for RNN. For the two previously defined problems, we test the convergence time and the number of iterations needed to reach some fixed MSE value. The MSE goals are $5\times
10^{-5}$ and $4\times 10^{-7}$ for the first and second problems respectively. For the GD algorithm, we set $\eta=0.1$. For the LM algorithm, we fixed $\beta$ to 2 and the initial value of $\mu$ to 0.1. For AM-LM, the values of $\zeta $ and $dP$ are fixed to 0.90 and 0.5 respectively. We show in the set of Figures 10.5 and in that of Figures 10.6 the variation of MSE against the number of iterations as well as the total time taken to converge to the MSE goal for the first and second problems respectively. From Figures 10.5(a) and 10.5(b), we can see that LM considerably outperforms GD. The latter takes 20.01 sec. in 112 iterations against 1.2 sec. in 9 iterations for the former to reach the same error level. Thus, LM requires about 8 times less to converge. For the second problem, as shown from Figures 10.6(a) and 10.6(b), GD takes 38 sec. and 270 iterations against 0.87 sec. and 7 iterations for LM. In this problem LM is 43 times faster than GD. Keeping the weights non-negative degrades the performance of the training algorithms for RNN. This can be shown from Figures 10.5(b), 10.5(c) for the first problem, and Figures 10.6(b), 10.6(c), 10.6(d) for the second problem. As we can see, LM performs better than both LM1 and LM2. In addition, we can see that LM1 performs better than LM2. For the first problem, LM1 does not converge. This is because most of the weights are zeroed and hence the Hessian matrix becomes singular. However, in general LM and LM2 perform better than GD. It should be mentioned that RNNs (as ANNs) are to be used as black boxes once they are correctly trained. The most important point is that the black box behaves correctly and efficiently instead of looking into the internal implementation. Hence, we suggest that the non-negative weights constraint should not be used as it greatly degrades the performance of the training algorithms. Of course, the price to pay is that the nice probabilistic interpretation is lost. From Figure 10.5(d), we can see that AM-LM outperforms all the other training algorithms. It is about 3 times faster than LM and 52 times faster than GD for the first problem. In addition, we can see that in three iterations, it converged to $10^{-7}$ against $5\times
10^{-5}$ for the other training algorithms for the first problem. As we will see next for the first problem, LM can converge in only 2 steps and takes only 0.22 sec. Moreover, AM-LM can converge in only one step and takes 0.14 sec. In addition, in some cases the two algorithms may reach absolute zero error. The XOR problem is solved by the AM-LM algorithm. Depending on the initialization of the weights, it takes from 1 to 8 iterations to converge. In some cases, it converges to zero error. Using GD, it takes 2140 iterations to reach $5\times
10^{-5}$ MSE [87].

Figure 10.5: The performance of the GD, LM, LM2 and AM-LM training algorithms on the first problem.
[GD training algorithm] \fbox{\includegraphics[width=.42\textwidth]{RnnFigs/GD-ex1.eps}} [LM training algorithm] \fbox{\includegraphics[width=.42\textwidth]{RnnFigs/LM-ex1.eps}} [LM2 training algorithm] \fbox{\includegraphics[width=.42\textwidth]{RnnFigs/LM2-ex1.eps}} [AM-LM training algorithm] \fbox{\includegraphics[width=.42\textwidth]{RnnFigs/Adapt-ex1.eps}}

Figure 10.6: The performance of the GD, LM, LM1, and LM2 training algorithms on the second problem.
[GD training algorithm] \fbox{\includegraphics[width=.42\textwidth]{RnnFigs/GD-ex2.eps}} [LM training algorithm] \fbox{\includegraphics[width=.42\textwidth]{RnnFigs/LM-ex2.eps}} [LM1 training algorithm] \fbox{\includegraphics[width=.42\textwidth]{RnnFigs/LM1-ex2.eps}} [LM2 training algorithm] \fbox{\includegraphics[width=.42\textwidth]{RnnFigs/LM2-ex2.eps}}


next up previous contents index
Next: Testing the Success Rate Up: Performance Evaluation of the Previous: and Parameters for AM-LM   Contents   Index
Samir Mohamed 2003-01-08