Next: The Algorithm with Adaptive Up: New Random Neural Network Previous: Different Variants of the Contents Index

New LM with Adaptive Momentum for RNN

Even though LM is one of the most powerful algorithms to train feedforward networks, it still has some drawbacks. One of them is the need to compute the Jacobian matrix and to invert the Hessian matrix with dimensions equal to the number of weights of the network. However, this problem is compensated by the increased rate of convergence of the algorithm, which becomes quadratic as we approach to a solution. Another disadvantage is that it is not guaranteed to converge to the global minimum of the cost function. When the algorithm converges to a local minimum, it is not possible to escape. Thus, in [8], the authors proposed an algorithm that handles local minima with increased robustness and still maintains its fast convergence rate. The results of the simulation tests they made have revealed the power of this method to obtain solutions in difficult problems whereas other techniques including LM fail to converge. The main idea to help LM to overshoot to a local minima is to include a momentum term that inserts second order derivative information in the training process and provides iterations whose form is similar to that in the Conjugate Gradient (CG) method. The major difference with the CG method is that the coefficients regulating the weighting between the gradient and the momentum term are heuristically determined whereas in the CG they are adaptively determined.

Subsections

The Algorithm with Adaptive Momentum

Next: The Algorithm with Adaptive Up: New Random Neural Network Previous: Different Variants of the Contents Index

Samir Mohamed 2003-01-08