1. Introduction
The traditional method for training a multilayer perceptron is the standard backpropagation (SBP) algorithm. This one suffers from the slowness of convergence; several iterations are required to train a small network, even for a simple problem. Much research works was pushed to accelerate the convergence of the algorithm. This research falls roughly into two learning approaches. The first one is based on the first order optimization techniques such as varying the learning rate of the gradient method used in SBP, using momentum and rescaling variables [6]–[7]. The second approach is based on the use of the second order optimization methods applied to train the MLP. The most popular approaches from the second category have used conjugate gradient or quasi-Newton (Secant) methods. The quasi-Newton methods are modified versions of Newton method known by its high convergence speed. All these optimization techniques like ML and DFP methods are based on the approximation of the Hessian matrix used in the Newton method and they are considered to be more efficient, but their storage and computational requirements go up as the square of the size of the network.