I. Introduction
The backpropagation learning algorithm [3] is often used with a momentum variation [2], in which the weight change is a combination of the new steepest descent step and the previous weight change. The purpose of using momentum is to smooth the weight trajectory and speed the convergence of the algorithm [1]. It is also sometimes credited with avoiding local minima in the error surface. In this paper we analyze the effect of momentum when minimizing quadratic error functions.