Finding the **n** parameters a which minimize when **n>1** is a much
harder problem. The **n=2** case is like trying to find the point at the
bottom of the lowest valley in a range of mountains. Higher dimensional
cases are hard to visualize at all. This author usually just pretends they
are 2-D problems and tries not to worry too much.

When presented with a new problem it is usually fruitful to get a feel for the shape of the function surface by plotting against for each , in each case keeping fixed.

As mentioned above the choice of local minimizer will depend upon whether it is possible to generate first or second derivatives efficiently, and how noisy the function is. All the local minimizers below assume that the function is locally relatively smooth - any noise will confuse the derivative estimates and may confound the algorithm. For such functions the Simplex Method is recommended (Section 2.7.1).

Most of the local minimizers below rely on approximating the function about the current point, a, using a Taylor series expansion

The matrix is called the * Hessian matrix* of the function at
a.

At the minimum the gradient is zero (if it wasn't we could move further downhill), so and

(where A is the Hessian evaluated at the minimum point ).

See also Covariance Estimation (section 1.9).

Fri Mar 28 14:12:50 GMT 1997