The conjugate gradients algorithm for minimising functions is described in appendix B. Throughout the lengthy derivation it is clear that the useful results obtained and the remarkably simple final result are due to the special properties of quadratic functions. Any function may be expanded in the form of a Taylor series about an analytic point, and around a minimum where the first order term from the gradient vanishes, a quadratic function is generally a good approximation. However, we note that the Kohn penalty functional has a branch point from the square-root function exactly at the ground-state minimum which we seek, and so the function cannot be Taylor-expanded there. Local information from the gradient cannot be used to infer the global shape of the function. This is illustrated in figure 6.3 for the case of a parabolic interpolation to find a line minimum based upon the gradient and a trial step, but the problem is even worse in the multi-dimensional space since the ``conjugate'' directions constructed from the gradients will not point in the direction of the ground-state minimum.
This problem is reflected in the very poor convergence when an attempt is
made to minimise the functional using conjugate gradients: the steepest
descents method actually performs better because it does not assume global quadratic
behaviour. Also, the penalty functional does not vanish at the minimum
sufficiently quickly
as the parameter is increased. However, the root-mean-square error in
the occupation numbers
, given by
![]() |
(6.14) |
=1mm
![]() |
In figure 6.4 we present the results of tests on an 8-atom silicon cell
to demonstrate the behaviour of the functional. As the penalty functional
parameter is increased, both the contribution of the penalty
functional to the total functional
, and the root mean
square error in the occupation numbers
decrease, but
not rapidly enough with
since the number of iterations required to
reach convergence increases with
making the calculations too
expensive for practical applications. For example, the number of iterations
required to converge the
total functional to
eV per atom increases by a factor of more than ten
when
is increased from 100 eV to 1000 eV. Even with the smaller
value for
, the rate of convergence is much slower than traditional
methods, and this is due to the incompatibility of the functional with the
conjugate gradients scheme.