Next: Bibliography Up: Paper abstract
Corrected penaltyfunctional method for linearscaling
calculations within densityfunctional theory
P. D. Haynes and M. C. Payne
Theory of Condensed Matter, Cavendish Laboratory,
Madingley Road, Cambridge CB3 0HE, United Kingdom
October 6, 1998
We present a new method for the calculation of groundstate total
energies within densityfunctional theory, based upon the
singleparticle densitymatrix formulation, which requires a
computational effort which scales only linearly with systemsize. The
difficult idempotency constraint is imposed approximately using a
penaltyfunctional constructed to allow efficient minimization. The
resulting error in the total energy due to the violation of
idempotency is removed by an analytic correction. The results for a
system comprising 216 atoms of crystalline silicon are compared with
those from a standard planewave code. Linear scaling to 512 atoms is
also demonstrated on a workstation.
PACS numbers: 71.15.Mb
Within densityfunctional theory (DFT), the complexity of the
problem of calculating the groundstate energy of the inhomogeneous
electron gas in an external potential [1] scales
linearly with systemsize (i.e. the number of
electrons). However, `traditional' methods based upon the KohnSham
(KS) formulation of DFT [2] require a computational
effort which scales asymptotically as , either because of the
cost of diagonalizing the Hamiltonian, or as a result of the
orthogonality requirement for the extended KS singleparticle
wavefunctions.
This scaling restricts the size of systems which can
currently be treated. Much interest has therefore been shown in using
the singleparticle densitymatrix (DM) to calculate the total energy
[3]. Since the DM is shortranged in realspace
[4] and free from orthogonality constraints, it
provides the basis of a linearscaling method for KSDFT
[5,6,7] (see
[8] for a review of some of these methods). However,
most linearscaling schemes have so far been applied only in the
context of tightbinding or restricted basisset calculations. In
contrast, the method presented here has been applied to fully
selfconsistent DFT calculations with an arbitrarily complete
basisset, a task so far attempted by only one other group
[6].
In terms of the KS wavefunctions and occupancies
, the groundstate DM
is

(1) 
where the occupancies equal zero or unity for wavefunctions
with KS eigenvalues above or below the chemical potential
respectively. Defining the energy functional by
where
is the sum of the Hartree and
exchangecorrelation energies which depend only on the electronic
density
(the factor of two
arising from spin degeneracy), and
is the external
potential arising from the ions. The groundstate energy is found by
minimizing this functional with respect to all Hermitian DMs
satisfying

(3) 
and the idempotency condition

(4) 
This can be achieved in operations by exploiting the
shortranged nature of the DM and truncating it beyond some spatial
cutoff
i.e. imposing
for
.
The idempotency condition is required to ensure that the
occupancies are all zero or unity. It is difficult to enforce
since it is a nonlinear constraint. One approach has been to use a
purifying transformation [9] to drive the DM
towards idempotency during minimization
[3,6]. The approach most closely related to
that introduced in this work is due to Kohn who, in a recent Letter
[10], added a penaltyfunctional to the KS energy
functional so that the minimum value of the total functional is
achieved for the idempotent groundstate DM . In terms of a
trial Hermitian DM and its eigenvalues , Kohn's
penaltyfunctional is

(5) 
The squareroot is required to force the total functional to take its
minimum value at the groundstate, because the energy is not
variational with respect to changes in the occupancies. However, the
branch point introduced means that the gradient is undefined at the
groundstate and the functional cannot be minimized using standard
techniques such as the conjugate gradients algorithm, since
information from local gradients cannot be used to build up a picture
of the global behaviour of the total functional [12].
In this Letter, we introduce a generalized functional
defined by

(6) 
The Taylor series for this functional in terms of the occupancies is
welldefined at all points, so that it may be minimized
efficiently. Although the total functional may possess multiple
minima for certain values of , there is only one minimum where
the normalization constraint (3) is also satisfied
[11], and since the search over trial DMs may be
straightforwardly constrained to normalized DMs, multiple minima do
not cause problems in practice.
Although the penaltyfunctional will prevent `runaway'
solutions (such as macroscopic occupation of the lowenergy bands at
the expense of negatively filling higher bands) so that the
minimization process is stable (which is not guaranteed by other
schemes), the DM which minimizes the functional will not
be exactly idempotent. In general, the occupancies
corresponding to occupied bands will exceed unity, and those
corresponding to unoccupied bands will be negative. We estimate the
errors in the occupancies
defined by
in the following manner: since the
functional is minimized by , it is a minimum with
respect to all changes in the occupancies which maintain the
normalization constraint (3), imposed by introducing a
Lagrange multiplier :

(7) 
Using Janak's theorem [13] for the derivative of the
energy functional in terms of the KS eigenvalues
at the minimum of the functional, we obtain:

(8) 
so assuming that the
are small,

(9) 
Application of the normalization constraint (3) requires
from which
is obtained.
The variance of the occupancy errors is related to the
variance of the energy eigenvalues, scaled by the parameter .
As is increased, the errors in the eigenvalues at the minimum
decrease as
. For small deviations
from idempotency, the penaltyfunctional
so that the penalty term also decreases as
, and the energy approaches the true groundstate
energy. This behaviour is illustrated in Fig. 1 where the
occupancy errors are plotted for four bands as a function of
.
Figure 1:
Variation of occupancy errors with . Dotted
lines show the best fit to behaviour.

This convergence is unsatisfactory for practical
applications, since it requires large values of to obtain
accurate estimates of the groundstate energy and at large values of
the penalty term dominates the total functional and hinders
efficient minimization of the energy term. To overcome this problem we
present a method for obtaining accurate values for the groundstate
energy from the results of calculations performed at much smaller
values of .
At the minimum of the total functional,
,

(10) 
This expression can be used to construct a firstorder Taylor
expansion for the total energy with respect to the occupancies. We can
provide an estimate for the groundstate energy as

(11) 
For occupied bands,
whereas for
unoccupied bands
so that
Figure 2:
Variation of the total energy, total functional and corrected
energy with . Dotted lines show the best fit to
behaviour.

The first term of the correction has been written as a sum over all
bands so that it can be expressed in terms of a trace
, which can always be
evaluated in operations. The second term only
contributes when unoccupied bands are included in the calculation,
which is not necessary for insulators. A single eigenvalue of the
(sparse) DM can always be obtained in operations and so
it is possible to evaluate the correction for a small number of
unoccupied bands without spoiling the scaling. Even so,
since the correction is only applied when the minimum of the total
functional has been found, a single step to obtain all
of the eigenvalues of the DM is still an insignificant fraction of the
total computational effort. Importantly, this analytic correction also
permits the calculation of forces consistent with the corrected
energy. However, the electronic density and KohnSham eigenvalues
calculated by the method remain in error, although the former is
qualitatively correct and thus still useful for visualization.
Fig. 2 shows the total energy, the total functional and the
corrected energy as a function of . Even at eV,
the corrected energy agrees with the
limits of both the total energy and total functional to better than
eV per atom.
The lowestorder term omitted from the Taylor expansion involves
the second derivative

(13) 
which is known as the chemical hardness matrix in the context of the
construction of transferable pseudopotentials [14]. This
matrix is neither positive nor negativedefinite, so that the
corrected energy is neither an upper nor a lower bound to the true
groundstate energy. However, this error is much smaller than those
arising from the finite DM cutoff, so that in practice the corrected
energy can be treated as a variational upper bound to the groundstate
energy.
In practical calculations, we cannot work directly
with the extended KS wavefunctions and occupancies, and
instead the trial DM is written in terms of a sparse matrix
and a set of nonorthogonal functions localized within
spherical regions of radius
:

(14) 
The sparsity of is controlled by the parameter , forcing
elements of corresponding to localized functions with centres
separated by more than to vanish.
and
together determine
. The
localized functions are expanded in some localized basis, and then the
total functional is minimized with respect to the expansion
coefficients and the matrix elements .
The minimization is performed by two nested loops, the inner with
respect to and the outer with respect to the . In
terms of the Hamiltonian in the representation of the ,
and the overlap
matrix
, the gradient with
respect to is

(15) 
In contrast to other orbitalbased methods, the condition number for
this stage of the minimization is approximately unity (since the
curvature of the functional with respect to variations of is
determined by the shape of alone) and hence the minimization is
much more efficient [15]. By making the Löwdin
transformation to a set of orthonormalized orbitals
and defining the DM and
Hamiltonian in this representation as
and
respectively, then at the constrained minimum defined by
(7);

(16) 
i.e.
so that the DM and the Hamiltonian
commute and can therefore be diagonalized simultaneously. (In terms of
this representation (14),
, although its value is never required). The resulting eigenvalues
obey (8).
The gradient with respect to the localized functions is
When transformed to the common diagonal frame of the DM
and Hamiltonian by some unitary transformation
, this gradient becomes

(18) 
At the minimum of with respect to both and we
thus have from (8):
These are simply the KohnSham equations, with eigenvalues shifted by
the Lagrange multiplier associated with the normalization
constraint. Thus, in addition to kineticenergy preconditioning
[16], occupancy preconditioning [17] can
be applied to improve the convergence.
This scheme has been implemented and applied to the test case of
216 atoms of crystalline silicon. The localized functions were
expanded in a sphericalwave basisset [18] using an
energy cutoff of 200 eV and angular momentum components up to . In Fig. 3 the convergence of the corrected energy with
respect to
and is plotted. As these spatial
cutoffs are increased, the variational freedom of the DM is
increased, so that the energy converges from above.
Figure 3:
Convergence of the corrected energy with respect to (left)
the localization region radius
and (right) the
cutoff applied to the matrix .

Table 1 shows values for the equilibrium lattice
parameter and bulk modulus obtained using values of
Å, Å and eV.
Experimental values and results of the CASTEP planewave code
[19], using the same pseudopotential
[20] and equivalent Brillouin zone sampling are also
shown. The calculations of agree to better than 1% and those of
(which is more sensitive to the data) to within 3%.
Table 1:
Comparison of calculated and experimental data for
crystalline silicon.
Calculation 
Linearscaling 
CASTEP 
Experiment 




/ Å 
5.428 
5.390 
5.430 
/ GPa 
104.3 
101.7 
100.0 
Finally, in Fig. 4 the computational effort for a
single iteration of the outer loop for the same values of
, and is plotted against systemsize
to demonstrate the linear scaling of the method. Since fewer matrix
multiplications are required, each iteration of this method is cheaper
than the purifying method, also plotted (both methods require a
similar number of iterations to converge the energy to a given
accuracy). The number of iterations required to converge these
linearscaling calculations is currently an order of magnitude larger
(about 200 for these calculations on silicon) than is the case for
traditional cubicscaling allbands methods [21], but
since the condition number of the functional depends only upon
properties of the system being modelled, the number of iterations
required does not increase with systemsize. Since the method
achieves linear scaling by partitioning realspace into overlapping
regions, excellent scaling with respect to the number of processors is
expected for implementations on massivelyparallel computers
[22].
Figure 4:
Scaling of computational effort with systemsize for this
method compared with a method based on the purifying transformation
(DEC 500au workstation).

In conclusion, we have presented a new DMbased linearscaling
method for DFT, in which the approximate imposition of idempotency,
enforced by means of a penaltyfunctional, permits the use of
efficient minimization methods, and from which accurate estimates of
the energy derived from the true idempotent groundstate DM can be
obtained by using a correction derived from the form of the
penaltyfunctional.
We acknowledge useful discussions with I. D. White, and PDH
acknowledges the support of an EPSRC studentship. This work is covered
by British Patent Application No. 9814931.3.
Next: Bibliography
Peter D. Haynes
20021025