Next: Orthogonal basis Up: Preconditioned iterative minimization Previous: Principles

General formalism for kinetic energy preconditioning

We introduce a positive-definite model Hamiltonian $\hat{X}$ and write the energy of the system that it describes as

$\begin{displaymath} E_{X} = \sum_{n} \int \psi^{\ast}_{n}(\mathbf{r}) \hat{X} \psi^{\ }_{n}(\mathbf{r}) \mathrm{d} \mathbf{r}. \end{displaymath}$

(15)

We proceed to derive exact expressions for preconditioning the minimization of Eq. (15). For suitable choice of $\hat{X}$ , these same expressions may be used to improve the condition number for minimizing the true energy Eq. (3). It is worth noting that all of the occupation numbers $f_{n}$ for the model system have been set to unity. This amounts to an additional occupancy preconditioning, first introduced by Gillan [36] in the context of metallic systems and then by Marzari et al. [29] in the general framework of ensemble density-funtional theory.

Following along the same lines as in Section 2, defining

$\begin{displaymath} x^{\ }_{\mu\nu} = \int D^{\ast}_{\mu}(\mathbf{r}) \hat{X} D^{\ }_{\nu}(\mathbf{r}) \mathrm{d} \mathbf{r}, \end{displaymath}$

(16)

and substituting this, Eq. (4) and Eq. (10) into Eq. (15) we obtain

$\begin{displaymath} E_{X} = \sum_{n} (M^{\dagger})_{n}^{\ \alpha} (c^{\dagger})_... ...}^{\ \mu} x^{\ }_{\mu\nu} c^{\nu}_{\ \beta} M^{ \beta}_{\ n}. \end{displaymath}$

(17)

It is at this point that a tensorially incorrect ``diagonal approximation'' is made in Ref. [21]. In our notation, this would be given by

$\begin{displaymath} \sum_{n} M^{\beta}_{\ n} (M^{\dagger})_{n}^{\ \alpha} = (S^{-1})^{\beta\alpha} \simeq J\delta^{\ }_{\beta\alpha}, \end{displaymath}$

(18)

where

is some constant, and the first equality follows from Eq. (6). We do not make this unnecessary approximation.

Formally, as it has been defined to be positive-definite, the matrix $\mathbf{x}$ may be expressed in terms of its unique Cholesky factor $\mathbf{G}$ [37]:

$\begin{displaymath} x_{\mu\nu} = \sum_{k} G_{\mu k}(G^{\dagger})_{k\nu}. \end{displaymath}$

(19)

Substituting this into Eq. (17) gives

$\begin{displaymath} E_{X} = \sum_{kn} \vert a_{kn}\vert^{2}, \end{displaymath}$

(20)

where the new variables $a_{kn}$ which make the energy surface spherical are given by

$\begin{displaymath} a^{\ }_{kn} = (G^{\dagger})^{\ }_{k\nu} c^{\nu}_{\ \beta} M^{\beta}_{\ n} . \end{displaymath}$

(21)

In a steepest descents procedure, although the following easily generalises to the conjugate gradients method, a line minimization is performed along the steepest descents search direction to find the new values of the coefficients $a'_{kn}$ :

$\begin{displaymath} a'_{kn} = a_{kn} - \lambda \frac{\partial E_{X}}{\partial a^{\ast}_{kn}}, \end{displaymath}$

(22)

where $\lambda$ is chosen to minimize the energy. We wish to minimize the energy with respect to the coefficients $c^{\mu}_{\ \alpha}$ , yet the functional is spherical (and hence preconditioned) in the new coefficients $a_{kn}$ . In order to find the new values $c'^{\mu}_{\ \alpha}$ of the coefficients $c^{\mu}_{\ \alpha}$ that minimize the energy, we use the chain rule to write

$\begin{displaymath} \frac{\partial E_{X}}{\partial a^{\ast}_{kn}} = \frac{\part... ...al c^{\mu}_{\ \alpha}} {\partial a^{\ }_{kn}}\right)^{\ast}, \end{displaymath}$

(23)

and from this, and Eqs. (21)-(22), it may be shown that

$\begin{displaymath} c'^{\mu}_{\ \alpha} = c^{\mu}_{\ \alpha} - \lambda (x^{-1})^... ..._{X}}{\partial c^{\nu \ast}_{\ \beta}} S^{\ }_{\beta \alpha}, \end{displaymath}$

(24)

where we have used the relations

$\begin{displaymath} \sum_{n} (M^{-\dagger})_{\alpha n}(M^{-1})_{n\beta} = S_{\alpha\beta}, \end{displaymath}$

(25)

and

$\begin{displaymath} \sum_{k} (G^{-\dagger})^{\mu}_{\ k} (G^{-1})_{k}^{\ \nu} = (x^{-1})^{\mu\nu}, \end{displaymath}$

(26)

obtained from Eqs. (6) and (19), respectively.

Choosing the model Hamiltonian $\hat{X}$ introduced in Eq. (14), and defining

$\displaystyle s^{\ }_{\mu\nu}$	$\textstyle =$	$\displaystyle \int D^{\ast}_{\mu}(\mathbf{r}) D^{\ }_{\nu}(\mathbf{r}) \mathrm{d} \mathbf{r},$	(27)
$\displaystyle t^{\ }_{\mu\nu}$	$\textstyle =$	$\displaystyle -\int D^{\ast}_{\mu}(\mathbf{r}) \nabla^{2} D^{\ }_{\nu}(\mathbf{r}) \mathrm{d} \mathbf{r},$	(28)

Eq. (24) becomes

$\begin{displaymath} c'^{\mu}_{\ \alpha} = c^{\mu}_{\ \alpha} - \lambda \left[\l... ...l E}{\partial c^{\nu \ast}_{\ \beta}} S^{\ }_{\beta \alpha} , \end{displaymath}$

(29)

where, following the discussion in Section 3, we have replaced the model energy $E_{X}$ with the true energy

. We see from Eq. (29) that preconditioning is effected by premultiplying the steepest descent gradient by the matrix $(\mathbf{s} + \mathbf{t}/k^{2}_{0})^{-1}$ and postmultiplying it by $\mathbf{S}$ .

Next: Orthogonal basis Up: Preconditioned iterative minimization Previous: Principles

Arash Mostofi 2003-10-28