next up previous
Next: Orthogonal basis Up: Preconditioned iterative minimization Previous: Principles


General formalism for kinetic energy preconditioning

We introduce a positive-definite model Hamiltonian $\hat{X}$ and write the energy of the system that it describes as

\begin{displaymath}
E_{X} = \sum_{n} \int \psi^{\ast}_{n}(\mathbf{r}) \hat{X}
\psi^{\ }_{n}(\mathbf{r}) \mathrm{d} \mathbf{r}.
\end{displaymath} (15)

We proceed to derive exact expressions for preconditioning the minimization of Eq. (15). For suitable choice of $\hat{X}$, these same expressions may be used to improve the condition number for minimizing the true energy Eq. (3). It is worth noting that all of the occupation numbers $f_{n}$ for the model system have been set to unity. This amounts to an additional occupancy preconditioning, first introduced by Gillan [36] in the context of metallic systems and then by Marzari et al. [29] in the general framework of ensemble density-funtional theory.

Following along the same lines as in Section 2, defining

\begin{displaymath}
x^{\ }_{\mu\nu} = \int D^{\ast}_{\mu}(\mathbf{r}) \hat{X} D^{\
}_{\nu}(\mathbf{r}) \mathrm{d} \mathbf{r},
\end{displaymath} (16)

and substituting this, Eq. (4) and Eq. (10) into Eq. (15) we obtain

\begin{displaymath}
E_{X} = \sum_{n} (M^{\dagger})_{n}^{\ \alpha}
(c^{\dagger})_...
...}^{\ \mu} x^{\ }_{\mu\nu} c^{\nu}_{\ \beta} M^{
\beta}_{\ n}.
\end{displaymath} (17)

It is at this point that a tensorially incorrect ``diagonal approximation'' is made in Ref. [21]. In our notation, this would be given by

\begin{displaymath}
\sum_{n} M^{\beta}_{\ n} (M^{\dagger})_{n}^{\ \alpha} =
(S^{-1})^{\beta\alpha} \simeq J\delta^{\ }_{\beta\alpha},
\end{displaymath} (18)

where $J$ is some constant, and the first equality follows from Eq. (6). We do not make this unnecessary approximation.

Formally, as it has been defined to be positive-definite, the matrix $\mathbf{x}$ may be expressed in terms of its unique Cholesky factor $\mathbf{G}$ [37]:

\begin{displaymath}
x_{\mu\nu} = \sum_{k} G_{\mu k}(G^{\dagger})_{k\nu}.
\end{displaymath} (19)

Substituting this into Eq. (17) gives

\begin{displaymath}
E_{X} = \sum_{kn} \vert a_{kn}\vert^{2},
\end{displaymath} (20)

where the new variables $a_{kn}$ which make the energy surface spherical are given by

\begin{displaymath}
a^{\ }_{kn} = (G^{\dagger})^{\ }_{k\nu} c^{\nu}_{\ \beta}
M^{\beta}_{\ n} .
\end{displaymath} (21)

In a steepest descents procedure, although the following easily generalises to the conjugate gradients method, a line minimization is performed along the steepest descents search direction to find the new values of the coefficients $a'_{kn}$:

\begin{displaymath}
a'_{kn} = a_{kn} - \lambda \frac{\partial E_{X}}{\partial a^{\ast}_{kn}},
\end{displaymath} (22)

where $\lambda$ is chosen to minimize the energy. We wish to minimize the energy with respect to the coefficients $c^{\mu}_{\ \alpha}$, yet the functional is spherical (and hence preconditioned) in the new coefficients $a_{kn}$. In order to find the new values $c'^{\mu}_{\ \alpha}$ of the coefficients $c^{\mu}_{\ \alpha}$ that minimize the energy, we use the chain rule to write
\begin{displaymath}
\frac{\partial E_{X}}{\partial a^{\ast}_{kn}} =
\frac{\part...
...al c^{\mu}_{\ \alpha}}
{\partial a^{\ }_{kn}}\right)^{\ast},
\end{displaymath} (23)

and from this, and Eqs. (21)-(22), it may be shown that
\begin{displaymath}
c'^{\mu}_{\ \alpha} = c^{\mu}_{\ \alpha} - \lambda (x^{-1})^...
..._{X}}{\partial c^{\nu \ast}_{\ \beta}} S^{\ }_{\beta
\alpha},
\end{displaymath} (24)

where we have used the relations
\begin{displaymath}
\sum_{n} (M^{-\dagger})_{\alpha n}(M^{-1})_{n\beta}
= S_{\alpha\beta},
\end{displaymath} (25)

and
\begin{displaymath}
\sum_{k} (G^{-\dagger})^{\mu}_{\ k} (G^{-1})_{k}^{\ \nu}
= (x^{-1})^{\mu\nu},
\end{displaymath} (26)

obtained from Eqs. (6) and (19), respectively.

Choosing the model Hamiltonian $\hat{X}$ introduced in Eq. (14), and defining

$\displaystyle s^{\ }_{\mu\nu}$ $\textstyle =$ $\displaystyle \int D^{\ast}_{\mu}(\mathbf{r}) D^{\
}_{\nu}(\mathbf{r}) \mathrm{d} \mathbf{r},$ (27)
$\displaystyle t^{\ }_{\mu\nu}$ $\textstyle =$ $\displaystyle -\int D^{\ast}_{\mu}(\mathbf{r}) \nabla^{2}
D^{\ }_{\nu}(\mathbf{r}) \mathrm{d} \mathbf{r},$ (28)

Eq. (24) becomes
\begin{displaymath}
c'^{\mu}_{\ \alpha} = c^{\mu}_{\ \alpha} - \lambda
\left[\l...
...l E}{\partial c^{\nu \ast}_{\ \beta}}
S^{\ }_{\beta \alpha} ,
\end{displaymath} (29)

where, following the discussion in Section 3, we have replaced the model energy $E_{X}$ with the true energy $E$. We see from Eq. (29) that preconditioning is effected by premultiplying the steepest descent gradient by the matrix $(\mathbf{s} + \mathbf{t}/k^{2}_{0})^{-1}$ and postmultiplying it by $\mathbf{S}$.


next up previous
Next: Orthogonal basis Up: Preconditioned iterative minimization Previous: Principles
Arash Mostofi 2003-10-28