Next: 6.2 Corrected penalty functional Up: 6. Penalty Functionals Previous: 6. Penalty Functionals Contents

Subsections

6.1 Kohn's method

6.1.1 Variational principle

As mentioned in section 4.4.3, Kohn [135] has suggested the use of a penalty functional to impose the idempotency condition, and has proved a variational principle based upon it. We consider trial density-matrices $\rho({\bf r},{\bf r'})$ expressed in diagonal form with real orthonormal extended orbitals $\{ \varphi_i({\bf r}) \}$ and occupation numbers $\{f_i\}$ :

$\begin{displaymath} {\rho}({\bf r},{\bf r'}) = \sum_i f_i~ {\varphi}_i({\bf r}) {\varphi}_i({\bf r'}) . \end{displaymath}$

(6.1)

The functional ${\cal Q}[{\rho};\mu,\alpha]$ is then formed:

$\begin{displaymath} {\cal Q}[{\rho};\mu,\alpha] \equiv E_{\mathrm{NI}}[{\rho}^2] - \mu N[{\rho}^2] + \alpha {\cal P}[{\rho}] \end{displaymath}$

(6.2)

in which

$\displaystyle E_{\mathrm{NI}}[{\rho}^2]$	$\textstyle \equiv$	$\displaystyle 2 \int {\mathrm d}{\bf r'} \left\{ \left[ -{\textstyle{1 \over 2}... ...\bf r'}) V_{\mathrm{KS}}({\bf r'}) \right\} = 2 \sum_i f_i^2~ {\varepsilon}_i ,$	(6.3)
$\displaystyle N[{\rho}^2]$	$\textstyle \equiv$	$\displaystyle 2 \int {\mathrm d}{\bf r}~ \rho^2({\bf r},{\bf r}) = 2 \sum_i f_i^2 ,$	(6.4)
$\displaystyle {\cal P}[{\rho}]$	$\textstyle \equiv$	$\displaystyle \left[ \int {\mathrm d}{\bf r}~ \left( {\rho}^2(1 - {\rho})^2 \ri... ...r}) \right]^{1 \over 2} = \left[ \sum_i f_i^2 (1 - f_i)^2 \right]^{1 \over 2} ,$	(6.5)

and where $\mu$ is the chemical potential and $\alpha$ is a positive real parameter.

Kohn proves the following variational principle: that for some $\alpha > \alpha_{\mathrm c}$ , the minimum value of ${\cal Q}[{\rho};\mu,\alpha]$ is obtained for the idempotent ground-state density-matrix $\rho_0$ and that the minimum value is the ground-state grand potential i.e.

$\begin{displaymath} \mathop{\rm min}\limits _{\rho} {\cal Q}[{\rho};\mu,\alpha] ... ...m_{i,\varepsilon^{(0)}_i \leq \mu} (\varepsilon^{(0)}_i - \mu) \end{displaymath}$

(6.6)

in which the $\{ \varepsilon^{(0)}_i \}$ are the exact eigenvalues of the self-consistent Hamiltonian, generated by the ground-state density-matrix $\rho_0$ .

The critical value of $\alpha$ , denoted $\alpha_{\mathrm c}$ , is given by

$\begin{displaymath} \alpha_{\mathrm c} = \mathop{\rm max}\limits _{{\cal P}'} \l... ...athrm d} \Omega({\cal P}')} {{\mathrm d}{\cal P}'} \right\vert \end{displaymath}$

(6.7)

in which $\Omega({\cal P}')$ is the conditional minimum defined by

$\begin{displaymath} \Omega({\cal P}') = \mathop{\rm min}\limits _{{\cal P}[{\rho... ...}'} \left( E_{\mathrm{NI}}[{\rho}^2] - \mu N[{\rho}^2] \right) \end{displaymath}$

(6.8)

i.e. the minimum grand potential for all trial density-matrices which give a penalty functional value of ${\cal P}'$ . Clearly

$\begin{displaymath} \alpha_{\mathrm c} \geq \left\vert \frac{{\mathrm d} \Omega(... ..._i \leq \mu} (\varepsilon^{(0)}_i - \mu)^2 \right]^{1 \over 2} \end{displaymath}$

(6.9)

although this is only a lower bound on $\alpha_{\mathrm c}$ .

**Figure 6.1:** Behaviour of Kohn's penalty functional ${\cal P}[\rho]$ when a single occupation number is varied and all others are zero or unity.
$\begin{picture}(285,199) \put(28,0){\includegraphics [width=8cm]{penfun.eps}} \put(6,171){${\cal P}[\rho]$} \put(262,6){$f_i$} \end{picture}$

Kohn's variational principle is based on the non-interacting energy $E_{\mathrm{NI}}[\rho^2]$ . We now present a simple modification of this functional based upon self-consistent variation of the interacting energy. Consider the functional

$\begin{displaymath} {\tilde {\cal Q}}[{\rho};\mu,\alpha] = E[{\rho}] - \mu N[{\rho}] + \alpha {\cal P}[{\rho}] \end{displaymath}$

(6.10)

in which $E[{\rho}]$ is the interacting energy, and ${\rho}$ is a positive semi-definite trial density-matrix. A given set of occupation numbers $\{f_i\}$ fixes the value of the penalty functional ${\cal P}[{\rho}]$ and variation of ${\tilde {\cal Q}}[{\rho};\mu,\alpha]$ with respect to the orbitals $\{ { \varphi}_i({\bf r}) \}$ at fixed occupation numbers and subject to the orthonormality constraint yields Kohn-Sham-like equations. Self-consistent variation of the occupation numbers $\{f_i\}$ (i.e. allowing the orbitals to relax, as in section 4.2) yields

$\begin{displaymath} \frac{\partial {\tilde {\cal Q}}[{\rho};\mu,\alpha]}{\partia... ... + \frac{\alpha}{{\cal P}[{\rho}]} f_i (1 - f_i) (1 - 2 f_i) . \end{displaymath}$

(6.11)

In the case of idempotent density-matrices, for which ${\cal P}[{\rho}]=0$ , we obtain the special cases

$\begin{displaymath} \left.\frac{\partial {\tilde Q}[{\rho};\mu,\alpha]} {\partia... ..._i=(0^{\pm},1^{\pm})} = 2 ({\varepsilon}_i - \mu) \pm \alpha . \end{displaymath}$

(6.12)

For this functional the critical value of $\alpha$ , again denoted $\alpha_{\mathrm c}$ , is given by

$\begin{displaymath} \alpha_{\mathrm c} = \mathop{\rm max}\limits _{\rho} \left\vert 2({\varepsilon}_i - \mu) \right\vert \end{displaymath}$

(6.13)

where the maximum is strictly over those density-matrices searched during the minimisation. For $\alpha > \alpha_{\mathrm c}$ the total functional ${\tilde {\cal Q}}[{\rho};\mu,\alpha]$ takes its minimum value when

for ${\varepsilon}_i > \mu$ and ${\varepsilon}_i < \mu$ respectively. In particular, for the ground-state density-matrix $\rho_0$ , the functional is strictly increasing with respect to all variations in occupation numbers. The discontinuity in the occupation number derivative of the penalty functional at idempotency is required because of the non-variational behaviour of the total energy with respect to these variations (section 4.2). The behaviour of the penalty functional for unconstrained occupation number variation is plotted in figure 6.1, and in figure 6.2 the total functional is sketched schematically for several representative values of the parameter $\alpha$ . This demonstrates how the minimising density-matrix is idempotent only for $\alpha \geq \alpha_{\mathrm c}$ .

**Figure 6.2:** Schematic illustration of Kohn's variational principle: behaviour of the total energy (black) and total functional (red) for representative values of $\alpha$ .
$\includegraphics [width=13cm]{kohnfun.eps}$

6.1.2 Implementation problems

The conjugate gradients algorithm for minimising functions is described in appendix B. Throughout the lengthy derivation it is clear that the useful results obtained and the remarkably simple final result are due to the special properties of quadratic functions. Any function may be expanded in the form of a Taylor series about an analytic point, and around a minimum where the first order term from the gradient vanishes, a quadratic function is generally a good approximation. However, we note that the Kohn penalty functional has a branch point from the square-root function exactly at the ground-state minimum which we seek, and so the function cannot be Taylor-expanded there. Local information from the gradient cannot be used to infer the global shape of the function. This is illustrated in figure 6.3 for the case of a parabolic interpolation to find a line minimum based upon the gradient and a trial step, but the problem is even worse in the multi-dimensional space since the ``conjugate'' directions constructed from the gradients will not point in the direction of the ground-state minimum.

**Figure 6.3:** Failure of quadratic interpolation for Kohn's penalty functional.
$\includegraphics [height=8cm]{oops.eps}$

This problem is reflected in the very poor convergence when an attempt is made to minimise the functional using conjugate gradients: the steepest descents method actually performs better because it does not assume global quadratic behaviour. Also, the penalty functional does not vanish at the minimum sufficiently quickly as the parameter $\alpha$ is increased. However, the root-mean-square error in the occupation numbers $\overline{\delta f}$ , given by

$\begin{displaymath} \overline{\delta f} = \frac{{\cal P}[\rho]}{\sqrt{N/2}} \end{displaymath}$

(6.14)

in fact decays more rapidly, so that the total energy calculated at the minimum is quite accurate, although it is neither variational nor an upper bound. Also, although the total functional ${\tilde {\cal Q}}[\rho;\mu,\alpha]$ decreases monotonically, the total energy does not. Thus no advantage is gained by using the variational property, since it can only be applied to the total energy when ${\cal P}[\rho] = 0$ . The variational property of the total functional is that it is minimal at the ground-state, but this minimum is defined in terms of the functional taking its minimum value there, not in terms of a vanishing gradient (the gradient being undefined at the ground-state). Because of the non-variational behaviour of the total energy with respect to the occupation numbers at the ground-state, it is impossible to construct a penalty functional which has a continuous first derivative at the ground-state and also results in a variational principle for the total energy.

**Figure 6.4:** Convergence properties of Kohn's penalty functional: behaviour of penalty functional and occupation numbers with $\alpha$ .
$\begin{picture}(427,242) \put(71,14){\includegraphics [width=10cm]{kohn.eps}} \p... ...$\alpha {\cal P}[\rho]$\ /eV} \put(370,142){$\overline{\delta f}$} \end{picture}$

In figure 6.4 we present the results of tests on an 8-atom silicon cell to demonstrate the behaviour of the functional. As the penalty functional parameter $\alpha$ is increased, both the contribution of the penalty functional to the total functional $\alpha {\cal P}[\rho]$ , and the root mean square error in the occupation numbers $\overline{\delta f}$ decrease, but not rapidly enough with $\alpha$ since the number of iterations required to reach convergence increases with $\alpha$ making the calculations too expensive for practical applications. For example, the number of iterations required to converge the total functional to eV per atom increases by a factor of more than ten when $\alpha$ is increased from 100 eV to 1000 eV. Even with the smaller value for $\alpha$ , the rate of convergence is much slower than traditional methods, and this is due to the incompatibility of the functional with the conjugate gradients scheme.

Next: 6.2 Corrected penalty functional Up: 6. Penalty Functionals Previous: 6. Penalty Functionals Contents

Peter Haynes