next up previous
Next: Bibliography Up: Paper abstract

Comparison of variational real-space representations of the kinetic energy operator

Chris-Kriton Skylaris, Oswaldo Diéguez, Peter D. Haynes and Mike C. Payne

Theory of Condensed Matter, Cavendish Laboratory, Madingley Road, Cambridge CB3 0HE, UK


We present a comparison of real-space methods based on regular grids for electronic structure calculations that are designed to have basis set variational properties, using as a reference the conventional method of finite differences (a real-space method that is not variational) and the reciprocal-space plane-wave method which is fully variational. We find that a new definition of the finite difference method [P. Maragakis, J. Soler, and E. Kaxiras, Phys. Rev. B 64, 193101 (2001)] satisfies one of the two properties of variational behaviour at the cost of larger errors than the conventional finite difference method. On the other hand, a technique which represents functions in a number of plane-waves which is independent of system size, closely follows the plane-wave method and therefore also the criteria for variational behaviour. Its application is only limited by the requirement of having functions strictly localised in regions of real-space but this is a characteristic of an increasing number of modern real-space methods as they are designed to have a computational cost that scales linearly with system-size.

PACS: 31.15.Fx, 71.15.Nc, 71.15.Ap

Electronic structure methods usually require the solution of Schrödinger's equation

\hat{H}\psi_0 = \varepsilon_0 \psi_0
\end{displaymath} (1)

for the ground state energy $\varepsilon_0$ and wavefunction $\psi_0$ of the Hamiltonian operator $\hat{H}$ of the system. Exact solutions exist only in very simple cases. A common practical way of approximating the solution of (1) is by expanding $\psi_0$ in terms of a set of basis functions. It is straightforward to show that this basis set approach has two desirable properties:
  1. The approximations to $\varepsilon_0$ are always upper bounds to the exact $\varepsilon_0$, so any algorithm that minimises $\varepsilon_0$ with respect to $\psi_0$ will yield the optimal solution for a given basis set.
  2. The leading term of the error $\delta \varepsilon_0$ in the energy is proportional to the square of the error $\delta \psi_0$ in the wavefunction and thus the energy displays quadratic convergence with increasing basis set size.
These two properties are referred to as the variational principle and are characteristic of the basis set approximation. Basis sets in common use for calculations on molecules and solids include Gaussian functions [2,1], Slater functions [3], plane-waves [4], spherical Bessel functions [5], pseudo-atomic orbitals [6,7], wavelets [8], ``blip'' cubic splines [9], and finite elements [10]. In particular, the second property is important because it means that even a relatively poor basis set can give relatively good results for the energy, thus saving the sometimes considerable cost associated with using large basis sets.

In recent years, the ability to perform calculations using only quantities that are localised in real space, such as the density matrix or Wannier-type orbitals, has led to linear-scaling electronic structure methods that can deal with thousands of atoms [11]. As a consequence, the importance of localised basis sets has grown because they are required for the efficient representation of localised functions. Delocalised basis sets such as plane-waves are not suitable as they make the cost of the electronic structure calculations scale as the cube of the system size.

Basis set expansion is by no means the only practical approximation in electronic structure theory. A set of important alternatives are the ``real-space'' methods [12,13] which use a real-space grid to express their solutions. The representation of functions directly on a real-space grid, either regular [12] or adapted to the positions of the atoms [14], simplifies the application of localisation constraints and for this reason the importance of real-space methods in recent years has grown in parallel with that of the localised basis set methods. Real-space methods based on a regular grid bear some similarity to the plane-wave pseudopotential approach as the spacing of the real-space grid defines a plane-wave kinetic energy cutoff and is the parameter with respect to which the solutions are converged. Most real-space methods are based on the finite difference (FD) approach, so they are not variational, contrary to the plane-wave basis set method. This is due to the fact that the Laplacian operator for the kinetic energy in the Hamiltonian is represented, or rather approximated, as a FD expansion

\frac{\partial^2 \phi}{\partial x^2}(x_i, y_j, z_k)
\simeq \...
C_n^{(A)} \phi(x_i+nh_x, y_j, z_k)
\end{displaymath} (2)

on every grid point $(x_i, y_j, z_k)$, where $h_x$ is the grid spacing in the $x$-direction. Using (2) is equivalent to applying the exact Laplacian operator to an approximation of the solution by a polynomial of degree $A$ at each point of the grid [15]. Therefore, the Hamiltonian operator changes as the grid spacing is varied. One consequence of this is the lack of variational behaviour in the solution process, a well known feature of real-space methods [12].

Maragakis et al. [16] correctly recognised that the FD coefficients $C_n^{(A)}$ of equation (2) systematically underestimate the kinetic energy. They have redetermined them so that they always slightly overestimate it, provided one assumes that the solution on the grid is the real-space representation of a plane-wave expansion. In practice, this means that with their new FD coefficients, energies always converge from above as the grid is made finer. So at least one of the desirable features of the variational principle is restored with this amended FD approach.

A completely different approach which attempts to combine the merits of the plane-wave basis set method with the localisation properties of the real-space methods is the FFT box technique [17,18,19]. Here a plane-wave basis is assumed but the functions are kept localised as they are represented on a real-space grid from the outset. A set of plane-waves which is proportional to the number of grid points in the regions of the localised functions and independent of the size of the system is used whenever the functions are represented in reciprocal-space. Functions localised in real-space are smooth and delocalised in reciprocal-space, therefore coarse sampling in reciprocal-space is a very good approximation [17]. There is no modification of the Hamiltonian operator but an intermediate truncation of the basis set is performed, so the variational behaviour cannot be guaranteed, but can be expected as a consequence of the close resemblance in practice of the FFT box method to the full plane-wave approach.

To assess the variational properties of the new methods, we will first use the exactly solvable model of the harmonic oscillator that Maragakis et al. used for their tests.

Figure 1: Error in energy as a function of the error in the wavefunction for the harmonic oscillator (arbitrary units). The curves represent approximations to the ground state solution on a real-space grid as the number of points is increased from 16 to 18 to 20, etc. in the interval [-10, 10]. The conventional and Maragakis et al.[16] FD order $A=12$ methods were used, as well as the plane-wave basis set expansion method (the number of plane-waves that can be represented on the real-space grid without aliasing is equal to the number of grid points).

\scalebox{0.7}{\includegraphics*[0cm,1cm][20cm,28cm]{errors.eps} }

As they established in their paper that their modified FD method converges from above, here we examine whether the quadratic convergence criterion of the variational principle is also satisfied. We compare their method for $A=12$ with the conventional FD $A=12$ method and with the plane-wave expansion method which is variational by definition. In Figure 1 we plot the absolute error in the ground state energy as a function of the RMS error in the ground state wavefunction. The behaviour of the two FD methods is very similar, their curves in Figure 1 almost coincide and have a slope close to unity or in other words the error in the energy depends only linearly on the error in the wavefunction. On the other hand, the plane-wave curve has a slope close to 2 and becomes parallel with the parabolic line (slope 2) as the error in the wavefunction decreases, so it is truly variational as expected.

The fact that none of the FD methods exhibit the quadratic convergence of variational behaviour may not be a serious drawback for practical calculations since computational limitations always preclude high convergence with respect to grid spacing. However, the speed of convergence is important as it will determine the speed at which relative energy differences converge. To examine this aspect we need to consider an example on a real system. For this purpose, we have performed LDA density-functional calculations on a silane molecule inside a large cubic simulation cell of (40 a$_0$)$^3$. The atomic cores are represented by norm-conserving pseudopotentials [20] and the charge density is expressed in terms of pseudo-atomic orbitals [6] centred on the atoms and optimised for each pseudopotential in spherical regions of 8.0 a$_0$ radius. We compare the FD methods (both the conventional one and Maragakis et al. variant) of order $A=12$ and 28 with the FFT box technique, and with the conventional plane-wave basis set approach. In their finite difference method, Chelikowsky et al. [13] found that order 12 sufficed for their calculations. We have chosen $A = 28$ as an example of very high order of discretization. In all FD calculations we solve the Poisson equation for the Hartree potential using the conventional FD coefficients, as Maragakis et al. do, so the only difference between the two forms of FD is in the calculation of the kinetic energy. Figure 2 shows the convergence of the total energy as a function of the number of grid points in every lattice vector direction.

Figure 2: Convergence of the total energy of a silane molecule in a cubic cell with respect to the number of grid points in each dimension. The energy is calculated with conventional finite differences (FD) of order $A=12$ ( $\bigtriangleup $) and 28 ( $\bigtriangledown $), with the Maragakis et al. [16] variational FD of order $A=12$ ($\Box $) and 28 ($\Diamond $), with the FFT box technique [17] ($\ast $) and with the plane-wave basis set method ($\bigcirc $ joined by broken line). The FFT box curve almost coincides with the plane-wave curve on the scale of these graphs. Since we are using an FFT box large enough to enclose any pair of support functions [17], its size will be (32 a$_0$)$^3$, which implies using $24 \times 24 \times 24$ grid points for the smallest number of points per cell dimension (30) considered in the graph.

\scalebox{0.7}{\includegraphics*[0cm,1cm][25cm,19cm]{} }

The convergence of the energy with the conventional FD methods is not monotonic although the effect is minimised as the order increases. At coarse grid spacings, FD converges from above since there are not enough grid points to accurately represent the charge density near the ions which is the major contribution to the total energy. When the grid spacing becomes finer than some threshold that depends on the order of the FD, the pseudopotential contribution lowers the energy, but to a value which is less than the grid-converged result because of the systematic underestimation of the kinetic energy. In the case of the Maragakis et al. FD, the convergence with grid spacing is monotonic and always from above, as expected and as we have also confirmed in tests with other systems. In terms of the speed of convergence and deviation from the plane-wave curve which is variational by definition, the conventional FDs do quite well starting with errors in the order of m$E_h$ which decrease rapidly with the number of points. The convergence of the variational FD is not as good on the other hand as the size of its errors is larger and its rate of convergence is slower. The FFT box technique closely follows the plane-wave curve as it is expected to do and has errors that are at least an order of magnitude smaller than the FD $A=12$ curve throughout its range, so it apears to coincide with the plane-wave curve on the scale of this graph. Of course the agreement improves when $A = 28$, but it can be seen that the FFT box method still works better than this very high order FD method.

The computational cost of both FD methods is the same: for an order $A$ FD applied to a function localised within a sphere $D$ grid points in diameter, the number of operations to calculate the Laplacian at all grid points where it is non-zero [17] is roughly $2 (3A+1) (A+D)^3$. For the same function, the FFT box has volume $(2D)^3$ and this method therefore requires about $(120 \log_2 D + 152) D^3$ operations (assuming radix-2 FFTs for real functions are used). While the scaling of both methods is essentially the same, the FD method has a smaller prefactor if the discretization order is not very high. As an example, consider spheres of radius $8.0   {\rm a}_0$ and a kinetic energy cutoff of $20   E_h$ corresponding to a grid spacing of $0.5   {\rm a}_0$, so that $D = 32$. For $A=12$, $N_{\rm ops}^{\rm FD} = 6 \times 10^6$ whereas $N_{\rm ops}^{\rm FFT} = 25 \times 10^6$, a factor of 4 larger. The FFT box method is more competitive for smaller spheres, but it should of course be remembered that a higher order FD would need to be used in practice to achieve the accuracy of the FFT box method. For example, if we take $A = 28$ in the mentioned test case , then $N_{\rm ops}^{\rm FD} = 37 \times 10^6$, which is a 50% more than the number of operations required using the FFT box method. Both the FD and FFT box methods are suitable for implementation on parallel computers. For the FFT box method, the spherical regions can be partitioned among the nodes so that the FFTs involved in applying the kinetic energy operator can all be performed locally. Some communication would then be required in order to calculate the inner product of this result with functions in other regions, but this would also be necessary in the case of the FD method.

In summary, we have carried out a comparison of methods for electronic structure calculations directly in real-space in order to examine if and to what extent they can display the variational behaviour of basis sets. The FD representation of the Laplacian for the kinetic energy operator systematically underestimates the kinetic energy and is the source of the problems as correctly identified by Maragakis et al. Their proposed modification of FD coefficients systematically overestimates the kinetic energy and causes the energy to converge from above albeit with errors up to an order of magnitude larger than conventional FD. In practice this means that finer grids may be necessary and this could be a high price to pay in terms of computational cost. None of the FD methods display quadratic convergence of the energy with respect to the wavefunction which is the second characteristic of the variational principle. For cases where the functions are strictly localised in regions of the real-space grid, as with most real-space methods intended for linear-scaling calculations [21,7], the FFT box technique could be the method of choice as it is designed to follow closely the behaviour of the fully variational plane-wave method.

C.K.S. would like to thank the EPSRC (grant number GR/M75525) for postdoctoral research funding. O.D. would like to thank the European Commission for a Research Training Network fellowship (grant number HPRN-CT-2000-00154). P.D.H. would like to thank Magdalene College, Cambridge, for a research fellowship.

next up previous
Next: Bibliography Up: Paper abstract
Peter D. Haynes 2002-10-25