next up previous
Next: 4. New parallel implementation Up: Parallel fast Fourier transforms Previous: 2. Fast Fourier transforms


3. Traditional parallel implementation

The traditional distribution of data in electronic structure calculations is shown schematically in Fig. 1 for the case of a $4 \times 6 \times 8$ grid and 4 nodes. When applying the potential to a trial eigenvector, the data is initially represented in momentum-space (on the left-hand side of Fig. 1) and each node deals with a number of ``rods'' of data in the $z$-direction. In the first stage of the 3D-FFT, each node performs a 1D-FFT in the $z$-direction on each of its rods. The nodes then communicate to effect a transpose in which the data is redistributed from ``$z$-rods'' to ``$y$-rods'' (middle of Fig. 1). Each node then performs a second 1D-FFT in the $y$-direction on these rods. A second communication stage transposes the data to ``$x$-rods'' (right of Fig. 1), and the final stage is to perform a 1D-FFT on these $x$-rods. The DFT from real- to momentum-space is performed similarly by reversing these operations.

Figure 1: Distribution of data for traditional implementation.
\includegraphics [height=58mm]{old.eps}


next up previous
Next: 4. New parallel implementation Up: Parallel fast Fourier transforms Previous: 2. Fast Fourier transforms
Peter Haynes