next up previous
Next: Acknowledgement Up: Parallel fast Fourier transforms Previous: 7. Load balancing

8. Conclusions

We have presented a new method for performing FFTs on parallel computers which scales to a larger number of nodes than the traditional method due to the reduced latency cost. This is achieved by taking advantage of the inherent data distribution required by the FFT algorithm. The method is applicable to electronic structure calculations, due to the small sizes of FFT grids used, and is most effective on clusters of workstations where the communication costs are high. The new method automatically satsifies the demand of load balancing, and effectively blocks the Hamiltonian matrix which may allow new iterative diagonalisation algorithms for block matrices to be applied.

Peter Haynes