Blue Gene/L is a massively parallel supercomputer organized as a three-dimensional torus of compute nodes. A fundamental challenge in harnessing the new computational capacities of Blue Gene/L is the design and implementation of numerical algorithms that scale effectively on thousands of nodes. A computational kernel of particular importance is the Fast Fourier Transform (FFT) of three-dimensional data. In this paper, we present the approach we are taking in Blue Gene/L to produce a scalable FFT implementation. We rely on a volume decomposition of the data to take advantage of the communication topology of Blue Gene/L. We present experimental results using an MPI-based implementation of our algorithm, in order to test the basic tenets behind our decomposition and to allow experimentation on existing platforms. Our preliminary results indicate that our
algorithm scales well for three-dimensional FFTs of size 128 X 128 X 128 to more than 500 nodes.
By: Maria Eleftheriou, Jose E. Moreira, Blake G. Fitch, Robert S. Germain
Published in: Lecture Notes in Computer Science, volume 2913, (no ), pages 194-203 in 2003Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.
Questions about this service can be mailed to email@example.com .