Multiplication of a symmetric banded matrix by a vector on a vector multiprocessor computer
by R. Reuter, U. Scharffenberger, J. Schüle
This paper describes how to vectorize and parallelize the multiplication of a symmetric banded matrix by a vector, on a vector multiprocessor. The ideas presented involve two packed-band-storage schemes, and implementations for both schemes are studied. The best among the uniprocessor solutions proposed achieves a maximum of 37.1 Mflops on an IBM Enterprise System/3090 400E with Vector Facility (VF). For one of the schemes, a parallel implementation on an IBM 3090 VF multiprocessor is presented, and time measurements are discussed.