Konferenzbeitrag
Generalizing of a high performance parallel Strassen implementation on distributed memory MIMD architectures
Lade...
Volltext URI
Dokumententyp
Text/Conference Paper
Dateien
Zusatzinformation
Datum
2010
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Gesellschaft für Informatik e.V.
Zusammenfassung
Strassen's algorithm to multiply two n x n matrices reduces the asymptotic operation count from O(n3) of the traditional algorithm to O(n2.81), thus designing efficient parallelizing for this algorithm becomes essential. In this paper, we present our generalizing of a parallel Strassen implementation which obtained a very nice performance on an Intel Paragon: faster 20% for n ≈ 1000 and more than 100% for n ≈ 5000 in comparison to the parallel traditional algorithms (as Fox, Cannon). Our method can be applied to all the matrix multiplication algorithms on distributed memory computers that use Strassen's algorithm at the system level, hence it gives us compatibility to find better parallel implementations of Strassen's algorithm.