Nguyen, Duc KienLavallee, IvanBui, MarcEichler, GeraldKropf, PeterLechner, UlrikeMeesad, PhayungUnger, Herwig2019-01-112019-01-112010978-3-88579-259-8https://dl.gi.de/handle/20.500.12116/19032Strassen's algorithm to multiply two n x n matrices reduces the asymptotic operation count from O(n3) of the traditional algorithm to O(n2.81), thus designing efficient parallelizing for this algorithm becomes essential. In this paper, we present our generalizing of a parallel Strassen implementation which obtained a very nice performance on an Intel Paragon: faster 20% for n ≈ 1000 and more than 100% for n ≈ 5000 in comparison to the parallel traditional algorithms (as Fox, Cannon). Our method can be applied to all the matrix multiplication algorithms on distributed memory computers that use Strassen's algorithm at the system level, hence it gives us compatibility to find better parallel implementations of Strassen's algorithm.enGeneralizing of a high performance parallel Strassen implementation on distributed memory MIMD architecturesText/Conference Paper1617-5468