Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator

Hofmann, Johannes; Treibig, Jan; Hager, Georg; Wellein, Gerhard

Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator

dc.contributor.author	Hofmann, Johannes
dc.contributor.author	Treibig, Jan
dc.contributor.author	Hager, Georg
dc.contributor.author	Wellein, Gerhard
dc.date.accessioned	2017-06-29T16:28:09Z
dc.date.available	2017-06-29T16:28:09Z
dc.date.issued	2014
dc.description.abstract	We examine the Xeon Phi, which is based on Intel’s Many Integrated Cores architecture, for its suitability to run the FDK algorithm—the most commonly used algorithm to perform the 3D image reconstruction in cone-beam computed tomography. We study the challenges of efficiently parallelizing the application and means to enable sensible data sharing between threads despite the lack of a shared last level cache. Apart from parallelization, SIMD vectorization is critical for good performance on the Xeon Phi; we perform various micro-benchmarks to investigate the platform’s new set of vector instructions and put a special emphasis on the newly introduced vector gather capability. We refine a previous performance model for the application and adapt it for the Xeon Phi to validate the performance of our optimized hand-written assembly implementation, as well as the performance of several different auto-vectorization approaches.	en
dc.identifier.pissn	0177-0454
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik e.V., Fachgruppe PARS
dc.relation.ispartof	PARS-Mitteilungen: Vol. 31, Nr. 1
dc.title	Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator	en
dc.type	Text/Journal Article
gi.citation.publisherPlace	Berlin

Dateien

Originalbündel

1 - 1 von 1

Name:: paper02.pdf
Größe:: 1.22 MB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

PARS-Mitteilungen 2014