Ching, EdwardEgi, NorbertMortazavi, MasoodCheung, VincentShi, GuangyuPlödereder, E.Grunske, L.Schneider, E.Ull, D.2017-07-262017-07-262014978-3-88579-626-8Modern high-performance server systems with their wide variety of compute resources (i.e. multi-core CPUs, GPUs, FPGAs, etc.) bring vast computing power to the fingertips of researchers and developers. To investigate modern CPU+GPU co-processing architectures and to discover their relative marginal return on resources (“bang for the buck”), we compare the different architectures with main focus on database query processing, provide performance optimization methods and analyze their performance using practical benchmarking approaches. Discrete GPU devices (d-GPUs) are connected to host memory through PCIe which is relatively slower than the internal bus which connects integrated GPU subsystems (i-GPUs) to the CPU and its caches. Our experimental analysis indicates that while massive processing capabilities have grown in d-GPUs, data locality, memory access patterns and I/O bottlenecks continue to govern and impede the speed of database processing functions. While the use of GPU systems (whether discrete or integrated) lead to significant speed-up in primitive data processing operations, the marginal performance returns on deploying teraflops d-GPU compute resources do not appear to be large enough to justify their cost or power consumption. At this point, integrated GPUs provide better cost-performance and energy-performance results in parallel database query processing.enUnleashing the hidden power of integrated-gpus for database co-processingText/Conference Paper1617-5468