Auflistung PARS-Mitteilungen 2012 nach Titel
1 - 10 von 15
Treffer pro Seite
Sortieroptionen
- Zeitschriftenartikel1. Aktuelle und zukünftige Aktivitäten (Bericht des Sprechers)(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012)
- Zeitschriftenartikel10th Workshop on Parallel Systems and Algorithms PASA 2012(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012)
- ZeitschriftenartikelAchieving scalability for job centric monitoring in a distributed infrastructure(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012) Hilbrich, Marcus; Müller-Pfefferkorn, RalphJob centric monitoring allows to observe jobs on remote computing resources. It may offer visualisation of recorded monitoring data and helps to find faulty or misbehaving jobs. If installations like grids or clouds are observed monitoring data of many thousands of jobs have to be handled. The challenge of job centric monitoring infrastructures is to store, search and access data collected in huge installations like grids or clouds. We take this challenge with a distributed layer based architecture which provides a uniform view to all monitoring data. The concept of this infrastructure called SLAte and an analysis of the scalability is provided in this paper.
- ZeitschriftenartikelARCS 2013(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012)
- ZeitschriftenartikelEuro-Par 2013 Aachen(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012)
- ZeitschriftenartikelFlexible Scheduling and Thread Allocation for Synchronous Parallel Tasks(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012) Kessler, Christoph; Hansson, ErikWe describe a task model and dynamic scheduling and resource allocation mechanism for synchronous parallel tasks to be executed on SPMD-programmed synchronous shared-memory MIMD parallel architectures with uniform, unit-time memory access and strict memory consistency, also known in the literature as PRAMs (Parallel Random Access Machines). Our task model provides a two-tier programming model for PRAMs that flexibly combines SPMD and fork-join parallelism within the same application. It offers flexibility by dynamic scheduling and late resource binding while preserving the PRAM execution properties within each task, the only limitation being that the maximum number of threads that can be assigned to a task is limited to what the underlying architecture provides. In particular, our approach opens for automatic performance tuning at run-time by controlling the thread allocation for tasks based on run-time predictions. By a prototype implementation of a synchronous parallel task API in the SPMDbased PRAM language Fork and experimental evaluation with example programs on the SBPRAM simulator, we show that a realization of the task model on a SPMDprogrammable PRAM machine is feasible with moderate runtime overhead per task.
- ZeitschriftenartikelFooling the masses with performance results: Old classics and some new ideas(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012) Wellein, G.; Hager, G.G. Wellein and G. Hager Department for Computer Science and Erlangen Regional Computing Center Friedrich-Alexander-Universität Erlangen-Nürnberg In 1991, David H. Bailey published his insightful 'Twelve Ways to Fool the Masses When Giving Performance Results on Parallel Computers.' In that humorous article, Bailey pinpointed typical 'evade and disguise' techniques for presenting mediocre performance results in the best possible light. At that time, the supercomputing landscape was governed by the 'chicken vs. oxen' debate: Could strong vector CPUs survive against the new massively parallel systems? In the past two decades, hybrid, hierarchical systems, multi-core processors, accelerator technology, and the dominating presence of commodity hardware have reshaped the landscape of High Performance Computing. It's also not so much oxen vs. chickens anymore; billions of ants have entered the battlefield. This talk gives an update of the 'Twelve Ways.' Old classics are presented alongside new 'stunts' that reflect today's technological boundary conditions. DISCLAIMER: Although these musings are certainly inspired by experience with many publications and talks in HPC, I wish to point out that (i) no offense is intended, (ii) I am not immune to the inherent temptations myself and (iii) this all still just meant to be fun.
- ZeitschriftenartikelI/O-efficient approximation of graph diameters by parallel cluster growing — a first experimental study(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012) Ajwani, Deepak; Beckmann, Andreas; Meyer, Ulrich; Veith, DavidA fundamental step in the analysis of a massive graph is to compute its diameter. In the RAM model, the diameter of a connected undirected unweighted graph can be efficiently 2-approximated using a Breadth-First Search (BFS) traversal from an arbitrary node. However, if the graph is stored on disk, even an external memory BFS traversal is prohibitive, owing to the large number of I/Os it incurs. Meyer [Mey08] proposed a parametrized algorithm to compute an approximation of graph diameter with fewer I/Os than that required for exact BFS traversal of the graph. The approach is based on growing clusters around randomly chosen vertices `in parallel' until their fringes meet. We present an implementation of this algorithm and compare it with some simple heuristics and external-memory BFS in order to determine the trade-off between the approximation ratio and running-time achievable in practice. Our experiments show that with carefully chosen parameters, the new approach is indeed capable to produce surprisingly good diameter approximations in shorter time. We also confirm experimentally, that there are graph-classes where the parametrized approach runs into bad approximation ratios just as the theoretical analysis in [Mey08] suggests.
- ZeitschriftenartikelParallel coding for storage systems — An OpenMP and OpenCL capable framework(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012) Sobe, PeterParallel storage systems distribute data onto several devices. This allows high access bandwidth that is needed for parallel computing systems. It also improves the storage reliability, provided erasure-tolerant coding is applied and the coding is fast enough. In this paper we assume storage systems that apply data distribution and coding in a combined way. We describe, how coding can be done parallel on multicore and GPU systems in order to keep track with the high storage access bandwidth. A framework is introduced that calculates coding equations from parameters and translates them into OpenMP- and OpenCL-based coding modules. These modules do the encoding for data that is written to the storage system, and do the decoding in case of failures of storage devices. We report on the performance of the coding modules and identify factors that influence the coding performance.
- ZeitschriftenartikelParallelization Strategies to Speed-Up Computations for Terrain Analysis on Multi-Core Processors(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 29, No. 1, 2012) Schiele, Steffen; Blaar, Holger; Thürkow, Detlef; Möller, Markus; Müller-Hanneman, MatthiasEfficient computation of regional land-surface parameters for large-scale digital elevation models becomes more and more important, in particular for webbased applications. This paper studies the possibilities of decreasing computing time for such tasks by parallel processing using multi-threads on multi-core processors. As an example of calculations of regional land-surface parameters we investigate the computation of flow directions and propose a modified D8 algorithm using an extended neighborhood. In this paper, we discuss two parallelization strategies, one based on a spatial decomposition, the other based on a two-phase approach. Three datasets of high resolution digital elevation models with different geomorphological types of landscapes are used in our evaluation. While local surface parameters allow for an almost ideal speed-up, the situation is different for the calculation of non-local parameters due to data dependencies. Nevertheless, still a significant decrease of computation time has been achieved. A task pool-based strategy turns out to be more efficient for calculations on datasets with many data dependencies.