Logo des Repositoriums
 

P124 - 9th Workshop on Parallel Systems and Algorythms

Autor*innen mit den meisten Dokumenten  

Auflistung nach:

Neueste Veröffentlichungen

1 - 10 von 12
  • Konferenzbeitrag
    Parallel derivative computation using ADOL-C
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Kowarz, Andreas; Walther, Andrea
    Derivative computation using Automatic Differentiation (AD) is often considered to operate purely serial. Performing the differentiation task in parallel may require the applied AD-tool to extract parallelization information from the user function, transform it, and apply this new strategy in the differentiation process. Furthermore, when using the reverse mode of AD, it must be ensured that no data races are introduced due to the reversed data access scheme. Considering an operator overloading based AD-tool, an additional challenge is to be met: Parallelization statements are typically not recognized. In this paper, we present and discuss the parallelization approach that we have integrated into ADOL-C, an operator overloading based AD-tool for the differentiation of C/C++ programs. The advantages of the approach are clarified by means of the parallel differentiation of a function that handles the time evolution of a 1D-quantum plasma.
  • Konferenzbeitrag
    How efficient are creatures with time-shuffled behaviors?
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Ediger, Patrick; Hoffmann, Rolf; Halbach, Mathias
    The task of the creatures in the “creatures’ exploration problem” is to visit all empty cells in an environment with a minimum number of steps. We have analyzed this multi agent problem with time-shuffled algorithms (behaviors) in the cellular automata model. Ten different “uniform” (non-time-shuffled) algorithms with good performance from former investigations were used alternating in time. We designed three time-shuffling types differing in the way how the algorithms are interweaved. New metrics were defined for such a multi agent system, like the absolute and relative efficiency. The efficiency relates the work of an agent system to the work of a reference system. A reference system is such a system that can solve the problem with the lowest number of creatures with uniform or time-shuffled algorithms. Some time-shuffled systems reached high efficiency rates, but the most efficient system was a uniform one with 32 creatures. Among the most efficient successful systems the uniform ones are dominant. Shuffling algorithms resulted in better success rates for one creature. But this is not always the case for more than one creature.
  • Konferenzbeitrag
    Adaptive Cache Infrastructure: Supporting dynamic Program Changes following dynamic Program Behavior
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Nowak, Fabian; Buchty, Rainer; Karl, Wolfgang
    Recent examinations of program behavior at run-time revealed distinct phases. Thus, it is evident that a framework for supporting hardware adaptation to phase behavior is needed. With the memory access behavior being most important and cache accesses being a very big subset of them, we herein propose an infrastructure for fitting cache accesses to a program’s requirements for a distinct phase.
  • Konferenzbeitrag
    A Generic Tool Supporting Cache Design and Optimisation on Shared Memory Systems
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Schindewolf, Martin; Tao, Jie; Karl, Wolfgang; Cintra, Marcelo
    For multi-core architectures, improving the cache performance is crucial for the overall system performance. In contrast to the common approach to design caches with the best trade-off between performance and costs, this work favours an application specific cache design. Therefore, an analysis tool capable of exhibiting the reason of cache misses has been developed. The results of the analysis can be used by system developers to improve cache architectures or can help programmers to improve the data locality behaviour of their programs. The SPLASH-2 benchmark suite is used to demonstrate the abilities of the analysis model.
  • Konferenzbeitrag
    SDVMR: A Scalable Firmware for FPGA-based Multi-Core Systems-on-Chip
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Hofmann, Andreas; Waldschmidt, Klaus
    As the main scope of mobile embedded systems shifts from control to data processing tasks high performance demand and limited energy budgets are often seen conflicting design goals. Heterogeneous, adaptive multicore systems are one approach to meet these challenges. Thus, the importance of multicore FPGAs as an implementation platform steadily grows. However, efficient exploitation of parallelism and dynamic runtime reconfiguration poses new challenges for application software developement. In this paper the implementation of a virtualization layer between applica- tions and the multicore FPGA is described. This virtualization allows a transparent runtime-reconfiguration of the underlying system for adaption to changing system environments. The parallel application does not see the underlying, even heterogeneous multicore system. Many of the requirements for an adaptive FPGA-realization are met by the SDVM, the scalable dataflow-driven virtual machine. This paper describes the concept of the FPGA firmware based on a reimplementation and adaptation of the SDVM.
  • Konferenzbeitrag
    High Performance Multigrid on Current Large Scale Parallel Computers
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Gradl, Tobias; Rüde, Ulrich
    Making multigrid algorithms run efficiently on large parallel computers is a challenge. Without clever data structures the communication overhead will lead to an unacceptable performance drop when using thousands of processors. We show that with a good implementation it is possible to solve a linear system with 1011 unknowns in about 1.5 minutes on almost 10,000 processors. The data structures also allow for efficient adaptive mesh refinement, opening a wide range of applications to our solver.
  • Konferenzbeitrag
    Grid Virtualization Engine: Providing Virtual Resources for Grid Infrastructure
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Kwemou, Emeric; Wang, Lizhe; Tao, Jie; Kunze, Marcel; Kramer, David; Karl, Wolfgang
    Virtual machines offer a lot of advantage such as easy configuration and management and can simplify the development and the deployment of Grid infrastructures. Various virtualization implementations despite have similar functions often provide different management and access interfaces. The heterogeneous virtualization technologies bring challenges for employing virtual machine as computing resources to build Grid infrastructures. The work proposed in this paper focus on a Web service based virtual machine provider for Grid infrastructures. The Grid Virtualization Engine (GVE) creates an abstraction layer between users and underlying virtualization technologies. The GVE implements a scalable distributed architecture, where an GVE Agent represents a computing center. The Agent talks with different virtualization product inside the computing center and provides virtual machine resources to GVE Site Service. Users could require computing resources through GVE Site Services. The system is designed and implemented in the state of the arts of distributed computing: Web service and Grid standards.
  • Konferenzbeitrag
    Specifying and Processing Co-Reservations in the Grid
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Röblitz, Thomas
    Executing complex applications on Grid infrastructures necessitates the guaranteed allocation of multiple resources. Such guarantees are often implemented by means of advance reservations. Reserving resources in advance requires multiple steps – beginning with their description to their actual allocation. In a Grid, a client possesses little knowledge about the future status of resources. Thus, manually specifying successful parameters of a co-reservation is a tedious task. Instead, we propose to parametrize certain reservation characteristics (e.g., the start time) and to let a client define criteria for selecting appropriate values. Then, a Grid reservation service processes such requests by determining the future status of resources and calculating a co-reservation candidate which satisfies the cri- teria. In this paper, we present the Simple Reservation Language (SRL) for describing the requests, demonstrate the transformation of an example request into an integer program using the Zuse Institute Mathematical Programming Language (ZIMPL) and experimentally evaluate the time needed to find the optimal co-reservation using CPLEX.
  • Konferenzbeitrag
    Hybrid Parallel Sort on the Cell Processor
    (9th workshop on parallel systems and algorithms – workshop of the GI/ITG special interest groups PARS and PARVA, 2008) Keller, Jörg; Kessler, Christoph; König, Kalle; Heenes, Wolfgang
    Sorting large data sets has always been an important application, and hence has been one of the benchmark applications on new parallel architectures. We present a parallel sorting algorithm for the Cell processor that combines elements of bitonic sort and merge sort, and reduces the bandwidth to main memory by pipelining. We present runtime results of a partial prototype implementation and simulation results for the complete sorting algorithm, that promise performance advantages over previ- ous implementations.