PARS-Mitteilungen 2015
Nutzen Sie die Buttons unter "Auflistung nach", um die Beiträge z.B. nach Beitragsart oder Session zu sortieren oder starten Sie direkt mit der Titelübersicht.
Sie können aber auch die komplette PARS-Mitteilungen 2015 als PDF-Datei laden.
Auflistung PARS-Mitteilungen 2015 nach Erscheinungsdatum
1 - 10 von 15
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelParallelisierung von Embedded Realtime Systemen: Probleme und Lösungsstrategien in Migrationsprojekten(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Abu-Khalil, MarwanDieser Artikel extrahiert Erfahrungen aus einer Reihe erfolgreicher sowie gescheiterter industrieller Parallelisierungsprojekte, bei denen Embedded Realtime Systeme von Single-Core CPUs auf Multi-Core SMP-Plattformen portiert wurden. Die Kernthese des Vortrages lautet, dass die Parallelisierung von Embedded Realtime Systemen spezifischen Herausforderungen gegenübersteht, die bei anderen System-Klassen, wie Serveroder Desktop-Software, nur eine untergeordnete Relevanz haben. Der Artikel analysiert und kategorisiert diese spezifischen Herausforderungen. Als Resultat werden allgemeingültige Herangehensweisen vorgeschlagen, die zu erfolgreicher Parallelisierung im Embedded-Bereich führen.
- ZeitschriftenartikelParallel Symbolic Relationship-Counting(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Doering, Andreas C.One of the most basic operations, transforming a relationship into a function that gives the number of fulfilling elements, does not seem to be widely investigated. In this article a new algorithm for this problem is proposed. This algorithm can be implemented using Binary Decision Diagrams. The algorithm transforms a relation given as symbolic expression into a symbolic function, which can be further used, e.g. for finding maxima. The performance of an implementation based on JINC is given for a scalable example problem.
- ZeitschriftenartikelNovel Image Processing Architecture for 3D Integrated Circuits(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Pfundt, Benjamin; Reichenbach, Marc; Söll, Christopher; Fey, DietmarUtilizing highly parallel processors for high speed embedded image processing is a well known approach. However, the question of how to provide a sufficiently fast data rate from image sensor to processing unit is still not solved. As Trough-Silicon-Vias (TSV), a new technology for chip stacking, become available, parallel image transmission from the image sensor to processing unit is enabled. Nevertheless, the usage of a new technology requires architectural changes in the processing units. With this technology at hand, we present a novel image preprocessing architecture suitable for image processing in 3D chips stacks. The architecture was developed in parallel with a customized image sensor to make a real assembly possible. It is fully functionally verified and layouted for a 150 nm process. Our performance estimation shows a processing speed of 770 up to 14.400 fps (frames per second) for 5 × 5 filters.
- ZeitschriftenheftPARS-Mitteilungen 2015(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015)
- ZeitschriftenartikelExtended Pattern-based Parallelization Approach for Hard Real-Time Systems and its Tool Support(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Stegmeier, Alexander; Frieb, Martin; Ungerer, TheoThe transformation of sequential legacy code to parallel applications is hard, especially when timing requirements have to be met. There exists a systematic parallelization approach dealing with this topic. Based on practical experience, we extend it and present our modifications. Our extensions comprise an additional phase dealing with implementation details and another one for quality assurance. Its results may be used to further improve the parallel program. Moreover, we propose tool support which further facilitates the parallelization process.
- ZeitschriftenartikelProximity Scheme for Instruction Caches in Tiled CMP Architectures(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Alawneh, Tareq; Chi, Chi Ching; Elhossini, Ahmed; Juurlink, BenRecent research results show that there is a high degree of code sharing between cores in multi-core architectures. In this paper we propose a proximity scheme for the instruction caches, a scheme in which the shared code blocks among the neighbouring L2 caches in tiled multi-core architectures are exploited to reduce the average cache miss penalty and the on-chip network traffic. We evaluate the proposed proximity scheme for instruction caches using a full-system simulator running an n-core tiled CMP. The experimental results reveal a significant execution time improvement of up to 91.4% for microbenchmarks whose instruction footprint does not fit in the private L2 cache. For real applications from the PARSEC benchmarks suite, the proposed scheme results in speedups of up to 8%.
- ZeitschriftenartikelParallel Processing for Data Deduplication(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Sobe, Peter; Pazak, Denny; Stiehr, MartinData deduplication is a technique for detection and elimination of duplicated data blocks in storage systems. It creates a set of unique data blocks and places references accordingly, which allows to access the original data within a reduced amount of data blocks. For deduplication, hashes of data blocks are calculated and compared in order to detect and remove duplicates. It can be seen as an alternative to data compression that allows to save storage capacity in large storage systems. A storage capacity saving is reached at the cost of additional computational effort that originates when data blocks are written and updated. This computational effort increases with the size of the storage system. On a single processor system, deduplication influences the performance in a negative way, particularly the write and update rates drop. The utilization of parallelism is a rewarding task to compensate this performance drop, particularly for hash value calculations and comparisons of hashes. In this paper we explain in which parts of a deduplication system it is worth to parallelize and how. Exemplarily, we show the performance results of two deduplication algorithms and their parallel implementations, based on multithreading and on parallel GPU computations.
- ZeitschriftenartikelParallelization of the Particle-in-cell-Code PATRIC with GPU-Programming(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Fitzek, JuttaThe Particle-in-cell (PIC) code PATRIC (Particle Tracking Code) is used at the GSI Helmholtz Center for Heavy Ion Reasearch to simulate particles in circular particle accelerators. Parallelization of PIC codes is an open research field and solutions depend very much on the specific problem. The possibilities and limits of GPU integration are being evaluated. General GPU aspects and problems arising from collective particle effects are put into focus with an emphasis on code maintainability and reuse of existing modules. The studies have been performed using NVIDIA⃝R ’s Tesla C2075 GPU. This contribution summarizes the findings.
- ZeitschriftenartikelReal-Time Vision System for License Plate Detection and Recognition on FPGA(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Rosli, Faird; Elhossini, Ahmed; Juurlink, BenRapid development of the Field Programmable Gate Array (FPGA) offers an alternative way to provide acceleration for computationally intensive tasks such as digital signal and image processing. Its ability to perform parallel processing shows the potential in implementing a high speed vision system. Out of numerous applications of computer vision, this paper focuses on the hardware implementation of one that is commercially known as Automatic Number Plate Recognition (ANPR).Morphological operations and Optical Character Recognition (OCR) algorithms have been implemented on a Xilinx Zynq-7000 All-Programmable SoC to realize the functions of an ANPR system. Test results have shown that the designed and implemented processing pipeline that consumed 63 % of the logic resources is capable of delivering the results with relatively low error rate. Most importantly, the computation time satisfies the real-time requirement for many ANPR applications.
- ZeitschriftenartikelA run-time reconfigurable NoC Monitoring System for performance analysis and debugging support(PARS-Mitteilungen: Vol. 32, Nr. 1, 2015) Koser, Erol; Stabernack, BennoRecently Network-on-Chip based architectures become more and more important due to their advantages in respect to design flexibility and systems bandwidth scalability since nowadays systems consists typically of a huge number of processing elements (e.g. heterogeneous multi processor systems). In contrast to typical shared memory based systems, predicting and monitoring the runtime behaviour of the system e.g. data throughput, link utilization and contention becomes more complex and requires special architectural features. Besides the traditional approach of using simulation based approaches at design time, runtime usable features promise to have a number of advantages. In this paper we present a flexible, reusable and run-time reconfigurable NoC monitoring system for performance analysis and debugging purposes. The evaluation of the monitoring data enables the system designer to achieve better resource utilization by adjusting the system architecture and the programming model.