Auflistung nach Autor:in "Wildermann, Stefan"
1 - 7 von 7
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelDesign and error analysis of accuracy-configurable sequential multipliers via segmented carry chains(it - Information Technology: Vol. 64, No. 3, 2022) Echavarria, Jorge; Wildermann, Stefan; Keszocze, Oliver; Khosravi, Faramarz; Becher, Andreas; Teich, JürgenWe present the design and a closed-form error analysis of accuracy-configurable multipliers via segmented carry chains. To address this problem, we model the approximate partial-product accumulations as a sequential process. According to a given splitting point of the carry chains, the technique herein discussed allows varying the quality of the accumulations and, consequently, the overall product. Due to these shorter critical paths, such kinds of approximate multipliers can trade-off accuracy for an increased performance whilst exploiting the inherent area savings of sequential over combinatorial approaches. We implemented multiple architectures targeting FPGAs and ASICs with different bit-widths and accuracy configurations to 1) estimate resources, power consumption, and delay, as well as to 2) evaluate those error metrics that belong to the so-called #P-complete class.
- KonferenzbeitragAn FPGA Avro Parser Generator for Accelerated Data Stream Processing(BTW 2023, 2023) Hahn, Tobias; Schüll, Daniel; Wildermann, Stefan; Teich, JürgenBig Data applications frequently involve processing data streams encoded in semi-structured data formats such as JSON, Protobuf, or Avro.A major challenge in accelerating data stream processing on FPGAs is that the parsing of such data formats is usually highly complex.This is especially true for JSON parsing on FPGAs, which lies in the focus of related work.The parsing of the binary Avro format, on the other hand, is perfectly suited for being processed on FPGAs and can thus serve as an enabler for data stream processing on FPGAs.In this realm, we present a methodology for parsing, projection, and selection of Avro objects, which enforces an output format suitable for further processing on the FPGA.Moreover, we provide a generator to automatically create accelerators based on this methodology.The obtained accelerators can achieve significant speedups compared to CPU-based parsers, and at the same time require only very few FPGA resources.
- JournalIntegration of FPGAs in Database Management Systems: Challenges and Opportunities(Datenbank-Spektrum: Vol. 18, No. 3, 2018) Becher, Andreas; B.G., Lekshmi; Broneske, David; Drewes, Tobias; Gurumurthy, Bala; Meyer-Wegener, Klaus; Pionteck, Thilo; Saake, Gunter; Teich, Jürgen; Wildermann, Stefan
- TextdokumentReProVide: Towards Utilizing Heterogeneous Partially Reconfigurable Architectures for Near-Memory Data Processing(BTW 2019 – Workshopband, 2019) Becher, Andreas; Herrmann, Achim; Wildermann, Stefan; Teich, JürgenReconfigurable hardware such as Field-programmable Gate Arrays (FPGAs) is widely used for data processing in databases. Most of the related work focuses on accelerating one or a small set of specific operations like sort, join, regular expression matching. A drawback of such approaches is often the assumed static accelerator hardware architecture: Rather than adapting the hardware to fit the query, the query plan has to be adapted to fit the hardware. Moreover, operators or data types that are not supported by the accelerator have to be processed in software. As a remedy, approaches for exploiting the dynamic partial reconfigurability of FPGAs have been proposed that are able to adapt the datapath at runtime. However, on modern FPGAs, this introduces new challenges due to the heterogeneity of the available resources. In addition, not only the execution resources may be heterogeneous but also the memory resources. This work focuses on the architectural aspects of database (co-)processing on heterogeneous FPGA-based PSoC (programmable System-on-Chip) architectures including processors, specialized hardware components, multiple memory types and dynamically partially reconfigurable areas. We present an approach to support such (co-)processing called ReProVide. In particular, we introduce a model to formalize the challenging task of operator placement and buffer allocation onto such heterogeneous hardware and describe the difficulties of finding good placements. Furthermore, a detailed insight into different memory types and their peculiarities is given in order to use the strength of heterogeneous memory architectures. Here, we also highlight the implications of heterogeneous memories for the problem of query placement.
- ZeitschriftenartikelSelf-organizing Core Allocation(PARS: Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware: Vol. 30, No. 1, 2013) Ziermann, Tobias; Wildermann, Stefan; Teich, JürgenThis paper deals with the problem of dynamic allocation of cores to parallel applications on homogeneous many-core systems such as, for example, MultiProcessor System-on-Chips (MPSoCs). For a given number of thread-parallel applications, the goal is to find a core assignment that maximizes the average speedup. However, the difficulty is that some applications may have a higher speedup variation than others when assigned additional cores. This paper first presents a centralized algorithm to calculate an optimal assignment for the above objective. However, as the number of cores and the dynamics of applications will significantly increase in the future, decentralized concepts are necessary to scale with this development. Therefore, a decentralized (self-organizing) algorithm is developed in order to minimize the amount of global information that has to be exchanged between applications. The experimental results show that this approach can reach the optimal result of the centralized version in average by 98.95%.
- ZeitschriftenartikelSelf-organizing Core Allocation(PARS-Mitteilungen: Vol. 30, Nr. 1, 2013) Ziermann, Tobias; Wildermann, Stefan; Teich, JürgenThis paper deals with the problem of dynamic allocation of cores to par- allel applications on homogeneous many-core systems such asfor exampleMulti- Processor System-on-Chips (MPSoCs). For a given number of thread-parallel applicationscthe goal is to find a core assignment that maximizes the average speedup. However, the difficulty is that some applications may have a higher speedup variation than others when assigned additional cores. This paper first presents a centralized algorithm to calculate an optimal assignment for the above objective. Howeveras the number of cores and the dynamics of applications will significantly increase in the future, decentralized concepts are necessary to scale with this development. Therefore, a decentralized (self-organizing) algorithm is developed in order to minimize the amount of global information that has to be exchanged between applications. The experimental results show that this approach can reach the optimal result of the centralized version in average by 98.95%.
- ZeitschriftenartikelSpeculative Dynamic Reconfiguration and Table Prefetching Using Query Look-Ahead in the ReProVide Near-Data-Processing System(Datenbank-Spektrum: Vol. 21, No. 1, 2021) Beena Gopalakrishnan, Lekshmi; Becher, Andreas; Wildermann, Stefan; Meyer-Wegener, Klaus; Teich, JürgenFPGAs are promising target architectures for hardware acceleration of database query processing, as they combine the performance of hardware with the programmability of software. In particular, they are partially reconfigurable at runtime, which allows for the runtime adaptation to a variety of queries. However, reconfiguration costs some time, and a region of the FPGA is not available for computations during its reconfiguration. Techniques to avoid or at least hide the reconfiguration latencies can improve the overall performance. This paper presents optimizations based on query look-ahead, which follows the idea of exploiting knowledge about subsequent queries for scheduling the reconfigurations such that their overhead is minimized. We evaluate our optimizations with a calibrated model for various parameter settings. Improvements in execution time can be “calculated” even if only being able to look one query ahead.