Auflistung nach Schlagwort "FPGA"
1 - 9 von 9
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelEfficient implementation of ideal lattice-based cryptography(it - Information Technology: Vol. 59, No. 6, 2017) Pöppelmann, ThomasAlmost all practically relevant asymmetric cryptosystems like RSA or ECC are either based on the hardness of factoring or on the hardness of the discrete logarithm problem. However, both problems could be solved efficiently on a large enough quantum computer. While quantum computers powerful enough to break currently used parameter sets are not available yet, they are heavily researched and expected to reach maturity in 15 to 20 years. As a consequence, research on alternative quantum-safe cryptosystems is required. One alternative is lattice-based cryptography which allows the construction of asymmetric public-key encryption and signature schemes that offer a good balance between security, performance, and key as well as ciphertext sizes.
- TextdokumentExploring Memory Access Patterns for Graph Processing Accelerators(BTW 2021, 2021) Dann, Jonas; Ritter, Daniel; Fröning, HolgerRecent trends in business and technology (e.g., machine learning, social network analysis) benefit from storing and processing growing amounts of graph-structured data in databases and data science platforms. FPGAs as accelerators for graph processing with a customizable memory hierarchy promise solving performance problems caused by inherent irregular memory access patterns on traditional hardware (e.g., CPU). However, developing such hardware accelerators is yet time-consuming and difficult and benchmarking is non-standardized, hindering comprehension of the impact of memory access pattern changes and systematic engineering of graph processing accelerators. In this work, we propose a simulation environment for the analysis of graph processing accelerators based on simulating their memory access patterns. Further, we evaluate our approach on two state-of-the-art FPGA graph processing accelerators and show reproducibility, comparablity, as well as the shortened development process by an example. Not implementing the cycle-accurate internal data flow on accelerator hardware like FPGAs significantly reduces the implementation time, increases the benchmark parameter transparency, and allows comparison of graph processing approaches.
- KonferenzbeitragAn FPGA Avro Parser Generator for Accelerated Data Stream Processing(BTW 2023, 2023) Hahn, Tobias; Schüll, Daniel; Wildermann, Stefan; Teich, JürgenBig Data applications frequently involve processing data streams encoded in semi-structured data formats such as JSON, Protobuf, or Avro.A major challenge in accelerating data stream processing on FPGAs is that the parsing of such data formats is usually highly complex.This is especially true for JSON parsing on FPGAs, which lies in the focus of related work.The parsing of the binary Avro format, on the other hand, is perfectly suited for being processed on FPGAs and can thus serve as an enabler for data stream processing on FPGAs.In this realm, we present a methodology for parsing, projection, and selection of Avro objects, which enforces an output format suitable for further processing on the FPGA.Moreover, we provide a generator to automatically create accelerators based on this methodology.The obtained accelerators can achieve significant speedups compared to CPU-based parsers, and at the same time require only very few FPGA resources.
- TextdokumentFPGA-basierter Protein- und DNA-Sequenzvergleich zur optimierten Datenbanksuche mit dem BLAST-Algorithmus(INFORMATIK 2017, 2017) Starke, Thomas Fabian; Bostelmann, Timm; Sawitzki, SergeiIm Bereich der Bioinformatik und insbesondere der Genforschung ist die Suche nach lokalen Übereinstimmungen in Gensequenzen (engl. local alignment) von wichtiger Bedeutung. Die Anzahl und Größe der in den Gendatenbanken hinterlegten Sequenzen wächst jedoch rasant an, sodass die Suchgeschwindigkeit von kritischer Bedeutung ist. Für die Suche lokaler Übereinstimmungen zwischen einer Such-und einer Datenbanksequenz wird üblicherweise der BLAST Algorithmus (Basic Local Alignment Search Tool) eingesetzt. In dieser Arbeit wird eine Implementierung des BLAST Algorithmus in einer baumartigen Hardwarestruktur auf einem FPGA vorgestellt. Die Vorund Nachteile dieser Implementierung gegenüber einer konventionellen Umsetzung des Algorithmus in Software werden gezeigt und analysiert. Abschließend wird ein Ausblick darauf gegeben, wie die bestehenden Einschränkungen der vorgestellten Lösung aufgehoben werden können.
- TextdokumentHardware Accelerator Framework Approach for Dynamic Partial Reconfigurable Overlays on Xilinx PYNQ(INFORMATIK 2017, 2017) Janßen, Benedikt; Wingender, Tim; Hübner, MichaelReconfigurable System-on-Chips (SoC) combine processor cores with Field-Programmable Gate Array (FPGA) fabric. Thereby, these systems enable to optimize the execution of application to some extend by hardware accelerators in the FPGA fabric. However, hardware accelerator development requires special skills from the developer since hardware development differs substantial from software development. Overlays offer a way to abstract the complexity of FPGA usage by predefined programmable hardware architectures. With Dynamic Partial reconfiguration (DPR) it becomes possible to exchange parts of an overlay’s architecture without affecting the operation of the remaining parts. In this article, we present a first version of our framework to ease the integration of hardware accelerators in Python via DPR overlays. The integration into Python is based on Xilinx PYNQ. The framework approach is based on our Python package ‘pynqpartial’.
- TextdokumentHardwaregestütze Positionsschätzung mit Bayes’schen Filtern auf Basis 3-dimensionaler Umgebungsmodelle für den Innenbereich(INFORMATIK 2017, 2017) Schott, Christian; Froß, Daniel; Rößler, Marko; Heinkel, UlrichDie Lage und Position von mobilen Geräten spielt eine zunehmend wichtigere Rolle bei Anwendungen die sowohl im Konsumbereich als auch in Produktion und Logistik liegen. Im Außenbereich bilden globale Positionierungssysteme auf Basis geostationärer Satelliten die Datengrundlage, welche durch weitere Sensorik verfeinert wird. Im Innen-und Nahbereich sind Signallaufzeitmessungen von Radio-oder Radarwellen mit nachgelagerter Triangulation eine wesentliche Grundlage für eine grobe Positionierung und Stand der Technik. Aufgrund der gegebenen Unsicherheit dieser Messverfahren bedarf es robuster Auswertealgorithmen für eine zuverlässige Positionierung, beispielsweise mittels Bayes’scher Filter. Der vorliegende Beitrag befasst sich mit der Erweiterung einer hardware-orientierten Umsetzung eines Partikelfilters, die es erlaubt apriori bekanntes Wissen aus einem 3-dimensionalen Umgebungsmodell mit den Entfernungsmessdaten in Echtzeit zu fusionieren. Vorgestellt wird eine Hardware/Software-Systemarchitektur für einen effektiven Zugriff auf die Modelldaten sowie deren Auswertung innerhalb einer Pipeline-Struktur des Partikelfilters auf Register-Transfer-Ebene. Die prototypenhafte Implementierung erfolgte für einen FPGA-Schaltkreis der ZYNQ-Familie.
- ZeitschriftenartikelIn-Depth Analysis of OLAP Query Performance on Heterogeneous Hardware(Datenbank-Spektrum: Vol. 21, No. 2, 2021) Broneske, David; Drewes, Anna; Gurumurthy, Bala; Hajjar, Imad; Pionteck, Thilo; Saake, GunterClassical database systems are now facing the challenge of processing high-volume data feeds at unprecedented rates as efficiently as possible while also minimizing power consumption. Since CPU-only machines hit their limits, co-processors like GPUs and FPGAs are investigated by database system designers for their distinct capabilities. As a result, database systems over heterogeneous processing architectures are on the rise. In order to better understand their potentials and limitations, in-depth performance analyses are vital. This paper provides interesting performance data by benchmarking a portable operator set for column-based systems on CPU, GPU, and FPGA – all available processing devices within the same system. We consider TPC‑H query Q6 and additionally a hash join to profile the execution across the systems. We show that system memory access and/or buffer management remains the main bottleneck for device integration, and that architecture-specific execution engines and operators offer significantly higher performance.
- TextdokumentReProVide: Towards Utilizing Heterogeneous Partially Reconfigurable Architectures for Near-Memory Data Processing(BTW 2019 – Workshopband, 2019) Becher, Andreas; Herrmann, Achim; Wildermann, Stefan; Teich, JürgenReconfigurable hardware such as Field-programmable Gate Arrays (FPGAs) is widely used for data processing in databases. Most of the related work focuses on accelerating one or a small set of specific operations like sort, join, regular expression matching. A drawback of such approaches is often the assumed static accelerator hardware architecture: Rather than adapting the hardware to fit the query, the query plan has to be adapted to fit the hardware. Moreover, operators or data types that are not supported by the accelerator have to be processed in software. As a remedy, approaches for exploiting the dynamic partial reconfigurability of FPGAs have been proposed that are able to adapt the datapath at runtime. However, on modern FPGAs, this introduces new challenges due to the heterogeneity of the available resources. In addition, not only the execution resources may be heterogeneous but also the memory resources. This work focuses on the architectural aspects of database (co-)processing on heterogeneous FPGA-based PSoC (programmable System-on-Chip) architectures including processors, specialized hardware components, multiple memory types and dynamically partially reconfigurable areas. We present an approach to support such (co-)processing called ReProVide. In particular, we introduce a model to formalize the challenging task of operator placement and buffer allocation onto such heterogeneous hardware and describe the difficulties of finding good placements. Furthermore, a detailed insight into different memory types and their peculiarities is given in order to use the strength of heterogeneous memory architectures. Here, we also highlight the implications of heterogeneous memories for the problem of query placement.
- TextdokumentUsing Genode OS for Remotely Updated Embedded IoT Devices(Tagungsband des FG-BS Frühjahrstreffens 2023, 2023) Matthé, Maximilian; Kühne, Paul; Schlatow, JohannesThis paper demonstrates the usage of the Genode OS for wirelessly connected, embedded IoT devices. We show that the modular Genode OS makes the system robust against corrupted or erroneous updates and provides builtin rollback functionality. It enables an increased uptime of the device compared to using an embedded Linux OS. We compare the performance of such device against an embedded Linux. The claimed properties are presented in the form of a demonstrator which can initiate discussions among experts from academia and industry.