Auflistung nach Autor:in "Pinnecke, Marcus"
1 - 5 von 5
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelGridTables: A One-Size-Fits-Most H2TAP Data Store(Datenbank-Spektrum: Vol. 20, No. 1, 2020) Pinnecke, Marcus; Campero Durand, Gabriel; Broneske, David; Zoun, Roman; Saake, GunterHeterogeneous Hybrid Transactional Analytical Processing ( $$\mathrm{H}^{2}$$ H 2 TAP) database systems have been developed to match the requirements for low latency analysis of real-time operational data. Due to technical challenges, these systems are hard to architect, non-trivial to engineer, and complex to administrate. Current research has proposed excellent solutions to many of those challenges in isolation – a unified engine enabling to optimize performance by combining these solutions is still missing. In this concept paper, we suggest a highly flexible and adaptive data structure (called gridtable ) to physically organize sparse but structured records in the context of $$\mathrm{H}^{2}$$ H 2 TAP. For this, we focus on the design of an efficient highly-flexible storage layout that is built from scratch for mixed query workloads. The key challenges we address are: (1) partial storage in different memory locations, and (2) the ability to optimize for mixed OLTP-/OLAP access patterns. To guarantee safe and well-specified data definition or manipulation, as well as fast querying with no compromises on performance, we propose two dedicated access paths to the storage. In this paper, we explore the architecture and internals of gridtables showing design goals, concepts and trade-offs. We close this paper with open research questions and challenges that must be addressed in order to take advantage of the flexibility of our solution.
- KonferenzbeitragKonzept und prototypische Implementierung eines föderativen Complex Event Processing Systeme mit Operatorverteilung(Datenbanksysteme für Business, Technologie und Web (BTW 2015) - Workshopband, 2015) Pinnecke, MarcusComplex Event Processing (CEP) ist eine etablierte Technologie zur Verarbeitung von Ereignisströmen in nahezu Echtzeit. Trotz alledem unterscheiden bestehende CEP-Systeme sich stark in ihren jeweiligen Leistungsumfängen und -profilen. Werden verschiedene CEP-Systeme in einem föderativen System kombiniert, so kann das resultierende System im Vergleich zu den Einzelsystemen eine höhere Datendurchsatzrate und einen breiteren Leistungsumfang erreichen. Fehlende Standardisierung und inkompatible Schnittstellen der CEP-Systeme behindern diesen Ansatz jedoch. Dieser Missstand wird durch die Middleware Java Event Processing Connectivity (JEPC) behoben. Gegenstand dieser Arbeit ist das auf JEPC basierendes föderatives CEP-System so zu erweitern, dass eine Verteilung von Anfrageoperatoren auf die beteiligten Systeme automatisch durchführt wird. Hierfür wird in dieser Arbeit das zugrundeliegende Konzept, eine Anfrageoptimierung sowie ein Kostenmodell für die Auswahl eines konkreten Systems vorgestellt. In einer Evaluation wird gezeigt, dass durch den Einsatz einer Föderation die Datendurchsatzrate deutlich steigt, insbesondere wenn es sich um eine heterogene Föderation mit unterschiedlichen Leistungsprofilen handelt.
- TextdokumentMSDataStream – Connecting a Bruker Mass Spectrometer to the Internet(BTW 2019, 2019) Zoun, Roman; Schallert, Kay; Broneske, David; Fenske, Wolfram; Pinnecke, Marcus; Heyer, Robert; Brehmer, Sven; Benndorf, Dirk; Saake, GunterMetaproteomics is the biological research of proteins of whole communities comprised of thousands of species using tandem mass spectrometry. But still it follows a sequential non parallelizable workflow. Hence, researchers have to wait for hours or even days until the measurement data are available. In our demo, we show a way to decrease the smallest unit of the workflow to a minimum to realize a near real time stream processing system on a fast data architecture.
- TextdokumentProtobase: It's About Time for Backend/Database Co-Design(BTW 2019, 2019) Pinnecke, Marcus; Campero, Gabriel; Zoun, Roman; Broneske, David; Saake, GunterIn this interactive demonstration, we show the current state of Protobase, our main-memory analytic document store that is designed from scratch to enable rapid prototyping of efficient microservices that perform analytics and explorations on (third-party) JSON-like documents stored in a novel columnar binary-encoded format, called the Cabin file format. In contrast to other solutions, our database system exposes neither a particular query language, nor a fixed REST API to its clients. Instead, the entire user-defined backend logic, whose user code is written in Python, is placed inside a sandbox that runs in the systems process. Protobase in turn exposes a user-defined REST API that the (frontend) application interacts with. Thus, our system acts as a backend server while at the same time avoids full exposure of its database to the clients. Consequently, a Protobase instance (database + user code + REST API) serves as (the entire) microservice -potentially minimizing the number of systems running in a typical analytic software stack. In terms of execution performance, Protobase therefore takes the inter-process communication overhead between backend and database system out of the picture and heavily utilizes columnar binary document storage to scale-up for analytic queries. Both features lead to a notable performance gain for non-trivial services, potentially minimizing the number of required nodes in a cloud setting, too. In our demo, we overview Protobases internals, spot major design decisions, and show how to prototype a scholarly search engine managing the Microsoft Academic Graph, a real-world scientific paper graph of roughly 154 mio. Documents.
- ZeitschriftenartikelQuery Optimization in Heterogenous Event Processing Federations(Datenbank-Spektrum: Vol. 15, No. 3, 2015) Pinnecke, Marcus; Hoßbach, BastianContinuous processing of event streams evolved to an important class of data management over the last years and will become even more important due to novel applications such as the Internet of Things. Because systems for data stream and event processing have been developed independent of each other, often in competition and without the existence of any standards, the Stream Processing System (SPS) landscape is extremely heterogeneous today. To overcome the problems caused by this heterogeneity, a novel event processing middleware, the Java Event Processing Connectivity (JEPC), has been presented recently. However, despite the fact that SPSs can be accessed uniformly using JEPC, their different performance profiles caused by different algorithms and implementations remain. This gives the opportunity to query optimization, because individual system strengths can be exploited. In this paper, we present a novel query optimizer that exploits the technical heterogeneity in a federation of different unified SPSs. Taking into account different performance profiles of SPSs, we address query plan partitioning, candidate selection, and reducing inter-system communication in order to improve the overall query performance. We suggest a heuristic that finds a good initial mapping of sub-plans to a set of heterogenous SPSs. An experimental evaluation clearly shows that heterogeneous federations outperform homogeneous federations, in general, and that our heuristic performs well in practice.