Auflistung nach Autor:in "Teubner, Jens"
1 - 10 von 115
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragAnfrage-getriebener Wissenstransfer zur Unterstützung von Datenanalysten(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Wahl, Andreas M.; Endler, Gregor; Schwab, Peter K.; Herbst, Sebastian; Lenz, RichardIn größeren Organisationen arbeiten verschiedene Gruppen von Datenanalysten mit unterschiedlichen Datenquellen, um analytische Fragestellungen zu beantworten. Das Formulieren effektiver analytischer Anfragen setzt voraus, dass die Datenanalysten profundes Wissen über die Existenz, Semantik und Verwendungskontexte relevanter Datenquellen besitzen. Derartiges Wissen wird informell innerhalb einzelner Gruppen von Datenanalysten geteilt, jedoch meist nicht in formalisierter Form für andere verfügbar gemacht. Mögliche Synergien bleiben somit ungenutzt. Wir stellen einen neuartigen Ansatz vor, der existierende Datenmanagementsysteme mit zusätzlichen Fähigkeiten für diesen Wissenstransfer erweitert. Unser Ansatz fördert die Kollaboration zwischen Datenanalysten, ohne dabei etablierte Analyseprozesse zu stören. Im Gegensatz zu bisherigen Forschungsansätzen werden die Analysten beim Transfer des in analytischen Anfragen enthaltenen Wissens unterstützt. Relevantes Wissen wird aus dem Anfrageprotokoll extrahiert, um das Auffinden von Datenquellen und die inkrementelle Datenintegration zu erleichtern. Extrahiertes Wissen wird formalisiert und zum Anfragezeitpunkt bereitgestellt.
- KonferenzbeitragApplication and Testing of Business Processes in the Energy Domain(Datenbanksysteme für Business, Technologie und Web (BTW 2017) - Workshopband, 2017) Böhmer, Kristof; Stertz, Florian; Hildebrandt, Tobias; Rinderle-Ma, Stefanie; Eibl, Günther; Ferner, Cornelia; Burkhart, Sebastian; Engel, DominikThe energy domain currently struggles with radical legal and technological changes, such as, smart meters. This results in new use cases which can be implemented based on business process technology. Understanding and automating business processes requires to model and test them. However, existing process testing approaches frequently struggle with the testing of process resources, such as ERP systems, and negative testing. Hence, this work presents a toolchain which tackles that limitations. The approach uses an open source process engine to generate event logs and applies process mining techniques in a novel way.
- JournalDie Arbeitsgruppe Datenbanken und Informationssysteme an der TU Dortmund(Datenbank-Spektrum: Vol. 19, No. 1, 2019) Teubner, Jens
- Konferenzbeitragasprin: Answer Set Programming with Preferences(Datenbanksysteme für Business, Technologie und Web (BTW 2017) - Workshopband, 2017) Romero, JavierAnswer Set Programming (ASP) is a well established approach to declarative problem solving, combining a rich yet simple modeling language with high-performance solving capacities. In this talk we present asprin, a general, flexible and extensible framework for preferences in ASP. asprin is general and captures many of the existing approaches to preferences. It is flexible, because it allows for the combination of different types of preferences. It is also extensible, allowing for an easy implementation of new approaches to preferences. Since it is straightforward to capture propositional theories and constraint satisfaction problems in ASP, the framework is also relevant to optimization in Satisfiability Testing and Constraint Processing.
- KonferenzbeitragAutonomous Data Ingestion Tuning in Data Warehouse Accelerators(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Stolze, Knut; Beier, Felix; Müller, JensThe IBM DB2 Analytics Accelerator (IDAA) is a state-of-the art hybrid database system that seamlessly extends the strong transactional capabilities of DB2 for z/OS with very fast processing of OLAP and analytical SQL workload in Netezza. IDAA copies the data from DB2 for z/OS into its Netezza backend, and customers can tailor data maintenance according to their needs. This copy process, the data load, can be done on a whole table or just a physical table partition. IDAA also o ers an incremental update feature, which employs replication technologies for low-latency data synchronization. The accelerator targets big relational databases with several TBs of data. Therefore, the data load is performance-critical, not only for the data transfer itself, but the system has to be able to scale up to a large number of tables, i. e., tens of thousands to be loaded at the same time, as well. The administrative overhead for such a number of tables has to be minimized. In this paper, we present our work on a prototype, which is geared towards e ciently loading data for many tables, where each table may store only a comparably small amount of data. A new load scheduler has been introduced for handling all concurrent load requests for disjoint sets of tables. That is not only required for a multi-tenant setup, but also a significant improvement for attaching an accelerator to a single DB2 for z/OS system. In this paper, we present architecture and implementation aspects of the new and improved load mechanism and results of some initial performance evaluations.
- KonferenzbeitragBenchmarking Univariate Time Series Classifiers(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Schäfer, Patrick; Leser, UlfTime series are a collection of values sequentially recorded over time. Nowadays, sensors for recording time series are omnipresent as RFID chips, wearables, smart homes, or event-based systems. Time series classification aims at predicting a class label for a time series whose label is unknown. Therefore, a classifier has to train a model using labeled samples. Classification time is a key challenge given new applications like event-based monitoring, real-time decision or streaming systems. This paper is the first benchmark that compares 12 state of the art time series classifiers based on prediction and classification times. We observed that most of the state-of-the-art classifiers require extensive train and classification times, and might not be applicable for these new applications.
- KonferenzbeitragBig Data is no longer equivalent to Hadoop in the industry(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Tönne, AndreasFor a long time, industry projects solved big data problems with Hadoop. The massive scalability of MapReduce algorithms and the HBase database brought solutions to an unanticipated level of computing. But this obstructs the view for the need of change. Business goals that emerge from Industry 4.0 or IoT have long been addressed with a suboptimal architecture. New business goals require a rethinking of the big data architecture instead of being driven by the known Hadoop ecosphere. We discuss the transformation of a Hadoop-centric middleware solution to a streaming architecture from a business value perspective. The new architecture also replaces a single NoSQL database by polyglot persistence that allows to focus on best performance and quality of each data processing step. We also discuss alternative architecture approaches like Lambda that were evaluated in the course of the transformation. We show that a single technology choice likely leads to a solution that is suboptimal.
- KonferenzbeitragThe Big Picture: Understanding large-scale graphs using Graph Grouping with GRADOOP(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Junghanns, Martin; Petermann, André; Teichmann, Niklas; Rahm, ErhardGraph grouping supports data analysts in decision making based on the characteristics of large-scale, heterogeneous networks containing millions or even billions of vertices and edges. We demonstrate graph grouping with G , a scalable system supporting declarative programs composed from multiple graph operations. Using social network data, we highlight the analytical capabilities enabled by graph grouping in combination with other graph operators. The resulting graphs are visualized and visitors are invited to either modify existing or write new analytical programs. G is implemented on top of Apache Flink, a state-of-the-art distributed dataflow framework, and thus allows us to scale graph analytical programs across multiple machines. In the demonstration, programs can either be executed locally or remotely on our research cluster.
- KonferenzbeitragBosch IoT Cloud – Platform for the Internet of Things(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Binz, TobiasUntil 2020 all electronic products of Bosch should be IoT-enabled. This IoT and digital transformation is an enormous opportunity for Bosch, addressing the fields of connected mobility, connected industry (Industry 4.0), connected buildings and smart home. This overarching connectivity strategy is enabled by the Bosch IoT Cloud, a cloud platform specialized on developing, testing, and running scalable IoT services and applications.
- KonferenzbeitragBring Your Language to Your Data with EXASOL(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Mandl, Stefan; Kozachuk, Oleksandr; Graupmann, JensUser Defined Functions (UDF) are an important feature of analytical SQL as they allow processing of data right inside of relational queries. Typically UDFs have to be written in a special language which sometimes diminishes their practical use, especially when required libraries are not available in this language. An alternative approach is to allow users to provide functions as native low-level database extensions; an approach that can be very dangerous. EXASOL now follows a radically different approach by allowing to integrate any programming language with the database without affecting data integrity: UDF language implementations are encapsulated within Linux containers that communicate with the database engine via a straightforward protocol. Using these technologies, users can now make available their own programming language for UDFs in the database. As an example, we show how to provide C++ as a UDF language for EXASOL.