P290 - BTW2019 - Datenbanksysteme für Business, Technologie und Web - Workshopband
Auflistung P290 - BTW2019 - Datenbanksysteme für Business, Technologie und Web - Workshopband nach Titel
1 - 10 von 36
Treffer pro Seite
Sortieroptionen
- Textdokument1st Workshop on Novel Data Management Ideas on Heterogeneous (Co-)Processors (NoDMC)(BTW 2019 – Workshopband, 2019) Broneske, David; Habich, Dirk
- TextdokumentAn Actor Database System for Akka(BTW 2019 – Workshopband, 2019) Schmidl, Sebastian; Schneider, Frederic; Papenbrock, ThorstenSystem architectures for data-centric applications are commonly comprised of two tiers: An application tier and a data tier. The fact that these tiers do not typically share a common format for data is referred to as object-relational impedance mismatch. To mitigate this, we develop an actor database system that enables the implementation of application logic into the data storage runtime. The actor model also allows for easy distribution of both data and computation across multiple nodes in a cluster. More specifically, we propose the concept of domain actors that provide a type-safe, SQL-like interface to develop the actors of our database system and the concept of Functors to build queries retrieving data contained in multiple actor instances. Our experiments demonstrate the feasibility of encapsulating data into domain actors by evaluating their memory overhead and performance. We also discuss how our proposed actor database system framework solves some of the challenges that arise from the design of distributed databases such as data partitioning, failure handling, and concurrent query processing.
- TextdokumentAngepasstes Item Set Mining zur gezielten Steuerung von Bauteilen in der Serienfertigung von Fahrzeugen(BTW 2019 – Workshopband, 2019) Spieß, Marco; Reimann, PeterQualitätsprobleme im Bereich Fahrzeugbau können nicht nur zum Imageverlust des Unternehmens führen, sondern auch mit entsprechend hohen Kosten einhergehen. Wird ein Bauteil als Verursacher eines Qualitätsproblems identifiziert, muss dessen Verbau gestoppt werden. Mit einer Datenanalyse kann herausgefunden werden, welche Fahrzeugkonfigurationen Probleme mit diesem fehlerverursachenden Bauteil haben. Im Rahmen der domänenspezifischen Problemstellung wird in diesem Beitrag die Anwendbarkeit von Standardalgorithmen aus dem Bereich Data-Mining untersucht. Da die Analyseergebnisse auf Standardausstattungen hinweisen, sind diese nicht zielführend. Für dieses Businessproblem von Fahrzeugherstellern haben wir einen Data-Mining Algorithmus entwickelt, der das Vorgehen des Item Set Mining der Assoziationsanalyse an das domänenspezifische Problem anpasst. Er unterscheidet sich zum klassischen Apriori-Algorithmus in der Beschneidung des Ergebnisraumes sowie in der nachfolgenden Aufbereitung und Verwendungsweise der Item Sets. Der Algorithmus ist allgemeingültig für alle Fahrzeughersteller anwendbar. Die Ergebnisse sind anhand eines realen Anwendungsfalls evaluiert worden, bei dem durch die Anwendung unseres Algorithmus 87% der Feldausfälle verhindert werden können.
- TextdokumentAssessing the Impact of Driving Bans with Data Analysis(BTW 2019 – Workshopband, 2019) Woltmann, Lucas; Hartmann, Claudio; Lehner, Wolfgang
- TextdokumentAutomated Architecture-Modeling for Convolutional Neural Networks(BTW 2019 – Workshopband, 2019) Duong, Manh KhoiTuning hyperparameters can be very counterintuitive and misleading, yet it plays a big (or even the biggest) part in many machine learning algorithms. For instance, finding the right architecture for an artificial neural network (ANN) can also be seen as a hyperparameter e.g. number of convolutional layers, number of fully connected layers etc. Tuning these can be done manually or by techniques such as grid search or random search. Even then finding optimal hyperparameters seems to be impossible. This paper tries to counter this problem by using bayesian optimization, which finds optimal parameters, including the right architecture for ANNs. In our case, a histological image dataset was used to classify breast cancer into stages.
- TextdokumentBTW2019 - Datenbanksysteme für Business, Technologie und Web - Workshopband(BTW 2019 – Workshopband, 2019) Meyer, Holger; Ritter, Norbert; Thor, Andreas; Nicklas, Daniela; Heuer, Andreas; Klettke, Meike
- TextdokumentChain-detection for DBSCAN(BTW 2019 – Workshopband, 2019) Held, Janis; Beer, Anna; Seidl, ThomasChains connecting two or more different clusters are a well known problem of the probably most famous density-based clustering algorithm DBSCAN. Since already a small number of points resulting from, e.g., noise can form such a chain and build a bridge between different clusters, it can happen that the results of DBSCAN are distorted: several disparate clusters get merged into one. This single-link effect is rather known but to the best of our knowledge there are no satisfying solutions which extract those chains, yet. We present a new algorithm detecting not only straight chains between clusters, but also bent and noisy ones. Users are able to choose between eliminating one dimensional and higher dimensional chains connecting clusters to receive the underlying cluster structure by DBSCAN. Also, the desired straightness can be set by the user. We tested our efficient algorithm on a dataset containing traffic accidents in Great Britain and were able to detect chains emerging from streets between cities and villages, which led to clusters composed of diverse villages.
- TextdokumentA Comparison of Distributed Stream Processing Systems for Time Series Analysis(BTW 2019 – Workshopband, 2019) Gehring, Melissa; Charfuelan, Marcela; Markl, VolkerGiven the vast number of data processing systems available today, in this paper, we aim to identify, select, and evaluate systems to determine the one that is better suited to use in conducting time series analysis. Published studies of performance are used to compare several open-source systems, and two systems are further selected for qualitative comparison and evaluation regarding the development of a time series analytics task. The main interest of this work lies in the investigation of the Ease of development. As a test scenario, a discrete Kalman filter is implemented to predict the closing price of stock market data in real-time. Basic functionality coverage is considered, and advanced functionality is evaluated using several qualitative comparison criteria.
- TextdokumentComputation Offloading in JVM-based Dataflow Engines(BTW 2019 – Workshopband, 2019) Gavriilidis, HaralamposState-of-the-art dataflow engines, such as Apache Spark and Apache Flink scale out on large clusters for a variety of data-processing tasks, including machine learning and data mining algorithms. However, being based on the JVM, they are unable to apply optimizations supported by modern CPUs. On the contrary, specialized data processing frameworks scale up by exploiting modern CPU characteristics. The goal of this thesis is to find the sweet spot between scale-out and scale-up systems by offloading computation from dataflow engines to specialized systems. We propose two computation offloading methods, reason about their applicability, and implement a prototype based on Apache Spark. Our evaluation shows that for compute-intensive tasks, computation offloading leads to performance improvements of up to a factor of 2.5x. For certain UDF scenarios, computation offloading performs worse by up to a factor of 3x: our microbenchmarks show that 80% of the time is spent on serialization operations. By employing data exchange without serialization, computation offloading achieves performance improvements by up to 10x.
- TextdokumentContext Selection in a Heterogeneous Legal Ontology(BTW 2019 – Workshopband, 2019) Wehnert, Sabine; Fenske, Wolfram; Saake, GunterOntology building in the legal domain is subject to ongoing research. Taxonomic ontologies provide for instance concept hierarchies for term definitions, annotations, query expansion and support for inferences. However, the context-dependent application of statuatory legal texts is hard to model, often leading to a limited ontology scope and fixed terminology to avoid conflicts. In previous work, we presented a method to create a lightweight heterogeneous ontology from textbooks offering connections between laws, while avoiding an error-prone and costly ontology alignment step. In our ontology, laws are linked by common contexts. We propose a new data model, so that the context can be explored and selected by a user, which is necessary for many applications, such as recommender systems. To obtain the relevant user context, we added a mechanism to retrieve linked laws from our ontology, given a scope of user interest and context information for each law.