Logo des Repositoriums

P265 - BTW2017 - Datenbanksysteme für Business, Technologie und Web

Autor*innen mit den meisten Dokumenten  

Auflistung nach:

Neueste Veröffentlichungen

1 - 10 von 56
  • Konferenzbeitrag
    Distributed Grouping of Property Graphs with GRADOOP
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Junghanns, Martin; Petermann, André; Rahm, Erhard
    Property graphs are an intuitive way to model, analyze and visualize complex relationships among heterogeneous data objects, for example, as they occur in social, biological and information networks. These graphs typically contain thousands or millions of vertices and edges and their entire representation can easily overwhelm an analyst. One way to reduce complexity is the grouping of vertices and edges to summary graphs. In this paper, we present an algorithm for graph grouping with support for attribute aggregation and structural summarization by user-defined vertex and edge properties. The algorithm is part of G , an open-source system for graph analytics. G is implemented on top of Apache Flink, a state-of-the-art distributed dataflow framework, and thus allows us to scale graph analytical programs across multiple machines. Our evaluation demonstrates the scalability of the algorithm on real-world and synthetic social network data.
  • Konferenzbeitrag
    The STARK Framework for Spatio-Temporal Data Analytics on Spark
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Hagedorn, Stefan; Götze, Philipp; Sattler, Kai-Uwe
    Big Data sets can contain all types of information: from server log files to tracking information of mobile users with their location at a point in time. Apache Spark has been widely accepted for Big Data analytics because of its very fast processing model. However, Spark has no native support for spatial or spatio-temporal data. Spatial filters or joins using, e.g., a contains predicate are not supported and would have to be implemented ine ciently by the users. Also, Spark cannot make use of, e.g., spatial distribution for optimal partitioning. Here we present our STARK framework that adds spatio-temporal support to Spark. It includes spatial partitioners, different modes for indexing, as well as filter, join, and clustering operators. In contrast to existing solutions, STARK integrates seamlessly into any (Scala) Spark program and provides more flexible and comprehensive operators. Furthermore, our experimental evaluation shows that our implementation outperforms existing solutions.
  • Konferenzbeitrag
    Transformations on Graph Databases for Polyglot Persistence with NotaQL
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Schildgen, Johannes; Krück, Yannick; Deßloch, Stefan
    Polyglot-persistence applications use a combination of many di erent data stores. Often, one of them is a graph database to model relationships between data items. The data-transformation language NotaQL can be used to define transformations from one NoSQL database to a di erent one. In this paper, we present a language extension for NotaQL to allow graph transformations, graph analysis, and data migrations on graph databases. NotaQL is schema-flexible, it o ers filters and aggregation functions, and it allows for graph traversal and edge creation. Our graph-transformation platform can be used for iterative graph algorithms and bulk processing.
  • Konferenzbeitrag
    Reverse Engineering Top-k Join Queries
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Panev, Kiril; Weisenauer, Nico; Michel, Sebastian
    Ranked lists have become a fundamental tool to represent the most important items taken from a large collection of data. Search engines, sports leagues and e-commerce platforms present their results, most successful teams and most popular items in a concise and structured way by making use of ranked lists. This paper introduces the PALEO-J framework which is able to reconstruct top-k database queries, given only the original query output in the form of a ranked list and the database itself. The query to be reverse engineered may contain a wide range of aggregation functions and an arbitrary amount of equality joins, joining several database relations. The challenge of this work is to reconstruct complex queries as fast as possible while operating on large databases and given only the little amount of information provided by the top-k list of entities serving as input. The core contribution is identifying the join predicates in reverse engineering top-k OLAP queries. Furthermore we introduce several optimizations and an advanced classification system to reduce the execution time of the algorithm. Experiments conducted on a large database show the performance of the presented approach and confirm the benefits of our optimizations.
  • Konferenzbeitrag
    The Big Picture: Understanding large-scale graphs using Graph Grouping with GRADOOP
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Junghanns, Martin; Petermann, André; Teichmann, Niklas; Rahm, Erhard
    Graph grouping supports data analysts in decision making based on the characteristics of large-scale, heterogeneous networks containing millions or even billions of vertices and edges. We demonstrate graph grouping with G , a scalable system supporting declarative programs composed from multiple graph operations. Using social network data, we highlight the analytical capabilities enabled by graph grouping in combination with other graph operators. The resulting graphs are visualized and visitors are invited to either modify existing or write new analytical programs. G is implemented on top of Apache Flink, a state-of-the-art distributed dataflow framework, and thus allows us to scale graph analytical programs across multiple machines. In the demonstration, programs can either be executed locally or remotely on our research cluster.
  • Konferenzbeitrag
    Secure Cryptographic Deletion in the Swift Object Store
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Waizenegger, Tim
    The secure deletion of data is of increasing importance to individuals, corporations as well as governments. Recent data breaches as well as advances in laws and regulations show that secure deletion is becoming a requirement in many areas. However, this requirement is rarely considered in today’s cloud storage services. The reason is that the established processes for secure deletion of on-site storage are not applicable to cloud storage services. Cryptographic deletion is a suitable candidate for these services, but a research gap still exists in applying cryptographic deletion to large cloud storage services. For these reasons, we demonstrate a working prototype for a secure-deletion enabled cloud storage service with the following two main contributions: A model for offering high value service without full plain-text access to the provider, as well as secure deletion of data through cryptography.
  • Konferenzbeitrag
    Invest Once, Save a Million Times – LLVM-based Expression Compilation in PostgreSQL
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Butterstein, Dennis; Grust, Torsten
    We demonstrate how a surgical change to PostgreSQL’s evaluation strategy for SQL expressions can have a noticeable impact on overall query runtime performance. (This is an abridged version of [BG16], originally demonstrated at VLDB 2016.)
  • Konferenzbeitrag
    InVerDa – The Liquid Database
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Herrmann, Kai; Voigt, Hannes; Seyschab, Thorsten; Lehner, Wolfgang
    Multiple applications, which share one common database, will evolve over time by their very nature. Often, former versions need to stay available, so database developers find themselves maintaining co-existing schema versions of multiple applications in multiple versions—usually with handwritten delta code—which is highly error-prone and explains significant costs in software projects. We showcase I V D , a tool using the richer semantics of a bidirectional database evolution language to generate all the delta code automatically, easily providing co-existing schema versions within one database. I V D automatically decides on an optimized physical database schema serving all schema versions to transparently optimize the performance for the current workload.
  • Konferenzbeitrag
    Exploring Databases via Reverse Engineering Ranking Queries with PALEO
    (Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Panev, Kiril; Michel, Sebastian; Milchevski, Evica; Pal, Koninika
    A novel approach to explore databases using ranked lists is demonstrated. Working with ranked lists, capturing the relative performance of entities, is a very intuitive and widely applicable concept. Users can post lists of entities for which explanatory SQL queries and full result lists are returned. By refining the input, the results, or the queries, users can interactively explore the database content. The demonstrated system was previously presented at VLDB 2016 and is centered around our PALEO framework for reverse engineering OLAP-style database queries. How is this useful for exploring data?