Logo des Repositoriums
 

Datenbank Spektrum 13(1) - März 2013

Autor*innen mit den meisten Dokumenten  

Auflistung nach:

Neueste Veröffentlichungen

1 - 10 von 11
  • Zeitschriftenartikel
    Compilation of Query Languages into MapReduce
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013) Sauer, Caetano; Härder, Theo
    The introduction of MapReduce as a tool for Big Data Analytics, combined with the new requirements of emerging application scenarios such as the Web 2.0 and scientific computing, has motivated the development of data processing languages which are more flexible and widely applicable than SQL. Based on the Big Data context, we discuss the points in which SQL is considered too restrictive. Furthermore, we provide a qualitative evaluation of how recent query languages overcome these restrictions. Having established the desired characteristics of a query language, we provide an abstract description of the compilation into the MapReduce programming model, which, up to minor variations, is essentially the same in all approaches. Given the requirements of query processing, we introduce simple generalizations of the model, which allow the reuse of well-established query evaluation techniques, and discuss strategies to generate optimized MapReduce plans.
  • Zeitschriftenartikel
    Inkrementelle Neuberechnungen in MapReduce
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013) Schildgen, Johannes; Jörg, Thomas; Deßloch, Stefan
    Das MapReduce-Programmiermodell ermöglicht die skalierbare Analyse und Transformation großer Datenmengen. Wir stellen das auf MapReduce basierende Marimba-Framework zur einfachen Entwicklung von inkrementellen, selbstwartbaren Programmen vor, welche bei Änderung von Quelldaten eine vollständige Wiederholung des MapReduce-Jobs vermeiden. Marimba wird anhand mehrerer Anwendungen illustriert und durch Leistungsmessungen evaluiert.
  • Zeitschriftenartikel
    Efficient OR Hadoop: Why Not Both?
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013) Dittrich, Jens; Richter, Stefan; Schuh, Stefan
    In this article, we give an overview of research related to Big Data processing in Hadoop going on at the Information Systems Group at Saarland University. We discuss how to make Hadoop efficient. We briefly survey three of our projects in this context: Hadoop++, Trojan Layouts, and HAIL.
  • Zeitschriftenartikel
    Datenmanagement und -exploration an der RWTH Aachen
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013) Seidl, Thomas
    Der Lehrstuhl für Informatik 9 (Datenmanagement und -exploration) an der RWTH Aachen beschäftigt sich mit Data Mining- und Datenbanktechnologien für multimediale und räumlich-zeitliche Daten in ingenieur-, natur-, lebens-, wirtschafts- und sozialwissenschaftlichen Anwendungen. Sowohl die große Menge an Daten als auch die Komplexität der einzelnen Objekte bergen unterschiedliche Herausforderungen für die Analyse und Exploration realer Daten, denen wir mit der Entwicklung neuer effektiver sowie effizienter Konzepte für Datenanalyse und Datenmanagement begegnen.
  • Zeitschriftenartikel
    Dr. Dean Jacobs
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013) Kemper, Alfons; Lehner, Wolfgang
  • Zeitschriftenartikel
    Towards Integrated Data Analytics: Time Series Forecasting in DBMS
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013) Fischer, Ulrike; Dannecker, Lars; Siksnys, Laurynas; Rosenthal, Frank; Boehm, Matthias; Lehner, Wolfgang
    Integrating sophisticated statistical methods into database management systems is gaining more and more attention in research and industry in order to be able to cope with increasing data volume and increasing complexity of the analytical algorithms. One important statistical method is time series forecasting, which is crucial for decision making processes in many domains. The deep integration of time series forecasting offers additional advanced functionalities within a DBMS. More importantly, however, it allows for optimizations that improve the efficiency, consistency, and transparency of the overall forecasting process. To enable efficient integrated forecasting, we propose to enhance the traditional 3-layer ANSI/SPARC architecture of a DBMS with forecasting functionalities. This article gives a general overview of our proposed enhancements and presents how forecast queries can be processed using an example from the energy data management domain. We conclude with open research topics and challenges that arise in this area.
  • Zeitschriftenartikel
    Parallel Entity Resolution with Dedoop
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013) Kolb, Lars; Rahm, Erhard
    We provide an overview of Dedoop (Deduplication with Hadoop), a new tool for parallel entity resolution (ER) on cloud infrastructures. Dedoop supports a browser-based specification of complex ER strategies and provides a large library of blocking and matching approaches. To simplify the configuration of ER strategies with several similarity metrics, training-based machine learning approaches can be employed with Dedoop. Specified ER strategies are automatically translated into MapReduce jobs for parallel execution on different Hadoop clusters. For improved performance, Dedoop supports redundancy-free multi-pass blocking as well as advanced load balancing approaches. To illustrate the usefulness of Dedoop, we present the results of a comparative evaluation of different ER strategies on a challenging real-world dataset.
  • Zeitschriftenartikel
    Dissertationen
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013)
  • Zeitschriftenartikel
    Bericht vom Herbsttreffen der GI-Fachgruppe Datenbanksysteme
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013) Kemper, Alfons; Mühlbauer, Tobias; Neumann, Thomas; Reiser, Angelika; Rödiger, Wolf
  • Zeitschriftenartikel
    News
    (Datenbank-Spektrum: Vol. 13, No. 1, 2013)