Auflistung nach:
Auflistung BTW - Datenbanksysteme für Business, Technologie und Web nach Autor:in "Adamek, Jochen"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragApplying stratosphere for big data analytics(Datenbanksysteme für Business, Technologie und Web (BTW) 2046, 2013) Leich, Marcus; Adamek, Jochen; Schubotz, Moritz; Heise, Arvid; Rheinländer, Astrid; Markl, VolkerAnalyzing big data sets as they occur in modern business and science applications requires query languages that allow for the specification of complex data processing tasks. Moreover, these ideally declarative query specifications have to be optimized, parallelized and scheduled for processing on massively parallel data processing platforms. This paper demonstrates the application of Stratosphere to different kinds of Big Data Analytics tasks. Using examples from different application domains, we show how to formulate analytical tasks as Meteor queries and execute them with Stratosphere. These examples include data cleansing and information extraction tasks, and a correlation analysis of microblogging and stock trade volume data that we describe in detail in this paper.
- KonferenzbeitragOperators for analyzing and modifying probabilistic data - a question of efficiency(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Adamek, Jochen; Eisenreich, Katrin; Markl, Volker; Rösch, PhilippTo enable analyses and decision support over historic, forecast, and estimated data, efficient querying and modification of probabilistic data is an important aspect. In earlier work, we proposed a data model and operators for the analysis and the modification of uncertain data in support of what-if scenario analysis. Naturally, and as discussed broadly in previous research, the representation of uncertain data introduces additional complexity to queries over such data. When targeting the interactive creation and evaluation of scenarios, we must be aware of the run-time performance of the provided functionalities in order to better estimate response times and reveal potentials for optimizations to users. The present paper builds on our previous work, addressing both a comprehensive evaluation of the complexity of selected operators as well as an experimental validation. Specifically, we investigate effects of varying operator parameterizations and the underlying data characteristics. We provide examples in the context of a simple analysis process and discuss our findings and possible optimizations.