Auflistung nach Autor:in "Mitschang, Bernhard"
1 - 10 von 224
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragAdvanced cardinality estimation in the XML query graph model(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Weiner, Andreas M.Reliable cardinality estimation is one of the key prerequisites for effective cost-based query optimization in database systems. The XML Query Graph Model (XQGM) is a tuple-based XQuery algebra that can be used to represent XQuery expressions in native XML database management systems. This paper enhances previous works on reliable cardinality estimation for XQuery and introduces several inference rules that deal with the unique features of XQGM, such as native support for Structural Joins, nesting, and multi-way merging. These rules allow to estimate the runtime cardinalities of XQGM operators. Using this approach, we can support classical join reordering with appropriate statistical information, perform cost-based query unnesting, and help to find the best evaluation strategy for value-based joins. The effectiveness of our approach for query optimization is evaluated using the query optimizer of XTC.
- KonferenzbeitragAIMS: an SQL-based system for airspace monitoring(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Schüller, Gereon; Behrend, AndreasThe “Airspace Monitoring System” (AIMS) is a system for monitoring and analyzing flight data streams with respect to the occurrence of arbitrary complex events. It is a general system that allows for a comprehensive analysis of aircraft movements, in contrast to already existing tools which focus on a single task like flight delay detection. For instance, the system is able to detect critical deviations from the current flight plan, abnormal approach parameters of landing flights as well as areas with an increased risk of collisions. To this end, tracks are extracted from cluttered radar data and SQL views are employed for a timely processing of these tracks. Additionally, the data is stored for later analysis.
- KonferenzbeitragAnfrage-getriebener Wissenstransfer zur Unterstützung von Datenanalysten(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Wahl, Andreas M.; Endler, Gregor; Schwab, Peter K.; Herbst, Sebastian; Lenz, RichardIn größeren Organisationen arbeiten verschiedene Gruppen von Datenanalysten mit unterschiedlichen Datenquellen, um analytische Fragestellungen zu beantworten. Das Formulieren effektiver analytischer Anfragen setzt voraus, dass die Datenanalysten profundes Wissen über die Existenz, Semantik und Verwendungskontexte relevanter Datenquellen besitzen. Derartiges Wissen wird informell innerhalb einzelner Gruppen von Datenanalysten geteilt, jedoch meist nicht in formalisierter Form für andere verfügbar gemacht. Mögliche Synergien bleiben somit ungenutzt. Wir stellen einen neuartigen Ansatz vor, der existierende Datenmanagementsysteme mit zusätzlichen Fähigkeiten für diesen Wissenstransfer erweitert. Unser Ansatz fördert die Kollaboration zwischen Datenanalysten, ohne dabei etablierte Analyseprozesse zu stören. Im Gegensatz zu bisherigen Forschungsansätzen werden die Analysten beim Transfer des in analytischen Anfragen enthaltenen Wissens unterstützt. Relevantes Wissen wird aus dem Anfrageprotokoll extrahiert, um das Auffinden von Datenquellen und die inkrementelle Datenintegration zu erleichtern. Extrahiertes Wissen wird formalisiert und zum Anfragezeitpunkt bereitgestellt.
- KonferenzbeitragEin Annotationsansatz zur Unterstützung einer ganzheitlichen Geschäftsanalyse(Synergien durch Integration und Informationslogistik, 2008) Radeschütz, Sylvia; Niedermann, Florian; Mitschang, Bernhard
- KonferenzbeitragApplication and Testing of Business Processes in the Energy Domain(Datenbanksysteme für Business, Technologie und Web (BTW 2017) - Workshopband, 2017) Böhmer, Kristof; Stertz, Florian; Hildebrandt, Tobias; Rinderle-Ma, Stefanie; Eibl, Günther; Ferner, Cornelia; Burkhart, Sebastian; Engel, DominikThe energy domain currently struggles with radical legal and technological changes, such as, smart meters. This results in new use cases which can be implemented based on business process technology. Understanding and automating business processes requires to model and test them. However, existing process testing approaches frequently struggle with the testing of process resources, such as ERP systems, and negative testing. Hence, this work presents a toolchain which tackles that limitations. The approach uses an open source process engine to generate event logs and applies process mining techniques in a novel way.
- KonferenzbeitragApplying stratosphere for big data analytics(Datenbanksysteme für Business, Technologie und Web (BTW) 2046, 2013) Leich, Marcus; Adamek, Jochen; Schubotz, Moritz; Heise, Arvid; Rheinländer, Astrid; Markl, VolkerAnalyzing big data sets as they occur in modern business and science applications requires query languages that allow for the specification of complex data processing tasks. Moreover, these ideally declarative query specifications have to be optimized, parallelized and scheduled for processing on massively parallel data processing platforms. This paper demonstrates the application of Stratosphere to different kinds of Big Data Analytics tasks. Using examples from different application domains, we show how to formulate analytical tasks as Meteor queries and execute them with Stratosphere. These examples include data cleansing and information extraction tasks, and a correlation analysis of microblogging and stock trade volume data that we describe in detail in this paper.
- KonferenzbeitragApproach to Synthetic Data Generation for Imbalanced Multi-class Problems with Heterogeneous Groups(BTW 2023, 2023) Treder-Tschechlov, Dennis; Reimann, Peter; Schwarz, Holger; Mitschang, BernhardTo benchmark novel classification algorithms, these algorithms should be evaluated on data with characteristics that also appear in real-world use cases. Important data characteristics that often lead to challenges for classification approaches are multi-class imbalance and heterogeneous groups. Real-world data that comprise these characteristics are usually not publicly available, e. g., because they constitute sensible patient information or due to privacy concerns. Further, the manifestations of the characteristics cannot be controlled specifically on real-world data. A more rigorous approach is to synthetically generate data such that different manifestations of the characteristics can be controlled. However, existing data generators are not able to generate data that feature both data characteristics, i. e., multi-class imbalance and heterogeneous groups. In this paper, we propose an approach that fills this gap as it allows to synthetically generate data that exhibit both characteristics. In particular, we make use of a taxonomy model that organizes real-world entities in domain-specific heterogeneous groups to generate data reflecting the characteristics of these groups. In addition, we incorporate probability distributions to reflect the imbalances of multiple classes and groups from real-world use cases. Our approach is applicable in different domains, as taxonomies are the simplest form of knowledge models and thus are available in many domains. The evaluation shows that our approach can generate data that feature the data characteristics multi-class imbalance and heterogeneous groups and that it allows to control different manifestations of these characteristics.
- KonferenzbeitragArchitecture of a highly scalable data warehouse appliance integrated to mainframe database systems(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Stolze, Knut; Beier, Felix; Sattler, Kai-Uwe; Sprenger, Sebastian; Grolimund, Carlos Caballero; Czech, Marco
- Konferenzbeitragasprin: Answer Set Programming with Preferences(Datenbanksysteme für Business, Technologie und Web (BTW 2017) - Workshopband, 2017) Romero, JavierAnswer Set Programming (ASP) is a well established approach to declarative problem solving, combining a rich yet simple modeling language with high-performance solving capacities. In this talk we present asprin, a general, flexible and extensible framework for preferences in ASP. asprin is general and captures many of the existing approaches to preferences. It is flexible, because it allows for the combination of different types of preferences. It is also extensible, allowing for an easy implementation of new approaches to preferences. Since it is straightforward to capture propositional theories and constraint satisfaction problems in ASP, the framework is also relevant to optimization in Satisfiability Testing and Constraint Processing.
- KonferenzbeitragAutonomous Data Ingestion Tuning in Data Warehouse Accelerators(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Stolze, Knut; Beier, Felix; Müller, JensThe IBM DB2 Analytics Accelerator (IDAA) is a state-of-the art hybrid database system that seamlessly extends the strong transactional capabilities of DB2 for z/OS with very fast processing of OLAP and analytical SQL workload in Netezza. IDAA copies the data from DB2 for z/OS into its Netezza backend, and customers can tailor data maintenance according to their needs. This copy process, the data load, can be done on a whole table or just a physical table partition. IDAA also o ers an incremental update feature, which employs replication technologies for low-latency data synchronization. The accelerator targets big relational databases with several TBs of data. Therefore, the data load is performance-critical, not only for the data transfer itself, but the system has to be able to scale up to a large number of tables, i. e., tens of thousands to be loaded at the same time, as well. The administrative overhead for such a number of tables has to be minimized. In this paper, we present our work on a prototype, which is geared towards e ciently loading data for many tables, where each table may store only a comparably small amount of data. A new load scheduler has been introduced for handling all concurrent load requests for disjoint sets of tables. That is not only required for a multi-tenant setup, but also a significant improvement for attaching an accelerator to a single DB2 for z/OS system. In this paper, we present architecture and implementation aspects of the new and improved load mechanism and results of some initial performance evaluations.