Auflistung nach Autor:in "Schwarz, Holger"
1 - 10 von 67
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragAdvanced cardinality estimation in the XML query graph model(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Weiner, Andreas M.Reliable cardinality estimation is one of the key prerequisites for effective cost-based query optimization in database systems. The XML Query Graph Model (XQGM) is a tuple-based XQuery algebra that can be used to represent XQuery expressions in native XML database management systems. This paper enhances previous works on reliable cardinality estimation for XQuery and introduces several inference rules that deal with the unique features of XQGM, such as native support for Structural Joins, nesting, and multi-way merging. These rules allow to estimate the runtime cardinalities of XQGM operators. Using this approach, we can support classical join reordering with appropriate statistical information, perform cost-based query unnesting, and help to find the best evaluation strategy for value-based joins. The effectiveness of our approach for query optimization is evaluated using the query optimizer of XTC.
- KonferenzbeitragAIMS: an SQL-based system for airspace monitoring(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Schüller, Gereon; Behrend, AndreasThe “Airspace Monitoring System” (AIMS) is a system for monitoring and analyzing flight data streams with respect to the occurrence of arbitrary complex events. It is a general system that allows for a comprehensive analysis of aircraft movements, in contrast to already existing tools which focus on a single task like flight delay detection. For instance, the system is able to detect critical deviations from the current flight plan, abnormal approach parameters of landing flights as well as areas with an increased risk of collisions. To this end, tracks are extracted from cluttered radar data and SQL views are employed for a timely processing of these tracks. Additionally, the data is stored for later analysis.
- KonferenzbeitragApproach to Synthetic Data Generation for Imbalanced Multi-class Problems with Heterogeneous Groups(BTW 2023, 2023) Treder-Tschechlov, Dennis; Reimann, Peter; Schwarz, Holger; Mitschang, BernhardTo benchmark novel classification algorithms, these algorithms should be evaluated on data with characteristics that also appear in real-world use cases. Important data characteristics that often lead to challenges for classification approaches are multi-class imbalance and heterogeneous groups. Real-world data that comprise these characteristics are usually not publicly available, e. g., because they constitute sensible patient information or due to privacy concerns. Further, the manifestations of the characteristics cannot be controlled specifically on real-world data. A more rigorous approach is to synthetically generate data such that different manifestations of the characteristics can be controlled. However, existing data generators are not able to generate data that feature both data characteristics, i. e., multi-class imbalance and heterogeneous groups. In this paper, we propose an approach that fills this gap as it allows to synthetically generate data that exhibit both characteristics. In particular, we make use of a taxonomy model that organizes real-world entities in domain-specific heterogeneous groups to generate data reflecting the characteristics of these groups. In addition, we incorporate probability distributions to reflect the imbalances of multiple classes and groups from real-world use cases. Our approach is applicable in different domains, as taxonomies are the simplest form of knowledge models and thus are available in many domains. The evaluation shows that our approach can generate data that feature the data characteristics multi-class imbalance and heterogeneous groups and that it allows to control different manifestations of these characteristics.
- KonferenzbeitragArchitecture of a highly scalable data warehouse appliance integrated to mainframe database systems(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Stolze, Knut; Beier, Felix; Sattler, Kai-Uwe; Sprenger, Sebastian; Grolimund, Carlos Caballero; Czech, Marco
- KonferenzbeitragAvailable-to-promise on an in-memory column store(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Tinnefeld, Christian; Müller, Stephan; Kaltegärtner, Helen; Hillig, Sebastian; Butzmann, Lars; Eickhoff, David; Klauck, Stefan; Taschik, Daniel; Wagner, Björn; Xylander, Oliver; Tosun, Cafer; Zeier, Alexander; Plattner, HassoAvailable-To-Promise (ATP) is an application in the context of Supply Chain Management (SCM) systems and provides a checking mechanism that calculates if the desired products of a customer order can be delivered on the requested date. Modern SCM systems store relevant data records as aggregated numbers which implies the disadvantages of maintaining redundant data as well as inflexibility in querying the data. Our approach omits aggregates by storing all individual data records in an in-memory, column-store and scans through all relevant records on-the-fly for each check. We contribute by describing the novel data organization and a lockingfree, highly-concurrent ATP checking algorithm. Additionally, we explain how new business functionality such as instant rescheduling of orders can be realized with our approach. All concepts are implemented within a prototype and benchmarked by using an anonymized SCM dataset of a Fortune 500 consumer products company. The paper closes with a discussion of the results and gives an outlook how this approach can help companies to find the right balance between low inventory costs and high order fulfillment rates.
- KonferenzbeitragBenchmarking hybrid OLTP&OLAP database systems(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Funke, Florian; Kemper, Alfons; Neumann, ThomasRecently, the case has been made for operational or real-time Business Intelligence (BI). As the traditional separation into OLTP database and OLAP data warehouse obviously incurs severe latency disadvantages for operational BI, hybrid OLTP&OLAP database systems are being developed. The advent of the first generation of such hybrid OLTP&OLAP database systems requires means to characterize their performance. While there are standardized and widely used benchmarks addressing either OLTP or OLAP workloads, the lack of a hybrid benchmark led us to the definition of a new mixed workload benchmark, called TPC-CH. This benchmark bridges the gap between the existing single-workload suits: TPC-C for OLTP and TPC-H for OLAP. The newly proposed TPC-CH benchmark executes a mixed workload: A transactional workload based on the order entry processing of TPC-C and a corresponding TPC-H-equivalent OLAP query suite on this sales data base. As it is derived from these two most widely used TPC benchmarks our new TPC-CH benchmark produces results that are highly comparable to both, hybrid systems and classic single-workload systems. Thus, we are able to compare the performance of our own (and other) hybrid database system running both OLTP and OLAP workloads in parallel with the OLTP performance of dedicated transactional systems (e.g., VoltDB) and the OLAP performance of specialized OLAP databases (e.g., column stores such as MonetDB).
- ZeitschriftenartikelBTW 2017 in Stuttgart(Datenbank-Spektrum: Vol. 17, No. 2, 2017) Mitschang, Bernhard; Schwarz, Holger
- KonferenzbeitragCloud Storage: Wie viel Cloud Computing steckt dahinter?(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Jaeger, Michael C.; Hohenstein, UweDurch die hohe Dynamik im Bereich des Cloud Computings entstehen neues Potenzial, aber auch Risiken bezüglich existierender und neuer Anwendungen. Eine wichtige Ressource bei der Nutzung von Cloud Computing ist sicherlich Cloud Storage. Unter Cloud Storage werden Datenspeicher subsumiert, die in unterschiedlichen Varianten von Cloud Computing Betreibern angeboten werden. Die wichtige Frage für die Nutzung von Cloud Storage ist hierbei, inwiefern auf technischer Ebene die wesentlichen Eigenschaften des Cloud Computings wie Elastizität, Skalierbarkeit und Kostenreduktion umgesetzt werden. Anhand von Beispielen wird deutlich, dass viele dieser Eigenschaften nach derzeitigem Stand nicht vollständig erfüllt werden.
- KonferenzbeitragCloudy transactions: cooperative XML authoring on Amazon S3(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Gropengießer, Francis; Baumann, Stephan; Sattler, Kai-UweOver the last few years cloud computing has received great attention in the research community. Customers are entitled to rent infrastructure, storage, and even software in form of services. This way they just have to pay for the actual use of these components or services. Cloud computing also comes with great opportunities for distributed design applications, which often require multiple users to work cooperatively on shared data. In order to enable cooperation, strict consistency is necessary. However, cloud storage services often provide only eventual consistency. In this paper, we propose a system that allows for strict consistent and cooperative XML authoring in distributed environments based on Amazon S3. Our solution makes use of local and distributed transactions, which are synchronized in an optimistic fashion, in order to ensure correctness. An important contribution of this paper is the evaluation of our system in a real deployment scenario upon Amazon S3. We show the strong impact of write operations to S3 on the transaction throughput. Furthermore, we show that fragmenting data increases write performance and reduces storage costs.
- KonferenzbeitragConceptual views for entity-centric search: turning data into meaningful concepts(Datenbanksysteme für Business, Technologie und Web (BTW), 2011) Selke, Joachim; Homoceanu, Silviu; Balke, Wolf-TiloThe storage, management, and retrieval of entity data has always been among the core applications of database systems. However, since nowadays many people access entity collections over the Web (e.g., when searching for products, people, or events), there is a growing need for integrating unconventional types of data into these systems, most notably entity descriptions in unstructured textual form. Prime examples are product reviews, user ratings, tags, and images. While the storage of this data is well-supported by modern database technology, the means for querying it in semantically meaningful ways remain very limited. Consequently, in entity-centric search suffers from a growing semantic gap between the users' intended queries and the database's schema. In this paper, we introduce the notion of conceptual views, an innovative extension of traditional database views, which aim to uncover those query-relevant concepts that are primarily reflected by unstructured data. We focus on concepts that are vague in nature and cannot be easily extracted by existing technology (e.g., business phone and romantic movie). After discussing different types of concepts and conceptual queries, we present two case studies, which illustrate how meaningful conceptual information can automatically be extracted from existing data, thus enabling the effective handling of vague real-world query concepts.