P103 - BTW2007 - Datenbanksysteme in Business, Technologie und Web
Auflistung P103 - BTW2007 - Datenbanksysteme in Business, Technologie und Web nach Titel
1 - 10 von 38
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragAlgebraic Query Optimization for Distributed Top-k Queries(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Neumann, Thomas; Michel, SebastianDistributed top-k query processing is increasingly becoming an essential functionality in a large number of emerging application classes. This paper addresses the efficient algebraic optimization of top-k queries in wide-area distributed data repositories where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers and the computational costs include network latency, bandwidth consumption, and local peer work. We use a dynamic programming approach to find the optimal execution plan using compact data synopses for selectivity estimation that is the basis for our cost model. The optimized query is executed in a hierarchical way involving a small and fixed number of communication phases. We have performed experiments on real web data that show the benefits of distributed top-k query optimization both in network resource consumption and query response time.
- KonferenzbeitragAlgorithms for merged indexes(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Graefe, Goetz
- KonferenzbeitragAnreizmechanismen für Peer-to-Peer Web Crawling unter Berücksichtigung bösartiger Teilnehmer(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Steinhaus, Holger; Böhm, Klemens; Schosser, StephanWeb Crawling ist grundlegend für eine Vielzahl von Anwendungen. Aufgrund der damit verbundenen hohen Ressourcenanforderungen sind jedoch nur wenige Organisationen in der Lage, eigene Crawler zu betreiben. Um neue Anwendungsbereiche zu erschließen und eine hohe Skalierbarkeit sowie eine faire Verteilung der Kosten zu erreichen, muss Web Crawling nicht nur verteilt, sondern auch koordinatorfrei sein. Peer-to-Peer-Techniken können diese Anforderungen erfüllen, benötigen jedoch Anreizmechanismen, um Teilnehmer nachhaltig zur Beteiligung zu motivieren. Solche Mechanismen müssen jedoch einige Besonderheiten der Anwendung ’Web Crawling’ berücksichtigen: Die Erkennung von sich nicht beteiligenden Teilnehmern ist schwierig, da diese nicht direkt beobachtet werden kann. Weiterhin sollte es einem Teilnehmer erlaubt sein, vorübergehend nicht zu crawlen, ohne in den Augen anderer Peers als bösartig betrachtet zu werden. Weiterhin müssen die zu entwerfenden Mechanismen robust gegen gezielte bösartige Angriffe sein. Wir stellen im Folgenden Anreizmechanismen für ein Peer-to-Peer-System vor, die diese Anforderungen erfüllen. Neben einem Feedback-basierten Reputationssystem kommen dabei Mechanismen zum Einsatz; die eine Überprüfung von Leistungsbehauptungen einzelner Teilnehmern ermöglichen. Umfangreiche Simulationsexperimente evaluieren die Auswirkungen bösartiger Teilnehmer und zeigen die Effektivität und Robustheit des Ansatzes.
- KonferenzbeitragArmada: a Reference Model for an Evolving Database System(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Groffen, Fabian; Kersten, Martin; Manegold, StefanThe data on the web, in digital libraries, in scientific repositories, etc. con- tinues to grow at an increasing rate. Distribution is a key solution to overcome this data explosion. However, existing solutions are mostly based on architectures with a single point of failure. In this paper, we present Armada, a model for a database architecture to handle large data volumes. Armada assumes autonomy of sites, allowing for a decentralised setup, where systems can largely work independently. Furthermore, a novel administration schema in Armada, based on lineage trails, allows for flexible adaptation to the (query) work load in highly dynamic environments. The lineage trails capture the metadata and its history. They form the basis to direct updates to the proper sites, to break queries into multi-stage plans, and to provide a reference point for site consistency. The lineage trails are managed in a purely distributed fashion, each Armada site is responsible for their persistency and long term availability. They provide a minimal, but sufficient basis to handle all distributed query processing tasks. The analysis of the Armada reference architecture depicts a path for innovative research at many levels of a DBMS. Challenging many conventional database assumptions and theories, it will eventually allow large databases to continue to grow and stay flexible.
- KonferenzbeitragAutonomes Index Tuning – DBMS-integrierte Verwaltung von Soft Indexen(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Lühring, Martin; Sattler, Kai-Uwe; Schallehn, Eike; Schmidt, KarstenDas Self Management in DBMS, und das Self Tuning als wichtiger Teil davon, gewinnt auf Grund der wachsenden Komplexität von Systemen und Anwen- dungen und den daraus resultierenden steigenden Betriebskosten zunehmend an Beachtung. Der Stand der Technik bezüglich des Index Tunings sind Administrations-Tools, die aus einem gegebenen Workload eine empfohlene Indexkonfiguration ableiten. Diese muss jedoch bei Änderungen des Systems, dessen Umgebung oder der Zugriffsmuster wiederholt angepasst werden. Wir stellen in diesem Beitrag einen dynamischen Ansatz zum Index Tuning vor, der zur Laufzeit das Systemverhalten und die -nutzung überwacht, Entscheidungen bezüglich möglicher Verbesserungen der aktuellen Indexkonfiguration trifft und diese integriert mit der Anfrageverarbeitung umsetzt. Für den zuletzt genannten Aspekt führen wir mit IndexBuildScan und SwitchPlan neue Operatoren zur Erzeugung von Indexen während der Bearbeitung einer Anfrage sowie deren sofortiger Nutzung ein. Die Beschreibung der Implementierung der vorgestellten Konzepte als Erweiterung des PostgreSQL-Systems, sowie eine Evaluierung an Hand eines von TPC-H abgeleiteten Benchmarks beschließen den Beitrag.
- KonferenzbeitragChange Management in Large Information Infrastructures – Representing and Analyzing Arbitrary Metadata(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Stumm, Boris; Dessloch, StefanWith information infrastructures getting more and more complex, it becomes necessary to give automated support for managing the evolution of the infrastructure. If changes are detected in a single system, the potential impact on other systems has to be calculated and appropriate countermeasures have to be initiated to prevent failures and data corruption that span several systems. This is the goal of Caro, our approach to change impact analysis in large information infrastructures. In this paper we present how we model the metadata of information systems to make a global analysis possible, regardless of the data models used, and how the analysis process works.
- KonferenzbeitragA Classification of Schema Mappings and Analysis of Mapping Tools(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Legler, Frank; Naumann, FelixSchema mapping techniques for data exchange have become popular and useful tools both in research and industry. A schema mapping relates a source schema with a target schema via correspondences, which are specified by a domain expert possibly supported by automated schema matching algorithms. The set of correspondences, i.e., the mapping, is interpreted as a data transformation usually expressed as a query. These queries transform data from the source schema to conform to the target schema. They can be used to materialize data at the target or used as views in a virtually integrated system. We present a classification of mapping situations that can occur when mapping between two relational or nested (XML) schemata. Our classification takes into con- sideration 1:1 and n:m correspondences, attribute-level and higher-level mappings, and special constructs, such as choice constraints, cardinality constraints, and data types. Based on this classification, we have developed a general suite of schemata, data, and correspondences to test the ability of tools to cope with the different mapping situations. We evaluated several commercial and research tools that support the definition of schema mappings and interpret this mapping as a data transformation. We found that no tool performs well in all mapping situations and that many tools produce incorrect data transformations. The test suite can serve as a benchmark for future improvements and developments of schema mapping tools.
- KonferenzbeitragData Provenance: A Categorization of Existing Approaches(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Glavic, Boris; Dittrich, KlausIn many application areas like e-science and data-warehousing detailed information about the origin of data is required. This kind of information is often re- ferred to as data provenance or data lineage. The provenance of a data item includes information about the processes and source data items that lead to its creation and current representation. The diversity of data representation models and application domains has lead to a number of more or less formal definitions of provenance. Most of them are limited to a special application domain, data representation model or data processing facility. Not surprisingly, the associated implementations are also restricted to some application domain and depend on a special data model. In this paper we give a survey of data provenance models and prototypes, present a general categorization scheme for provenance models and use this categorization scheme to study the properties of the existing approaches. This categorization enables us to distinguish between different kinds of provenance information and could lead to a better understanding of provenance in general. Besides the categorization of provenance types, it is important to include the storage, transformation and query requirements for the different kinds of provenance information and application domains in our considerations. The analysis of existing approaches will assist us in revealing open research problems in the area of data provenance.
- KonferenzbeitragData Staging for OLAP- and OLTP-Applications on RFID Data(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Krompaß, Stefan; Aulbach, Stefan; Kemper, AlfonsThe emerging trend towards seamless monitoring of all business processes via comprehensive sensor networks – in particular RFID readers – creates new data man- agement challenges. In addition to handling the huge volume of data generated by these sensor networks the information systems must support the efficient querying and analysis of both recent data and historic sensor readings. In this paper, we devise and evaluate a data staging architecture that consists of a distributed caching layer and a data warehouse. The caches maintain the most recent RFID events (i.e., sensor read- ings) to facilitate very efficient OLTP processing. Aged RFID events are removed from the caches and propagated into the data warehouse where related events are ag- gregated to reduce storage space consumption. The data warehouse can be utilized for business intelligence applications that, e.g., analyze the supply chain quality.
- Editiertes BuchDatenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS)(Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007)