Logo des Repositoriums
 

P103 - BTW2007 - Datenbanksysteme in Business, Technologie und Web

Autor*innen mit den meisten Dokumenten  

Auflistung nach:

Neueste Veröffentlichungen

1 - 10 von 38
  • Konferenzbeitrag
    Effective and Efficient Indexing for Large Video Databases
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Böhm, Christian; Kunath, Peter; Pryakhin, Alexey; Schubert, Matthias
    Content based multimedia retrieval is an important topic in database systems. An emerging and challenging topic in this area is the content based search in video data. A video clip can be considered as a sequence of images or frames. Since this representation is too complex to facilitate efficient video retrieval, a video clip is often summarized by a more concise feature representation. In this paper, we transform a video clip into a set of probabilistic feature vectors (pfvs). In our case, a pfv corresponds to a Gaussian in the feature space of frames. We demonstrate that this representation is well suited for accurate video retrieval. The use of pfvs allows us to calculate confidence values for frames or sets of frames for being contained within a given video in the database. These confidence values can be employed to specify two types of queries. The first type of query retrieves the videos stored in the database which contain a given set of frames with a probability that is larger than a given thresh-old value. Furthermore, we introduce a probabilistic ranking query retrieving the k database videos which contain the given query set with the highest probabilities. To efficiently process these queries, we introduce query algorithms on set-valued objects. Our solution is based on the Gauss-tree, an index structure for efficiently managing Gaussians in arbitrary vector spaces. Our experimental evaluation demonstrates that sets of probabilistic feature vectors yield a compact and descriptive representation of video clips. Additionally, we show that our new query algorithms outperform competitive approaches when answering the given types of queries on a database of over 900 real world video clips.
  • Konferenzbeitrag
    Algorithms for merged indexes
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Graefe, Goetz
  • Konferenzbeitrag
    Ranking von Produktempfehlungen mit präferenz-annotiertem SQL
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Beck, Matthias; Radde, Sven; Freitag, Burkhard
    Web-basierte, datenbankgestützte Beratungssysteme finden durch zentrale Wartbarkeit bei sich permanent ändernden Produktpaletten aktuell starke Verbreitung. Kernpunkt für eine optimale Produktempfehlung ist dabei die Berücksichtigung der Präferenzen des Kunden, welche auf eine möglichst einfache und nachvollziehbare Art und Weise spezifiziert werden sollten. Daher wird ein Ansatz präsentiert, der erlaubt, die vom Benutzer ohnehin anzugebenden Selektionsbedingungen zusätzlich mit Gewichten zu annotieren und damit die Sortierung der Empfehlungen zu beeinflussen. Dies wird durch eine erweiterte SQL-Syntax ermöglicht, über die theoretisch fundiert ein Ranking auf der Ergebnismenge definiert wird.
  • Konferenzbeitrag
    Hierarchy-driven Visual Exploration of Multidimensional Data Cubes
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Mansmann, Svetlana; Mansmann, Florian; Scholl, Marc H.; Keim, Daniel A.
    Analysts interact with OLAP data in a predominantly “drill-down” fashion, i.e. gradually descending from a coarsely grained overview towards the desired level of detail. Analysis tools enable visual exploration as a sequence of navigation steps in the data cubes and their dimensional hierarchies. However, most state-of-the-art solutions are limited either in their capacity to handle complex multidimensional data or in the ability of their visual metaphors to provide an overview+details context. This work proposes an explorative framework for OLAP data based on a simple but powerful approach to analyzing data cubes of virtually arbitrary complexity. The data is queried using an intuitive navigation in which each dimension is represented by its hierarchy schema. Any granularity level can be dragged into the visualization to serve as an disaggregation axis. The results of the iterative exploration are mapped to a specified visualization technique. We favor hierarchical layouts for their natural ability to show step-wise decomposition of aggregate values. The power of the tool to support various application scenarios is demonstrated by presenting use cases from different domains and the visualization techniques suitable for solving specific analysis tasks.
  • Konferenzbeitrag
    Getting Prime Cuts from Skylines over Partially Ordered Domains
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Balke, Wolf-Tilo; Güntzer, Ulrich; Siberski, Wolf
    Skyline queries have recently received a lot of attention due to their intuitive query formulation: users can state preferences with respect to several attributes. Unlike numerical preferences, preferences over discrete value domains do not show an inherent total order, but have to rely on partial orders as stated by the user. In such orders typically many object values are incomparable, increasing the size of skyline sets significantly, and making their computation expensive. In this paper we explore how to enable interactive tasks like query refinement or relevance feedback by providing 'prime cuts'. Prime cuts are interesting subsets of the full Pareto skyline, which give users a good overview over the skyline. They have to be small, efficient to compute, suitable for higher numbers of query predicates, and representative. The key to improved performance and reduced result set sizes is the relaxation of Pareto semantics to the concept of weak Pareto dominance. We argue that this relaxation yields intuitive results and show how it opens up the use of efficient and scalable query processing algorithms. Assessing the practical impact, our experiments show that our approach leads to lean result set sizes and outperforms Pareto skyline computations by up to two orders of magnitude.
  • Konferenzbeitrag
    Efficient Time-Travel on Versioned Text Collections
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Berberich, Klaus; Bedathur, Srikanta; Weikum, Gerhard
    The availability of versioned text collections such as the Internet Archive opens up opportunities for time-aware exploration of their contents. In this paper, we propose time-travel retrieval and ranking that extends traditional keyword queries with a temporal context in which the query should be evaluated. More precisely, the query is evaluated over all states of the collection that existed during the temporal context. In order to support these queries, we make key contributions in (i) defining extensions to well-known relevance models that take into account the temporal context of the query and the version history of documents, (ii) designing an immortal index over the full versioned text collection that avoids a blowup in index size, and (iii) making the popular NRA algorithm for top-k query processing aware of the temporal context. We present preliminary experimental analysis over the English Wikipedia revision history showing that the proposed techniques are both effective and efficient.
  • Konferenzbeitrag
    Ein Nachrichtentransformationsmodell für komplexe Transformationsprozesse in datenzentrischen Anwendungsszenarien
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Böhm, Matthias; Wloka, Uwe; Habich, Dirk
    Die horizontale Integration von Systemen durch eine nachrichtenbasierte Kommunikation über Middleware-Produkte stellt eine, sich immer weiter verbreitende, Art der Anwendungsintegration dar, um eine hinreichend lose Kopplung der partizipierenden Systeme und Anwendungen zu gewährleisten. Für die Beschreibung derartiger Integrationsprozesse kommen zunehmend funktional-orientierte Prozessbeschreibungssprachen wie beispielsweise WSBPEL zum Einsatz, welche allerdings Defizite bei der Beschreibung von datenzentrischen Anwendungsszenarien offenbaren. Dieses Papier leistet einen Beitrag zur systematischen Modellbildung für komplexe Nachrichtentransformationen in datenzentrischen Prozessen. Die Realisierbarkeit der Ergebnisse wird dabei an der Integrationsplattform TransConnect®, der Firma SQL GmbH, verdeutlicht.
  • Konferenzbeitrag
    Integrating Query-Feedback Based Statistics into Informix Dynamic Server
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Behm, Alexander; Markl, Volker; Haas, Peter; Murthy, Keshava
    Statistics that accurately describe the distribution of data values in the columns of relational tables are essential for effective query optimization in a database management system. Manually maintaining such statistics in the face of changing data is difficult and can lead to suboptimal query performance and high administration costs. In this paper, we describe a method and prototype implementation for automatically maintaining high quality single-column statistics, as used by the optimizer in IBM Informix Dynamic Server (IDS). Our method both refines and extends the ISOMER algorithm of Srivastava et al. for maintaining a multidimensional histogram based on query feedback (QF). Like ISOMER, our new method is based on the maximum entropy (ME) principle, and therefore incorporates information about the data distribution in a principled and consistent manner. However, because IDS only needs to maintain one-dimensional histograms, we can simplify the ISOMER algorithm in several ways, significantly speeding up performance. First, we replace the expensive STHoles data structure used by ISOMER with a simple binning scheme, using a sweep-line algorithm to determine bin boundaries. Next, we use an efficient method for incorporating new QF into the histogram; the idea is to aggregate, prior to the ME computation, those bins that do not overlap with the new feedback records. Finally, we introduce a fast pruning method to ensure that the number of bins in the frequency distribution stays below a specified upper bound. Besides refining ISOMER to deal efficiently with one-dimensional histograms, we extend previous work by combining the reactive QF approach with a proactive sampling approach. Sampling is triggered whenever (as determined from QF records) actual and estimated selectivities diverge to an unacceptably large degree. Our combined proactive/reactive approach greatly improves the robustness of the estimation mechanism, ensuring very high quality selectivity estimates for queries falling inside the range of available feedback while guaranteeing reasonably good estimates for queries outside of the range. By automatically updating statistics, query execution is improved due to better selectivity estimates, and the total cost of ownership (TCO) is reduced since the database administrator need not update statistics manually for monitored columns.
  • Editiertes Buch
    Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS)
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007)
  • Konferenzbeitrag
    Überlegungen zur Entwicklung komplexer Grid-Anwendungen mit Globus Toolkit
    (Datenbanksysteme in Business, Technologie und Web (BTW 2007) – 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 2007) Walter, Andreas; Böhm, Klemens; Schosser, Stephan
    Verteilte Anwendungen mit einem hohen Ressourcenbedarf lassen sich heutzutage mit Hilfe von Grid-Techniken ohne den Einsatz großer Rechenzentren erstellen. Stattdessen nutzt man freie Ressourcen auf Computern, die im Internet verteilt sind. Grid Middleware ist hierfür eine Grundlage. Hauptinteresse dieser Arbeit ist die Analyse der am weitesten verbreiteten Middleware für Grid-Anwendungen, dem Globus Toolkit. Uns geht es insbesondere um die Erstellung von komplexen Anwendungen mit hohem Ressourcenbedarf. Für unsere Untersuchung haben wir als geeignete Anwendung Web-Crawling ausgewählt, aufgrund eines sehr hohen Daten- und Kontrollaufwands über viele verteilt arbeitende Knoten. Wir zeigen, welche Anforderungen dabei an Globus Toolkit gestellt werden, und wie gut sich diese Plattform für solche Anwendungen eignet.