P214 - BTW2013 - Datenbanksysteme für Business, Technologie und Web
Auflistung P214 - BTW2013 - Datenbanksysteme für Business, Technologie und Web nach Erscheinungsdatum
1 - 10 von 42
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragPythiaSearch - Interaktives, Multimodales Multimedia-Retrieval(Datenbanksysteme für Business, Technologie und Web (BTW) 2043, 2013) Zellhöfer, David; Böttcher, Thomas; Bertram, Maria; Schmidt, Christoph; Tillmann, Claudius; Uhlig, Markus; Zierenberg, Marcel; Schmitt, IngoPythiaSearch ist ein interaktives Multimedia-Retrieval-System. Es vereint verschiedene Suchstrategien, diverse Visualisierungen und erlaubt eine Personalisierung der Retrieval-Ergebnisse mittels eines Präferenz-basierten Relevance Feedbacks. Das System nutzt die probabilistische Anfragesprache CQQL und erlaubt eine multimodale Anfragedefinition basierend auf Bildern, Texten oder Metadaten.
- KonferenzbeitragProQua: Ein Probabilistisches Datenbanksystem für die Auswertung von Ähnlichkeitsanfragen auf unsicheren Datengrundlagen(Datenbanksysteme für Business, Technologie und Web (BTW) 2052, 2013) Lehrack, Sebastian; Saretz, Sascha; Winkel, ChristianProQua ist ein neuartiges probabilistisches Datenbanksystem, welches die Auswertung von gewichteten logikbasierten Ähnlichkeitsbedingungen auf einer unsicheren Datenbasis zum Ziel hat. Die wesentlichen Leistungsmerkmale von ProQua werden anhand eines Bespielszenarios aus dem Umfeld der Archäologie präsentiert.
- KonferenzbeitragFlexY: Flexible, datengetriebene Prozessmodelle mit YAWL(Datenbanksysteme für Business, Technologie und Web (BTW) 2045, 2013) Schick, Sebastian; Meyer, Holger; Heuer, AndreasWir präsentieren mit FlexY ein Workflow-Management-System (WFMS) zur flexiblen, datengetriebenen Umsetzung von Prozessen mit YAWL. Durch den Einsatz des Systems wird die datenabhängige Steuerung und Kontrolle von Arbeitsabläufen in Informationssystemen zur Laufzeit ermöglicht. Mit einem flexiblen Prozessmodell wird die gemeinsame Modellierung von Kontrollund Datenflussabhängigkeiten erreicht. Änderungsoperationen die den Zustand der Dokumente und deren Struktur beschreiben, können so zu bestimmten Zeitpunkten im Prozessverlauf abgefragt und direkt an die Änderung der Prozessinstanz geknüpft werden.
- KonferenzbeitragMaking social media analysis more efficient through taxonomy supported concept suggestion(Datenbanksysteme für Business, Technologie und Web (BTW) 2040, 2013) Cardoso Coutinho, Fabio; Lang, Alexander; Mitschang, BernhardSocial Media sites provide consumers the ability to publicly create and shape the opinion about products, services and brands. Hence, timely understanding of content created in social media has become a priority for marketing departments, leading to the appearance of social media analysis applications. This article describes an approach to help users of IBM Cognos Consumer Insight, IBM's social media analysis offering, define and refine the analysis space more efficiently. It defines a Concept Suggestion Component (CSC) that suggests relevant as well as off-topic concepts within social media, and tying these concepts to taxonomies typically found in marketing around brands, products and campaigns. The CSC employs data mining techniques such as term extraction and clustering, and combines them with a sampling approach to ensure rapid and high-quality feedback. Initial evaluations presented in this article show that these goals can be accomplished for real-life data sets, simplifying the definition of the analysis space for a more comprehensive and focused analysis.
- KonferenzbeitragDemonstrating near real-time analytics with IBM DB2 analytics accelerator(Datenbanksysteme für Business, Technologie und Web (BTW) 2042, 2013) Martin, Daniel; Ivanova, Iliyana; Mueller, Raphael; Velez Montoya, Luis Eduardo; Maruschka, KlausVersion 3 of the IBM1 DB2 Analytics Accelerator (IDAA) takes a major step towards the vision of a universal relational DBMS that transparently processes both, OLTP and analytical-type queries in a single system. Based on heuristics in DB2 for z/OS, the DB2 optimizer decides if a query should be executed by ”mainline” DB2 or if it is beneficial to forward it to the attached IBM DB2 Analytics Optimizer that operates on copies of the DB2 tables. The new ”incremental update” functionality keeps these copy tables in sync by employing replication technology that monitors the DB2 transaction log and asynchronously applies the changes in micro-batches to IDAA. This enables near real-time analytics over online data, effectively marrying traditionally separated OLTP and data warehouse environments. With IDAA, reports can access data that is constantly refreshed in contrast to traditional warehouses that are updated on a daily or even weekly basis. Without any changes to the applications and without the need to introduce cross-system ETL flows, an existing OLTP environment can be used for reporting purposes as well. In this demo, we present a near realtime reporting application modeled on an industry benchmark (TPC-DS), but with a constantly changing set of tables with over 800 million rows that is running on DB2 for z/OS. In a browser-based user interface, demo attendants can influence the rate of changes to the tables and observe how the reporting queries are capturing new data as it is being modified by a separately running OLTP workload generator.
- KonferenzbeitragMR-DSJ: distance-based self-join for large-scale vector data analysis with mapreduce(Datenbanksysteme für Business, Technologie und Web (BTW) 2017, 2013) Seidl, Thomas; Fries, Sergej; Boden, BrigitteData analytics gets faced with huge and tremendously increasing amounts of data for which MapReduce provides a very convenient and effective distributed programming model. Various algorithms already support massive data analysis on computer clusters but, in particular, distance-based similarity self-joins lack efficient solutions for large vector data sets though they are fundamental in many data mining tasks including clustering, near-duplicate detection or outlier analysis. Our novel distance-based self-join algorithm for MapReduce, MR-DSJ, is based on grid partitioning and delivers correct, complete, and inherently duplicate-free results in a single iteration. Additionally we propose several filter techniques which reduce the runtime and communication of the MR-DSJ algorithm. Analytical and experimental evaluations demonstrate the superiority over other join algorithms for MapReduce.
- KonferenzbeitragComposition methods for link discovery(Datenbanksysteme für Business, Technologie und Web (BTW) 2029, 2013) Hartung, Michael; Groß, Anika; Rahm, ErhardThe Linked Open Data community publishes an increasing number of data sources on the so-called Data Web and interlinks them to support data integration applications. We investigate how the composition of existing links and mappings can help discovering new links and mappings between LOD sources. Often there will be many alternatives for composition so that the problem arises which paths can provide the best linking results with the least computation effort. We therefore investigate different methods to select and combine the most suitable mapping paths. We also propose an approach for selecting and composing individual links instead of entire mappings. We comparatively evaluate the methods on several real-world linking problems from the LOD cloud. The results show the high value of reusing and composing existing links as well as the high effectiveness of our methods.
- KonferenzbeitragFully parallel inference in Markov logic networks(Datenbanksysteme für Business, Technologie und Web (BTW) 2026, 2013) Beedkar, Kaustubh; Corro, Luciano Del; Gemulla, RainerMarkov logic is a powerful tool for handling the uncertainty that arises in real-world structured data; it has been applied successfully to a number of data management problems. In practice, the resulting ground Markov logic networks can get very large, which poses challenges to scalable inference. In this paper, we present the first fully parallelized approach to inference in Markov logic networks. Inference decomposes into a grounding step and a probabilistic inference step, both of which can be cost-intensive. We propose a parallel grounding algorithm that partitions the Markov logic network based on its corresponding join graph; each partition is ground independently and in parallel. Our partitioning scheme is based on importance sampling, which we use for parallel probabilistic inference, and is also well-suited to other, more efficient parallel inference techniques. Preliminary experiments suggest that significant speedup can be gained by parallelizing both grounding and probabilistic inference.
- KonferenzbeitragTaking the edge off cardinality estimation errors using incremental execution(Datenbanksysteme für Business, Technologie und Web (BTW) 2019, 2013) Neumann, Thomas; Galindo-Legaria, CesarQuery optimization is an essential ingredient for efficient query processing, as semantically equivalent execution alternatives can have vastly different runtime behavior. The query optimizer is largely driven by cardinality estimates when selecting execution alternatives. Unfortunately these estimates are largely inaccurate, in particular for complex predicates or skewed data. We present an incremental execution framework to make the query optimizer more resilient to cardinality estimation errors. The framework computes the sensitivity of execution plans relative to cardinality estimation errors, and if necessary executes parts of the query to remove uncertainty. This technique avoids optimization decisions based upon gross misestimation, and makes query optimization (and thus processing) much more robust. We demonstrate the effectiveness of these techniques on large real-world and synthetic data sets.
- KonferenzbeitragDie „schlaue Stadt“ - Erzeugung virtueller Sensordaten für Smart City Anwendungen(Datenbanksysteme für Business, Technologie und Web (BTW) 2051, 2013) Behrendt, Marcus; Böhm, Mischa; Caylak, Mustafa; Eylert, Lena; Friedrichs, Robert; Höting, Dennis; Knefel, Kamil; Lottmann, Timo; Rehfeldt, Andreas; Runge, Jens; Schnabel, Sabrina-Cynthia; Janssen, Stephan; Nicklas, Daniela; Wurst, MichaelIn der Smart City-Anwendung der Projektgruppe ALISE werden Verkehrs-, Wettersowie Energiedaten synthetisch generiert und verarbeitet, um Sensordaten und Auswirkungen von Ereignissen darzustellen. Das Gesamtsystem besteht aus einer verteilten Simulation, einem Datenstrommanagementsystem, einer Business Intelligence Lösung und einem Kartendienst. Ein Operation Center und Control Center bilden die Interaktionspunkte des Systems.