P241 - BTW2015 - Datenbanksysteme für Business, Technologie und Web

https://dl.gi.de/handle/20.500.12116/21088

Auflistung nach:

1 - 10 von 53

Konferenzbeitrag
Admission Control für kontinuierliche Anfragen
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Michelsen, Timo; Grawunder, Marco; Ludmann, Cornelius; Appelrath, H.-Jürgen
Überlast bedeutet, dass an ein System mehr Anforderungen gestellt werden, als es erfüllen kann. Im schlechtesten Fall ist es nicht mehr ansprechbar oder stürzt ab. Für Datenbankmanagementsysteme (DBMS) existiert eine spezielle Komponente, Admission Control (AC) genannt, welche die Systemlast überwacht, eintreffende Anfragen vor deren Ausführung überprüft und ggfs. zurückstellt. Für kontinuierliche Anfragen, welche permanent ausgeführt werden, reicht diese Art von AC jedoch nicht aus: Es kann zur Konfliktlösung nicht mehr auf die Terminierung einer Anfrage gewartet werden. Zudem ist die Verarbeitung datengetrieben, d. h. Umfang und Inhalt der Daten können stark variieren. Diese Arbeit stellt ein Konzept vor, wie eine flexible und anpassbare AC-Komponente für kontinuierliche Anfragen auf Basis eines Event-Condition-Action-Modells (ECA) umgesetzt werden kann. Zur Regeldefinition wird die einfache Sprache CADL vorgestellt, die eine hohe Flexibilität und Anpassbarkeit der kontinuierlichen AC an konkrete Systeme und Hardware erlaubt. Die Evaluation zeigt mittels Odysseus, einem Framework für Datenstrommanagementsysteme, dass das Konzept effizient funktioniert und effektiv zur Lastkontrolle und Ergreifung von Maßnahmen zur Lastreduktion eingesetzt werden kann.
Konferenzbeitrag
Autoshard - a Java object mapper (not only) for hot spot data objects in nosql data stores
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Scherzinger, Stefanie; Thor, Andreas
We demonstrate AutoShard, a ready-to-use object mapper for Java applications running against NoSQL data stores. AutoShard's unique feature is its capability to gracefully shard hot spot data objects that are suffering under concurrent writes. By sharding data on the level of the logical schema, scalability bottlenecks due to write contention can be effectively avoided. Using AutoShard, developers can easily employ sharding in their web application by adding minimally intrusive annotations to their code. Our live experiments show the significant impact of sharding on both the write throughput and the execution time.
Konferenzbeitrag
Autostudio: A generic web application for transforming dataflow programs into action
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Rane, Nikhil-Kishor; Saleh, Omran
In this paper, a user-friendly, interactive, and easy-to-use web-based application, is introduced. “AutoStudio” is a generic application where the users can generate dataflow programs for different dataflow languages, such as PipeFlow and Pig, using a drag and drop functionality. Using this application, the users can assemble the operators that comprise a particular program without being an expert in the target language and without his/her awareness of its operators and constructs. Therefore, the rapid development can be achieved. Our application also provides additional functionalities which makes it a “One-Stop-App” such as compiling, saving, executing, and checking the status of the generated dataflow programs.
Konferenzbeitrag
Big Data-Zentren - Vorstellung und Panel
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Markl, Volker; Rahm, Erhard; Lehner, Wolfgang; Beigl, Michael; Seidl, Thomas
Zur Erforschung der verschiedenen Facetten von Big Data“ wurden jüngst drei Zen- ” tren gegründet. Hierbei handelt es sich um die vom Bundesministerium für Bildung und Forschung (BMBF) geförderten Kompetenzzentren BBDC (Berlin Big Data Center, Leitung TU Berlin) und ScaDS (Competence Center for Scalable Data Services and Solutions, Leitung TU Dresden und Uni Leipzig) sowie das in Zusammenarbeit von Industrie und Forschung eingerichtete SDIL (Smart Data Innovation Lab, Leitung KIT). Diese drei Zentren werden zunächst in Kurzvorträgen vorgestellt. Eine sich anschließende Panel- Diskussion arbeitet Gemeinsamkeiten, spezifische Ansprüche und Kooperationsmöglichkeiten heraus.
Konferenzbeitrag
Big data: hype and reality
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Mohan, C.
Big Data has become a hot topic in the last few years in both industry and the research community. For the most part, these developments were initially triggered by the requirements of Web 2.0 companies. Both technical and non-technical issues have continued to fuel the rapid pace of developments in the Big Data space. Open source and non-traditional software entities have played key roles in the latter. As it always happens with any emerging technology, there is a fair amount of hype that accompanies the work being done in the name of Big Data. The set of clear-cut distinctions that were made initially between Big Data systems and traditional database management systems are being blurred as the needs of the broader set of (real world) users and developers have come into sharper focus in the last couple of years. In this talk, I will survey the developments in Big Data and try to distill reality from the hype! 2 Biography Dr. C. Mohan has been an IBM researcher for 33 years in the information management area, impacting numerous IBM and non-IBM products, the research and academic communities, and standards, especially with his invention of the ARIES family of database locking and recovery algorithms, and the P
Konferenzbeitrag
The cache sketch: revisiting expiration-based caching in the age of cloud data management
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Gessert, Felix; Schaarschmidt, Michael; Wingerath, Wolfram; Friedrich, Steffen; Ritter, Norbert
The expiration-based caching model of the web is generally considered irreconcilable with the dynamic workloads of cloud database services, where expiration dates are not known in advance. In this paper, we present the Cache Sketch data structure which makes expiration-based caching of database records feasible with rich tunable consistency guarantees. The Cache Sketch enables database services to leverage the large existing caching infrastructure of content delivery networks, browser caches and web caches to provide low latency and high scalability. The Cache Sketch employs Bloom filters to create compact representations of potentially stale records to transfer the task of cache coherence to clients. Furthermore, it also minimizes the number of invalidations the service has to perform on caches that support them (e.g., CDNs). With different age-control policies the Cache Sketch achieves very high cache hit ratios with arbitrarily low stale read probabilities. We present the Constrained Adaptive TTL Es- timator to provide cache expiration dates that optimize the performance of the Cache Sketch and invalidations. To quantify the performance gains and to derive workloadoptimal Cache Sketch parameters, we introduce the YCSB Monte-Carlo Caching Simulator (YMCA), a generic framework for simulating the performance and consistency characteristics of any caching and replication topology. We also provide empirical evidence for the efficiency of the Cache Sketch construction and the real-world latency reductions of database workloads under CDN-caching.
Konferenzbeitrag
The case for small data management
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Dittrich, Jens
Exabytes of data; several hundred thousand TPC-C transactions per second on a single computing core; scale-up to hundreds of cores and a dozen Terabytes of main memory; scale-out to thousands of nodes with close to Petabyte-sized main memories; and massively parallel query processing are a reality in data management. But, hold on a second: for how many users exactly? How many users do you know that really have to handle these kinds of massive datasets and extreme query workloads? On the other hand: how many users do you know that are fighting to handle relatively small datasets, say in the range of a few thousand to a few million rows per table? How come some of the most popular open source DBMS have hopelessly outdated optimizers producing inefficient query plans? How come people don't care and love it anyway? Could it be that most of the world's data management problems are actually quite small? How can we increase the impact of database research in areas when datasets are small? What are the typical problems? What does this mean for database research? We discuss research challenges, directions, and a concrete technical solution coined PDbF: Portable Database Files. This is an extended version of an abstract and Gong Show talk presented at CIDR 2015 (http://www.cidrdb.org/cidr2015/Papers/11 Abstract17DJ.pdf). 27 Biography Jens Dittrich is a Full Professor of Computer Science in the area of Databases, Data Management, and Big Data at Saarland University, Germany. Previous affiliations include U Marburg, SAP AG, and ETH Zurich. He is also associated to CISPA (Center for IT- Security, Privacy and Accountability). He received an Outrageous Ideas and Vision Paper Award at CIDR 2011, a BMBF VIP Grant, a best paper award at VLDB 2014, two CS teaching awards in 2011 and 2013, as well as several presentation awards including a qualification for the interdisciplinary German science slam finals in 2012 and three presentation awards at CIDR (2011, 2013, and 2015). His research focuses on fast access to big data including in particular: data analytics on large datasets, Hadoop MapReduce, main-memory databases, and database indexing. He has been a PC member and/or area chair of prestigious international database conferences such as PVLDB, SIGMOD, and ICDE. Since 2013 he has been teaching his classes on data management as flipped classrooms. See http://datenbankenlernen.de or
Konferenzbeitrag
Cleager: Eager Schema Evolution in NoSQL Document Stores
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Scherzinger, Stefanie; Klettke, Meike; Störl, Uta
Schema-less NoSQL data stores offer great flexibility in application development, particularly in the early stages of software design. Yet over time, software engineers struggle with the heavy burden of dealing with increasingly heterogeneous data. In this demo we present Cleager, a framework for eagerly managing schema evolution in schema-less NoSQL document stores. Cleager executes declarative schema modification operations as MapReduce jobs on the Google Cloud Platform. We present different scenarios that require data migration, such as adding, removing, or renaming properties of persisted objects, as well as copying and moving them between objects. Our audience can declare the required schema migration operations in the Cleager console, and then verify the results in real time.
Konferenzbeitrag
D2Pt: Privacy-Aware Multiparty Data Publication
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Nielsen, Jan Hendrik; Janusz, Daniel; Taeschner, Jochen; Freytag, Johann-Christoph
Today, publication of medical data faces high legal barriers. On the one hand, publishing medical data is important for medical research. On the other hand, it is neccessary to protect peoples' privacy by ensuring that the relationship between individuals and their related medical data remains unknown to third parties. Various data anonymization techniques remove as little identifying information as possible to maintain a high data utility while satisfying the strict demands of privacy laws. Current research in this area proposes a multitude of concepts for data anonymization. The concept of k-anonymity allows data publication by hiding identifying information without losing its semantics. Based on k-anonymity, the concept of t-closeness incorporates semantic relationships between personal data values, therefore increasing the strength of the anonymization. However, these concepts are restricted to a centralized data source. In this paper, we extend existing data privacy mechanisms to enable joint data publication among multiple participating institutions. In particular, we adapt the concept of t-closeness for distributed data anonymization. We introduce Distributed two-Party t-closeness (D2Pt), a protocol that utilizes cryptographic algorithms to avoid a central component when anonymizing data adhering the t-closeness property. That is, without a trusted third party, we achieve a data privacy based on the notion of t-closeness.
Konferenzbeitrag
Datenanalyse in den Digital Humanities - Eine Annäherung an Kostümmuster mittels OLAP Cubes
(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Falkenthal, Michael; Barzen, Johanna; Dörner, Simon; Elkind, Vadym; Fauser, Jan; Leymann, Frank; Strehl, Tino
Im Film ist das Kostüm eines der prominentesten Gestaltungselemente, um Aussagen über eine Rolle, deren Charakter, Stimmung und Transformation, wie auch über Ortund Zeitgegebenheiten zu kommunizieren. Durch Kostümmuster sollen Kostümbildner befähigt werden, effizient auf bewährte Designlösungen zurückgreifen zu können. Diese Demo zeigt, wie generelle Designprinzipien aus einer großen Anzahl an Kostümen aus Filmen für die Entwicklung dieser Kostümmuster mittels OLAP Cubes abstrahiert werden können. Um generelle Designprinzipien feststellen zu können, werden Kostüme über kategoriale Merkmalstaxonomien beschrieben und in verschiedenen Aggregationsstufen ausgewertet. Die Abstraktion von generellen Lösungen für Kostümmuster wird durch Drill-Down und Roll-Up Mechanismen unterstützt.

Auflistung P241 - BTW2015 - Datenbanksysteme für Business, Technologie und Web nach Titel

Treffer pro Seite

Sortieroptionen