Auflistung nach Schlagwort "NoSQL"
1 - 10 von 15
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelBig Data – Eine Einführung(HMD Praxis der Wirtschaftsinformatik: Vol. 51, No. 4, 2014) Fasel, DanielVerfolgt man die Diskussionen in der europäischen Wirtschaft, erkennt man, dass der Begriff Big Data in der Praxis nicht klar definiert ist. Er ist zwar in aller Munde, doch nur wenige haben eine klare Antwort auf die Frage, was Big Data ist und wo es sich von klassischen Daten einer Unternehmung unterscheidet. Dieser Beitrag gibt eine Einführung in Big Data. Anhand von Volume, Velocity und Variety werden grundlegende Merkmale von Big Data erläutert. Um Big Data wertschöpfend in einer Firma einzusetzen braucht es neue Technologien und neue Fähigkeiten, damit mit solchen Daten besser umgegangen werden kann. In diesem Beitrag werden die Hauptgruppen und einige Vertreter von solchen neuen Technologien kurz erläutert. Letztlich werden die Chancen und Risiken von Big Data in Unternehmen betrachtet.
- KonferenzbeitragControVol Flex: Flexible Schema Evolution for NoSQL Application Development(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Haubold, Florian; Schildgen, Johannes; Scherzinger, Stefanie; Deßloch, StefanWe demonstrate ControVol Flex, an Eclipse plugin for controlled schema evolution in Java applications backed by NoSQL document stores. The sweet spot of our tool are applications that are deployed continuously against the same production data store: Each new release may bring about schema changes that conflict with legacy data already stored in production. The type system internal to the predecessor tool ControVol is able to detect common schema conflicts, and enables developers to resolve them with the help of object-mapper annotations. Our new tool ControVol Flex lets developers choose their schema-migration strategy, whether all legacy data is to be migrated eagerly by means of NotaQL transformation scripts, or lazily, as declared by object-mapper annotations. Our tool is even capable of carrying out both strategies in combination, eagerly migrating data in the background, while lazily migrating data that is meanwhile accessed by the application. From the viewpoint of the application, it remains transparent how legacy data is migrated: Every read access yields an entity that matches the structure that the current application code expects. Our live demo shows how ControVol Flex gracefully solves a broad range of common schema-evolution tasks.
- KonferenzbeitragCoordinated Omission in NoSQL Database Benchmarking(Datenbanksysteme für Business, Technologie und Web (BTW 2017) - Workshopband, 2017) Friedrich, Steffen; Wingerath, Wolfram; Ritter, NorbertDatabase system benchmark frameworks like the Yahoo! Cloud Serving Benchmark (YCSB) play a crucial role in the process of developing and tuning data management systems. YCSB has become the de facto standard for performance evaluation of scalable NoSQL database systems. However, its initial design is prone to skipping important latency measurements. This phenomenon is known as the coordinated omission problem and occurs in almost all load generators and monitoring tools. A recent revision of the YCSB code base addresses this particular problem, but does not actually solve it. In this paper we present the latency measurement scheme of NoSQLMark, our own YCSB-based scalable benchmark framework that completely avoids coordinated omission and show that NoSQLMark produces more accurate results using our validation tool SickStore and the distributed data store Cassandra.
- ZeitschriftenartikelDie neue Realität: Erweiterung des Data Warehouse um Hadoop, NoSQL & Co(HMD Praxis der Wirtschaftsinformatik: Vol. 51, No. 4, 2014) Müller, StefanDurch die immer starker wachsenden Datenberge stößt der klassische Data Warehouse-Ansatz an seine Grenzen, weil er in Punkto Schnelligkeit, Datenvolumen und Auswertungsmöglichkeiten nicht mehr mithalten kann. Neue Big Data-Technologien wie analytische Datenbanken, NoSQL-Datenbanken oder Hadoop versprechen Abhilfe, haben aber einige Nachteile: Während sich analytische Datenbanken nur unzureichend mit anderen Datenquellen integrieren lassen, reichen die Abfragesprachen von NoSQL-Datenbanken nicht an die Möglichkeiten von SQL heran. Die Einführung von Hadoop erfordert wiederum den aufwändigen Aufbau von Knowhow im Unternehmen. Durch eine geschickte Kombination des Data Warehouse-Konzepts mit modernen Big Data-Technologien lassen sich diese Schwierigkeiten überwinden: Die Data Marts, auf die analytische Datenbanken zugreifen, können aus dem Data Warehouse gespeist werden. Die Vorteile von NoSQL lassen sich in den Applikationsdatenbanken nutzen, während die Daten für die Analysen in das Data Warehouse geladen werden, wo die relationalen Datenbanken ihre Stärken ausspielen. Die Ergebnisse von Hadoop-Transaktionen schließlich lassen sich sehr gut in einem Data Warehouse oder in Data Marts ablegen, wo sie einfach über eine Data-Warehouse-Plattform ausgewertet werden können, während die Rohdaten weiterhin bei Hadoop verbleiben. Zudem unterstützt Hadoop auch Werkzeuge fur einen performanten SQL-Zugriff. Der Artikel beschreibt, wie aus altem Data Warehouse-Konzept und modernen Technologien die „neue Realität“ entsteht und illustriert dies an verschiedenen Einsatzszenarien.
- ZeitschriftenartikelEinflüsse auf den Implementierungserfolg von NoSQL-Systemen: Erkenntnisse einer quantitativ-empirischen Untersuchung(HMD Praxis der Wirtschaftsinformatik: Vol. 53, No. 4, 2016) Cato, Patrick; Brumm, Simon; Gölzer, Philipp; Demmelhuber, WalterNoSQL-Systeme sind eine neue Generation von Datenbanksystemen, mit denen sich eine Reihe von neuen und innovativen Anwendungsfällen realisieren lässt. Erfahrungsberichte von Praktikern zeigen jedoch, dass die Implementierung von NoSQL-Systemen in der Praxis oftmals komplexitätsbedingt scheitert. Ziel der Studie des Instituts für Wirtschaftsinformatik Nürnberg war es daher, die zentralen Einflussgrößen auf den Implementierungserfolg zu erfassen und empirisch zu validieren. Acht Faktoren haben einen signifikanten Einfluss auf den Implementierungserfolg von NoSQL-Systemen: ausreichende Ressourcenausstattung des Projekts, datengetriebene Entscheidungskultur im Unternehmen, Datenqualität der Quellsysteme, Managementunterstützung, Data Science Fähigkeiten, Laboransatz zur Exploration des Anwendungsfalls, adäquate Technologiewahl sowie Berücksichtigung von datenschutzrechtlichen Aspekten in der Designphase (Privacy by Design).AbstractNoSQL systems are a new generation of data base systems that enable the implementation of new and innovative use cases. However, experience reports from practitioners show that the implementation of NoSQL systems is a complex undertaking and many NoSQL projects fail. This paper summarizes the key results of a study that investigated the central factors that have an impact on the implementation success of NoSQL systems. The study has identified eight factors that have a significant impact: Sufficient resources, data-driven culture, data quality of the sourcing systems, management support, data science skills, proof-of-concept phase, adequate technology selection and privacy by design.
- KonferenzbeitragAn Experimental Analysis of Different Key-Value Stores and Relational Databases(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Gembalczyk, David; Schuhknecht, Felix Martin; Dittrich, JensNowadays, databases serve two main workloads: Online Transaction Processing (OLTP) and Online Analytic Processing (OLAP). For decades, relational databases dominated both areas. With the hype on NoSQL databases, the picture has changed. Initially designed as inter-process hash tables handling OLTP requested, some key-value store vendors have started to tackle the area of OLAP as well. Therefore, in this performance study, we compare the relational databases PostgreSQL, MonetDB, and HyPer with the key-value stores Redis and Aerospike in their write, read, and analytical capabilities. Based on the results, we investigate the reasons of the database’s respective advantages and disadvantages.
- TextdokumentNoSQL & Real-Time Data Management in Research & Practice(BTW 2019 – Workshopband, 2019) Wingerath, Wolfram; Gessert, Felix; Ritter, NorbertUsers have come to expect reactivity from mobile and web applications, i.e. they assume that changes made by other users become visible immediately. However, developers are challenged with building reactive applications on top of traditional pull-oriented databases, because they are ill-equipped to push new information to the client. Systems for data stream management and processing, on the other hand, are natively push-oriented and thus facilitate reactive behavior, but they do not follow the same collection-based semantics as traditional databases: Instead of database collections, stream-oriented systems are based on a notion of potentially unbounded sequences of data items. In this tutorial, we survey and categorize the system space between pull-oriented databases and push-oriented stream management systems, using their respectively facilitated means of data retrieval as a reference point. We start with an in-depth survey of the most relevant NoSQL databases to provide a comparative classification and highlight open challenges. To this end, we analyze the approach of each system to derive its scalability, availability, consistency, data modeling, and querying characteristics. We present how each system’s design is governed by a central set of trade-offs over irreconcilable system properties. We then cover recent research results in distributed data management to illustrate that some shortcomings of NoSQL systems could already be solved in practice, whereas other NoSQL data management problems pose interesting and unsolved research challenges. A particular emphasis lies on the novel system class of real-time databases which combine the push-based access paradigm of stream-oriented systems with the collection-based query semantics of traditional databases. We explore why real-time databases deserve distinction in a separate system class and dissect their different architectures to highlight issues, derive open challenges, and discuss avenues for addressing them.
- ZeitschriftenartikelOne DB Does Not Fit It All: Teaching the Differences in Advanced Database Systems(Datenbank-Spektrum: Vol. 21, No. 2, 2021) Wiese, Lena; Benabbas, Aboubakr; Elmamooz, Golnaz; Nicklas, DanielaIn this article we report on our experience with teaching differences that exist between relational and non-relational data models. We present results of an evaluation of a practical course in which students are assigned 15 queries within 6 tasks that they execute on four different database systems. The pedagogical aim of this course was to show conceptual differences between data models, difficulties that can occur when trying to formulate queries in different query languages, as well as specific system behavior. We evaluate the practical course based on a questionnaire that recorded the students’ performance on each task for each DB system.
- TextdokumentProtobase: It's About Time for Backend/Database Co-Design(BTW 2019, 2019) Pinnecke, Marcus; Campero, Gabriel; Zoun, Roman; Broneske, David; Saake, GunterIn this interactive demonstration, we show the current state of Protobase, our main-memory analytic document store that is designed from scratch to enable rapid prototyping of efficient microservices that perform analytics and explorations on (third-party) JSON-like documents stored in a novel columnar binary-encoded format, called the Cabin file format. In contrast to other solutions, our database system exposes neither a particular query language, nor a fixed REST API to its clients. Instead, the entire user-defined backend logic, whose user code is written in Python, is placed inside a sandbox that runs in the systems process. Protobase in turn exposes a user-defined REST API that the (frontend) application interacts with. Thus, our system acts as a backend server while at the same time avoids full exposure of its database to the clients. Consequently, a Protobase instance (database + user code + REST API) serves as (the entire) microservice -potentially minimizing the number of systems running in a typical analytic software stack. In terms of execution performance, Protobase therefore takes the inter-process communication overhead between backend and database system out of the picture and heavily utilizes columnar binary document storage to scale-up for analytic queries. Both features lead to a notable performance gain for non-trivial services, potentially minimizing the number of required nodes in a cloud setting, too. In our demo, we overview Protobases internals, spot major design decisions, and show how to prototype a scholarly search engine managing the Microsoft Academic Graph, a real-world scientific paper graph of roughly 154 mio. Documents.
- KonferenzbeitragScalable Data Management: An In-Depth Tutorial on NoSQL Data Stores(Datenbanksysteme für Business, Technologie und Web (BTW 2017) - Workshopband, 2017) Gessert, Felix; Wingerath, Wolfram; Ritter, NorbertThe unprecedented scale at which data is consumed and generated today has shown a large demand for scalable data management and given rise to non-relational, distributed “NoSQL” database systems. Two central problems triggered this process: 1) vast amounts of user-generated content in modern applications and the resulting request loads and data volumes as well as 2) the desire of the developer community to employ problem-specific data models for storage and querying. To address these needs, various data stores have been developed by both industry and research, arguing that the era of one-size-fits-all database systems is over. The heterogeneity and sheer amount of these systems – now commonly referred to as NoSQL data stores – make it increasingly di cult to select the most appropriate system for a given application. Therefore, these systems are frequently combined in polyglot persistence architectures to leverage each system in its respective sweet spot. This tutorial gives an in-depth survey of the most relevant NoSQL databases to provide comparative classification and highlight open challenges. To this end, we analyze the approach of each system to derive its scalability, availability, consistency, data modeling and querying characteristics. We present how each system’s design is governed by a central set of trade-o s over irreconcilable system properties. We then cover recent research results in distributed data management to illustrate that some shortcomings of NoSQL systems could already be solved in practice, whereas other NoSQL data management problems pose interesting and unsolved research challenges. In addition to earlier tutorials, we explicitly address how the quickly emerging topic of processing and storing massive amounts of data in real-time can be solved by di erent types real-time data management systems.