Logo des Repositoriums
 

Tuning Cassandra through Machine Learning

dc.contributor.authorEppinger, Florian
dc.contributor.authorStörl, Uta
dc.contributor.editorKönig-Ries, Birgitta
dc.contributor.editorScherzinger, Stefanie
dc.contributor.editorLehner, Wolfgang
dc.contributor.editorVossen, Gottfried
dc.date.accessioned2023-02-23T13:59:59Z
dc.date.available2023-02-23T13:59:59Z
dc.date.issued2023
dc.description.abstractNoSQL databases have become an important component of many big data and real-time web applications. Their distributed nature and scalability make them an ideal data storage repository for a variety of use cases. While NoSQL databases are delivered with a default ”off-the-shelf” configuration, they offer configuration settings to adjust a database’s behavior and performance to a specific use case and environment. The abundance and oftentimes imperceptible inter-dependencies of configuration settings make it difficult to optimize and performance-tune a NoSQL system. There is no one-size-fits-all configuration and therefore the workload, the physical design, and available resources need to be taken into account when optimizing the configuration of a NoSQL database. This work explores Machine Learning as a means to automatically tune a NoSQL database for optimal performance. Using Random Forest and Gradient Boosting Decision Tree Machine Learning algorithms, multiple Machine Learning models were fitted with a training dataset that incorporates properties of the NoSQL physical configuration (replication and sharding). The best models were then employed as surrogate models to optimize the Database Management System’s configuration settings for throughput and latency using various Black-box Optimization algorithms. Using an Apache Cassandra database, multiple experiments were carried out to demonstrate the feasibility of this approach, even across varying physical configurations. The tuned Database Management System (DBMS) configurations yielded throughput improvements of up to 4%, read latency reductions of up to 43%, and write latency reductions of up to 39% when compared to the default configuration settings.en
dc.identifier.doi10.18420/BTW2023-04
dc.identifier.isbn978-3-88579-725-8
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/40346
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofBTW 2023
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-331
dc.subjectAI for Database Systems
dc.subjectNoSQL
dc.subjectMachine Learning
dc.subjectPerformance Modeling
dc.subjectTuning
dc.subjectBlack-box Optimization
dc.titleTuning Cassandra through Machine Learningen
dc.typeText/Conference Paper
gi.citation.endPage104
gi.citation.publisherPlaceBonn
gi.citation.startPage93
gi.conference.date06.-10. März 2023
gi.conference.locationDresden, Germany

Dateien

Originalbündel
1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
B1-4.pdf
Größe:
2.69 MB
Format:
Adobe Portable Document Format