Predicting Scaling Efficiency of Distributed Stream Processing Systems via Task Level Performance Simulation
dc.contributor.author | Rank, Johannes | |
dc.contributor.author | Barnert, Maximilian | |
dc.contributor.author | Hein, Andreas | |
dc.contributor.author | Krcmar, Helmut | |
dc.contributor.editor | Herrmann, Andrea | |
dc.date.accessioned | 2024-02-22T10:37:53Z | |
dc.date.available | 2024-02-22T10:37:53Z | |
dc.date.issued | 2023 | |
dc.description.abstract | Stream processing systems (SPS) are a special class of Big Data systems that firms employ in (near) real time business scenarios. They ensure low-latency processing through a high degree of parallelization and elasticity. However, firms often do not know which scaling direction: horizontally, vertically, or mixed, is the best strategy in terms of CPU performance to scale those systems. Especially in cloud deployments with a pay-per-use model and cluster sizes that can span dozens of cores and machines, firms would profit from more accurate measurement-based approaches. In this paper, we show how to predict the CPU consumption of Apache Flink for different scaling scenarios using the Palladio Component Model. Our approach models the individual streaming tasks that make up the application and parametrizes it with fine grained CPU metrics obtained by combining BPF pro filing and querying the CPU’s performance measurement unit. Through this “task-level model approach”, we can achieve highly accurate predictions, despite using a simple model and only requiring a few mea surements for parametrization. Our experiment also shows that we achieve more accurate results than an alternative approach based on regression analysis. | en |
dc.identifier.issn | 0720-8928 | |
dc.identifier.uri | https://dl.gi.de/handle/20.500.12116/43654 | |
dc.language.iso | en | |
dc.pubPlace | Bonn | |
dc.publisher | Gesellschaft für Informatik e.V. | |
dc.relation.ispartof | Softwaretechnik-Trends Band 43, Heft 1 | |
dc.relation.ispartofseries | Softwaretechnik-Trends | |
dc.subject | Stream processing | |
dc.subject | Big Data | |
dc.subject | parallelization | |
dc.subject | scaling | |
dc.subject | prediction | |
dc.subject | performance | |
dc.subject | latency | |
dc.title | Predicting Scaling Efficiency of Distributed Stream Processing Systems via Task Level Performance Simulation | en |
dc.type | Text/Conference Paper | |
mci.conference.date | 7.-9.11.2022 | |
mci.conference.location | Stuttgart | |
mci.conference.sessiontitle | 13th Symposium on Software Performance (SSP) | |
mci.reference.pages | 14-16 |
Dateien
Originalbündel
1 - 1 von 1