Logo des Repositoriums
 
Konferenzbeitrag

Predicting Scaling Efficiency of Distributed Stream Processing Systems via Task Level Performance Simulation

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2023

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

Stream processing systems (SPS) are a special class of Big Data systems that firms employ in (near) real time business scenarios. They ensure low-latency processing through a high degree of parallelization and elasticity. However, firms often do not know which scaling direction: horizontally, vertically, or mixed, is the best strategy in terms of CPU performance to scale those systems. Especially in cloud deployments with a pay-per-use model and cluster sizes that can span dozens of cores and machines, firms would profit from more accurate measurement-based approaches. In this paper, we show how to predict the CPU consumption of Apache Flink for different scaling scenarios using the Palladio Component Model. Our approach models the individual streaming tasks that make up the application and parametrizes it with fine grained CPU metrics obtained by combining BPF pro filing and querying the CPU’s performance measurement unit. Through this “task-level model approach”, we can achieve highly accurate predictions, despite using a simple model and only requiring a few mea surements for parametrization. Our experiment also shows that we achieve more accurate results than an alternative approach based on regression analysis.

Beschreibung

Rank, Johannes; Barnert, Maximilian; Hein, Andreas; Krcmar, Helmut (2023): Predicting Scaling Efficiency of Distributed Stream Processing Systems via Task Level Performance Simulation. Softwaretechnik-Trends Band 43, Heft 1. Gesellschaft für Informatik e.V.. ISSN: 0720-8928

Zitierform

DOI

Tags