Konferenzbeitrag
A Dynamic Resource Demand Analysis Approach for Stream Processing Systems
Vorschaubild nicht verfügbar
Volltext URI
Dokumententyp
Text/Conference Paper
Dateien
Zusatzinformation
Datum
2020
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Gesellschaft für Informatik e.V.
Zusammenfassung
Systems that provide real-time business insights based on live data, so-called Stream Processing Systems (SPS), have received much attention in recent years. In many areas such as stock markets or surveillance, it is essential to process data immediately and react accordingly. As the processing of real-time data is at the heart of SPS, their performance in terms of latency, throughput, and resource utilization constitutes a crucial role. Traditional performance and benchmarking approaches for SPS usually focus on the throughput and latency, trying to answer the question of which engine processes the incoming events fastest. However, neglecting the corresponding resource utilization provides only a limited and sometimes even misleading view on their actual performance. Depending on the use-case, an engine that achieves faster processing results at the cost of higher memory utilization is not always best suited, which can be shown based on the example of IoT edge computing devices with limited resources. For this reason, we developed a dynamic performance approach to analyze the resource demands of an SPS. The approach yields fine-grained performance metrics based on the individual processing steps of the SPS and without requiring any knowledge of the actual source code. More-over it takes the whole system (engine and streaming application) into account. Since, we do not rely on code instrumentation or language-specific profiling techniques but instead, use the dynamic tracing capabilities of the Linux kernel, we can support a broad range of different SPSs. We evaluate our approach by inspecting the CPU performance of Apache Flink while performing the Yahoo streaming benchmark.