Auflistung nach Schlagwort "Stream processing"
1 - 4 von 4
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelAn Index-Inspired Algorithm for Anytime Classification on Evolving Data Streams(Datenbank-Spektrum: Vol. 12, No. 1, 2012) Kranen, Philipp; Assent, Ira; Seidl, ThomasDue to the ever growing presence of data streams there has been a considerable amount of research on stream data mining over the past years. Anytime algorithms are particularly well suited for stream mining, since they flexibly use all available time on streams of varying data rates, and are also shown to outperform traditional budget approaches on constant streams. In this article we present an index-inspired algorithm for Bayesian anytime classification on evolving data streams and show its performance on benchmark data sets.
- TextdokumentA Comparison of Distributed Stream Processing Systems for Time Series Analysis(BTW 2019 – Workshopband, 2019) Gehring, Melissa; Charfuelan, Marcela; Markl, VolkerGiven the vast number of data processing systems available today, in this paper, we aim to identify, select, and evaluate systems to determine the one that is better suited to use in conducting time series analysis. Published studies of performance are used to compare several open-source systems, and two systems are further selected for qualitative comparison and evaluation regarding the development of a time series analytics task. The main interest of this work lies in the investigation of the Ease of development. As a test scenario, a discrete Kalman filter is implemented to predict the closing price of stock market data in real-time. Basic functionality coverage is considered, and advanced functionality is evaluated using several qualitative comparison criteria.
- KonferenzbeitragEnactment of Adaptation in Data Stream Processing with Latency Implications(Software Engineering 2020, 2020) Qin, Cui; Eichelberger, Holger; Schmid, KlausThis summary refers to the paper Enactment of adaptation in data stream processing with latency implications – A systematic literature review. This paper is a journal paper published in Information and Software Technology (IST) in July 2019. Runtime adaptation in stream processing plays a significant role in supporting the optimization of data processing tasks. In recent years, runtime adaptation, particularly its enactment, has received significant interest in scientific literature. However, so far no categorization of the enactment approaches for runtime adaptation in stream processing has been established. This paper presents a systematic literature review (SLR), where we identify and characterize different approaches towards the enactment of runtime adaptation in stream processing with a main focus on latency as quality dimension. We discovered 75 relevant papers out of 244 papers from the search. We identified 17 different enactment categories and developed a taxonomy to characterize all possible enactment approaches. We extracted the realization techniques of each identified enactment approach and classified them into categories. Furthermore, we identified 9 categories of processing problems, 6 adaptation goals, 9 evaluation metrics and 12 evaluation parameters from the identified enactment approaches. The research interest on enactment approaches has significantly increased in recent years. The most commonly applied enactment approaches are parameter adaptation to tune parameters or settings of the processing, load balancing used to re-distribute workloads, and processing scaling to dynamically scale up and down the processing.
- KonferenzbeitragPredicting Scaling Efficiency of Distributed Stream Processing Systems via Task Level Performance Simulation(Softwaretechnik-Trends Band 43, Heft 1, 2023) Rank, Johannes; Barnert, Maximilian; Hein, Andreas; Krcmar, HelmutStream processing systems (SPS) are a special class of Big Data systems that firms employ in (near) real time business scenarios. They ensure low-latency processing through a high degree of parallelization and elasticity. However, firms often do not know which scaling direction: horizontally, vertically, or mixed, is the best strategy in terms of CPU performance to scale those systems. Especially in cloud deployments with a pay-per-use model and cluster sizes that can span dozens of cores and machines, firms would profit from more accurate measurement-based approaches. In this paper, we show how to predict the CPU consumption of Apache Flink for different scaling scenarios using the Palladio Component Model. Our approach models the individual streaming tasks that make up the application and parametrizes it with fine grained CPU metrics obtained by combining BPF pro filing and querying the CPU’s performance measurement unit. Through this “task-level model approach”, we can achieve highly accurate predictions, despite using a simple model and only requiring a few mea surements for parametrization. Our experiment also shows that we achieve more accurate results than an alternative approach based on regression analysis.