Konferenzbeitrag
Architecture-Aware Online Failure Prediction for Distributed Software Systems
Lade...
Volltext URI
Dokumententyp
Text/Conference Paper
Dateien
Zusatzinformation
Datum
2018
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Gesellschaft für Informatik
Zusammenfassung
This abstract summarizes our article appearing in the Journal of Systems and Software. Today’s software systems are complex. They comprise an immense number of distributed hardware and software components to deliver desired functionalities. Failures during production are inevitable despite successful approaches for quality assurance during software development. A failure in one component, e.g., memory leak or slow response time, can create a chain of failures propagating to other components and the users . Online failure prediction aims to foresee imminent failures by making predictions based on system parameters from monitoring data. Existing approaches employ prediction models that predict failures either for the whole system or for individual components without considering software architecture. We propose an architecture-aware online failure prediction approach that combines failure prediction with architectural knowledge. The failure probabilities of individual components are predicted based on continuously collected monitoring data. The prediction results are forwarded to a failure propagation model, which periodically computes a system failure probability. The model uses a Bayesian network to represent architectural dependencies extracted automatically from architectural knowledge. The results can, for instance, be used for proactive maintenance. The experimental evaluation shows that the prediction quality is improved when software architectural knowledge is included in the prediction.