On the Cost and Profit of Software Defect Prediction

Herbold, SteffenKoziolek, AnneSchaefer, InaSeidl, Christoph2020-12-172020-12-172021978-3-88579-704-3https://dl.gi.de/handle/20.500.12116/34509The article "On the cost and profit of software defect prediction" published in the IEEE Transactions on Software Engineering in 2019 propose a cost model to enable the estimation of the expected profit when using machine learning models for defect prediction. Defect prediction can be a powerful tool to guide the use of quality assurance resources. However, while lots of research covered methods for defect prediction as well as methodological aspects of defect prediction research, the actual cost saving potential of defect prediction is still unclear. We close this research gap and formulate a cost model for software defect prediction. We derive mathematically provable boundary conditions that must be fulfilled by defect prediction models such that there is a positive profit when the defect prediction model is used. Our cost model includes aspects like the costs for quality assurance, the costs of post-release defects, the possibility that quality assurance fails to reveal predicted defects. Our results show that the unrealistic assumption that defects only affect a single software artifact leads to inaccurate cost estimations. Moreover, the results indicate that thresholds for machine learning metrics are also not suited to define success criteria for software defect prediction.enDefect PredictionCostsReturn On InvestmentOn the Cost and Profit of Software Defect PredictionText/ConferencePaper10.18420/SE2021_151617-5468