Total Recall? How Good Are Static Call Graphs Really?

Helm, DominikKeidel, SvenKampkötter, AnemoneDüsing, JohannesRoth, TobiasHermann, BenMezini, MiraKoziolek, AnneLamprecht, Anna-LenaThüm, ThomasBurger, Erik2025-02-142025-02-1420252944-7682https://dl.gi.de/handle/20.500.12116/45789Static call graphs are a fundamental building block of program analysis. However, differences in callgraph construction and the use of specific language features can yield unsoundness and imprecision. Call-graph analyses are evaluated using measures of precision and recall, but this is hard when a ground truth for real-world programs is generally unobtainable. We propose to use dynamic baselines based on fixed entry points and input corpora. The creation of this dynamic baseline is posed as an approximation of the ground truth—an optimization problem. We use manual extension and coverage-guided fuzzing for creating suitable input corpora. With these dynamic baselines, we study call-graph quality of multiple algorithms and implementations using four real-world Java programs. We find that our methodology provides insights into call-graph quality and how to measure it. We provide a novel methodology to advance the field of static program analysis as we assess the computation of one of its core data structures—the call graph.enCall GraphStatic AnalysisDynamic AnalysisPrecisionRecallTotal Recall? How Good Are Static Call Graphs Really?10.18420/se2025-282944-7682