Schott, StefanPauck, FelixEngels, GregorHebig, ReginaTichy, Matthias2023-01-182023-01-182023978-3-88579-726-5https://dl.gi.de/handle/20.500.12116/40107The conventional approach of assessing the performance of Android taint analysis tools consists of applying the tool to already existing benchmarks and calculating its performance on the contained benchmark cases. Creating and maintaining a benchmark requires a lot of effort, since it needs to comprise various analysis challenges, and since each benchmark case needs a well documented ground-truth - otherwise one cannot know whether a tool’s analysis is accurate. This effort is further increased by the frequently changing Android API. All these factors lead to the same, usually manually created, benchmarks being reused over and over again. In consequence analysis tools are often over-adapted to these benchmarks. To overcome these issues we propose the concept of benchmark fuzzing , which allows the generation of previously unknown and unique benchmarks, alongside their ground-truths, at evaluation time. We implement this approach in our tool GenBenchDroid and additionally show that we are able to find analysis faults that remain uncovered when solely relying on the conventional benchmarking approach.enFuzzingBenchmarksAndroid Taint AnalysisGenBenchDroid: Fuzzing Android Taint Analysis BenchmarksText/Conference Paper1617-5468