Mustafa, Tarek AlKönig-Ries, BirgittaSamuel, SheebaKönig-Ries, BirgittaScherzinger, StefanieLehner, WolfgangVossen, Gottfried2023-02-232023-02-232023978-3-88579-725-8https://dl.gi.de/handle/20.500.12116/40382Over the last decade Machine learning (ML) has dramatically changed the application ofand research in computer science. It becomes increasingly complicated to assure the transparency and reproducibility of advanced ML systems from raw data to deployment. In this paper, we describe an approach to supply users with an interface to specify a variety of parameters that together provide complete provenance information and automatically generate executable ML code from this information. We introduce MLProvCodeGen (Machine Learning Provenance Code Generator), a JupyterLab extension to generate custom code for ML experiments from user-defined metadata. ML workflows can be generated with different data settings, model parameters, methods, and trainingparameters and reproduce results in Jupyter Notebooks. We evaluated our approach with two ML applications, image and multiclass classification, and conducted a user evaluation.enProvenance ManagementCode GenerationMachine LearningJupyterLabJupyter NotebooksReproducibilityMLProvCodeGen: A Tool for Provenance Data Input and Capture of Customizable Machine Learning ScriptsText/Conference Paper10.18420/BTW2023-72