Ongoing Automated Data Set Generation for Vulnerability Prediction from Github Data
dc.contributor.author | Hinrichs, Torge | |
dc.contributor.editor | Christian Wressnegger, Delphine Reinhardt | |
dc.date.accessioned | 2023-01-24T11:17:50Z | |
dc.date.available | 2023-01-24T11:17:50Z | |
dc.date.issued | 2022 | |
dc.description.abstract | This paper describes the development of a continuous github repository analysis pipeline with the focus on creating a data set for vulnerability prediction in source code. Currently, used data sets consist only of source code functions or methods without additional meta information. This paper assumes that the surrounding code of vulnerable functions can be beneficial to the detection rate. In order to test this assumption, large data sets are needed that can be created using the proposed pipeline. Although the pipeline requires some improvements, in a first test run 1.5 million repositories could be analyzed and evaluated. The resulting data set will be published in the future. | en |
dc.identifier.doi | 10.18420/sicherheit2022_17 | |
dc.identifier.isbn | 978-3-88579-717-3 | |
dc.identifier.pissn | 1617-5468 | |
dc.identifier.uri | https://dl.gi.de/handle/20.500.12116/40138 | |
dc.language.iso | en | |
dc.publisher | Gesellschaft für Informatik, Bonn | |
dc.relation.ispartof | GI SICHERHEIT 2022 | |
dc.relation.ispartofseries | Lecture Notes in Informatics (LNI) - Proceedings, Volume P-323 | |
dc.subject | Vulnerability Prediction | |
dc.subject | Vulnerability Detection | |
dc.subject | Machine Learning | |
dc.subject | data set generation | |
dc.title | Ongoing Automated Data Set Generation for Vulnerability Prediction from Github Data | en |
gi.citation.endPage | 222 | |
gi.citation.startPage | 219 | |
gi.conference.date | 5.-8. April 2022 | |
gi.conference.location | Karlsruhe | |
gi.conference.sessiontitle | Practitioners Track |
Dateien
Originalbündel
1 - 1 von 1