Logo des Repositoriums
 

Ongoing Automated Data Set Generation for Vulnerability Prediction from Github Data

dc.contributor.authorHinrichs, Torge
dc.contributor.editorChristian Wressnegger, Delphine Reinhardt
dc.date.accessioned2023-01-24T11:17:50Z
dc.date.available2023-01-24T11:17:50Z
dc.date.issued2022
dc.description.abstractThis paper describes the development of a continuous github repository analysis pipeline with the focus on creating a data set for vulnerability prediction in source code. Currently, used data sets consist only of source code functions or methods without additional meta information. This paper assumes that the surrounding code of vulnerable functions can be beneficial to the detection rate. In order to test this assumption, large data sets are needed that can be created using the proposed pipeline. Although the pipeline requires some improvements, in a first test run 1.5 million repositories could be analyzed and evaluated. The resulting data set will be published in the future.en
dc.identifier.doi10.18420/sicherheit2022_17
dc.identifier.isbn978-3-88579-717-3
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/40138
dc.language.isoen
dc.publisherGesellschaft für Informatik, Bonn
dc.relation.ispartofGI SICHERHEIT 2022
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-323
dc.subjectVulnerability Prediction
dc.subjectVulnerability Detection
dc.subjectMachine Learning
dc.subjectdata set generation
dc.titleOngoing Automated Data Set Generation for Vulnerability Prediction from Github Dataen
gi.citation.endPage222
gi.citation.startPage219
gi.conference.date5.-8. April 2022
gi.conference.locationKarlsruhe
gi.conference.sessiontitlePractitioners Track

Dateien

Originalbündel
1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
B5-4.pdf
Größe:
295.83 KB
Format:
Adobe Portable Document Format