Ongoing Automated Data Set Generation for Vulnerability Prediction from Github Data

Hinrichs, Torge

Ongoing Automated Data Set Generation for Vulnerability Prediction from Github Data

dc.contributor.author	Hinrichs, Torge
dc.contributor.editor	Christian Wressnegger, Delphine Reinhardt
dc.date.accessioned	2023-01-24T11:17:50Z
dc.date.available	2023-01-24T11:17:50Z
dc.date.issued	2022
dc.description.abstract	This paper describes the development of a continuous github repository analysis pipeline with the focus on creating a data set for vulnerability prediction in source code. Currently, used data sets consist only of source code functions or methods without additional meta information. This paper assumes that the surrounding code of vulnerable functions can be beneficial to the detection rate. In order to test this assumption, large data sets are needed that can be created using the proposed pipeline. Although the pipeline requires some improvements, in a first test run 1.5 million repositories could be analyzed and evaluated. The resulting data set will be published in the future.	en
dc.identifier.doi	10.18420/sicherheit2022_17
dc.identifier.isbn	978-3-88579-717-3
dc.identifier.pissn	1617-5468
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/40138
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik, Bonn
dc.relation.ispartof	GI SICHERHEIT 2022
dc.relation.ispartofseries	Lecture Notes in Informatics (LNI) - Proceedings, Volume P-323
dc.subject	Vulnerability Prediction
dc.subject	Vulnerability Detection
dc.subject	Machine Learning
dc.subject	data set generation
dc.title	Ongoing Automated Data Set Generation for Vulnerability Prediction from Github Data	en
gi.citation.endPage	222
gi.citation.startPage	219
gi.conference.date	5.-8. April 2022
gi.conference.location	Karlsruhe
gi.conference.sessiontitle	Practitioners Track

Dateien

Originalbündel

1 - 1 von 1

Name:: B5-4.pdf
Größe:: 295.83 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

P323 - Sicherheit 2022 - Sicherheit, Schutz und Zuverlässigkeit