Logo des Repositoriums
 
Konferenzbeitrag

Using Pre-trained Transformers to Detect Malicious Source Code Within JavaScript Packages

Vorschaubild nicht verfügbar

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2024

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

The proliferation of open source software reuse has led to a significant increase in software supply chain attacks, making it increasingly challenging to identify malicious packages amidst the sheer volume of available packages. Traditional static analysis methods often fall short in detecting these threats due to the complexity and diversity of code semantics. This paper addresses these challenges by leveraging the remarkable success of transformer models in understanding code semantics. We propose a novel approach that utilizes pre-trained transformer models to embed source code, followed by training classifiers on these embeddings. This methodology enables a more nuanced understanding of code semantics, significantly improving the detection of malicious packages. Through extensive experiments, our approach achieves F1-scores as high as 0.98 and an alert rate of 0.09%, demonstrating its effectiveness in detecting malicious code within open source software supply chains.

Beschreibung

Ohm, Marc; Götz, Anja (2024): Using Pre-trained Transformers to Detect Malicious Source Code Within JavaScript Packages. INFORMATIK 2024. DOI: 10.18420/inf2024_40. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-746-3. pp. 529-538. Safety in Bytes. Wiesbaden. 24.-26. September 2024

Zitierform

Tags