Bidirectional Transformer Language Models for Smart Autocompletion of Source Code

This paper investigates the use of transformer networks – which have recently become ubiquitous in natural language processing – for smart autocompletion on source code. Our model JavaBERT is based on a RoBERTa network, which we pretrain on 250 million lines of code and then adapt for method ranking, i.e. ranking an object's methods based on the code context. We suggest two alternative approaches, namely unsupervised probabilistic reasoning and supervised fine-tuning. The supervised variant proves more accurate, with a top-3 accuracy of up to 98%. We also show that the model – though trained on method calls' full contexts – is quite robust with respect to reducing context.

Binder, Felix; Villmow, Johannes; Ulges, Adrian (2021): Bidirectional Transformer Language Models for Smart Autocompletion of Source Code. INFORMATIK 2020. DOI: 10.18420/inf2020_83. Gesellschaft für Informatik, Bonn. PISSN: 1617-5468. ISBN: 978-3-88579-701-2. pp. 915-922. 3rd Workshop on Smart Systems for Better Living Environments. Karlsruhe. 28. September - 2. Oktober 2020

Schlagwörter

smart autocompletion , deep learning , transformer networks

DOI

10.18420/inf2020_83

Sammlungen

P307 - INFORMATIK 2020 - Back to the Future

Komplettanzeige

Bidirectional Transformer Language Models for Smart Autocompletion of Source Code

Volltext URI

Dokumententyp

Dateien

Zusatzinformation

Datum

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Quelle

Verlag

Zusammenfassung

Beschreibung

Schlagwörter

Zitierform

DOI

Tags

Sammlungen