Bidirectional Transformer Language Models for Smart Autocompletion of Source Code

Binder, Felix; Villmow, Johannes; Ulges, Adrian

Bidirectional Transformer Language Models for Smart Autocompletion of Source Code

dc.contributor.author	Binder, Felix
dc.contributor.author	Villmow, Johannes
dc.contributor.author	Ulges, Adrian
dc.contributor.editor	Reussner, Ralf H.
dc.contributor.editor	Koziolek, Anne
dc.contributor.editor	Heinrich, Robert
dc.date.accessioned	2021-01-27T13:34:30Z
dc.date.available	2021-01-27T13:34:30Z
dc.date.issued	2021
dc.description.abstract	This paper investigates the use of transformer networks – which have recently become ubiquitous in natural language processing – for smart autocompletion on source code. Our model JavaBERT is based on a RoBERTa network, which we pretrain on 250 million lines of code and then adapt for method ranking, i.e. ranking an object's methods based on the code context. We suggest two alternative approaches, namely unsupervised probabilistic reasoning and supervised fine-tuning. The supervised variant proves more accurate, with a top-3 accuracy of up to 98%. We also show that the model – though trained on method calls' full contexts – is quite robust with respect to reducing context.	en
dc.identifier.doi	10.18420/inf2020_83
dc.identifier.isbn	978-3-88579-701-2
dc.identifier.pissn	1617-5468
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/34796
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik, Bonn
dc.relation.ispartof	INFORMATIK 2020
dc.relation.ispartofseries	Lecture Notes in Informatics (LNI) - Proceedings, Volume P-307
dc.subject	smart autocompletion
dc.subject	deep learning
dc.subject	transformer networks
dc.title	Bidirectional Transformer Language Models for Smart Autocompletion of Source Code	en
gi.citation.endPage	922
gi.citation.startPage	915
gi.conference.date	28. September - 2. Oktober 2020
gi.conference.location	Karlsruhe
gi.conference.sessiontitle	3rd Workshop on Smart Systems for Better Living Environments

Dateien

Originalbündel

1 - 1 von 1

Name:: C20-6.pdf
Größe:: 291.24 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

P307 - INFORMATIK 2020 - Back to the Future