Auflistung nach Autor:in "Villmow, Johannes"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- TextdokumentBidirectional Transformer Language Models for Smart Autocompletion of Source Code(INFORMATIK 2020, 2021) Binder, Felix; Villmow, Johannes; Ulges, AdrianThis paper investigates the use of transformer networks – which have recently become ubiquitous in natural language processing – for smart autocompletion on source code. Our model JavaBERT is based on a RoBERTa network, which we pretrain on 250 million lines of code and then adapt for method ranking, i.e. ranking an object's methods based on the code context. We suggest two alternative approaches, namely unsupervised probabilistic reasoning and supervised fine-tuning. The supervised variant proves more accurate, with a top-3 accuracy of up to 98%. We also show that the model – though trained on method calls' full contexts – is quite robust with respect to reducing context.
- KonferenzbeitragEvaluating Contextualized Code Search in Practical User Studies(INFORMATIK 2024, 2024) Villmow, Johannes; Ulges, Adrian; Schwanecke, UlrichContextualized Code Search (CCS) aims to retrieve relevant code snippets that complement the developer’s current editor context. In contrast to AI-based code generation, it offers the key benefit that the source of the retrieved code is made transparent, allowing for a safe re-use of code within companies. Recently, self-supervised training for CCS has been shown to be effective. Evidence for this, however, focuses on ranking quality on research datasets. It remains unclear whether – and if yes, by how far – CCS can help improve the efficiency of real-world users. To fill this gap, we have integrated a recent CCS model into an IDE. We describe specialized robustness-oriented enhancements to the training to improve usability. We then evaluate the model in two practical user studies: In Study A, we measure efficiency improvements of fourth semester computer science students on simple algorithm exercises. In Study B, we allow a professional software development team to use the tool in their everyday work. Their company consists of several – more or less independent – teams that work on the same product, which might find code of other teams helpful. We demonstrate improvements by the proposed search, discuss use cases for the tool, and point out challenges and directions for future research (such as the combination with code generation in retrieval augmented generation).