Logo des Repositoriums
 

Code Generation for Niche Programming Languages with Large Language Models

dc.contributor.authorKogler, Philipp
dc.contributor.authorChen, Wei
dc.contributor.authorWallner, Stefan
dc.contributor.editorFeichtinger, Kevin
dc.contributor.editorSonnleithner, Lisa
dc.contributor.editorHajiabadi, Hamideh
dc.date.accessioned2025-02-14T10:03:34Z
dc.date.available2025-02-14T10:03:34Z
dc.date.issued2025
dc.description.abstractCode generation is a prominent use-case for Large Language Models (LLMs). Specialized LLMs such as CodeLlama or Codestral are trained on a large variety of programming languages and achieve a strong performance on coding tasks. However, when applied to less common programming languages which are not included in their pre-training corpus, their performance decreases. In this work, we describe an approach to integrate a LLM in the context of a coding copilot for specific applications where code shall be generated in a niche general-purpose programming language. We study the use of an intermediate domain-specific-language to limit the scope to the application-specific needs, and to enable the LLM to reliably generate code in such an application-specific scenario. We evaluate this method on two use-cases: Generating constraints in the context of product configuration using the MiniZinc constraint language, and generating test specifications in the context of railway infrastructure using the Balise Telegram Test Language. Our results show that defining an intermediate scope-limited DSL improves the performance of an LLM in our evaluated application-specific code generation scenarios. However, we can not guarantee that the presented performance results are generalizable to all scenarios.en
dc.identifier.doi10.18420/se2025-ws-13
dc.identifier.eissn2944-7682
dc.identifier.issn2944-7682
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/45821
dc.language.isoen
dc.publisherGesellschaft für Informatik, Bonn
dc.relation.ispartofSoftware Engineering 2025 – Companion Proceedings
dc.subjectCode Generation
dc.subjectLarge Language Models
dc.subjectNiche Programming Languages
dc.subjectDomainspecific Languages
dc.titleCode Generation for Niche Programming Languages with Large Language Modelsen
mci.conference.date22.-28. Februar 2025
mci.conference.locationKarlsruhe
mci.conference.sessiontitle2nd Workshop on Generative and Neurosymbolic AI in Software Engineering (GenSE’25)
mci.reference.pages147-165

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
B4-3.pdf
Größe:
306.78 KB
Format:
Adobe Portable Document Format