Code Generation for Niche Programming Languages with Large Language Models
dc.contributor.author | Kogler, Philipp | |
dc.contributor.author | Chen, Wei | |
dc.contributor.author | Wallner, Stefan | |
dc.contributor.editor | Feichtinger, Kevin | |
dc.contributor.editor | Sonnleithner, Lisa | |
dc.contributor.editor | Hajiabadi, Hamideh | |
dc.date.accessioned | 2025-02-14T10:03:34Z | |
dc.date.available | 2025-02-14T10:03:34Z | |
dc.date.issued | 2025 | |
dc.description.abstract | Code generation is a prominent use-case for Large Language Models (LLMs). Specialized LLMs such as CodeLlama or Codestral are trained on a large variety of programming languages and achieve a strong performance on coding tasks. However, when applied to less common programming languages which are not included in their pre-training corpus, their performance decreases. In this work, we describe an approach to integrate a LLM in the context of a coding copilot for specific applications where code shall be generated in a niche general-purpose programming language. We study the use of an intermediate domain-specific-language to limit the scope to the application-specific needs, and to enable the LLM to reliably generate code in such an application-specific scenario. We evaluate this method on two use-cases: Generating constraints in the context of product configuration using the MiniZinc constraint language, and generating test specifications in the context of railway infrastructure using the Balise Telegram Test Language. Our results show that defining an intermediate scope-limited DSL improves the performance of an LLM in our evaluated application-specific code generation scenarios. However, we can not guarantee that the presented performance results are generalizable to all scenarios. | en |
dc.identifier.doi | 10.18420/se2025-ws-13 | |
dc.identifier.eissn | 2944-7682 | |
dc.identifier.issn | 2944-7682 | |
dc.identifier.uri | https://dl.gi.de/handle/20.500.12116/45821 | |
dc.language.iso | en | |
dc.publisher | Gesellschaft für Informatik, Bonn | |
dc.relation.ispartof | Software Engineering 2025 – Companion Proceedings | |
dc.subject | Code Generation | |
dc.subject | Large Language Models | |
dc.subject | Niche Programming Languages | |
dc.subject | Domainspecific Languages | |
dc.title | Code Generation for Niche Programming Languages with Large Language Models | en |
mci.conference.date | 22.-28. Februar 2025 | |
mci.conference.location | Karlsruhe | |
mci.conference.sessiontitle | 2nd Workshop on Generative and Neurosymbolic AI in Software Engineering (GenSE’25) | |
mci.reference.pages | 147-165 |
Dateien
Originalbündel
1 - 1 von 1