Combining Retrieval-Augmented Generation and Few-Shot Learning for Model Synthesis of Uncommon DSLs

Baumann, Nils; Diaz, Juan Sebastian; Michael, Judith; Netz, Lukas; Nqiri, Haron; Reimer, Jan; Rumpe, Bernhard

Combining Retrieval-Augmented Generation and Few-Shot Learning for Model Synthesis of Uncommon DSLs

dc.contributor.author	Baumann, Nils
dc.contributor.author	Diaz, Juan Sebastian
dc.contributor.author	Michael, Judith
dc.contributor.author	Netz, Lukas
dc.contributor.author	Nqiri, Haron
dc.contributor.author	Reimer, Jan
dc.contributor.author	Rumpe, Bernhard
dc.contributor.editor	Giese, Holger
dc.contributor.editor	Rosenthal
dc.contributor.editor	Kristina
dc.date.accessioned	2024-03-12T05:30:27Z
dc.date.available	2024-03-12T05:30:27Z
dc.date.issued	2024
dc.description.abstract	We introduce a method that empowers large language models (LLMs) to generate models for domain-specific languages (DSLs) for which the LLM has little to no training data on. Common LLMs such as GPT-4, Llama 2, or Bard are trained on publicly available data and thus have the capability to produce models for well-known modeling languages such as PlantUML, however, they perform worse on lesser-known or unpublished DSLs. Previous work focused on the usage of few-shot learning (FSL) to synthesize models but did not address or evaluate the potential of retrieval-augmented generation (RAG) to provide fitting examples for the FSL-based modeling approach. In this work, we propose a toolchain and test each building block individually: We use the MontiCore Sequence Diagram Language, which GPT-4 has minimal training data on, to assess the extent to which FSL enhances the likelihood of synthesizing an accurate model. Additionally, we evaluate how effectively RAG can identify suitable models for user requests and determine whether GPT-4 can distinguish between requests for a specific model and those for general information. We show that RAG and FSL can be used to enable simple model synthesis for uncommon DSLs, as long as there is a fitting knowledge base that can be accessed to provide the needed examples for the FSL approach.	en
dc.identifier.doi	10.18420/modellierung2024-ws-007
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/43781
dc.language.iso	en
dc.pubPlace	Bonn
dc.publisher	Gesellschaft für Informatik e.V.
dc.relation.ispartof	Modellierung 2024 Satellite Events
dc.subject	LLMs
dc.subject	RAG
dc.subject	DSLs
dc.subject	Few-Shot Learning
dc.subject	MDSE
dc.title	Combining Retrieval-Augmented Generation and Few-Shot Learning for Model Synthesis of Uncommon DSLs	en
dc.type	Text/Workshop Paper
gi.conference.date	12. - 15. März
gi.conference.location	Potsdam
gi.conference.sessiontitle	LLM4Modeling

Dateien

Originalbündel

1 - 1 von 1

Name:: 07_Combining Retrieval-Augmented Generation and Few-Shot Learning for Model Synthesis.pdf
Größe:: 561.02 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

Modellierung 2024 - Workshopband