Logo des Repositoriums
 
Konferenzbeitrag

On Data Spaces for Retrieval Augmented Generation

Zusammenfassung

Large Language Models (LLMs) have revolutionized knowledge retrieval from natural language queries. However, LLMs still face challenges regarding the creation of domain-specific and accurate answers. Recently, Retrieval Augmented Generation (RAG) architecture has been proposed as one approach to addressing these challenges. While current research focuses on optimizing document retrieval and augmenting the initial query accordingly, we identify untapped potentials of RAG to retrieve knowledge from heterogeneous data sources via data spaces. In this work, we investigate three conceptual integration scenarios between RAG and data spaces. Our findings indicate that given the data space extended RAG, it could provide domain-specific information retrieval with diverse data sources. However, solutions to mitigate unintended information leakage require further consideration.

Beschreibung

Hermsen, Felix; Nitz, Lasse; Akbari Gurabi, Mehdi; Matzutt, Roman; Mandal, Avikarsha (2024): On Data Spaces for Retrieval Augmented Generation. INFORMATIK 2024. DOI: 10.18420/inf2024_57. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-746-3. pp. 701-707. GRANITE – EJEA: Europe meets Japan: Intercultural Workshop on Data Sovereignty and Generative AI: Applications, Design, Social, Ethical and Technological Impact. Wiesbaden. 24.-26. September 2024

Zitierform

Tags