Assessing Large Language Models in the Agricultural Sector: A Comprehensive Analysis Utilizing a Novel Synthetic Benchmark Dataset

Kästing, Marvin; Hänig, Christian

Assessing Large Language Models in the Agricultural Sector: A Comprehensive Analysis Utilizing a Novel Synthetic Benchmark Dataset

dc.contributor.author	Kästing, Marvin
dc.contributor.author	Hänig, Christian
dc.contributor.editor	Klein, Maike
dc.contributor.editor	Krupka, Daniel
dc.contributor.editor	Winter, Cornelia
dc.contributor.editor	Gergeleit, Martin
dc.contributor.editor	Martin, Ludger
dc.date.accessioned	2024-10-21T18:24:12Z
dc.date.available	2024-10-21T18:24:12Z
dc.date.issued	2024
dc.description.abstract	This paper provides a comprehensive study of Large Language Models (LLMs) for question-answering and information retrieval tasks within the agricultural domain. We introduce the novel benchmark dataset BVL QA Corpus 2024 specifically designed to thoroughly evaluate both commercial and non-commercial LLMs in agricultural contexts. Using LLMs, we generate question-answer pairs from paragraphs extracted from domain-specific agricultural documents. Leveraging this newly developed benchmark dataset, we assess a selection of LLMs using standard metrics. Additionally, we develop a prototype Retrieval-Augmented Generation (RAG) system tailored to the agricultural sector. This system is then compared to baseline evaluations to determine the degree of alignment between actual performance and initial upper limit estimations. Our empirical analysis demonstrates that RAG systems outperform baseline LLMs across all metrics.	en
dc.identifier.doi	10.18420/inf2024_113
dc.identifier.isbn	978-3-88579-746-3
dc.identifier.pissn	1617-5468
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/45085
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik e.V.
dc.relation.ispartof	INFORMATIK 2024
dc.relation.ispartofseries	Lecture Notes in Informatics (LNI) - Proceedings, Volume P-352
dc.subject	Large Language Models
dc.subject	Retrieval-Augmented Generation
dc.subject	Agricultural Information Retrieval
dc.subject	Benchmark Dataset
dc.title	Assessing Large Language Models in the Agricultural Sector: A Comprehensive Analysis Utilizing a Novel Synthetic Benchmark Dataset	en
dc.type	Text/Conference Paper
gi.citation.endPage	1286
gi.citation.publisherPlace	Bonn
gi.citation.startPage	1279
gi.conference.date	24.-26. September 2024
gi.conference.location	Wiesbaden
gi.conference.sessiontitle	KoLaZ-24-Kolloquium Landwirtschaft der Zukunft 2024: Digitale Souveränität in der Landwirtschaft, der Lebensmittelkette und dem ländlichen Raum: Trotz, mit oder durch KI?

Dateien

Originalbündel

1 - 1 von 1

Name:: Kaesting_Haenig_Assessing_Large_Language_Models.pdf
Größe:: 420.44 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

P352 - INFORMATIK 2024 - Lock in or log out? Wie digitale Souveränität gelingt