Assessing Large Language Models in the Agricultural Sector: A Comprehensive Analysis Utilizing a Novel Synthetic Benchmark Dataset
dc.contributor.author | Kästing, Marvin | |
dc.contributor.author | Hänig, Christian | |
dc.contributor.editor | Klein, Maike | |
dc.contributor.editor | Krupka, Daniel | |
dc.contributor.editor | Winter, Cornelia | |
dc.contributor.editor | Gergeleit, Martin | |
dc.contributor.editor | Martin, Ludger | |
dc.date.accessioned | 2024-10-21T18:24:12Z | |
dc.date.available | 2024-10-21T18:24:12Z | |
dc.date.issued | 2024 | |
dc.description.abstract | This paper provides a comprehensive study of Large Language Models (LLMs) for question-answering and information retrieval tasks within the agricultural domain. We introduce the novel benchmark dataset BVL QA Corpus 2024 specifically designed to thoroughly evaluate both commercial and non-commercial LLMs in agricultural contexts. Using LLMs, we generate question-answer pairs from paragraphs extracted from domain-specific agricultural documents. Leveraging this newly developed benchmark dataset, we assess a selection of LLMs using standard metrics. Additionally, we develop a prototype Retrieval-Augmented Generation (RAG) system tailored to the agricultural sector. This system is then compared to baseline evaluations to determine the degree of alignment between actual performance and initial upper limit estimations. Our empirical analysis demonstrates that RAG systems outperform baseline LLMs across all metrics. | en |
dc.identifier.doi | 10.18420/inf2024_113 | |
dc.identifier.isbn | 978-3-88579-746-3 | |
dc.identifier.pissn | 1617-5468 | |
dc.identifier.uri | https://dl.gi.de/handle/20.500.12116/45085 | |
dc.language.iso | en | |
dc.publisher | Gesellschaft für Informatik e.V. | |
dc.relation.ispartof | INFORMATIK 2024 | |
dc.relation.ispartofseries | Lecture Notes in Informatics (LNI) - Proceedings, Volume P-352 | |
dc.subject | Large Language Models | |
dc.subject | Retrieval-Augmented Generation | |
dc.subject | Agricultural Information Retrieval | |
dc.subject | Benchmark Dataset | |
dc.title | Assessing Large Language Models in the Agricultural Sector: A Comprehensive Analysis Utilizing a Novel Synthetic Benchmark Dataset | en |
dc.type | Text/Conference Paper | |
gi.citation.endPage | 1286 | |
gi.citation.publisherPlace | Bonn | |
gi.citation.startPage | 1279 | |
gi.conference.date | 24.-26. September 2024 | |
gi.conference.location | Wiesbaden | |
gi.conference.sessiontitle | KoLaZ-24-Kolloquium Landwirtschaft der Zukunft 2024: Digitale Souveränität in der Landwirtschaft, der Lebensmittelkette und dem ländlichen Raum: Trotz, mit oder durch KI? |
Dateien
Originalbündel
1 - 1 von 1
Lade...
- Name:
- Kaesting_Haenig_Assessing_Large_Language_Models.pdf
- Größe:
- 420.44 KB
- Format:
- Adobe Portable Document Format