Logo des Repositoriums
 

Computer-Assisted Short Answer Grading Using Large Language Models and Rubrics

dc.contributor.authorMetzler, Tim
dc.contributor.authorPlöger, Paul G.
dc.contributor.authorHees, Jörn
dc.contributor.editorKlein, Maike
dc.contributor.editorKrupka, Daniel
dc.contributor.editorWinter, Cornelia
dc.contributor.editorGergeleit, Martin
dc.contributor.editorMartin, Ludger
dc.date.accessioned2024-10-21T18:24:13Z
dc.date.available2024-10-21T18:24:13Z
dc.date.issued2024
dc.description.abstractGrading student answers and providing feedback are essential yet time-consuming tasks for educators. Recent advancements in Large Language Models (LLMs), including ChatGPT, Llama, and Mistral, have paved the way for automated support in this domain. This paper investigates the efficacy of instruction-following LLMs in adhering to predefined rubrics for evaluating student answers and delivering meaningful feedback. Leveraging the Mohler dataset and a custom German dataset, we evaluate various models, from commercial ones like ChatGPT to smaller open-source options like Llama, Mistral, and Command R. Additionally, we explore the impact of temperature parameters and techniques such as few-shot prompting. Surprisingly, while few-shot prompting enhances grading accuracy closer to ground truth, it introduces model inconsistency. Furthermore, some models exhibit non-deterministic behavior even at near-zero temperature settings. Our findings highlight the importance of rubrics in enhancing the interpretability of model outputs and fostering consistency in grading practices.en
dc.identifier.doi10.18420/inf2024_121
dc.identifier.isbn978-3-88579-746-3
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/45094
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofINFORMATIK 2024
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-352
dc.subjectNatural Language Processing
dc.subjectAutomatic Short Answer Grading
dc.subjectLarge Language Models
dc.subjectRubrics
dc.subjectMistral
dc.subjectChatGPT
dc.subjectLlama
dc.titleComputer-Assisted Short Answer Grading Using Large Language Models and Rubricsen
dc.typeText/Conference Paper
gi.citation.endPage1393
gi.citation.publisherPlaceBonn
gi.citation.startPage1383
gi.conference.date24.-26. September 2024
gi.conference.locationWiesbaden
gi.conference.sessiontitleAI@WORK

Dateien

Originalbündel
1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
Metzler_et_al_Computer-Assisted_Short_Answer_Grading.pdf
Größe:
441.38 KB
Format:
Adobe Portable Document Format