Logo des Repositoriums
 

A Classification Framework for Scientific Documents to Support Knowledge Graph Population

dc.contributor.authorKaplan, Angelika
dc.contributor.authorKeim, Jan
dc.contributor.authorGreiner, Lukas
dc.contributor.authorKoziolek, Anne
dc.contributor.authorReussner, Ralf
dc.contributor.editorFeichtinger, Kevin
dc.contributor.editorSonnleithner, Lisa
dc.contributor.editorHajiabadi, Hamideh
dc.date.accessioned2025-02-14T10:03:35Z
dc.date.available2025-02-14T10:03:35Z
dc.date.issued2025
dc.description.abstractResearch papers are a central communication medium to share new scientific insights and progress and are, nowadays, stored as PDF files. However, little effort is spent on reorganizing information with effective knowledge classification and comprehensive representation during the publication process. In terms of software engineering (SE), those papers are also aligned to software artifacts and research data. Aggregating knowledge and empirical evidence is done with systematic literature studies that tend to be very time-consuming and require a manual inspection of the respective research artifacts (paper-and data-wise). Research knowledge graphs like the Open Research Knowledge Graph (ORKG) aim to contribute to and rethink scholarly communication by providing formats while easing the processing of semantic information. Therefore, ORKG offers templates to summarize and structure a research artifact’s content, providing metadata as well. Based on this, researchers can connect similar papers, reuse replication artifacts, and generate literature studies more easily. However, adding papers to ORKG is still a tedious manual process. Moreover, selecting suitable template formats is challenging and can be highly domain-specific. To support the manual process, we aim to provide an automated classification framework for scientific papers, supporting the knowledge graph population. This framework intends to be flexible, allowing various input data and schemas, so it can be applied and trained in a multitude of research fields in SE. In this paper, we present the concept of the framework’s implementation, and an excerpt of the evaluation result in one of the research subfields in SE, namely software architecture and design.en
dc.identifier.doi10.18420/se2025-ws-28
dc.identifier.eissn2944-7682
dc.identifier.issn2944-7682
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/45837
dc.language.isoen
dc.publisherGesellschaft für Informatik, Bonn
dc.relation.ispartofSoftware Engineering 2025 – Companion Proceedings
dc.subjectAutomated Text Classification
dc.subjectNatural Language Processing
dc.subjectScientific Documents in Software Engineering
dc.subjectKnowledge Graph Population
dc.subjectSemantic Aspects
dc.titleA Classification Framework for Scientific Documents to Support Knowledge Graph Populationen
mci.conference.date22.-28. Februar 2025
mci.conference.locationKarlsruhe
mci.conference.sessiontitle1st Workshop on Research Data Management for and in Software Engineering (RDMxSE’25)
mci.reference.pages277-286

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
B6-3.pdf
Größe:
467.2 KB
Format:
Adobe Portable Document Format