Logo des Repositoriums
 
Konferenzbeitrag

Exploring Existing Tools for Managing Different Types of Research Data

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2024

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

Data management is important for the reproducibility of scientific research. One important aspect of data management is version control. In software development, version control tools like Git are commonly used to track source code changes and releases, reproduce earlier versions, find defects, and simplify their repair. In scientific research, scientists often have to manage large amounts of data, while also trying to achieve reproducibility of results and wanting to identify and repair defects in the data. Version control software like Git is specialized for managing source code and other textual files, making it often unsuitable for managing other types of data. This creates a need for version control tools specialized for dealing with research data. This paper establishes requirements for version control tools for research data and evaluates Git Large File Storage, Neptune, Pachyderm, DVC, and Snowflake according to those requirements. We found that none of the evaluated tools fulfill all of our requirements, but we still recommend DVC, Git LFS, and Pachyderm for the use cases they do support.

Beschreibung

Freund, Adrian; Hajiabadi, Hamideh; Koziolek, Anne (2024): Exploring Existing Tools for Managing Different Types of Research Data. INFORMATIK 2024. DOI: 10.18420/inf2024_189. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-746-3. pp. 2181-2193. Hochschule 2034 / HS2034. Wiesbaden. 24.-26. September 2024

Zitierform

Tags