LAUDATIO-repository.org - A modular approach of an open access research data repository for a discipline

Zielke, Dennis

LAUDATIO-repository.org - A modular approach of an open access research data repository for a discipline

dc.contributor.author	Zielke, Dennis
dc.contributor.editor	Plödereder, E.
dc.contributor.editor	Grunske, L.
dc.contributor.editor	Schneider, E.
dc.contributor.editor	Ull, D.
dc.date.accessioned	2017-07-26T10:58:46Z
dc.date.available	2017-07-26T10:58:46Z
dc.date.issued	2014
dc.description.abstract	The presentation aims at presenting the LAUDATIO repository, a data repository for research data of historical corpus linguistics. It has a technical focus concerning the requirements that have been taken into consideration for the development of a repository infrastructure based on Fedora. Research data collections are expensive in terms of time, money and knowledge that is necessary to compile them. The scientists' effort can be reduced by sharing these collections with a broader scientific community for re-use and re-evaluation. A common approach of creating sustainable and shareable data is the definition of existing and future open standards for the data creation and maintenance. The implementation and usage of digital repositories enables the management and distribution of that data. However, large collections of linguistic data, like corpora of deeply annotated text, pose unique challenges to repository design, which mainly arise due to the complexity of the applied data models. In the presentation a historical corpus linguistic data repository that is based on community-specific requirements and distinct user scenarios will be presented. Its main focus is on German historical texts including all dialects of time periods ranging from the 9th to the 19th century. The LAUDATIO repository is developed in the cause of a research data infrastructure project funded by the German Research Foundation from 2011 until 2014 at Humboldt- Universität zu Berlin within a cooperation of Computer and Media Service, the Departments of Historical Linguistics and Corpus Linguistics at HU Berlin, as well as INRIA, France. 1 The main objective for the technical implementation of the repository was to ensure the long-term access and preservation as well as the accessibility and the (re-)use of the heterogeneous research data. The requirements specification is essentially based on criteria such as the use of a complex data model for linguistic research data [Od13], their management, the structural collocation, the searchability, and long-term availability of the data. [Zi14] Exemplary technical requirements regarding these aspects and the implemented features of the repository infrastructure include: 1 LAUDATIO Repository: http://www.laudatio-repository.org [last access 26.06.2014] 1667 Compatibility with a complex data model Data indexing and search Data Versioning Registration and management of Persistent Identifiers (PID) Graphical User Interface (GUI) for data presentation Search and visualization with the help of ANNIS2 Fedora Commons3 being an open-source repository framework that is well established in the repository community turned out to be the well-suited solution for the storage and management of linguistic research data considering the defined requirements. In the digital humanities domain Fedora is a widespread repository solution, and is for instance recommended and used by the CLARIN research infrastructure and its participating institutions. The LAUDATIO repository infrastructure is furthermore based on generalizable software modules such as the graphical user interface, the data exchange module between research data and Fedora REST-interface4, and an indexed and faceted search based on Lucene-based5 technology.6 Naturally, the repository is oriented towards international guidelines concerning organizational, legal as well as technical aspects of research data repositories such as the Data Seal of Approval7. The presentation aims at a discussion on the re-usability of the LAUDATIO repository infrastructure modules for other initiatives providing research data infrastructures as well. References [Od13] Odebrecht, C.: Modellierung von linguistischen Forschungsdaten. Korpuslinguistik Kolloquium 13.11.2013, Berlin.	en
dc.identifier.isbn	978-3-88579-626-8
dc.identifier.pissn	1617-5468
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik e.V.
dc.relation.ispartof	Informatik 2014
dc.relation.ispartofseries	Lecture Notes in Informatics (LNI) - Proceedings, Volume P-232
dc.title	LAUDATIO-repository.org - A modular approach of an open access research data repository for a discipline	en
dc.type	Text/Conference Paper
gi.citation.endPage	1668
gi.citation.publisherPlace	Bonn
gi.citation.startPage	1667
gi.conference.date	22.-26. September 2014
gi.conference.location	Stuttgart

Dateien

Originalbündel

1 - 1 von 1

Name:: 1667.pdf
Größe:: 53.8 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

P232 - INFORMATIK 2014