Auflistung nach Schlagwort "metadata"
1 - 6 von 6
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragAdapting RDMO for the Efficient Management of Educational Research Data(21. Fachtagung Bildungstechnologien (DELFI), 2023) Kiesler, Natalie; Schiffner, Daniel; Nieder-Vahrenholz, AxelResearch data management has become a core element of research projects in educational technology research. Tools like the Research Data Management Organiser (RDMO) offer support to researchers by gathering metadata of (planned) projects, studies and their output in a structured manner. It further facilitates machine actionable components so that metadata and information can be exchanged by connecting metadata, repositories, and institutions. In this demo, we present a use case of RDMO at a German research data center. Therein, we adapt RDMO’s export plugin so that it generates JSON in the data structure of the German Network of Educational Research Data, and thus allows the export of metadata from RDMO to research data centers. This proof-of-concept thus presents an application scenario of RDMO, and how it can contribute to an improved data management process in educational technology research.
- KonferenzbeitragFast Approximate Discovery of Inclusion Dependencies(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Kruse, Sebastian; Papenbrock, Thorsten; Dullweber, Christian; Finke, Moritz; Hegner, Manuel; Zabel, Martin; Zöllner, Christian; Naumann, FelixInclusion dependencies (INDs) are relevant to several data management tasks, such as foreign key detection and data integration, and their discovery is a core concern of data profiling. However, n-ary IND discovery is computationally expensive, so that existing algorithms often perform poorly on complex datasets. To this end, we present F , the first approximate IND discovery algorithm. F combines probabilistic and exact data structures to approximate the INDs in relational datasets. In fact, F guarantees to find all INDs and only with a low probability false positives might occur due to the approximation. This little inaccuracy comes in favor of significantly increased performance, though. In our evaluation, we show that F scales to very large datasets and outperforms the state-of-the-art algorithm by a factor of up to six in terms of runtime without reporting any false positives. This shows that F strikes a good balance between efficiency and correctness.
- KonferenzbeitragA Hybrid Approach for Efficient Unique Column Combination Discovery(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Papenbrock, Thorsten; Naumann, FelixUnique column combinations (UCCs) are groups of attributes in relational datasets that contain no value-entry more than once. Hence, they indicate keys and serve data management tasks, such as schema normalization, data integration, and data cleansing. Because the unique column combinations of a particular dataset are usually unknown, UCC discovery algorithms have been proposed to find them. All previous such discovery algorithms are, however, inapplicable to datasets of typical real-world size, e.g., datasets with more than 50 attributes and a million records. We present the hybrid discovery algorithm H UCC, which uses the same discovery techniques as the recently proposed functional dependency discovery algorithm H FD: A hybrid combination of fast approximation techniques and e cient validation techniques. With it, the algorithm discovers all minimal unique column combinations in a given dataset. H UCC does not only outperform all existing approaches, it also scales to much larger datasets.
- KonferenzbeitragThe InsightsNet Climate Change Corpus (ICCC)(BTW 2023, 2023) Bartsch, Sabine; Duan, Changxu; Tan, Sherry; Volkanovska, Elena; Stille, WolfgangThe discourse on climate change has become a centerpiece of public debate, thereby creating a pressing need to analyze the multitude of messages created by the participants in this communication process. In addition to text, messages on this topic are communicated through images, videos, tables and other data objects that are embedded within a document and accompany the text. This paper presents the process of building the InsightsNet Climate Change Corpus (ICCC), a multimodal corpus on the topic of climate change, using NLP tools to enrich corpus metadata, a dataset that lends itself to the exploration of the interplay between the various modalities that constitute the discourse on climate change.
- KonferenzbeitragKnowledge and Metadata Integration for Warehousing Complex Data(Information systems technology and its applications – 6th international conference – ISTA 2007, 2007) Ralaivao, Jean-Christian; Darmont, JérômeWith the ever-growing availability of so-called complex data, especially on the Web, decision-support systems such as data warehouses must store and process data that are not only numerical or symbolic. Warehousing and analyzing such data requires the joint exploitation of metadata and domain-related knowledge, which must thereby be integrated. In this paper, we survey the types of knowledge and meta-data that are needed for managing complex data, discuss the issue of knowledge and metadata integration, and propose a CWM-compliant integration solution that we incorporate into an XML complex data warehousing framework we previously designed.
- Konferenzbeitragnon-linear video(Workshop Audiovisuelle Medien WAM 2010, 2010) Seeliger, Robert; Raeck, Christian; Arbanowski, StefanNon-linear video is a technology developed by the Fraunhofer Institute FOKUS which makes video content an interactive experience. Nonlinear video gives the viewer the opportunity to interact with objects that are part of the video and access supplemental information. On demand, multimedia content is linked with related information. Interactive, time independent navigation opens a new ways to experience video content.