Auflistung nach Schlagwort "Data quality assessment"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- TextdokumentApproaches for Automated Data Quality Analysis: Syntactic and Semantic Assessment(INFORMATIK 2022, 2022) Ahiagble,Agbodzea Pascal; Stein,HannahData quality significantly influences data usability and plays an important role in data trading. This paper presents a data quality analysis (DQA) of data tables on two levels. The first, the so-called syntactic level, concerns the structure of the elements within the database and the second, the so-called semantic level, concerns the relationship between the elements in the database and the "real world". Based on a literature review the most relevant data quality criteria and corresponding metrics were derived. Subsequently, based on heuristics, a data-centric approach and an unsupervised machine learning clustering algorithm DBSCAN, a service for automated DQA, is designed and implemented (syntactic DQA). In the next step, an automated semantic DQA service as well. The approach is used to examine data tables for example for missing relevant columns (i.e., semantic completeness). A data quality index represents the services’ output, which is derived from the automated analysis of various data quality criteria. This enables the assessment of data quality, as well as the detection of potentials for improving quality and thus increasing the value of tradeable data.
- TextdokumentPrescriptive and descriptive quality metrics for the quality assessment of operational data(INFORMATIK 2022, 2022) Viedt,Isabell; Mädler,Jonathan; Khaydarov,Valentin; Urbas,LeonIn the process industry data-driven and hybrid modeling approaches are increasingly popular in regards to process monitoring, optimization and control. The major problem with process data is that the data collected in process plants during operation, even though available in vast amounts, might generally be low in information content. The collected data usually represents certain operating points while anomalies, ramp-up and shut-down are rare occurrences and therefore only seldom covered. Due to its possibly low quality, the use of such data might lead to an inadequate model coverage and overall low model performance. Data quality assessment prior to modeling is crucial to allow an estimation of model quality prior to the model development. Therefore, the following paper discusses prescriptive and descriptive assessment metrics for the quality assessment of process data and their potential application in the quality assurance of data-driven and hybrid models. This approach will in later application support the user in their choice of modeling approach.