Logo des Repositoriums
 

Do We Need Real Data? - Testing and Training Algorithms with Artificial Geolocation Data

dc.contributor.authorKaiser, Jan
dc.contributor.authorBavendiek, Kai
dc.contributor.authorSchupp, Sibylle
dc.contributor.editorDavid, Klaus
dc.contributor.editorGeihs, Kurt
dc.contributor.editorLange, Martin
dc.contributor.editorStumme, Gerd
dc.date.accessioned2019-08-27T12:55:21Z
dc.date.available2019-08-27T12:55:21Z
dc.date.issued2019
dc.description.abstractAs big data becomes increasingly important, so do algorithms that operate on geolocation data. Privacy requirements and the cost of collecting large sets of geolocation data, however, make it difficult to test those algorithms with real data. Artificially generated data sets therefore present an appealing alternative. This paper explores the use of two types of neural networks as generators of geolocation data and introduces a method based on the Turing Test to determine whether generated geolocation data is indistinguishable from real data. In an extensive evaluation we apply the method to data generated by our own implementation of neural networks as well as the widely used BerlinMOD generator on the one hand, the four most prominent data sets of real geolocation data covering at total of 65 million records on the other hand. The experiments show that in eleven of twelve cases artificial data sets can be told from real ones. We conclude that, at present, the generators we tested provide no safe replacement for real data.en
dc.identifier.doi10.18420/inf2019_25
dc.identifier.isbn978-3-88579-688-6
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/24972
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofINFORMATIK 2019: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-294
dc.subjectgeolocation data
dc.subjectartificial data
dc.subjectdata generation
dc.subjectneural networks generators
dc.subjectdata quality
dc.titleDo We Need Real Data? - Testing and Training Algorithms with Artificial Geolocation Dataen
dc.typeText/Conference Paper
gi.citation.endPage218
gi.citation.publisherPlaceBonn
gi.citation.startPage205
gi.conference.date23.-26. September 2019
gi.conference.locationKassel
gi.conference.sessiontitleData Science

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
paper3_02.pdf
Größe:
1.3 MB
Format:
Adobe Portable Document Format