Logo des Repositoriums
 

Meeting the challenges of integrating large and diverse geographic databases

dc.contributor.authorSchäfers, Michael
dc.contributor.authorLipeck, Udo W.
dc.contributor.editorSeidl, Thomas
dc.contributor.editorRitter, Norbert
dc.contributor.editorSchöning, Harald
dc.contributor.editorSattler, Kai-Uwe
dc.contributor.editorHärder, Theo
dc.contributor.editorFriedrich, Steffen
dc.contributor.editorWingerath, Wolfram
dc.date.accessioned2017-06-30T11:40:44Z
dc.date.available2017-06-30T11:40:44Z
dc.date.issued2015
dc.description.abstractUsing data matching techniques to identify multiple representations of the same real-world entity is an essential step for all data integration tasks. While matching standard data types like strings or numbers with generic methods is well-studied, approaches for non-standard data have to deal with domain-specific challenges. For geographic databases containing spatial features we face a high degree of diversity in terms of geometric and semantic modeling between data sources. Likewise, complex geometric data types and topological relations require efficient processing. Finally, geodatabases can grow very large if they cover extensive regions or whole countries. In this paper, we present our SimMatching approach for integrating relational geodatabases that meets these challenges. In particular, we study road networks from several data sources. Our iterative algorithm matches semantically equivalent objects based on geometric and semantic attribute similarity measures. Relational similarity helps to solve difficult situations by exploiting the underlying graph structure of road networks: Already confirmed neighbouring matchings improve the similarity value of a given matching. Adaptability to diverse input data is reached by combining and weighting subsets of similarity measures. A greedy approach and an efficient end-toend-system built upon simple and flexible components outperform previous systems in terms of runtimes while showing matching results of high quality. Scalability to large geodatabases is supported by a partitioning framework together with parallel processing. We have experimentally verified our approach with large real-world datasets.en
dc.identifier.isbn978-3-88579-635-0
dc.identifier.pissn1617-5468
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofDatenbanksysteme für Business, Technologie und Web (BTW 2015)
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-241
dc.titleMeeting the challenges of integrating large and diverse geographic databasesen
dc.typeText/Conference Paper
gi.citation.endPage156
gi.citation.publisherPlaceBonn
gi.citation.startPage137
gi.conference.date2.-3. März 2015
gi.conference.locationHamburg

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
137.pdf
Größe:
262.99 KB
Format:
Adobe Portable Document Format