Auflistung nach Autor:in "Peukert, Eric"
1 - 10 von 10
Treffer pro Seite
Sortieroptionen
- JournalBig Data Competence Center ScaDS Dresden/Leipzig: Overview and selected research activities(Datenbank-Spektrum: Vol. 19, No. 1, 2019) Rahm, Erhard; Nagel, Wolfgang E.; Peukert, Eric; Jäkel, René; Gärtner, Fabian; Stadler, Peter F.; Wiegreffe, Daniel; Zeckzer, Dirk; Lehner, Wolfgang
- TextdokumentBig graph analysis by visually created workflows(BTW 2019, 2019) Rostami, M. Ali; Peukert, Eric; Wilke, Moritz; Rahm, ErhardThe analysis of large graphs has received considerable attention recently but current solutions are typically hard to use. In this demonstration paper, we report on an effort to improve the usability of the open-source system Gradoop for processing and analyzing large graphs. This is achieved by integrating Gradoop into the popular open-source software KNIME to visually create graph analysis workflows, without the need for coding. We outline the integration approach and discuss what will be demonstrated.
- JournalBIGGR: Bringing Gradoop to Applications(Datenbank-Spektrum: Vol. 19, No. 1, 2019) Rostami, M. Ali; Kricke, Matthias; Peukert, Eric; Kühne, Stefan; Wilke, Moritz; Dienst, Steffen; Rahm, Erhard
- KonferenzbeitragComparing similarity combination methods for schema matching(INFORMATIK 2010. Service Science – Neue Perspektiven für die Informatik. Band 1, 2010) Peukert, Eric; Maßmann, Sabine; König, KathleenA recurring manual task in data integration or ontology alignment is finding mappings between complex schemas. In order to reduce the manual effort, many matching algorithms for semi-automatically computing mappings were introduced. In the last decade it turned out that a combination of matching algorithms often improves mapping quality. Many possible combination methods can be found in literature, each promising good result quality for a specific domain of schemas. We introduce the rationale of each strategy shortly. Then we evaluate the most commonly used methods on a number of mapping tasks and try to find the most robust strategy that behaves well across all given tasks.
- TextdokumentGraph Data Transformations in Gradoop(BTW 2019, 2019) Kricke, Matthias; Peukert, Eric; Rahm, ErhardThe analysis of graph data using graph database and distributed graph processing systems has gained significant interest. However, relatively little effort has been devoted to preparing the graph data for analysis, in particular to transform and integrate data from different sources. To support such ETL processes for graph data we investigate transformation operations for property graphs managed by the distributed platform Gradoop. We also provide initial results of a runtime evaluation of the proposed graph data transformations.
- TextdokumentKOBRA: Praxisfähige lernbasierte Verfahren zur automatischen Konfiguration von Business-Regeln in Duplikaterkennungssystemen(INFORMATIK 2020, 2021) Braun, Simone; Alkhouri, Georges; Peukert, EricDuplikaterkennung, -suche und -konsolidierung für Kunden-und Geschäftspartnerdaten, sog. „Identity Resolution“, ist die Voraussetzung für erfolgreiches Customer Relationship Management und Customer Experience Management, aber auch für das Risikomanagement zur Minimierung von Betrugsrisiken und Einhaltung regulatorischer Vorschriften und viele weitere Anwendungsfälle. Diese Systeme sind jedoch hochkomplex und müssen individuell an die kundenspezifischen Anforderungen angepasst werden. Der Einsatz lernbasierter Verfahren bietet großes Potenzial zur automatisierten Anpassung. In diesem Beitrag präsentieren wir für ein KMU praxisfähige, lernbasierte Verfahren zur automatischen Konfiguration von Business-Regeln in Duplikaterkennungssystemen. Dabei wurden für Fachanwender Möglichkeiten entwickelt, um beispielgetrieben das Match-System an individuelle Business-Regeln (u.a. Umzugserkennung, Sperrlistenabgleich) anzupassen und zu konfigurieren. Die entwickelten Verfahren wurden evaluiert und in einer prototypischen Lösung integriert. Wir konnten zeigen, dass unser Machine-Learning-Verfahren, die von einem Domainexperten erstellten Business-Regeln für das Duplikaterkennungssystem „identity“ verbessern konnte. Zudem konnte der hierzu erforderliche Zeitaufwand verkürzt werden.
- ZeitschriftenartikelScaDS Dresden/Leipzig – A competence center for collaborative big data research(it - Information Technology: Vol. 60, No. 5-6, 2018) Jäkel, René; Peukert, Eric; Nagel, Wolfgang E.; Rahm, ErhardThe efficient and intelligent handling of large, often distributed and heterogeneous data sets increasingly determines the scientific and economic competitiveness in most application areas. Mobile applications, social networks, multimedia collections, sensor networks, data intense scientific experiments, and complex simulations nowadays generate a huge data deluge. Nonetheless, processing and analyzing these data sets with innovative methods open up new opportunities for its exploitation and new insights. Nevertheless, the resulting resource requirements exceed usually the possibilities of state-of-the-art methods for the acquisition, integration, analysis and visualization of data and are summarized under the term big data. ScaDS Dresden/Leipzig, as one Germany-wide competence center for collaborative big data research, bundles efforts to realize data-intensive applications for a wide range of applications in science and industry. In this article, we present the basic concept of the competence center and give insights in some of its research topics.
- KonferenzbeitragA smart link infrastructure for integrating and analyzing process data(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Peukert, Eric; Wartner, Christian; Rahm, ErhardAn often forgotten asset of many companies is internal process data. From the everyday processes that run within companies, huge amounts of such data is collected within different software systems. However, the value of analyzing this data holistically is often not exploited. This is mainly due to the inherent heterogeneity of the different data sources and the missing flexibility of existing approaches to integrate additional sources in an ad-hoc fashion. In this paper the Smart Link Infrastructure is introduced. It offers tools that enable data integration and linking to support a holistic analysis of process data. The infrastructure consists of easy to use components for matching schemas, linking entities, storing entities as a graph and executing pattern queries on top of the integrated data. We showcase the value of the presented integration approach with two real world usecases, one based on knowledge management in manufacturing from the LinkedDesign project and another one based on an agile software development process.
- ZeitschriftenartikelThe First Data Science Challenge at BTW 2017(Datenbank-Spektrum: Vol. 17, No. 3, 2017) Hirmer, Pascal; Waizenegger, Tim; Falazi, Ghareeb; Abdo, Majd; Volga, Yuliya; Askinadze, Alexander; Liebeck, Matthias; Conrad, Stefan; Hildebrandt, Tobias; Indiono, Conrad; Rinderle-Ma, Stefanie; Grimmer, Martin; Kricke, Matthias; Peukert, EricThe 17th Conference on Database Systems for Business, Technology, and Web (BTW2017) of the German Informatics Society (GI) took place in March 2017 at the University of Stuttgart in Germany. A Data Science Challenge was organized for the first time at a BTW conference by the University of Stuttgart and Sponsor IBM. We challenged the participants to solve a data analysis task within one month and present their results at the BTW. In this article, we give an overview of the organizational process surrounding the Challenge, and introduce the task that the participants had to solve. In the subsequent sections, the final four competitor groups describe their approaches and results.
- KonferenzbeitragWebclipse – Rich Internet Applications auf Grundlage serverseitiger Plugins(Workshop Gemeinschaften in Neuen Medien (GeNeMe) 2007, 2007) Lorz, Alexander; Peukert, Eric; Moncsek, Andy