Auflistung nach Schlagwort "Graph Analytics"
1 - 4 von 4
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragThe Big Picture: Understanding large-scale graphs using Graph Grouping with GRADOOP(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Junghanns, Martin; Petermann, André; Teichmann, Niklas; Rahm, ErhardGraph grouping supports data analysts in decision making based on the characteristics of large-scale, heterogeneous networks containing millions or even billions of vertices and edges. We demonstrate graph grouping with G , a scalable system supporting declarative programs composed from multiple graph operations. Using social network data, we highlight the analytical capabilities enabled by graph grouping in combination with other graph operators. The resulting graphs are visualized and visitors are invited to either modify existing or write new analytical programs. G is implemented on top of Apache Flink, a state-of-the-art distributed dataflow framework, and thus allows us to scale graph analytical programs across multiple machines. In the demonstration, programs can either be executed locally or remotely on our research cluster.
- KonferenzbeitragDistributed Grouping of Property Graphs with GRADOOP(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Junghanns, Martin; Petermann, André; Rahm, ErhardProperty graphs are an intuitive way to model, analyze and visualize complex relationships among heterogeneous data objects, for example, as they occur in social, biological and information networks. These graphs typically contain thousands or millions of vertices and edges and their entire representation can easily overwhelm an analyst. One way to reduce complexity is the grouping of vertices and edges to summary graphs. In this paper, we present an algorithm for graph grouping with support for attribute aggregation and structural summarization by user-defined vertex and edge properties. The algorithm is part of G , an open-source system for graph analytics. G is implemented on top of Apache Flink, a state-of-the-art distributed dataflow framework, and thus allows us to scale graph analytical programs across multiple machines. Our evaluation demonstrates the scalability of the algorithm on real-world and synthetic social network data.
- KonferenzbeitragEfficient Batched Distance and Centrality Computation in Unweighted and Weighted Graphs(Datenbanksysteme für Business, Technologie und Web (BTW 2017), 2017) Then, Manuel; Günnemann, Stephan; Kemper, Alfons; Neumann, ThomasDistance and centrality computations are important building blocks for modern graph databases as well as for dedicated graph analytics systems. Two commonly used centrality metrics are the compute-intense closeness and betweenness centralities, which require numerous expensive shortest distance calculations. We propose batched algorithm execution to run multiple distance and centrality computations at the same time and let them share common graph and data accesses. Batched execution amortizes the high cost of random memory accesses and presents new vectorization potential on modern CPUs and compute accelerators. We show how batched algorithm execution can be leveraged to significantly improve the performance of distance, closeness, and betweenness centrality calculations on unweighted and weighted graphs. Our evaluation demonstrates that batched execution can improve the runtime of these common metrics by over an order of magnitude.
- TextdokumentGraph Sampling with Distributed In-Memory Dataflow Systems(BTW 2021, 2021) Gomez, Kevin; Täschner, Matthias; Rostami, M. Ali; Rost, Christopher; Rahm, ErhardGiven a large graph, graph sampling determines a subgraph with similar characteristics for certain metrics of the original graph. The samples are much smaller thereby accelerating and simplifying the analysis and visualization of large graphs. We focus on the implementation of distributed graph sampling for Big Data frameworks and in-memory dataflow systems such as Apache Spark or Apache Flink and evaluate the scalability of the new implementations. The presented methods will be open source and be integrated into Gradoop, a system for distributed graph analytics.