Logo des Repositoriums
 

Handling Big Data in Astronomy and Astrophysics: Rich Structured Queries on Replicated Cloud Data with XtreemFS

dc.contributor.authorEnke, Harry
dc.contributor.authorPartl, Adrian
dc.contributor.authorReinefeld, Alexander
dc.contributor.authorSchintke, Florian
dc.date.accessioned2018-01-10T13:18:43Z
dc.date.available2018-01-10T13:18:43Z
dc.date.issued2012
dc.description.abstractWith recent observational instruments and survey campaigns in astrophysics, efficient analysis of big structured data becomes more and more relevant. While providing good query expressiveness and data analysis capabilities through SQL, off-the-shelf RDBMS are yet not well prepared to handle high volume scientific data distributed across several nodes, neither for fast data ingest nor for fast spatial queries. Our SQL query parser and job manager performs query reformulation to spread queries to data nodes, gathering outputs on a head node and providing them again to the shards for subsequent processing steps. We combine this data analysis architecture with the cloud data storage component XtreemFS for automatic data replication to improve the availability and access latency. With our solution we perform rich structured data analysis expressed using SQL on large amounts of structured astrophysical data distributed across numerous storage nodes in parallel. The cloud storage virtualization with XtreemFS provides elasticity and reproducibility of scientific analysis tasks through its snapshot capability.
dc.identifier.pissn1610-1995
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/11654
dc.publisherSpringer
dc.relation.ispartofDatenbank-Spektrum: Vol. 12, No. 3
dc.relation.ispartofseriesDatenbank-Spektrum
dc.titleHandling Big Data in Astronomy and Astrophysics: Rich Structured Queries on Replicated Cloud Data with XtreemFS
dc.typeText/Journal Article
gi.citation.endPage181
gi.citation.startPage173

Dateien