Logo des Repositoriums
 
Konferenzbeitrag

Rightinsight: open source architecture for data science

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2015

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

We give the details of our reference architecture called RightInsight for enabling rapid data science. RightInsight is based purely on open source technologies. The data is stored in a standard distributed file system such as HDFS. The stored data is processed in Apache Spark, which provides an enhanced Map/Reduce programming environment. Its rich and powerful machine learning base makes it easy to construct descriptive, prescriptive, and predictive models. In addition to providing an agile environment for making sense of the data and the data science problem at hand, its Python-based middleware with a wide array of scientific libraries such as scipy, numpy, matplotlib, and pandas, enables interactive and exploratory data analysis. The ability to ask questions, especially the right questions, and to do what-if analysis is extremely important for any serious data science project. The results of such exploratory analyses are stored in a suitable format that is easily consumable in the Web tier. Using rich JavaScript libraries such as data driven documents and bootstrap, the formatted data can be visualised within a Web browser in creative ways for rapid insight discovery.

Beschreibung

Bulut, Ahmet (2015): Rightinsight: open source architecture for data science. Datenbanksysteme für Business, Technologie und Web (BTW 2015) - Workshopband. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-636-7. pp. 151-160. Hamburg. 2.-3. März 2015

Schlagwörter

Zitierform

DOI

Tags