Bulut, AhmetRitter, NorbertHenrich, AndreasLehner, WolfgangThor, AndreasFriedrich, SteffenWingerath, Wolfram2017-06-302017-06-302015978-3-88579-636-7We give the details of our reference architecture called RightInsight for enabling rapid data science. RightInsight is based purely on open source technologies. The data is stored in a standard distributed file system such as HDFS. The stored data is processed in Apache Spark, which provides an enhanced Map/Reduce programming environment. Its rich and powerful machine learning base makes it easy to construct descriptive, prescriptive, and predictive models. In addition to providing an agile environment for making sense of the data and the data science problem at hand, its Python-based middleware with a wide array of scientific libraries such as scipy, numpy, matplotlib, and pandas, enables interactive and exploratory data analysis. The ability to ask questions, especially the right questions, and to do what-if analysis is extremely important for any serious data science project. The results of such exploratory analyses are stored in a suitable format that is easily consumable in the Web tier. Using rich JavaScript libraries such as data driven documents and bootstrap, the formatted data can be visualised within a Web browser in creative ways for rapid insight discovery.enRightinsight: open source architecture for data scienceText/Conference Paper1617-5468