The HARPS-N archive through a Cassandra, NoSQL database suite?
Date Issued
2016
Author(s)
Abstract
The TNG-INAF is developing the science archive for the WEAVE instrument. The underlying architecture of the archive is based on a non relational database, more precisely, on Apache Cassandra cluster, which uses a NoSQL technology. In order to test and validate the use of this architecture, we created a local archive which we populated with all the HARPSN spectra collected at the TNG since the instrument's start of operations in mid-2012, as well as developed tools for the analysis of this data set. The HARPS-N data set is two orders of magnitude smaller than WEAVE, but we want to demonstrate the ability to walk through a complete data set and produce scientific output, as valuable as that produced by an ordinary pipeline, though without accessing directly the FITS files. The analytics is done by Apache Solr and Spark and on a relational PostgreSQL database. As an example, we produce observables like metallicity indexes for the targets in the archive and compare the results with the ones coming from the HARPS-N regular data reduction software. The aim of this experiment is to explore the viability of a high availability cluster and distributed NoSQL database as a platform for complex scientific analytics on a large data set, which will then be ported to the WEAVE Archive System (WAS) which we are developing for the WEAVE multi object, fiber spectrograph.
Coverage
Software and Cyberinfrastructure for Astronomy IV
All editors
Chiozzi, Gianluca; Guzman, Juan C.
Series
Volume
9913
Start page
99132A
Conferenece
Software and Cyberinfrastructure for Astronomy IV
Conferenece place
Edinburgh, United Kingdom
Conferenece date
26 June - 1 July 2016
Issn Identifier
0277-786X
Ads BibCode
2016SPIE.9913E..2AM
Rights
open.access
File(s)![Thumbnail Image]()
Loading...
Name
99132A.pdf
Description
Pdf editoriale
Size
515.74 KB
Format
Adobe PDF
Checksum (MD5)
81a7eb7b53375bf257cb008b875010db