The HARPS-N archive through a Cassandra, NoSQL database suite?

The TNG-INAF is developing the science archive for the WEAVE instrument. The underlying architecture of the archive is based on a non relational database, more precisely, on Apache Cassandra cluster, which uses a NoSQL technology. In order to test and validate the use of this architecture, we created a local archive which we populated with all the HARPSN spectra collected at the TNG since the instrument's start of operations in mid-2012, as well as developed tools for the analysis of this data set. The HARPS-N data set is two orders of magnitude smaller than WEAVE, but we want to demonstrate the ability to walk through a complete data set and produce scientific output, as valuable as that produced by an ordinary pipeline, though without accessing directly the FITS files. The analytics is done by Apache Solr and Spark and on a relational PostgreSQL database. As an example, we produce observables like metallicity indexes for the targets in the archive and compare the results with the ones coming from the HARPS-N regular data reduction software. The aim of this experiment is to explore the viability of a high availability cluster and distributed NoSQL database as a platform for complex scientific analytics on a large data set, which will then be ported to the WEAVE Archive System (WAS) which we are developing for the WEAVE multi object, fiber spectrograph.

Coverage

Software and Cyberinfrastructure for Astronomy IV

All editors

Chiozzi, Gianluca; Guzman, Juan C.

Series

PROCEEDINGS OF SPIE

Volume

9913

Start page

99132A

Conferenece

Software and Cyberinfrastructure for Astronomy IV

Conferenece place

Edinburgh, United Kingdom

Conferenece date

26 June - 1 July 2016

Uri

http://hdl.handle.net/20.500.12386/28342

Url

https://www.spiedigitallibrary.org/conference-proceedings-of-spie/9913/1/The-HARPS-N-archive-through-a-Cassandra-NoSQL-database-suite/10.1117/12.2233137.short

Issn Identifier

0277-786X

Ads BibCode

2016SPIE.9913E..2AM

Rights

open.access

File(s)

Name

99132A.pdf

Description

Pdf editoriale

Size

515.74 KB

Format

Adobe PDF

Checksum (MD5)

81a7eb7b53375bf257cb008b875010db

The HARPS-N archive through a Cassandra, NoSQL database suite?

Explore By

Information and guides for authors