Repository logo
  • English
  • Italiano
Log In
Have you forgotten your password?
  1. Home
  2. PRODOTTI RICERCA INAF
  3. 4 ALTRI PRODOTTI SCIENTIFICI (Other scientific products)
  4. 4.01 Rapporti tecnici INAF
  5. The Gaia AVU-GSR parallel solver: preliminary porting with OpenACC parallelization language of a LSQR-based application in perspective of exascale systems
 

The Gaia AVU-GSR parallel solver: preliminary porting with OpenACC parallelization language of a LSQR-based application in perspective of exascale systems

Date Issued
2022
Author(s)
CESARE, VALENTINA  
•
BECCIANI, Ugo  
•
VECCHIATO, Alberto  
•
PITARI, FABIO  
•
RACITI, MARIO  
•
TUDISCO, GIUSEPPE  
•
Aldinucci, Marco
Abstract
The Gaia Astrometric Verification Unit-Global Sphere Reconstruction (AVU-GSR) Parallel Solver aims to find the positions and the proper motions for ~10^8 stars in our galaxy, besides the attitude and the instrumental settings of the Gaia satellite, and the global parameter 𝛾 of the post Newtonian formalism. To find these parameters, the code solves a system of linear equations, 𝐀 × 𝒙 = 𝒃, where the coefficient matrix 𝐀 is large, containing ~10^11 x 10^8 elements, and sparse. The system of equations is solved with a customized implementation of the iterative preconditioned (PC)-LSQR algorithm and is parallelized on the CPU with MPI+OpenMP, where the computation related to different horizontal portions of the coefficient matrix is assigned to different MPI processes and it is further parallelized on the OpenMP threads. To improve the code performance, we explored the feasibility of a porting of this application on a GPU environment, by replacing the OpenMP directives with the OpenACC correspondent ones. In this preliminary porting, the ~95% of the data is copied from the host (CPU) to the device (GPU) before the entire cycle of iterations, making the code compute bound rather than data-transfers bound. The OpenACC code accelerates of a factor of ~1.5 compared to the OpenMP code. The OpenACC application runs on multiple GPUs and it was tested on the CINECA SuperComputer Marconi100, with 4 V100 GPUs per node having 16 GB of memory each. A following porting, where the OpenACC language is replaced with CUDA, was performed, optimizing the preliminary porting with OpenACC. The CUDA code has just been put into production on Marconi100 and we plan to run it on the future pre-exascale platform Leonardo of CINECA, with 4 next-generation A100 GPUs per node.
Series
INAF Technical Reports - Rapporti Tecnici INAF  
Report number
163
Uri
http://hdl.handle.net/20.500.12386/32451
https://doi.org/10.20371/INAF/TechRep/163
Rights
open.access
File(s)
Loading...
Thumbnail Image
Name

Technical_report_Gaia_MPI_OpenACC_Valentina_Cesare_et_al.pdf

Size

1.78 MB

Format

Adobe PDF

Checksum (MD5)

b554f98d0cbd16e238a587b637777631

Explore By
  • Communities and Collection
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
Information and guides for authors
  • https://openaccess-info.inaf.it: all about open access in INAF
  • How to enter a product: guides to OA@INAF
  • The INAF Policy on Open Access
  • Downloadable documents and templates

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback