This title appears in the Scientific Report :
2018
Please use the identifier:
http://dx.doi.org/10.1142/S0129183118500109 in citations.
Please use the identifier: http://hdl.handle.net/2128/18722 in citations.
Portable multi-node LQCD Monte Carlo simulations using OpenACC
Portable multi-node LQCD Monte Carlo simulations using OpenACC
This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a...
Saved in:
Personal Name(s): | Bonati, Claudio |
---|---|
Calore, Enrico / D’Elia, Massimo / Mesiti, Michele / Negro, Francesco / Sanfilippo, Francesco / Schifano, Sebastiano Fabio / Silvi, Giorgio (Corresponding author) / Tripiccione, Raffaele | |
Contributing Institute: |
Jülich Supercomputing Center; JSC |
Published in: | International journal of modern physics / C, 29 (2018) 01, S. 1850010 - |
Imprint: |
Singapore [u.a.]
World Scientific
2018
|
DOI: |
10.1142/S0129183118500109 |
Document Type: |
Journal Article |
Research Program: |
Doktorand ohne besondere Förderung Computational Science and Mathematical Methods |
Link: |
OpenAccess OpenAccess |
Publikationsportal JuSER |
Please use the identifier: http://hdl.handle.net/2128/18722 in citations.
This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors. |