This title appears in the Scientific Report :
2023
Please use the identifier:
http://dx.doi.org/10.1007/978-3-031-42785-5_8 in citations.
Please use the identifier: http://dx.doi.org/10.34734/FZJ-2023-05391 in citations.
COMPESCE: A Co-design Approach for Memory Subsystem Performance Analysis in HPC Many-Cores
COMPESCE: A Co-design Approach for Memory Subsystem Performance Analysis in HPC Many-Cores
This paper explores the memory subsystem design through gem5 simulations of a non-uniform memory access (NUMA) architecture with ARM cores equipped with vector engines. And connected to a Network-on-Chip (NoC) following the Coherent Hub Interface (CHI) protocol. The study quantifies the benefits of...
Saved in:
Personal Name(s): | Portero, Antonio (Corresponding author) |
---|---|
Falquez, Carlos / Ho, Nam / Petrakis, Polydoros / Nassyr, Stepan / Marazakis, Manolis / Dolbeau, Romain / Nocua Cifuentes, Jorge A. / Beltran, Luis / Pleiter, Dirk / Suarez, Estela | |
Contributing Institute: |
Institute for Advanced Simulation; IAS Jülich Supercomputing Center; JSC |
Published in: |
Architecture of Computing Systems - 36th International Conference |
Imprint: |
Cham
Springer Nature Switzerland
2023
|
Physical Description: |
105-119 |
ISBN: |
978-3-031-42784-8 978-3-031-42785-5 (electronic) |
DOI: |
10.1007/978-3-031-42785-5_8 |
DOI: |
10.34734/FZJ-2023-05391 |
Conference: | Architecture of Computing Systems - 36th International Conference, Athens (Greece), 2023-06-13 - 2023-06-15 |
Document Type: |
Contribution to a book Contribution to a conference proceedings |
Research Program: |
SGA1 (Specific Grant Agreement 1) OF THE EUROPEAN PROCESSOR INITIATIVE (EPI) Future Computing & Big Data Systems |
Series Title: |
Lecture Notes in Computer Science
13949 |
Link: |
Get full text OpenAccess |
Publikationsportal JuSER |
Please use the identifier: http://dx.doi.org/10.34734/FZJ-2023-05391 in citations.
This paper explores the memory subsystem design through gem5 simulations of a non-uniform memory access (NUMA) architecture with ARM cores equipped with vector engines. And connected to a Network-on-Chip (NoC) following the Coherent Hub Interface (CHI) protocol. The study quantifies the benefits of vectorization, prefetching, and multichannel NoC configurations using a benchmark for generating memory patterns and indexed accesses. The outcomes provide insights into improving bus utilization and bandwidth and reducing stalls in the system. The paper proposes hardware/software (HW/SW) advancements to reach and use the HBM device with a higher percentage than 80% at the memory controllers in the simulated manycore system. |