This title appears in the Scientific Report :
2022
Please use the identifier:
http://dx.doi.org/10.1109/CLUSTER51413.2022.00073 in citations.
Please use the identifier: http://hdl.handle.net/2128/32799 in citations.
Assessing the State of Autovectorization Support based on SVE
Assessing the State of Autovectorization Support based on SVE
So-called SIMD instructions, which trigger operations that process in each clock cycle a data tuple, have become widespread in modern processor architectures. In particular, processors for high-performance computing (HPC) systems rely on this additional level of parallelism to reach a high throughpu...
Saved in:
Personal Name(s): | Brank, Bine (Corresponding author) |
---|---|
Pleiter, Dirk | |
Contributing Institute: |
Jülich Supercomputing Center; JSC |
Imprint: |
IEEE
2022
|
Physical Description: |
556–562 |
DOI: |
10.1109/CLUSTER51413.2022.00073 |
Conference: | 2022 IEEE International Conference on Cluster Computing (CLUSTER), Heidelberg (Germany), 2022-09-05 - 2022-09-08 |
Document Type: |
Contribution to a conference proceedings |
Research Program: |
Mont-Blanc 2020, European scalable, modular and power efficient HPC processor Future Computing & Big Data Systems |
Link: |
OpenAccess |
Publikationsportal JuSER |
Please use the identifier: http://hdl.handle.net/2128/32799 in citations.
So-called SIMD instructions, which trigger operations that process in each clock cycle a data tuple, have become widespread in modern processor architectures. In particular, processors for high-performance computing (HPC) systems rely on this additional level of parallelism to reach a high throughput of arithmetic operations. Leveraging these SIMD instructions can still be challenging for application software developers. This challenge has become simpler due to a compiler technique called auto-vectorization. In this paper, we explore the current state of auto-vectorization capabilities using state-of-the-art compilers using a recent extension of the Arm instruction set architecture, called SVE. We measure the performance gains on a recent processor architecture supporting SVE, namely the Fujitsu A64FX processor. |