This title appears in the Scientific Report :
2014
Please use the identifier:
http://hdl.handle.net/2128/8040 in citations.
Please use the identifier: http://dx.doi.org/10.1145/2642769.2642783 in citations.
Catching Idlers with Ease: A Lightweight Wait-State Profiler for MPI Programs
Catching Idlers with Ease: A Lightweight Wait-State Profiler for MPI Programs
Load imbalance usually introduces wait states into the execution of parallel programs. Being able to identify and quantify wait states is therefore essential for the diagnosis and remediation of this phenomenon. An established method of detecting wait states is to generate event traces and compare r...
Saved in:
Personal Name(s): | Mao, Guoyong (Corresponding Author) |
---|---|
Böhme, David / Hermanns, Marc-André / Geimer, Markus / Lorenz, Daniel / Wolf, Felix | |
Contributing Institute: |
Jülich Supercomputing Center; JSC |
Published in: | S. 103 |
Published in: |
Proceedings of the 21st European MPI Users' Group Meeting |
Imprint: |
New York, NY, USA
ACM Press
2014
|
Physical Description: |
6 p. |
ISBN: |
978-1-4503-2875-3 |
DOI: |
10.1145/2642769.2642783 |
Conference: | EuroMPI/ASIA 21st European MPI Users' Group Meeting, Kyoto (Japan), 2014-09-09 - 2014-09-12 |
Document Type: |
Contribution to a book Contribution to a conference proceedings |
Research Program: |
Computational Science and Mathematical Methods |
Link: |
OpenAccess |
Publikationsportal JuSER |
Please use the identifier: http://dx.doi.org/10.1145/2642769.2642783 in citations.
Load imbalance usually introduces wait states into the execution of parallel programs. Being able to identify and quantify wait states is therefore essential for the diagnosis and remediation of this phenomenon. An established method of detecting wait states is to generate event traces and compare relevant timestamps across process boundaries. However, large trace volumes usually prevent the analysis of longer execution periods. In this paper, we present an extremely lightweight wait-state profiler which does not rely on traces that can be used to estimate wait states in MPI codes with arbitrarily long runtimes. The profiler combines scalability with portability and low overhead. |