This title appears in the Scientific Report :
2009
Please use the identifier:
http://dx.doi.org/10.1016/j.parco.2008.12.012 in citations.
Scalable timestamp synchronization for event traces of message-passing applications
Scalable timestamp synchronization for event traces of message-passing applications
Event traces are helpful in understanding the performance behavior of message-passing applications since they allow the in-depth analysis of communication and synchronization patterns. However, the absence of synchronized clocks may render the analysis ineffective because inaccurate relative event t...
Saved in:
Personal Name(s): | Becker, D. |
---|---|
Rabenseifner, R. / Wolf, F. / Linford, J. | |
Contributing Institute: |
Jülich Supercomputing Center; JSC JARA - HPC; JARA-HPC |
Published in: | Parallel computing, 35 (2009) S. 595 - 607 |
Imprint: |
Amsterdam [u.a.]
North-Holland, Elsevier Science
2009
|
Physical Description: |
595 - 607 |
DOI: |
10.1016/j.parco.2008.12.012 |
Document Type: |
Journal Article |
Research Program: |
Scientific Computing |
Series Title: |
Parallel Computing
35 |
Subject (ZB): | |
Publikationsportal JuSER |
Event traces are helpful in understanding the performance behavior of message-passing applications since they allow the in-depth analysis of communication and synchronization patterns. However, the absence of synchronized clocks may render the analysis ineffective because inaccurate relative event timings may misrepresent the logical event order and lead to errors when quantifying the impact of certain behaviors. Although linear offset interpolation can restore consistency to some degree, time-dependent drifts and other inaccuracies may still disarrange the original succession of events - especially during longer runs. The controlled logical clock algorithm accounts for such violations in point-to-point communication by shifting message events in time as much as needed while trying to preserve the length of local intervals. In this article, we describe how the controlled logical clock is extended to collective communication to enable the correction of realistic message-passing traces. We present a parallel version of the algorithm scaling to more than thousand processes and evaluate its accuracy by showing that it eliminates inconsistent inter-process timings while preserving the length of local intervals. (C) 2009 Elsevier B.V. All rights reserved. |