This title appears in the Scientific Report :
2013
Please use the identifier:
http://dx.doi.org/10.1109/HPCC.2012.88 in citations.
Automatic Tuning of the Fast Multipole Method Based on Integrated Performance Prediction
Automatic Tuning of the Fast Multipole Method Based on Integrated Performance Prediction
The Fast Multipole Method (FMM) is an efficient, widely used method for the solution of N-body problems. One of the main data structures is a hierarchical tree data structure describing the separation into near-field and far-field particle interactions. This article presents a method for automatic t...
Saved in:
Personal Name(s): | Dachsel, Holger (Corresponding author) |
---|---|
Hofmann, Michael / Lang, Jens / Runger, Gudula | |
Contributing Institute: |
Jülich Supercomputing Center; JSC |
Published in: |
2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems |
Imprint: |
IEEE
2012
|
Physical Description: |
617-624 |
DOI: |
10.1109/HPCC.2012.88 |
Conference: | 2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS), Liverpool (United Kingdom), 2012-06-25 - 2012-06-27 |
Document Type: |
Contribution to a book Contribution to a conference proceedings |
Research Program: |
Computational Science and Mathematical Methods |
Publikationsportal JuSER |
The Fast Multipole Method (FMM) is an efficient, widely used method for the solution of N-body problems. One of the main data structures is a hierarchical tree data structure describing the separation into near-field and far-field particle interactions. This article presents a method for automatic tuning of the FMM by selecting the optimal FMM tree depth based on an integrated performance prediction of the FMM computations. The prediction method exploits benchmarking of significant parts of the FMM implementation to adapt the tuning to the specific hardware system being used. Furthermore, a separate analysis phase at runtime is used to predict the computational load caused by the specific particle system to be computed. The tuning method was integrated into an FMM implementation. Performance results show that a reliable determination of the tree depth is achieved, thus leading to minimal execution times of the FMM algorithm. |