This title appears in the Scientific Report :
2017
Please use the identifier:
http://hdl.handle.net/2128/26302 in citations.
Please use the identifier: http://dx.doi.org/10.1145/3144763.3144765 in citations.
Supporting Software Engineering Practices in the Development of Data-Intensive HPC Applications with the JuML Framework
Supporting Software Engineering Practices in the Development of Data-Intensive HPC Applications with the JuML Framework
The development of high performance computing applications is considerably different from traditional software development. This distinction is due to the complex hardware systems, inherent parallelism, different software lifecycle and workflow, as well as (especially for scientific computing appli...
Saved in:
Personal Name(s): | Götz, Markus (Corresponding author) |
---|---|
Book, Matthias / Bodenstein, Christian / Riedel, Morris | |
Contributing Institute: |
Jülich Supercomputing Center; JSC |
Published in: |
Proceedings of the 1st International Workshop on Software Engineering for High Performance Computing in Computational and Data-enabled Science & Engineering |
Imprint: |
ACM Press
2017
|
Physical Description: |
1-8 |
DOI: |
10.1145/3144763.3144765 |
Conference: | Workshop on Software Engineering for High Performance Computing in Computational and Data-enabled Science & Engineering, Denver (USA), 2017-11-12 - 2017-11-17 |
Document Type: |
Contribution to a book Contribution to a conference proceedings |
Research Program: |
DEEP - Extreme Scale Technologies Doktorand ohne besondere Förderung Data-Intensive Science and Federated Computing |
Link: |
Restricted Restricted OpenAccess |
Publikationsportal JuSER |
Please use the identifier: http://dx.doi.org/10.1145/3144763.3144765 in citations.
The development of high performance computing applications is considerably different from traditional software development. This distinction is due to the complex hardware systems, inherent parallelism, different software lifecycle and workflow, as well as (especially for scientific computing applications) partially unknown requirements at design time. This makes the use of software engineering practices challenging, so only a small subset of them are actually applied. In this paper, we discuss the potential for applying software engineering techniques to an emerging field in high performance computing, namely large-scale data analysis and machine learning. We argue for the employment of software engineering techniques in the development of such applications from the start, and the design of generic, reusable components. Using the example of the Juelich Machine Learning Library (JuML), we demonstrate how such a framework can not only simplify the design of new parallel algorithms, but also increase the productivity of the actual data analysis workflow. We place particular focus on the abstraction from heterogeneous hardware, the architectural design as well as aspects of parallel and distributed unit testing. |