This title appears in the Scientific Report :
2021
Please use the identifier:
http://dx.doi.org/10.1109/BigData50022.2020.9378050 in citations.
Please use the identifier: http://hdl.handle.net/2128/31843 in citations.
HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics
HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics
To cope with the rapid growth in available data, the efficiency of data analysis and machine learning libraries has recently received increased attention. Although great advancements have been made in traditional array-based computations, most are limited by the resources available on a single compu...
Saved in:
Personal Name(s): | Götz, Markus (Corresponding author) |
---|---|
Debus, Charlotte / Coquelin, Daniel / Krajsek, Kai / Comito, Claudia / Knechtges, Philipp / Hagemeier, Bjorn / Tarnawa, Michael / Hanselmann, Simon / Siggel, Martin / Basermann, Achim / Streit, Achim | |
Contributing Institute: |
Jülich Supercomputing Center; JSC |
Imprint: |
IEEE
2020
|
Physical Description: |
276-287 |
ISBN: |
978-1-7281-6251-5 |
DOI: |
10.1109/BigData50022.2020.9378050 |
Conference: | 2020 IEEE International Conference on Big Data (Big Data), Atlanta (GA), 2020-12-10 - 2020-12-13 |
Document Type: |
Contribution to a book Contribution to a conference proceedings |
Research Program: |
SimLab Neuroscience Helmholtz Analytics Framework Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups |
Link: |
OpenAccess |
Publikationsportal JuSER |
Please use the identifier: http://hdl.handle.net/2128/31843 in citations.
To cope with the rapid growth in available data, the efficiency of data analysis and machine learning libraries has recently received increased attention. Although great advancements have been made in traditional array-based computations, most are limited by the resources available on a single computation node. Consequently, novel approaches must be made to exploit distributed resources, e.g. distributed memory architectures. To this end, we introduce HeAT, an array-based numerical programming framework for large-scale parallel processing with an easy-to-use NumPy-like API. HeAT utilizes PyTorch as a node-local eager execution engine and distributes the workload on arbitrarily large high-performance computing systems via MPI. It provides both low-level array computations, as well as assorted higher-level algorithms. With HeAT, it is possible for a NumPy user to take full advantage of their available resources, significantly lowering the barrier to distributed data analysis. When compared to similar frameworks, HeAT achieves speedups of up to two orders of magnitude. |