This title appears in the Scientific Report :
2020
Please use the identifier:
http://hdl.handle.net/2128/26424 in citations.
Scalable Machine Learning with High Performance and Cloud Computing
Scalable Machine Learning with High Performance and Cloud Computing
Deep Learning is emerging as the leading AI technique owing to the current convergence of scalable computing capability (i.e., HCP and Cloud computing), easy access to large volumes of data, and the emergence of new algorithms enabling robust training of large-scale deep neural networks. The tutoria...
Saved in:
Personal Name(s): | Cavallaro, Gabriele (Corresponding author) |
---|---|
Memon, Mohammad Shahbaz / Sedona, Rocco | |
Contributing Institute: |
Jülich Supercomputing Center; JSC |
Imprint: |
2020
|
Conference: | IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (Online event), 2020-09-26 - 2020-09-27 |
Document Type: |
Lecture |
Research Program: |
DEEP - Extreme Scale Technologies Data-Intensive Science and Federated Computing |
Link: |
Get full text OpenAccess OpenAccess OpenAccess OpenAccess OpenAccess OpenAccess OpenAccess OpenAccess |
Publikationsportal JuSER |
Deep Learning is emerging as the leading AI technique owing to the current convergence of scalable computing capability (i.e., HCP and Cloud computing), easy access to large volumes of data, and the emergence of new algorithms enabling robust training of large-scale deep neural networks. The tutorial aims at providing a complete overview for an audience that is not familiar with these topics.Lecture 1: Introduction -Jülich Supercomputing Centre - Forschungszentrum Jülich-Machine learning and Deep Learning in remote sensing-Deep learning and SupercomputingLecture 2: Levels of Parallelism and High Performance Computing-The Free Lunch is Over-Hardware Levels of Parallelism-High Performance Computing (HPC)-Jupyter-JSCLecture 3: Distributed Deep Learning-Distributed training-Horovod-DeepSpeedLecture 4: Hands-on Distributed Deep Learning-Become familiar with Horovod, a data distributed training framework-Understand how to modify existing code to enable parallelism-Understand the importance of distributing data beforehand-Understand what Horovod does looking at the lines of code to be added-Create a job script to execute Python code on the GPUs-Play around with model architecture, optimizer, learning rateLecture 5: Big Data Analytics using Apache Spark-Apache Spark Basics-Developing on Spark and Clouds-Machine Learning on Spark |