Description: NestMC: A new multi-compartment neuronal network simulator

This title appears in the Scientific Report : 2017

NestMC: A new multi-compartment neuronal network simulator

NestMC is a prototype simulator for neuronal networks composed of morphologically detailed neurons.This new code is being designed for the new generation of HPC infrastructure composed of massively parallel and heterogeneous architectures.Planned architectures include `normal' non-vectorized CP...

Personal Name(s):	Peyser, Alexander (Corresponding author)
Contributing Institute:	Jülich Supercomputing Center; JSC
Published in:	JUQUEEN Extreme Scaling Workshop 2017
Imprint:	Jülich Forschungszentrum Jülich Jülich Supercomputing Centre 2017
Physical Description:	31-36
Conference:	Jülich (Germany), 2017-01-23 - 2017-01-25
Document Type:	Contribution to a book
Research Program:	SimLab Neuroscience Supercomputing and Modelling for the Human Brain Human Brain Project Specific Grant Agreement 1 Theory, modelling and simulation Computational Science and Mathematical Methods
Series Title:	JSC Internal Report FZJ-JSC-IB-2017-01
Link:	OpenAccess OpenAccess
	Publikationsportal JuSER

Please use the identifier: http://hdl.handle.net/2128/14860 in citations.

NestMC is a prototype simulator for neuronal networks composed of morphologically detailed neurons.This new code is being designed for the new generation of HPC infrastructure composed of massively parallel and heterogeneous architectures.Planned architectures include `normal' non-vectorized CPUs, vectorized CPUs such as KNL, GPUs and other boosters such as FPGAs.For OpenMP, the current architecture with 1 thread per rank handling all spike communications and exchange scales well up to 2048 nodes, and continues to give performance gains up to full JUQUEEN.Using threading pools that partially implement the functionality of TBB, we see good weak-scaling up to 4096 nodes and can expect to see performance gains up to JUQUEEN scale.For more complex neuron models and morphologies which increase the ratio of computation time to communication time, weak scaling should be significantly improved; the cases tested are 'worst case scenarios' relative to production runs.With this workshop, we identified the limits of weak-scaling on the current architecture.This motivated the development of a threading backend for architectures where TBB is not available.Since the communication time is dominated by processing the global spike buffers, a dry-run mode has been developed taking advantage of this performance profile, which will allow us to estimate these results using negligible resources.