This title appears in the Scientific Report : 2014 

An optimized subspace iteration eigensolver applied to sequences of dense eigenproblems in ab initio simulations
Di Napoli, Edoardo (Corresponding Author)
Jülich Supercomputing Center; JSC
6th International Workshop on Parallel Matrix Algorithms and Applications, Lugano (Switzerland), 2014-07-01 - 2014-07-04
Conference Presentation
Computational Science and Mathematical Methods
Sequences of eigenvalue problems consistently appear in a large class of applications based on the iterative solution of a non-linear eigenvalue problem. A typical example is given by the chemistry and materials science ab initio simulations relying on computational methods developed within the framework of Density Functional Theory (DFT). DFT provides the means to solve a high-dimensional quantum mechanical problem by representing it as a non-linear generalized eigenvalue problem which is solved self-consistently through a series of successive outer-iteration cycles. As a consequence each self-consistent simulation is made of several sequences of generalized eigenproblems $P: Ax=\lambda Bx$. Each sequence, $P^{(1)}, \dots P^{(\ell)} \dots P^{(N)}$, groups together eigenproblems with increasing outer-iteration index $\ell$. In more general terms a set of eigenvalue problems $\{P^{(1)}, \dots P^{(N)}\}$ is said to be a ``sequence'' if the solution of the $\ell$-th eigenproblem determines, in an application-specific manner, the initialization of the $(\ell+1)$-th eigenproblem. For instance at the beginning of each DFT cycle an initial function $\rho^{(\ell)}({\bf r})$ determines the initialization of the $\ell$-th eigenproblem. A large fraction of $P^{(\ell)}$ eigenpairsare then use to compute a new $\rho^{(\ell+1)}({\bf r})$ which, in turn, leads to the initialization of a new eigenvalue problem $P^{(\ell+1)}$. In addition to be part of a sequence, successive eigenproblems might possess a certain degree of correlation connecting their respective eigenpairs. In DFT sequences, correlation becomes manifest in the way eigenvectors of successive eigenproblems become progressively more collinear to each other as the $\ell$-indexincreases. We developed a subspace iteration method (ChFSI) specifically tailored for sequences of eigenproblems whose correlation appears in the form of increasingly collinear eigenvectors. Our strategy is to take the maximal advantage possible from the information that the solution of the $P^{(\ell)}$ eigenproblem is providing when solving for the successive $P^{(\ell+1)}$ problem. As a consequence the subspace iteration was augmented with a Chebyshevpolynomial filter whose degree gets dynamically optimized so as to minimize the number of matrix-vector multiplications. The effectiveness of the Chebyshev filter is substantially increased when inputed the approximate eigenvectors $\{ x_1^{(\ell)}, \dots x_{\rm nev}^{(\ell)}\}$, as well as very reliable estimates, namely $[\lambda_1^{(\ell)},\ \lambda_{\rm nev}^{(\ell)}]$, for the limits of the eigenspectrum interval $[a,\ b]$ to be filtered in. In additionthe degree of the polynomial filter is adjusted so as to be minimal with respect to the required tolerance for the eigenpairs residual. This result is achieved by exploiting the dependence each eigenpair residual have with respect to its convergence ratio as determined by the rescaled Chebyshev polynomial and its degree. The solver is complemented with an efficient mechanism which locks and deflates the converged eigenpairs.The resulting eigensolver was implemented in C language and parallelized for both shared and distributed architectures. Numerical tests show that, when the eigensolver is inputed approximate solutions instead of random vectors, it achieves up to a 5X speedup. Moreover ChFSI takes great advantage of computational resources by scaling over a large range of cores commensurate with the size of the eigenproblems. Specifically, by making better use of massively parallel architectures, the distributed memory version will allow DFT users to simulate physical systems quite larger than are currently accessible.