ESSEX-II – Equipping Sparse Solvers for Exascale

ESSEX project documentation

Content

The ESSEX-II project

SPPEXA Priority Programme

The ESSEX project is funded by the German DFG priority programme 1648 “Software for Exascale Computing” (SPPEXA). In 2016 it has entered is second funding phase, ESSEX-II.

ESSEX investigates programming concepts and numerical algorithms for scalable, efficient and robust iterative sparse matrix applications on exascale systems. Starting with successful blueprints and prototype solutions identified in ESSEX-I, the second phase project ESSEX-II aims at delivering a collection of broadly usable and scalable sparse eigenvalue solvers with high hardware efficiency for the computer architectures to come. Project activities are organized along the traditional software layers of low-level parallel building blocks (kernels), algorithm implementations, and applications. The classic abstraction boundaries separating these layers are broken in ESSEX by strongly integrating objectives: scalability, numerical reliability, fault tolerance, and holistic performance and power engineering.

The basic building block library supports an elaborate MPI+X approach that is able to fully exploit hardware heterogeneity while exposing functional parallelism and data parallelism to all other software layers in a flexible way. In addition, facilities for fully asynchronous checkpointing, silent data corruption detection and correction, performance assessment, performance model validation, and energy measurements will be provided transparently.

The advanced building blocks will be defined and employed by the developments at the algorithms layer. Here, ESSEX-II will provide state-of-the-art library implementations of classic linear sparse eigenvalue solvers including block Jacobi-Davidson, Kernel Polynomial Method (KPM), and Chebyshev filter diagonalization (ChebFD) that are ready to use for production on modern heterogeneous compute nodes with best performance and numerical accuracy. Research in this direction includes the development of appropriate parallel adaptive AMG software for the block Jacobi-Davidson method. Contour integral-based approaches are also covered in ESSEX-II and will be extended in two directions: The FEAST method will be further developed for improved scalability, and the Sakurai-Sugiura method (SSM) method will be extended to nonlinear sparse eigenvalue problems. These developments are strongly supported by additional Japanese project partners from University of Tokyo, Computer Science, and University of Tsukuba, Applied Mathematics.

The applications layer will deliver scalable solutions for conservative (Hermitian) and dissipative (non-Hermitian) quantum systems with strong links to optics and biology and to novel materials such as graphene and topological insulators.

Extending its predecessor project, ESSEX-II adopts an additional focus on production-grade software. Although the selection of algorithms is strictly motivated by quantum physics application scenarios, the underlying research directions of algorithmic and hardware efficiency, accuracy, and resilience will radiate into many fields of computational science. Most importantly, all developments will be accompanied by an uncompromising performance engineering process that will rigorously expose any discrepancy between expected and observed resource efficiency.

 

Principal investigators:

Gerhard Wellein,  Computer Science, University of  Erlangen-Nuremberg

Bruno Lang,  Applied Computer Science, University of Wuppertal

Achim Basermann,  Simulation & SW Technology, DLR

Holger Fehske,  Institute for Physics, University of Greifswald

Georg Hager, Erlangen Regional Computing Center

Tetsuya Sakurai, Department of Computer Science, University of Tsukuba

Kengo Nakajima, Information Technology Center, University of Tokyo

ESSEX-II kick-off

ESSEX-II kick-off meeting in Berlin