Georg Hager's Blog

Random thoughts on High Performance Computing


SAHPC 2012

Half-day tutorial at the third Saudi Arabian High Performance Computing (SAHPC) conference, December 1-3, 2012, at King Abdullah University of Science and Technology (KAUST) in Thuwal, Saudi Arabia:

Performance Engineering on Multi- and Manycores



As shown in the tutorial: SAHPC-Tutorial-2012-small.pdf

Including skipped slides: SAHPC-Tutorial-2012-full.pdf

Since the blog system does not allow uploading of Excel files, this is a link to my Dropbox: Excel sheet for the power model


Georg Hager
Erlangen Regional Computing Center
University of Erlangen-Nuremberg


The advent of multi- and manycore chips has led to a further opening of the gap between peak and application performance for many scientific codes. Paradoxically, bad node-level performance helps to “efficiently” scale to massive parallelism, but at the price of increased overall time to solution. We convey the architectural features of current processor chips, multiprocessor nodes, and accelerators, as far as they are relevant for  high-performance simulation. Typical bottlenecks are identified and the features and problems of the dominating programming models, MPI and OpenMP, are pointed out. Simple performance models on the chip and node level are introduced as powerful tools to get a grasp on what is “optimal performance”, what optimizations could be done to improve it, and what the expected benefit is.

We also comment on typical performance and scalability patterns and how they can be used to improve the energy efficiency of simulations. Finally, all these strategies are embedded into a structured “performance engineering” process, which we propose as a guiding principle in all HPC-related efforts.