Georg Hager's Blog

Random thoughts on High Performance Computing

Content

Get a full dose of performance engineering with our two half-day tutorials at ISC25!

This year’s ISC High Performance conference takes place in Hamburg from June 10-13. As a refreshing counterweight to the usual enema of AI and Quantum Computing you will get there, we conduct two half-day tutorials about good old solid performance engineering.

In the morning of June 13, Jan Laukemann and I will present the “Core-Level Performance Engineering” tutorial: 3.5 hours packed with information on how modern CPUs execute your code. Pipelining, out-of-order execution, superscalarity, SIMD, plus hands-on exercises using Matt Godbolt’s Compiler Explorer and OSACA, our Open-Source Architecture Code Analyzer, which is integrated with it. If you need to model the in-core performance of code for optimization or co-design, this tutorial is for you.

In the afternoon of June 13, Christie Alappat (still a PhD student at FAU but now working for Intel), Jonas Thies (TU Delft), Hartwig Anzt (TU München Campus Heilbronn), and myself will conduct the tutorial “Performance Engineering for Sparse Linear Solvers.” It provides a thorough coverage of sparse matrix-vector multiplication (SpMV), preconditioners, and even cache blocking of matrix powers via RACE, Christie’s Recursive Algebraic Coloring Engine. In the hands-on exercises, attendees will get access to an A100 GPU and be able to experiment with SpMV and sparse linear solvers. All code (mostly python/numba) is available for download.

New tutorial “Performance Engineering for Linear Solvers” at ISC High Performance 2024

On Sunday, May 12, the brand-new tutorial “Performance Engineering for Linear Solvers” will be presented at ISC High Performance in Hamburg by Christie Alappat (still a PhD student at FAU but now working for Intel), Jonas Thies (TU Delft), Hartwig Anzt (TU München Campus Heilbronn), and myself.

This tutorial was in the making for a long time; many concepts were made, re-made, and updated again. We aimed at a slightly higher abstraction level than in our popular tutorial “Node-Level Performance Engineering,” which has a strong focus on the Roofline model and the optimization of simple loops and loop nests. In contrast, the new tutorial concentrates on the performance of sparse linear solvers, which includes a coverage of sparse matrix-vector multiplication (SpMV), preconditioners, and even cache blocking of matrix powers via RACE, Christie’s Recursive Algebraic Coloring Engine. Since the tutorial was accepted as a half-day event, we could only accommodate online demos instead of hands-on exercises for attendees. However, all code (mostly python/numba) is available for download.