Georg Hager's Blog

Random thoughts on High Performance Computing


New tutorial on “Core-Level Performance Engineering” accepted for ICPE 2023

ICPE 2023 LogoOur brand-new tutorial “Core-Level Performance Engineering” has been accepted as a full-day tutorial at ICPE 2023, the 14th ACM/SPEC International Conference on Performance Engineering. This tutorial concentrates on the in-core aspects of performance modeling and analysis on CPUs. We use Matt Godbolt’s Compiler Explorer and our Open-Source Architecture Code Analyzer (OSACA), which is now integrated with the Compiler Explorer, to teach the basics of code execution including pipelining, superscalarity, SIMD, intra-iteration and loop-carried dependencies, and more. Intel/AMD x86 and ARM Neon/SVE assembly code is introduced, and participants can get their hands dirty exploring the depths of machine code execution using only a web browser! Lead OSACA developer Jan Laukemann did most of the work for this exciting new event. Find the details at:

All slides and some of the exercises are available at:

PMBS19 Workshop Best Late-Breaking Paper Award

The authors proudly presenting the award at the Bavarian Supercomputing Alliance booth at SC19.

Our paper “Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels” has just won the “Best Late-Breaking Paper Award” at the 10th Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS19), a renowned workshop co-located with the SC19 conference. The lead author, our master student Jan Laukemann, presented his work on a new version of the OSACA tool (Open-Source Architecture Code Analyzer), which now supports throughput, critical path, and loop-carried dependency analysis for assembly loop kernels on x86 and ARM architectures. It is thus a critical component for ECM and Roofline modeling and can be used as a more capable substitute for Intel’s discontinued IACA tool.