Our brand-new tutorial “Core-Level Performance Engineering” has been accepted as a full-day tutorial at ICPE 2023, the 14th ACM/SPEC International Conference on Performance Engineering. This tutorial concentrates on the in-core aspects of performance modeling and analysis on CPUs. We use Matt Godbolt’s Compiler Explorer and our Open-Source Architecture Code Analyzer (OSACA), which is now integrated with the Compiler Explorer, to teach the basics of code execution including pipelining, superscalarity, SIMD, intra-iteration and loop-carried dependencies, and more. Intel/AMD x86 and ARM Neon/SVE assembly code is introduced, and participants can get their hands dirty exploring the depths of machine code execution using only a web browser! Lead OSACA developer Jan Laukemann did most of the work for this exciting new event. Find the details at: https://icpe2023.spec.org/tutorials/tutorial3/.
All slides and some of the exercises are available at: http://tiny.cc/CLPE.

LIKWID 5.2.1 is out! This bugfix release addresses a lot of small and not-so-small issues:

MachineState is a python3 module and CLI application for documenting and comparing settings known to affect application performance: e.g., CPU/Uncore frequencies, hardware prefetchers, memory capacity, but also OS and software settings like NUMA balancing, writeback workqueues, scheduling, or the versions of common tools and libraries (e.g., compilers and MPI). All this information can be essential for reproduction of benchmark results. The MachineState tool gathers all (known) settings and presents them as a JSON document. A state file written earlier can be compared to the current machine state to uncover deviations from the original test system.
