Georg Hager's Blog

2026

Power, energy, and TCO considerations for scientific computing and AI workloads. Invited talk at PPAM 2026, the 16th International Conference on Parallel Processing & Applied Mathematics, Poznań, Poland, August 30-September 2, 2026
Practical Roofline Modeling by Example. Full-day tutorial at PPAM 2026, the 16th International Conference on Parallel
Processing & Applied Mathematics, Poznań, Poland, August 30-September 2, 2026 (with Jan Laukemann)
Node-Level Peformance Optimisation. Full-day tutorial at the Durham HPC Days 2026, Durham, UK, June 16, 2026 (with Thomas Gruber)
Do the math! Pen-and-paper HPC for fun and profit. Invited talk at Marvin’s 2nd Birthday: The Community Event at the End of the Universe, University of Bonn, March 19, 2026.
Elements of Performance Engineering and BYOC Hackathon. University of Bonn, March 17-18, 2026.
Annual course Parallel Programming for High-Performance Systems (PPHPS26). Three-day on-site course at NHR@FAU, February 24-26, 2026 (with Alireza Ghasemi, Sebastian Kuckuk, and LRZ staff).
Hybrid Programming in HPC – MPI+X. Three-day hybrid tutorial at High Performance Computing Center Stuttgart (HLRS), Stuttgart, Germany, February 10-12, 2026 (with Tobias Haas [HLRS] and Claudia Blaas-Schenner [TU Wien]).
Parallel Computing: From CPU Core to Supercomputer. Invited talk at Fakultät Mathematik und Informatik, OTH Regensburg, January 14, 2026 (with Alireza Ghasemi).

2025

Co-design in the HPC space with analytic resource models. Invited talk at the 5th International Symposium for the Quantitative Codesign of Suprcomputers, SC25, St. Louis, MO, November 17, 2025.
Performance Engineering for Sparse Linear Solvers. Half-day tutorial at SC25, St. Louis, MO, November 16, 2025 (with Christie L. Alappat and Hartwig Anzt [TU München]).
Core-Level Performance Engineering. Half-day tutorial at SC25, St. Louis, MO, November 17, 2025 (with Jan Laukemann).
Patterns, measurements, guesstimates: How to work with energy metrics in HPC and stay sane. Invited talk at the Workshop on Sustainable Scientific Computing, Leiden, The Netherlands, October 27-31, 2025.
Core-Level Performance Engineering. Full-day online tutorial, October 6, 2025 (with Jan Laukemann).
Introduction to the LIKWID Tool Suite. Full-day online tutorial, August 31, 2025 (with Thomas Gruber).
Towards a Workflow for Analytic Performance, Power, and Energy Models. Invited talk at PERMAVOST 2025, the 5th Workshop on Performance EngineeRing, Modelling, Analysis, and VisualizatiOn STrategy, July 20, 2025, Notre Dame, IN, USA (in conjunction with ACM HPDC 2025).
Core-Level Performance Engineering. Half-day tutorial at ISC High Performance 2025, Hamburg, Germany, June 13, 2025 (with Jan Laukemann).
Performance Engineering for Linear Solvers. Half-day tutorial at ISC High Performance 2025, Hamburg, Germany, June 13, 2025 (with Christie L. Alappat, Jonas Thies [TU Delft], and Hartwig Anzt [TU München]).
Introduction to Parallel Programming with MPI. Two-day online course at NHR@FAU, April 9-10, 2025 (with Alireza Ghasemi).
Annual course Parallel Programming for High-Performance Systems (PPHPS25). Three-day on-site course at LRZ Garching, February 18-20, 2025 (with Alireza Ghasemi, Sebastian Kuckuk, and LRZ staff).
Hybrid Programming in HPC – MPI+X. Three-day hybrid tutorial at High Performance Computing Center Stuttgart (HLRS), Stuttgart, Germany, January 21-23, 2025 (with Rolf Rabenseifner [HLRS] and Claudia Blaas-Schenner [TU Wien]).
Node-Level Performance Engineering tutorials 2025

2024

Performance Engineering for Linear Solvers. Half-day tutorial at SC24, Atlanta, GA, November 18, 2024 (with Christie L. Alappat and Hartwig Anzt [TU München]).
Core-Level Performance Engineering. Half-day tutorial at SC24, Atlanta, GA, November 18, 2024 (with Jan Laukemann).
Analytic Performance Modeling for HPC Workloads. Invited talk at the Sino-German Workshop on Multiphysics Device Simulation and Hardware-Aware Computing, Xi’An, China, October 10-16, 2024.
Core-Level Performance Engineering. Full-day online tutorial at NHR@FAU, October 8, 2024.
Core-Level Performance Engineering. Full-day on-site tutorial at PPAM 2024, the 15th International Conference on Parallel Parallel Processing and Applied Mathematics, Ostrava, Czech Republic, September 8-11, 2024.
Hardware Evolution from an HPC Point of View. Invited talk at 20 ans du Groupe Calcul, Paris, France, June 3, 2024.
Performance Engineering for Linear Solvers. Half-day tutorial at ISC High Performance 2024, Hamburg, Germany, May 12, 2024 (with Christie L. Alappat, Jonas Thies [TU Delft], and Hartwig Anzt [TU München]).
Introduction to Parallel Programming with MPI. Two-day online course at NHR@FAU, April 11-12, 2024 (with Alireza Ghasemi).
Resources for High Performance Computing at FAU. Talk at the FAU Graduate Centre, March 19, 2024 (with Jan Eitzinger).
Annual course Parallel Programming of High-Performance Systems (PPHPS24). Three-day on-site course at NHR@FAU, February 20-22, 2024 (with Markus Wittmann, Ayesha Afzal, and LRZ staff).
Performance Engineering with Resource-Based Metrics. Invited talk at the Zentralinstitut für Technische Informatik (ZITI), University of Heidelberg, February 5, 2024.
Hybrid Programming in HPC – MPI+X. Three-day online tutorial at High Performance Computing Center Stuttgart (HLRS), Stuttgart, Germany, January 23-25, 2024 (Georg Hager, with Rolf Rabenseifner [HLRS] and Claudia Blaas-Schenner [TU Wien]).
Node-Level Performance Engineering tutorials 2024

2023

Performance Modeling and Performance Engineering. Invited lecture series at the AQTIVATE Training Workshop on Exacale Computing and Scalable Algorithms, Stockholm, Sweden, November 27-December 14, 2023.
A. Afzal (G. Hager): Physical Oscillator Model for Supercomputing. Short paper presentation at PMBS23, the 14th Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Denver, CO, November 13, 2023. Slides
A. Afzal (G. Hager): SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study. Paper presentation at PMBS23, the 14th Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Denver, CO, November 13, 2023. Slides
Core-Level Performance Engineering. Full-day on-site tutorial at PACT 2023, the 32nd International Conference on Parallel Architectures and Compilation Techniques, Vienna, Austria, October 21-25, 2023.
Core-Level Performance Engineering. Full-day online tutorial at NHR@FAU, October 12, 2023.
Parallelization and Efficient Programming on High Performance Computers. Five-day block course at the University of Greifswald Computing Center, Greifswald, Germany, September 21-27, 2023.
Resources for High Performance Computing at FAU. Talk at the FAU Graduate Centre, September 14, 2023 (with Jan Eitzinger).
Computer Architecture 101 for Scientists. HPC Café talk at NHR@FAU, June 13, 2023. Slides Video
Core-Level Performance Engineering. Full-day tutorial at ICPE 2023, the 14th ACM/SPEC International Conference on Performance Engineering, Coimbra, Portugal, April 15-19, 2023 (with Jan Laukemann).
Application Knowledge Required: Performance Modeling for Fun and Profit. Keynote at ICPE 2023, the 14th ACM/SPEC International Conference on Performance Engineering, Coimbra, Portugal, April 15-19, 2023
Annual course Parallel Programming of High-Performance Systems (PPHPS23). Three-day online course, March 7-9, 2023 (with Markus Wittmann, Ayesha Afzal, and LRZ staff).
Performance Engineering in CSE: A Bird’s-Eye View. Talk at the SIAM CSE23 Minisymposium “Performance Engineering and Applications” (MS167), Amsterdam, The Netherlands, March 1, 2023. Slides
Resources for High Performance Computing at FAU. Talk at the FAU Graduate Centre, February 16, 2023 (with Jan Eitzinger).
News from NHR@FAU – Fritz, Alex and Woody. ECAP Seminar, FAU Erlangen-Nürnberg, January 19, 2023 (with Johannes Veh).
Node-Level Performance Engineering tutorials 2023

2022

The National High-Performance Computing Alliance and NHR@FAU: New Structures and Opportunities. Physikalisches Kolloquium, Universität Regensburg, December 19, 2022 (with Gerhard Wellein).
Hybrid Programming in HPC-MPI+X. Three-day online PRACE tutorial at Vienna Scientific Cluster (VSC), TU Wien, Austria, December 12-14, 2022 (with Rolf Rabenseifner [HLRS] and Claudia Blaas-Schenner [TU Wien]).
Resources for High Performance Computing at FAU. Talk at the FAU Graduate Centre, September 22, 2022 (with Jan Eitzinger).
Spontaneous asynchronicity: parallel programs out of lockstep. Invited talk at PPAM 2022, the 14th International Conference on Parallel Processing and Applied Mathematics, Gdansk, Poland, September 11-14, 2022. Slides
Roofline Modeling and Performance Engineering. Invited talk with hands-on exercises at the 2022 CSCS-USI Summer University on Effective High-Performance Computing and Data Analytics, Serpiano, Switzerland, July 23, 2022.
Hybrid Programming in HPC – MPI+X. Three-day online PRACE tutorial at Leibniz Supercomputing Centre (LRZ), Garching, Germany, June 22-24, 2022 (with Rolf Rabenseifner [HLRS] and Claudia Blaas-Schenner [TU Wien]).
NHR Graduate School Course Week 2022. Five-day training event for NHR Graduate School students at the Zuse-Institute Berlin (ZIB), June 13-17, 2022 (with Markus Wittmann and ZIB/TU Darmstadt staff).
Hybrid Programming in HPC-MPI+X. Three-day online PRACE tutorial at Vienna Scientific Cluster (VSC), TU Wien, Austria, April 5-7, 2022 (with Rolf Rabenseifner [HLRS] and Claudia Blaas-Schenner [TU Wien]).
Annual course Parallel Programming of High-Performance Systems (PPHPS22). Three-day online course, March 8-10, 2022 (with Markus Wittmann, Ayesha Afzal, and LRZ staff).
Node-Level Performance Engineering tutorials 2022

2021

From numbers to insight via performance models. Online invited talk at the IACS Seminar at Stony Brook University, Stony Brook, NY, October 14, 2021. Video recording
The surprising dynamics of non-lockstep execution. Talk at the 18th ScalPerf Workshop, Bertinoro, Italy, September 19-23, 2021.
Modeling and tuning of SpMV and a lattice QCD kernel on the A64FX. Invited online talk at the online A64FX Symposium, Stony Brook University, Stony Brook, NY, August 12, 2021. Slides
2021 Code Performance Series: From analysis to insight. Online session on “Single-Node optimization,” July 15, 2021 (with Thomas Gruber). Video recording
Introduction to Hybrid Programming in HPC. Three-day online tutorial at Vienna Scientific Cluster (VSC), TU Wien, Austria, June 15-17, 2021 (with Rolf Rabenseifner [HLRS] and Claudia Blaas-Schenner [TU Wien]).
Annual course Parallel Programming of High-Performance Systems (PPHPS21). Three-day online course, April 13-15, 2021 (together with LRZ staff).
A closer look at the Fujitsu A64FX processor. Public talk in the NHR PerfLab seminar, February 23, 2021. Video
Node-Level Performance Engineering tutorials 2021

2020

Parallel Programming with OpenMP and MPI. Online lecture and tutorial at the University of Greifswald, University Computing Center (URZ) and Institute of Physics, winter term 2020/21. YouTube playlist
Der Rechenschieber – Rechnen wie vor 100 Jahren. Night of Science, Universität Frankfurt, 19. Juni 2020. Video recording (in German)
Introduction to Hybrid Programming in HPC. Three-day online tutorial at Vienna Scientific Cluster (VSC), TU Wien, Austria, June 17-19, 2020 (with Rolf Rabenseifner [HLRS], Irene Reichl, and Claudia Blaas-Schenner [TU Wien]).
Annual course on “Parallel Programming of High Performance Systems“, RRZE, March 9-13, 2020 (together with LRZ staff).
Introduction to Hybrid Programming in HPC. Two-day PRACE tutorial at High Performance Computing Center Stuttgart (HLRS), Stuttgart, Germany, January 27-28, 2020 (with Rolf Rabenseifner [HLRS], Irene Reichl, and Claudia Blaas-Schenner [TU Wien]).
Node-Level Performance Engineering tutorials 2020

2019

EoCoE-II Performance Evaluation Workshop. Four-day workshop at FAU Erlangen, October 7-11, 2019 (with Judit Gimenez, BSC).
Some observations on NEC Aurora Tsubasa 10B – stencils and spMVM. Talk at the NEC Aurora community meeting, ISC 2019, June 16, 2019, Frankfurt, Germany. 2019-06-16_GHa_spMVM_Stencil_Tsubasa.pdf
Von der Wettervorhersage zur Kernwaffe: Supercomputer – was sie sind und was sie können. Night of Science, Universität Frankfurt, 14. Juni 2019.
Introduction to Hybrid Programming in HPC. Two-day tutorial at Vienna Scientific Cluster (VSC), TU Wien, Austria, June 12-13, 2019 (with Rolf Rabenseifner [HLRS], Irene Reichl, and Claudia Blaas-Schenner [TU Wien]).
Annual course on “Parallel Programming of High Performance Systems“, LRZ Garching, February 25 – March 1, 2019 (together with LRZ staff).
Introduction to Hybrid Programming in HPC. Two-day PRACE tutorial at Leibniz Supercomputing Centre (LRZ), Garching, January 28-29, 2019 (with Rolf Rabenseifner [HLRS], Irene Reichl, and Claudia Blaas-Schenner [TU Wien]).
Node-Level Performance Engineering tutorials 2019

2018

The Execution-Cache-Memory (ECM) Performance Model. Intel Platform Performance Brown Bag Talk, October 25, 2018. Hager_BrownBag_2018.pdf
Making sense of performance numbers. Invited talk at OpenMPCon 2018, Barcelona, Spain, September 24-26, 2018. Hager_OMPCon_2018.pdf
Thirteen modern ways to fool the masses with performance results on parallel computers. GridKa School 2018, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany, August 29, 2018. FTM-GridKa18-c.pdf
Performance Engineering – Why and How? PASC MS05, Basel, Switzerland, July 2-4, 2018. PASC18_MS05_Hager.pdf
Introduction to Hybrid Programming in HPC. One-day PTC short course at HLRS Stuttgart, June 19, 2018 (with Rolf Rabenseifner, HLRS). Details and registration: https://www.hlrs.de/training/2018-06-19-hy-s/
Von der Wettervorhersage zur Kernwaffe: Supercomputer – was sie sind und was sie können. Night of Science, Universität Frankfurt, 8. Juni 2018.
Introduction to Hybrid Programming in HPC. One-day tutorial at the Technical University of Vienna, Austria, June 6, 2018 (with Rolf Rabenseifner).
Annual course on “Parallel Programming of High Performance Systems“, RRZE, March 12-16, 2018 (together with LRZ staff).
“If it doesn’t work, we learn something.” Instructive case studies from performance engineering. Minisymposium MS29 at SIAM PP18, the 2018 Conference on Parallel Processing, March 8, 2018, Tokyo, Japan. PP18MS29_Hager.pdf
Introduction to Hybrid Programming in HPC. One-day PTC short course at LRZ Garching, January 18, 2018 (with Rolf Rabenseifner, HLRS). Details and registration: https://www.lrz.de/services/compute/courses/2018-01-18_hhyp1w17/
Node-Level Performance Engineering tutorials 2018

2017

Parallelization and Efficient Programming of High Performance Computers. Five-day block course at the Institute of Physics, University of Greifswald, September 25-29, 2017.
The curses and blessings of analytic performance modeling. Invited talk at PPAM‘2017, the 12th International Conference on Parallel Processing and Applied Mathematics, Lublin, Poland, September 10-13, 2017. PPAM17_Hager.pdf
MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. Half-day tutorial together with Rolf Rabenseifner at ISC High Performance 2017, June 18, 2016, Frankfurt, Germany.
MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. One-day PATC short course at HLRS Stuttgart, June 12, 2017 (with Rolf Rabenseifner).
Supercomputer: Mächtiges Werkzeug und Forschungsobjekt. Night of Science, Universität Frankfurt, 9. Juni 2017. 2017-06-09_NoS.pdf (in German). Video recording
Thirteen modern ways to fool the masses with performance results on parallel computers. Evening talk at the Course on “Parallel Programming of High Performance Systems 2017”, LRZ Garching, March 6-10, 2017.
Annual course on “Parallel Programming of High Performance Systems“, LRZ Garching, March 6-10, 2017 (together with Markus Wittmann, Volker Weinberg, and others).
Making sense of temporally blocked stencil performance via analytic modeling. Invited talk at the 7th AICS International Symposium, Integrated Research Center of Kobe University, Kobe, Japan, February 23-24, 2017. AICS17_Hager.pdf
MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. One-day PATC short course at LRZ Garching, January 12, 2017 (with Rolf Rabenseifner).
Node-Level Performance Engineering tutorials 2017

2016

Introduction to Hybrid Programming in HPC. One-day tutorial at the Technical University of Vienna, Austria, November 4, 2016 (with Rolf Rabenseifner).
MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. Half-day tutorial together with Rolf Rabenseifner at ISC High Performance 2016, June 19, 2016, Frankfurt, Germany.
MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. One-day short course at HLRS Stuttgart, June 13, 2016 (with Rolf Rabenseifner).
Annual course on “Parallel Programming of High Performance Systems“, RRZE, March 7-11, 2016 (together with Markus Wittmann, Volker Weinberg, and others).
Efficient multicore programming. Lecture series together with G. Wellein at the Ohm University of Applied Sciences, Nuremberg, February 29-March 3, 2016.
Parallel and Efficient Programming. Five-day block course at the Institute of Physics, University of Greifswald, February 8-12, 2016.
Performance Engineering for Algorithmic Building Blocks in GHOST. Talk at the ESSEX Minisymposium at the SPPEXA Symposium 2016, Garching, January 25-27, 2016.
MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. One-day PATC short course at LRZ Garching, January 14, 2016 (with Rolf Rabenseifner).
Node-Level Performance Engineering tutorials 2016

2015

MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. Half-day tutorial together with Rolf Rabenseifner at Supercomputing 2015 (SC15), November 15-20, 2015, Austin, TX.
What role does software play in energy efficiency? Panel kick-off talk at the workshop on Energy-Efficient Supercomputing (E2SC 2015) at SC15, Austin, TX, November 15, 2015. E2SC15_Panel_Hager.pdf
Holistic node-level performance engineering for maximum resource efficiency on modern multi-core CPUs. Talk at ParisTech TELECOM, Paris, France, September 7, 2015. Hager_ParisTech.pdf
Analytical Tool-Supported Modeling of Streaming and Stencil Loops. Talk at the Scalable Tools Workshop 2015, Lake Tahoe, CA, August 3-6, 2015. STW_Hager.pdf
Performance engineering via analytical models. Talk at the workshop “Performance Modeling: Methods and Applications” at ISC High Performance 2015, Frankfurt, Germany, July 16, 2015. ISC15_PM_Hager.pdf
MPI+X programming models on future systems – the search for lowest-order effects. Invited talk at the session on “Programming models on the road to exascale,” ISC High Performance 2015, Frankfurt, Germany, July 13, 2015. MPI_plus_X_Hager.pdf
MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. Half-day tutorial together with Rolf Rabenseifner at ISC High Performance 2015, July 12, 2015, Frankfurt, Germany.
Model-guided performance engineering of numerical kernels. Invited talk at the meeting of the SFB Transregio 55 “Hadron Physics from Lattice QCD,” University of Wuppertal, Germany, July 10, 2015. Hager_Analytic_PM_BUW_15.pdf
White-box modeling for performance and energy: Useful patterns for resource optimization. Invited lecture at PACO 2015, the Workshop on Power-Aware Computing, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany, July 6-7, 2015. PACO-PE.pdf
Quantifying performance bottlenecks of stencil computations using the Execution-Cache-Memory model. Talk at ICS’15, the 29th ACM International Conference on Supercomputing, June 8-11, 2015, Newport Beach, CA. ICS15_Hager.pdf
Insight into stencil performance by analytic modeling. Talk at the Dagstuhl Seminar on Advanced Stencil Code Engineering, April 13-17, 2015, Schloss Dagstuhl, Wadern, Germany. Dagstuhl_Stencils_Hager_2015.pdf
GHOST, Performance Engineering, SpMVM. And 42. Talk at the workshop “Sparse Solvers for Exascale: From Building Blocks to Applications.” Greifswald, Germany, March 23-25, 2015. Hager_Exascale15_Greifswald.pdf
Annual course on “Parallel Programming of High Performance Systems“, LRZ Garching, March 9-13, 2015 (together with Markus Wittmann, Volker Weinberg, and Carla Guillen Carias).
Systematic Node-Level Performance Engineering. Talk at the SPEC DevOps Meeting, February 20, 2015, University of Würzburg, Germany.
Node-Level Performance Engineering tutorials 2015

2014

Node-level performance engineering. Two-day PATC short course at LRZ Garching, December 4-5, 2014 (with Gerhard Wellein).
MPI+X – Hybrid Programming on Modern Compute Clusters with Multicore Processors and Accelerators. Half-day tutorial together with Rolf Rabenseifner at Supercomputing 2014 (SC14), Nov 16-21, 2014, New Orleans, LA.
Node-Level Performance Engineering. Full-day tutorial together with Jan Treibig at Supercomputing 2014 (SC14), Nov 16-21,2014, New Orleans, LA.
Node-Level Performance Engineering. Two-day tutorial at the second “SPPEXA Doctoral Retreat”, Sarntal, South Tyrol, September 22-26, 2014 (together with Gerhard Wellein).
Node-Level Performance Engineering. Two-day short course at the Summer School for Modern Computational Science (MCS 2014), University of Oldenburg, September 4-5, 2014.
Multicore Architectures. Invited talk at the DIMACS Workshop on Multicore and Cryptography, Stevens Institute of Technology, Hoboken, NJ, July 21-23, 2014. Multicore-Architectures.pdf
Node-Level Performance Engineering. Two-day PATC short course at the High Performance Computing Center Stuttgart (HLRS), July 14-15, 2014 (with Jan Treibig).
Node-Level Performance Engineering. Full-day tutorial at the International Supercomputing Conference (ISC14), Leipzig, Germany, June 22-26, 2014 (with Jan Treibig and Gerhard Wellein).
Node-Level Performance Engineering. Two-day short course at the Swiss National Supercomputing Centre (CSCS), Lugano, Switzerland, May 15-16, 2013.
Basic performance modeling for numerical applications: Roofline and beyond. Lecture at the SPPEXA PhD seminar, University of Erlangen-Nuremberg, April 30, 2014. Roofline_ECM_SPPEXA_PhD_2014.pdf
Node-Level Performance Engineering. Two-day tutorial with Exercises at the Dortmund Center for Scientific Computing (DoWiR), TU Dortmund, April 8-9, 2014.
Performance-oriented programming on multicore-based systems, with a focus on the Cray XE6 and XC30. One-day PATC tutorial at the Cray XE6/XC30 optimization workshop, HLRS Stuttgart, March 20, 2014 (together with Jan Treibig). Cray_MC_SS_2014.pdf
Annual course on “Parallel Programming of High Performance Systems“, RRZE, March 10-14, 2014 (together with Markus Wittmann, Jan Treibig, Volker Weinberg, and Carla Guillen Carias).
Sparse Matrix-Vector Multiplication with Wide SIMD Units: Performance Models and a Unified Storage Format. Invited talk at Minisymposium MS53 on “Sparse Computations on Accelerators” at the SIAM Conference on Parallel Processing for Scientific Computing 2014 (PP14), Portland, OR, Feb 18-21, 2014. SELL-C-sigma.pdf
Efficient multicore programming. Lecture series together with G. Wellein at the Ohm University of Applied Sciences, Nuremberg, Feb 25-28, 2014.

2013

Node-level performance engineering. Two-day PATC short course (together with Gerhard Wellein) at LRZ Garching, December 3-4, 2013.
The practitioner’s cookbook for good parallel performance on multi- and manycore systems. Full-day tutorial together with Jan Treibig and Gerhard Wellein at Supercomputing 2013 (SC13), Nov 17-22,2013, Denver, CO.
Hybrid MPI and OpenMP Parallel Programming. Half-day tutorial together with Rolf Rabenseifner and Gabriele Jost at Supercomputing 2013 (SC13), Nov 17-22,2013, Denver, CO.
Node-Level Performance Engineering. Full-day tutorial at the “aiXcelerate 2013 HPC tuning workshop” RWTH Aachen, October 8-11, 2013.
Parallel Programming of Multi- and Manycore Systems. Block lecture at the Ohm University of Applied Sciences, Nuremberg, September 23-27, 2013 (together with Gerhard Wellein).
Node-Level Performance Engineering. Full-day tutorial at the first “SPPEXA Doctoral Retreat”, TU Darmstadt, September 16-20, 2013 (together with Gerhard Wellein).
More Science per Joule: Bottleneck Computing. Invited talk at the 10th International Conference on Parallel Processing and Applied Mathematics (PPAM 2013), Warsaw, Poland, September 8-11, 2013. PPAM13_Hager_Invited.pdf
Node-Level Performance Engineering. Half-day tutorial at the 10th International Conference on Parallel Processing and Applied Mathematics (PPAM 2013), Warsaw, Poland, September 8-11, 2013.
Node-Level Performance Engineering. Full-day tutorial at the International Supercomputing Conference (ISC13), Leipzig, Germany, June 16-20, 2013 (with Jan Treibig and Gerhard Wellein).
Performance Engineering on Multicore Platforms. Three-day tutorial at IBM Toronto Lab, Markham, ON, Canada, June 7-11, 2013 (together with Jan Treibig).
Specialist Workshop on Parallel Computing 2013: Advanced Multicore. Two-day tutorial at the University of Ghent and the University of Leuven, Belgium, April 23-24, 2013 (together with Jan Treibig).
Performance-oriented programming on multicore-based systems, with a focus on the Cray XE6. One-day PATC tutorial at the Cray XE6 optimization workshop, HLRS Stuttgart (together with Jan Treibig)
- April 19, 2013 Cray_MC_SS_2013-final.pdf
- October 31, 2013
News about LIKWID. Talk at ZKI AK Supercomputing, University of Paderborn, Parallel Computing Center, March 15, 2013, Paderborn, Germany. Hager_ZKI_Maerz13_LIKWID.pdf
Node-level performance engineering. Two-day short course (together with Gerhard Wellein and Moritz Kreutzer) at DLR Köln, March 13-14, 2013, Cologne, Germany.
Performance and Power Engineering on Multicore Systems. Invited talk at the German Research School for Simulation Sciences, RWTH Aachen University, March 11, 2013, Aachen, Germany. GRS-PE.pdf
Annual course on “Parallel Programming of High Performance Systems“, LRZ Garching, March 4-8, 2013 (together with Markus Wittmann, Jan Treibig, Volker Weinberg, and Carla Guillen Carias).
Efficient multicore programming. Lecture series together with G. Wellein at the Ohm University of Applied Sciences, Nuremberg, Feb 25-27, 2013.

2012

Node-level performance engineering. Two-day PATC short course (together with Gerhard Wellein) at LRZ Garching, December 6-7, 2012.
Performance engineering on multi-and manycores. Half-day tutorial at the 3rd Saudi-Arabian HPC Users Conference (SAHPC 2012) at King Abdullah University of Science and Technology (KAUST), December 1-3, 2012, Thuwal, Saudi-Arabia.
Energy efficiency: A down-to-earth perspective. Short talk at the “Cool Supercomputing” BoF, Supercomputing 2012 (SC12), Nov 11-16, 2012, Salt Lake City, UT.
The practitioner’s cookbook for good parallel performance on multi- and manycore systems. Full-day tutorial together with Gerhard Wellein at Supercomputing 2012 (SC12), Nov 11-16,2012, Salt Lake City, UT.
Hybrid MPI and OpenMP Parallel Programming. Half-day tutorial together with Rolf Rabenseifner and Gabriele Jost at Supercomputing 2012 (SC12), Nov 11-16,2012, Salt Lake City, UT.
Performance Optimization and Modeling. Block lecture at the Ernst Moritz Arndt University of Greifswald, October 8-12, 2012.
Parallel Programming of Multi- and Manycore Systems. Block lecture at the Ohm University of Applied Sciences, Nuremberg, September 24-28, 2012.
Performance patterns and hardware metrics on modern multicore processors: Best practices for performance engineering. Talk at PROPER 2012, the 5th Workshop on Productivity and Performance, at Euro-Par 2012, Rhodes Island, Greece, August 28, 2012. Hager-PROPER12-paper.pdf
Performance Engineering: From Numbers to Insight. Invited talk at PROPER 2012, the 5th Workshop on Productivity and Performance, at Euro-Par 2012, Rhodes Island, Greece, August 28, 2012. Hager-PROPER12-invited.pdf
Performance Engineering for Multi- and Manycores: Unveiling the Mysteries of Application Performance. Invited session “Application Performance: Lessons Learned From Petascale Computing” at ISC12, Hamburg, Germany, June 18, 2012. Hager-ISC12.pdf
Performance-oriented programming on multicore-based clusters with MPI, OpenMP, and hybrid MPI/OpenMP. Half-day tutorial together with Jan Treibig, Rolf Rabenseifner, and Gabriele Jost at ISC12, Hamburg, Germany, June 17, 2012.
Performance-oriented programming on multicore-based systems. Tutorial at the NUG 2012 Meeting, Potsdam, Germany, June 12, 2012 (together with R. Fischer). RRZE-Multicore.pdf
Specialist Workshop on Parallel Computing 2012: Multithreading and Multiprocessing. Two-day tutorial at the University of Ghent, Belgium, April 19-20, 2012 (together with Jan Treibig).
Performance-oriented programming on multicore-based systems, with a focus on the Cray XE6.
- One-day PATC tutorial at the Cray XE6 optimization workshop, HLRS Stuttgart, April 2-5, 2012 (together with Jan Treibig).
- One-day PATC tutorial at the Cray XE6 optimization workshop, HLRS Stuttgart, November 5-8, 2012 (together with Jan Treibig). Cray_MC_WS_2012-final.pdf
Annual course on “Parallel Programming of High Performance Systems“, RRZE, March 5-9 and 19-22, 2012 (together with Jan Treibig and Reinhold Bader).
Simulating incompressible flows with the lattice-Boltzmann method: Algorithm, implementation, performance.
- Greifswalder Physikalisches Kolloquium, University of Greifswald, Germany, January 5, 2012.
- SIAM Conference on Parallel Processing for Scientific Computing 2012 (PP12) Minisymposium MS14, Savannah, GA, USA, February 15, 2012.

2011

1000 x 0 = 0. Single-node optimisation does matter. Birds-of-a-feather session organized by Bettina Krammer at SC11, Nov 17, 2011, Seattle, WA.
Hybrid MPI and OpenMP Parallel Programming. Supercomputing ’11 tutorial S-01 together with Rolf Rabenseifner and Gabriele Jost. SC11, Nov 13-18, 2011, Seattle, WA.
Teaching High Performance Computing to Scientists and Engineers: A Model-Based Approach. Award talk at the 7th European Computer Science Summit, Politecnico di Milano, Milan, Italy, November 7-9, 2011. IEAward.pdf
Multicore Technology Briefing, ZISC Erlangen, October 13, 2011:
- New chips – new software?
- Parallel code for multicore systems
Final report on KONWIHR project HQS@HPC-II. KONWIHR Results and Review Workshop, LRZ Garching, October 12, 2011.
Parallel Programming of Multi- and Manycore Systems. Block lecture together with G. Wellein at the Ohm University of Applied Sciences, Nuremberg, September 26-30, 2011.
Monitoring, Accounting und Nutzerverwaltung auf den HPC-Systemen des RRZE. Talk at the ZIH Kolloquium, TU Dresden, August 25, 2011. ZIH_110825.pdf
Performance-oriented programming on multicore-based Clusters with MPI, OpenMP, and hybrid MPI/OpenMP. Full-day tutorial together with Jan Treibig, Gerhard Wellein, and Gabriele Jost at ISC11, June 19, 2011, Hamburg, Germany.
Prospects for Truly Asynchronous Communication with Pure MPI and Hybrid MPI/OpenMP on Current Supercomputing Platforms. Talk at the Cray User Group Conference 2011, May 23-26, 2011, Fairbanks, AK. Hager-Slides-CUG11.pdf
Parallel sparse matrix-vector multiplication as a test case for hybrid MPI+OpenMP programming. Talk at the 2011 Workshop on Large-Scale Parallel Processing (LSPP 2011), May 20, 2011, Anchorage, AK. Hager-Slides-LSPP11.pdf
Efficient multithreaded programming on modern CPUs and GPUs. Short course at KTH Stockholm, March 14-18, 2011 (together with Gerhard Wellein).
Thirteen modern ways to fool the masses with performance results on parallel computers. Evening talk at the Course on “Parallel Programming of High Performance Systems 2011”, LRZ Garching, March 7-11, 2011.
Annual course on “Parallel Programming of High Performance Systems“, LRZ Garching, March 7-11 and 21-23, 2011 (together with Jan Treibig and Reinhold Bader).
Common sense in high performance computing. Leogang HPC workshop, March 2nd, 2011.
Efficient multicore programming. Lecture series together with G. Wellein at the Ohm University of Applied Sciences, Nuremberg, Feb 21-23, 2011.
Ingredients for Good Parallel Performance on Multicore-Based Systems. PPoPP11 tutorial, Feb 13, 2011, San Antonio, TX.

2010

Ingredients for Good Parallel Performance on Multicore-Based Systems. Supercomputing ’10 tutorial M-16 together with Gerhard Wellein. SC10, Nov 14-19, 2010, New Orleans, LA.
Hybrid MPI and OpenMP Parallel Programming. Supercomputing ’10 tutorial M-02 together with Rolf Rabenseifner and Gabriele Jost. SC10, Nov 14-19, 2010, New Orleans, LA.
C++ für Programmierer. Workshop at LRZ Garching, October 11-15, 2010.
MPI/OpenMP hybrid computing (on modern multicore systems). Invited talk at the 39th SPEEDUP workshop on High-Performance Computing, ETH Zurich, September 6-7, 2010. Hager-Speedup-2010.pdf
Thirteen modern ways to fool the masses with performance results on parallel computers.
- Talk at the 12th Teraflop Workshop, HLRS Stuttgart, March 15-16, 2010. thirteen_ways_tfws_2010.pdf
- Talk at the 6th Erlangen International High End Computing Symposium, RRZE, June 4th, 2010. thirteen-ways-eihecs6.pdf
Annual course on “Parallel Programming of High Performance Systems“, RRZE, March 2010 (together with Jan Treibig, Markus Müller, and Reinhold Bader).
Hybrid applications on modern architectures: Things to consider. Invited talk at the SIAM Conference on Parallel Processing for Scientific Computing (PP10), February 24-26, 2010, Seattle, WA. hager-pp10.pdf
Lecture series “Efficient multi-core programming“ together with G. Wellein at the Ohm University of Applied Sciences, Nuremberg, Feb 8-10, 2010.

2009

Hybrid MPI and OpenMP Parallel Programming. Supercomputing ’09 tutorial M-09 together with Rolf Rabenseifner and Gabriele Jost. SC09, Portland, OR.
Wavefront Parallel Temporal Blocking on Multi-Core Processors with Shared Caches. Los Alamos National Laboratory, Performance Architecture Lab (PAL), August 26th, 2009. lanl-pal-2009-08-26.pdf
C++ for C and Fortran programmers. Four-day tutorial at CD-Adapco, Nuremberg, March 16-19, 2009.
Annual course on “Parallel Programming of High Performance Systems“, LRZ Garching, February 2009 (together with Reinhold Bader).
Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. Talk at the 17th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP 2009), February 18-20, 2009. mpi_openmp_pdp09.pdf
Lecture series “Parallel Computing” together with G. Wellein at Ohm University of Applied Sciences Nürnberg, March 9-13, 2009.
- Part 1 (intro, serial programming): ohn-parallelrechner2009-teil1-2proseite.pdf.
- Part 2 (shared memory parallelization): ohn-parallelrechner2009-teil2-2proseite.pdf .
- Part 3 (distributed memory parallelization): ohn-parallelrechner2009-teil3-2proseite.pdf

2008

Hybrid MPI and OpenMP Parallel Programming. Supercomputing ’08 tutorial M-09 together with Rolf Rabenseifner, Gabriele Jost, and Rainer Keller. SC08, Austin, TX.
Lecture series “Parallel Computing” together with G. Wellein at Ohm University of Applied Sciences Nürnberg, March 3-6, 2008.
- Part 1 (intro, serial programming): ohn-parallelrechner2008-teil1-2proseite.pdf.
- Part 2 (shared memory parallelization): ohn-parallelrechner2008-teil2-2proseite.pdf .
- Part 3 (distributed memory parallelization): ohn-parallelrechner2008-teil3-2proseite.pdf
Annual course on “Parallel Programming of High Performance Systems“, RRZE, March 2008 (together with Reinhold Bader).
Effiziente Nutzung von Hochleistungsrechnern in der numerischen Strömungsmechanik. NUMET-Kurzlehrgang, 10.-13.03.2008, LSTM, Universität Erlangen
numet_hager_08.pdf

2007

Hybrid MPI and OpenMP Parallel Programming. Supercomputing ’07 tutorial S-10 together with Rolf Rabenseifner, Gabriele Jost, and Rainer Keller. SC07, Reno, NV.
Lecture series “Parallel Computing” together with G. Wellein at FH Nürnberg, Feb 28th – Mar 2nd, 2007.
- Day 1: fhn2007-parallelrechner-tag1-2proseite.pdf
- Day 2: fhn2007-parallelrechner-tag2-2proseite.pdf
- Day 3: fhn2007-parallelrechner-tag3-2proseite.pdf
High Performance Computing at RRZE. Talk at the Computer Chemistry Center (CCC), Apr 23rd, 2007, Erlangen. ccc_070423.pdf
Are the Killer Micros Still Attacking? Talk at the NEC User Group (NUG) XIX. General Meeting, May 24th, 2007, Cetraro (Italy). nug-07-killermicros.pdf
Cluster OpenMP. Talk at the 1st HLRS Parallel Tools Workshop, July 10th, 2007, HLRS Stuttgart
clomp_hlrs_070710.pdf
Windows Compute Cluster Server 2003 Evaluation. ZKI AK Supercomputing, Oct 25th, 2007, GWDG Göttingen
zki_winccs_07.pdf
Sun UltraSPARC T2 – First Tests. SunDay at RRZE, Nov 6th, 2007.
rrze-n2-ea.pdf
Performance Evaluation of Current HPC Architectures Using Low-Level and Application Benchmarks. HLRB2/KONWIHR Result and Review Workshop, Dec 3rd, 2007, LRZ.
hzsw-hlrb07.pdf

2006

Why is performance productivity poor on modern architectures? Talk with Jan Treibig at the Dagstuhl Seminar on Petacomputing, Feb 13-17, 2006, Dagstuhl
performance_productivity.pdf
Effiziente Nutzung von Hochleistungsrechnern in der numerischen Strömungsmechanik. NUMET-Kurzlehrgang, 13.-16.03.2006, LSTM, Universität Erlangen
numet06_hager.pdf
First Experiences with Cluster OpenMP. Cluster OpenMP workshop, May 19, 2006, HLRS
rrze-clomp_190506.pdf
High Performance Computing: Sequential Code Optimization by Example. Wilhelm and Else Heraeus Summerschool on Computational Many Particle Physics, Sep 18-29, 2006, Greifswald
hgw_prog_serial.pdf
High Performance Computing: Selected Topics in Shared Memory Parallelization. Wilhelm and Else Heraeus Summerschool on Computational Many Particle Physics, Sep 18-29, 2006, Greifswald
hgw_prog_parallel.pdf

2005

Erfahrungen und Benchmarks mit Dual-Core Prozessoren. ZKI AK Supercomputing, Karlsruhe, 22.09.2005
zki2_05_dualcore.pdf
Betrieb eines heterogenen Clusters. ZKI AK Supercomputing, Karlsruhe, 23.09.2005
zki2_05_cluster.pdf
Benchmarks on Current Dual Core CPUs (and some comments on OpenMP, C++, Tools etc.). Video conference with ZIH Dresden, Oct 10, 2005
vk_201005.pdf

2004

Investigation of Stripe Formation in Hubbard Ladders using Parallel DMRG. KONWIHR result and review workshop, March 2-3, 2004, TU Munich
hqshpc_04.pdf
Application Performance: Altix vs. the Rest. SGI User Group Conference, May 24-27, 2004, Orlando, Florida
hager_sgi04.pdf
Intel VTune für Linux. Videokonferenz mit HLRS, 14.07.2004, RRZE
vtune2_04.pdf

2003

Parallelization Strategies for Density Matrix Renormalization Group Algorithms on Shared-Memory Systems. Informal DMRG workshop, May 7-9 2003, RRZE
dmrg03.pdf
Writing Efficient Programs in Fortran, C and C++: Selected Case Studies. Workshop on efficient HPC programming, July 21st 2003, LRZ
cases_03.pdf
Introduction to IA32 and IA64: Architectures, Tools and Libraries. Workshop on Parallel Programming for High Performance Computers, Oct 13-17 2003, RRZE
intel_architectures_03.pdf

2002

Paralleles Rechnen in der Physik. Kolloquium zur Physik-Didaktik, Universität Erlangen, 07.05.2002
phydid-070502.pdf

Random thoughts on High Performance Computing

Content

Talks & Teaching

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002