Georg Hager's Blog

Google scholar page

2026

A. Ujeniya, J. Eitzinger, G. Hager, and G. Wellein: Architectural Trade-offs in the Energy-Efficient Era: A Comparative Study of power-capping NVIDIA H100 and H200. Submitted.
M. Mayr, S. Wind, L. Schröder, G. Hager, H. Köstler, and G. Wellein: AI Application Benchmarking: Power-Aware Performance Analysis for Vision and Language Models. Submitted. Preprint: arXiv:2603.16164
A. Afzal, G. Hager, and G. Wellein: Wattlytics: A Web Platform for Co-Optimizing Performance, Energy, and TCO in HPC Clusters. Submitted. Preprint: arXiv:2604.08182
J. Laukemann, G. Hager, and G. Wellein: Microarchitectural comparison, in-core modeling, and memory hierarchy analysis of state-of-the-art CPUs: Grace, Sapphire Rapids, and Genoa. Parallel Computing 127 (2026), 103183, DOI: 10.1016/j.parco.2026.103183.
A. Afzal, G. Hager, and G. Wellein: Exploring metrics for analyzing dynamic behavior in MPI programs via a coupled-oscillator model. Parallel Computing 127 (2026), 103184, DOI: 10.1016/j.parco.2026.103184. Preprint: arXiv:2506.02792
R. Ravedutti, J. Eitzinger, G. Hager, and G. Wellein: On the Challenges of Energy-Efficiency Analysis in HPC Systems: Evaluating Synthetic Benchmarks and Gromacs. In Proceedings of the Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region Workshops (SCA/HPCAsiaWS ’26). Association for Computing Machinery, New York, NY, USA, 50–58. DOI: 10.1145/3784828.3785158. Preprint: arXiv:2512.03697

2025

A. Afzal, A. Kahler, G. Hager, and G. Wellein: GROMACS Unplugged: How Power Capping and Frequency Shapes Performance on GPUs. Submitted. Preprint: arXiv:2510.06902
D. C. Lacey, C. L. Alappat, F. Lange, G. Hager, H. Fehske, and G. Wellein: Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels. The International Journal of High Performance Computing Applications (IJHPCA), 2025;0(0). DOI: 10.1177/10943420251319332. Available with Open Access.
E. Suarez, H. Bockelmann, N. Eicker, J. Eitzinger, S. El Sayed, T. Fieseler, M. Frank, P. Frech, P. Giesselmann, D. Hackenberg, G. Hager, A. Herten, T. Ilsche, B. Koller, E. Laure, C. Manzano, S. Oeste, M. Ott, K. Reuter, R. Schneider, K. Thust, B. v. St. Vieth: Energy-aware operation of HPC systems in Germany. Frontiers in High Performance Computing (2025). DOI: 10.3389/fhpcp.2025.1520207. Available with Open Access.
A.Afzal, G. Hager, and G. Wellein: Analytic Roofline Modeling and Energy Analysis of the LULESH Proxy Application on Multi-Core Clusters. The International Journal of High Performance Computing Applications. 2025;0(0). DOI:10.1177/10943420251363711. Preprint: arXiv:2412.08792

2024

J. Laukemann, G. Hager, and G. Wellein: Microarchitectural comparison and in-core modeling of state-of-the-art CPUs: Grace, Sapphire Rapids, and Genoa. Proc. 15th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS 2024) , Atlanta, GA, USA, November 18, 2024, DOI: 10.1109/SCW63240.2024.00181. Preprint: arXiv:2409.08108
J. Laukemann, T. Gruber, G. Hager, D. Oryspayev, and G. Wellein: CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion. In 2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS), San Francisco, CA, USA, 2024 pp. 350-360. DOI: 10.1109/IPDPS57955.2024.00038. Preprint: arXiv:2311.04797
C. Alappat, J. Thies, G. Hager, H. Fehske, and G. Wellein: Algebraic Temporal Blocking for Sparse Iterative Solvers on Multi-Core CPUs. The International Journal of High Performance Computing Applications, 2024;0(0). DOI: 10.1177/10943420241283828. Preprint: arXiv:2309.02228

2023

A. Afzal, G. Hager, and G. Wellein: Physical Oscillator Model for Supercomputing. Proc. 14th IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS23), Denver, CO, USA. PMBS23 Best Short Paper Award. Available with Open Access. DOI: 10.1145/3624062.3625535, Preprint: arXiv:2310.05701
A. Afzal, G. Hager, and G. Wellein: SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study. Proc. 14th IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS23), Denver, CO, USA. Available with Open Access. DOI: 10.1145/3624062.3624197 , Preprint: arXiv:2309.05373
A. Alvermann, G. Hager, and H. Fehske: Orthogonal layers of parallelism in large-scale eigenvalue computations. ACM Transactions on Parallel Computing 10(3), Article 16 (September 2023), pp 1-31. DOI: 10.1145/3614444. Preprint: arXiv:2209.01974
R. Ravedutti Lucio Machado, J. Eitzinger, J. Laukemann, G. Hager, H. Köstler, and G. Wellein: MD-Bench: Engineering the in-core performance of short-range molecular dynamics kernels from state-of-the-art simulation packages. Future Generation Computer Systems (2023), ISSN 0167-739X, DOI: 10.1016/j.future.2023.06.023. Preprint: arXiv:2302.14660
A. Afzal, G. Hager, S. Markidis, and G. Wellein: Making Applications Faster by Asynchronous Execution: Slowing Down Processes or Relaxing MPI Collectives. Future Generation Computer Systems (2023), ISSN 0167-739X, DOI: 10.1016/j.future.2023.06.017. Preprint: arXiv:2302.12164
C. L. Alappat, G. Hager, O. Schenk, and G. Wellein: Level-based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication. IEEE Transactions on Parallel and Distributed Systems 34(2), 581-597 (2023), DOI: 10.1109/TPDS.2022.3223512. Preprint: arXiv:2205.01598
A. Afzal, G. Hager, and G. Wellein: The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs. IEEE Transactions on Parallel and Distributed Systems 34(2), 623-638 (2023), DOI: 10.1109/TPDS.2022.3221085. 2023 Best Paper Runner-up in IEEE TPDS. Preprint: arXiv:2205.04190
D. Ernst, M. Holzer, G. Hager, M. Knorr, and G. Wellein: Analytical Performance Estimation during Code Generation on Modern GPUs. Journal of Parallel and Distributed Computing 173, 152-167 (2023). DOI: 10.1016/j.jpdc.2022.11.003, Preprint: arXiv:2204.14242
A. Afzal, G. Hager, G. Wellein, and S. Markidis: Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2022. Lecture Notes in Computer Science, vol 13826. Springer, Cham. Available with OpenAccess. DOI: 10.1007/978-3-031-30442-2_12, Preprint: arXiv:2205.13963

2022

A. Afzal, G. Hager, and G. Wellein: Analytic performance model for parallel overlapping memory-bound kernels. Concurrency and Computation: Practice and Experience (January 2022). Available with Open Access. DOI: 10.1002/cpe.6816, Preprint: arXiv:2011.00243
A. Afzal, G. Wellein, and G. Hager: Addressing White-box Modeling and Simulation Challenges in Parallel Computing. In: SIGSIM-PADS ’22: Proceedings of the 2022 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, pp. 25-26, June 2022. DOI: 10.1145/3518997.3534986

2021

D. Ernst, G. Hager, M. Knorr, G. Wellein, and M. Holzer: Opening the Black Box: Performance Estimation during Code Generation for GPUs. Accepted for SBAC-PAD 2021, in 2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Belo Horizonte, Brazil, Oct 26-29, 2021, pp. 22-32. DOI: 10.1109/SBAC-PAD53543.2021.00014, Preprint: arXiv:2107.01143.
C. L. Alappat, N. Meyer, J. Laukemann, T. Gruber, G. Hager, G. Wellein, and T. Wettig: ECM modeling and performance tuning of SpMV and Lattice QCD on A64FX. Concurrency and Computation: Practice and Experience, e6512 (2021). Available with Open Access. DOI: 10.1002/cpe.6512, Preprint: arXiv:2103.03013
A. Afzal, G. Hager, and G. Wellein: Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact. Proc. ISC High Performance 2021 Digital, June 24 – July 2, 2021, Frankfurt, Germany. DOI: 10.1007/978-3-030-78713-4_19 Preprint: arXiv:2103.03175
C. L. Alappat, J. Seiferth, G. Hager, M. Korch, Thomas Rauber, and G. Wellein: YaskSite – Stencil Optimization Techniques Applied to Explicit ODE Methods on Modern Architectures. 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), Seoul, Korea (South), 2021 pp. 174-186. DOI: 10.1109/CGO51591.2021.9370316, Preprint: cgo21main-p18-p-aeebf45-49058-preprint.pdf

2020

C. L. Alappat, J. Laukemann, T. Gruber, G. Hager, G. Wellein, N. Meyer, and T. Wettig: Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX. 2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), GA, USA, 2020, pp. 1-7. PMBS20 Best Short Paper Award. DOI: 10.1109/PMBS51919.2020.00006 Preprint: arXiv:2009.13903
A. Pieper, G. Hager, and H. Fehske: A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials. The International Journal of High Performance Computing Applications, (September 2020). DOI: 10.1177/1094342020959423. Preprint: arXiv:1708.09689
C. L. Alappat, A. Alvermann, A. Basermann, H. Fehske, Y. Futamura, M. Galgon, G. Hager, S. Huber, A. Imakura, M. Kawai, M. Kreutzer, B. Lang, K. Nakajima, M. Röhrig-Zöllner, T. Sakurai, F. Shahzad, J. Thies, and G. Wellein: ESSEX: Equipping Sparse Solvers For Exascale. In: Bungartz HJ., Reiz S., Uekermann B., Neumann P., Nagel W. (eds) Software for Exascale Computing – SPPEXA 2016-2019. Lecture Notes in Computational Science and Engineering 136, 143-187 (2020). Springer, Cham. Available with Open Access. DOI: 10.1007/978-3-030-47956-5_7
J. Hofmann, C. L. Alappat, G. Hager, D. Fey, and G. Wellein: Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors. Supercomputing Frontiers and Innovations 7(2), 54-78, July 2020. Available with Open Access. DOI: 10.14529/jsfi200204.
C. L. Alappat, G. Hager, O. Schenk, J. Thies, A. Basermann, A. R. Bishop, H. Fehske, and G. Wellein: A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication. ACM Trans. Parallel Comput. 7(3), Article 19 (June 2020), 37 pages. Available with Open Access. DOI: 10.1145/3399732.
F. Cremonesi, G. Hager, G. Wellein, and F. Schürmann: Analytic Performance Modeling and Analysis of Detailed Neuron Simulations. International Journal of High Performance Computing Applications, (April 2020). Available with Open Access. DOI: 10.1177/1094342020912528. Preprint: arXiv:1901.05344
D. Ernst, G. Hager, J. Thies, and G. Wellein: Performance Engineering for Real and Complex Tall & Skinny Matrix Multiplication Kernels on GPUs. The International Journal of High Performance Computing Applications, (October 2020). Available with Open Access. DOI: 1094342020965661. Preprint: arXiv:1905.03136v2
A. Afzal, G. Hager, and G. Wellein: Desynchronization and Wave Pattern Formation in MPI-Parallel and Hybrid Memory-Bound Programs. In: P. Sadayappan, B. Chamberlain, G. Juckeland, H. Ltaief (eds): High Performance Computing. ISC High Performance 2020. Lecture Notes in Computer Science, vol 12151. Springer, Cham. Available with Open Access. DOI: 10.1007/978-3-030-50743-5_20
C. L. Alappat, J. Hofmann, G. Hager, H. Fehske, A. R. Bishop, and G. Wellein: Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors. In: P. Sadayappan, B. Chamberlain, G. Juckeland, H. Ltaief (eds): High Performance Computing. ISC High Performance 2020. Lecture Notes in Computer Science, vol 12151. Springer, Cham. Available with Open Access. DOI: 10.1007/978-3-030-50743-5_21
J. Thies, M. Röhrig-Zöllner, N. Overmars, A. Basermann, D. Ernst, G. Hager, and G. Wellein: PHIST: a Pipelined, Hybrid-parallel Iterative Solver Toolkit. ACM Transactions on Mathematical Software 46(4), Article 31 (October 2020). DOI: 10.1145/3402227. Preprint: https://elib.dlr.de/123323/

2019

J. Laukemann, J. Hammer, G. Hager, and G. Wellein: Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels. 2019 IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), Denver, CO, USA, 2019, pp. 1-6, DOI: 10.1109/PMBS49563.2019.00006. PMBS19 Best Late-Breaking Paper Award. Preprint: arXiv:1910.00214
A. Afzal, G. Hager, and G. Wellein: Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study. Proc. 2019 IEEE International Conference on Cluster Computing (CLUSTER), Albuquerque, NM, September 23-26, 2019. DOI: 10.1109/CLUSTER.2019.8890995, Preprint: arXiv:1905.10603
J. Hornich, J. Hammer, G. Hager, T. Gruber, and G. Wellein: Collecting and Presenting Reproducible Intranode Stencil Performance: INSPECT. Supercomputing Frontiers and Innovations 6(3), 4-25 (2019). ISSN 2313-8734, DOI: 10.14529/jsfi190301
D. Ernst, G. Hager, J. Thies, and G. Wellein: Performance Engineering for a Tall & Skinny Matrix Multiplication Kernel on GPUs. In: Wyrzykowski R., Deelman E., Dongarra J., Karczewski K. (eds) Parallel Processing and Applied Mathematics. PPAM 2019. Lecture Notes in Computer Science, vol 12043. Springer, Cham. PPAM 2019 Best Paper Award. DOI: 10.1007/978-3-030-43229-4_43, Preprint: arXiv:1905.03136v1
A. Alvermann, A. Basermann, H.-J. Bungartz, C. Carbogno, D. Ernst, H. Fehske, Y. Futamura, M. Galgon, G. Hager, S. Huber, T. Huckle, A. Ida, A. Imakura, M. Kawai, S. Köcher, M. Kreutzer, P. Kus, B. Lang, H. Lederer, V. Manin, A. Marek, K. Nakajima, L. Nemec, K. Reuter, M. Rippl, M. Röhrig-Zöllner, T. Sakurai, M. Scheffler, C. Scheurer, F. Shahzad, D. Simoes Brambila, J. Thies, and G. Wellein: Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects. Proc. EPASA 2018, Japan Journal of Industrial and Applied Mathematics, 36(2), 699-717, DOI: 10.1007/s13160-019-00360-8. Preprint: arXiv:1806.01036.
F. Shahzad, J. Thies, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: CRAFT: A library for easier application-level checkpoint/restart and automatic fault tolerance. IEEE Transactions on Parallel and Distributed Systems 30(3), 501-514 (2019). DOI: 10.1109/TPDS.2018.2866794, Preprint: arXiv:1708.02030

2018

G. Hager and G. Wellein: Performance Engineering. Informatik Spektrum, ISSN 1432-122X, Online first, DOI: 10.1007/s00287-018-1122-1. (in German)
J. Laukemann, J. Hammer, J. Hofmann, G. Hager, and G. Wellein: Automated Instruction Stream Throughput Prediction for Intel and AMD Microarchitectures. 2018 IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), Dallas, TX, USA, 2018, pp. 121-131. DOI: 10.1109/PMBS.2018.8641578. Preprint: arXiv:1809.00912
M. Wittmann, G. Hager, R. Janalík, M. Lanser, A. Klawonn, O. Rheinbach, O. Schenk, and G. Wellein: Multicore Performance Engineering of Sparse Triangular Solves Using a Modified Roofline Model. Proc. 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), September 24-27, 2018, Lyon, France, 233-241. DOI: 10.1109/CAHPC.2018.8645938
J. Hofmann, G. Hager, and D. Fey: On the accuracy and usefulness of analytic energy models for contemporary multicore processors. In: R. Yokota, M. Weiland, D. Keyes, and C. Trinitis (eds.): High Performance Computing: 33rd International Conference, ISC High Performance 2018, Frankfurt, Germany, June 24-28, 2018, Proceedings, Springer, Cham, LNCS 10876, ISBN 978-3-319-92040-5 (2018), 22-43. DOI: 10.1007/978-3-319-92040-5_2, Preprint: arXiv:1803.01618. Winner of the ISC 2018 Gauss Award.
M. Kreutzer, G. Hager, D. Ernst, H. Fehske, A.R. Bishop, and G. Wellein: Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs. In: R. Yokota, M. Weiland, D. Keyes, and C. Trinitis (eds.): High Performance Computing: 33rd International Conference, ISC High Performance 2018, Frankfurt, Germany, June 24-28, 2018, Proceedings, Springer, Cham, LNCS 10876, ISBN 978-3-319-92040-5 (2018), 329-349. DOI: 10.1007/978-3-319-92040-5_17. ISC 2018 Hans Meuer Award Finalist.
J. Hornich, G. Hager, and C. Pflaum: Efficient optical simulation of nano structures in thin-film solar cells. Proc. SPIE 10694, Computational Optics II, 106940R (28 May 2018); DOI: 10.1117/12.2312545

2017

M. Galgon, L. Krämer, B. Lang, A. Alvermann, H. Fehske, A. Pieper, G. Hager, M. Kreutzer, F. Shahzad, G. Wellein, A. Basermann, M. Röhrig-Zöllner, and J. Thies: Improved coefficients for polynomial filtering in ESSEX. In T. Sakurai, S.-L. Zhang, T. Imamura, Y. Yamamoto, Y. Kuramashi, and T. Hoshi (eds.), Eigenvalue Problems: Algorithms, Software and Applications, in Petascale Computing. Proc. EPASA 2015, Tsukuba, Japan, September 2015, volume 117 of LNCSE, pages 63-79. Springer International Publishing, 2017. DOI: 10.1007/978-3-319-62426-6_5
T. M. Malas, G. Hager, H. Ltaief, and D. E. Keyes: Multi-dimensional intra-tile parallelization for memory-starved stencil computations. ACM Transactions on Parallel Computing 4(3), 12:1-12:32 (2017). DOI: 10.1145/3155290, Preprint: arXiv:1510.04995
T. Röhl, J. Eitzinger, G. Hager, and G. Wellein: LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses. Accepted for the HPCMASPA 2017, the Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications, held in conjunction with IEEE Cluster 2017, Honolulu, HI, September 5, 2017. DOI: 10.1109/CLUSTER.2017.115. Preprint: arXiv:1708.01476
J. Hofmann, G. Hager, G. Wellein, and D. Fey: An analysis of core- and chip-level architectural features in four generations of Intel server processors. In: J. Kunkel et al. (eds.), High Performance Computing: 32nd International Conference, ISC High Performance 2017, Frankfurt, Germany, June 18-22, 2017, Proceedings, Springer, Cham, LNCS 10266, ISBN 978-3-319-58667-0 (2017), 294-314. DOI: 10.1007/978-3-319-58667-0_16. Preprint: arXiv:1702.07554
J. Hammer, J. Eitzinger, G. Hager, and G. Wellein: Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels. In: Niethammer C., Gracia J., Hilbrich T., Knüpfer A., Resch M., Nagel W. (eds), Tools for High Performance Computing 2016, ISBN 978-3-319-56702-0, 1-22 (2017). Proceedings of IPTW 2016, the 10th International Parallel Tools Workshop, October 4-5, 2016, Stuttgart, Germany. Springer, Cham. DOI: 10.1007/978-3-319-56702-0_1, Preprint: arXiv:1702.04653

2016

T. Röhl, J. Eitzinger, G. Hager, and G. Wellein: Validation of Hardware Events for Successful Performance Pattern Identification in High Performance Computing. In: A. Knüpfer et al. (eds.), Tools for High Performance Computing 2015, Springer International Publishing, ISBN 978-3-319-39589-0 (2016), 17-28. DOI: 10.1007/978-3-319-39589-0_2. Preprint: arXiv:1710.04094
F. Shahzad, M. Kreutzer, T. Zeiser, R. Machado, A. Pieper, G. Hager, and G. Wellein: Building and utilizing fault tolerance support tools for the GASPI applications. International Journal of High Performance Computing Applications (2016). First published date: November-28-2016, DOI: 10.1177/1094342016677085. Preprint (post-review): ft-gaspi-ijhpca.pdf
M. Kreutzer, J. Thies, M. Röhrig-Zöllner, A. Pieper, F. Shahzad, M. Galgon, A. Basermann, H. Fehske, G. Hager, and G. Wellein: GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems. International Journal of Parallel Programming (2016). DOI: 10.1007/s10766-016-0464-z. Preprint: arXiv:1507.08101
A. Pieper, M. Kreutzer, A. Alvermann, M. Galgon, H. Fehske, G. Hager, B. Lang, and G. Wellein: High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations. Journal of Computational Physics 325, 226-243 (2016). DOI: 10.1016/j.jcp.2016.08.027, Preprint: arXiv:1510.04895
J. Hofmann, D. Fey, J. Eitzinger, G. Hager, and G. Wellein: Analysis of Intel’s Haswell Microarchitecture Using the ECM Model and Microbenchmarks. Proc. Architecture of Computing Systems — ARCS 2016, Volume 9637 of the series Lecture Notes in Computer Science, 210-222 (2016). DOI: 10.1007/978-3-319-30695-7_16
J. Hofmann, D. Fey, M. Riedmann, J. Eitzinger, G. Hager, and G. Wellein: Performance analysis of the Kahan-enhanced scalar product on current multi- and manycore processors. Concurrency & Computation: Practice & Experience 29(9), e3921 (2016). Available online, DOI: 10.1002/cpe.3921. Preprint: arXiv:1604.01890
M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: Modeling and analyzing performance for highly optimized propagation steps of the lattice Boltzmann method on sparse lattices. Submitted. Preprint: arXiv:1410.0412
T. M. Malas, J. Hornich, G. Hager, H. Ltaief, C. Pflaum, and D. E. Keyes: Optimization of an electromagnetics code with multicore wavefront diamond blocking and multi-dimensional intra-tile parallelization. Proc. IPDPS16, the 30th IEEE International Parallel & Distributed Processing Symposium, May 23-27, 2016, Chicago, IL. DOI: 10.1109/IPDPS.2016.87. Preprint: arXiv:1510.05218
J. Thies, M. Galgon, F. Shahzad, A. Alvermann, M. Kreutzer, A. Pieper, M. Röhrig-Zöllner, A. Basermann, H. Fehske, G. Hager, B. Lang, and G. Wellein: Towards an Exascale Enabled Sparse Solver Repository. In: Software for Exascale Computing – SPPEXA 2013-2015, Volume 113 of the series Lecture Notes in Computational Science and Engineering, 295-316 (2016). DOI: 10.1007/978-3-319-40528-5_13. Preprint: lncs_CWPs-4.pdf
M. Kreutzer, J. Thies, A. Pieper, A. Alvermann, M. Galgon, M. Röhrig-Zöllner, F. Shahzad, A. Basermann, A. R. Bishop, H. Fehske, G. Hager, B. Lang, and G. Wellein: Performance Engineering and Energy Efficiency of Building Blocks for Large, Sparse Eigenvalue Computations on Heterogeneous Supercomputers. In: Software for Exascale Computing – SPPEXA 2013-2015, Volume 113 of the series Lecture Notes in Computational Science and Engineering, 317-338 (2016). DOI: 10.1007/978-3-319-40528-5_14

2015

J. Hammer, G. Hager, J. Eitzinger, and G. Wellein: Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft. Proc. PMBS15, the 6th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, in conjunction with ACM/IEEE Supercomputing 2015 (SC15), November 16, 2015, Austin, TX. DOI: 10.1145/2832087.2832092, Preprint: arXiv:1509.03778
J. Hofmann, D. Fey, J. Eitzinger, G. Hager, and G. Wellein: Performance analysis of the Kahan-enhanced scalar product on current multicore processors. In: R. Wyrzykowski et al. (eds.), Parallel Processing and Applied Mathematics: 11th International Conference, PPAM 2015, Krakow, Poland, September 6-9, 2015. Revised Selected Papers, Part I. LNCS vol. 9573, 63-73 (2016). DOI: 10.1007/978-3-319-32149-3_7 Preprint: arXiv:1505.02586
F. Shahzad, M. Kreutzer, T. Zeiser, R. Machado, A. Pieper, G. Hager, G. Wellein: Building a fault tolerant application using the GASPI communication layer. Proc. FTS 2015, the 1st International Workshop on Fault-Tolerant Systems, in conjunction with IEEE Cluster 2015, September 8, 2015, Chicago, IL. DOI: 10.1109/CLUSTER.2015.106, Preprint: arXiv:1505.04628
T. M. Malas, G. Hager, H. Ltaief, H. Stengel, G. Wellein, and D. E. Keyes: Multicore-optimized wavefront diamond blocking for optimizing stencil updates. SIAM Journal on Scientific Computing 37(4), C439-C464 (2015). DOI: 10.1137/140991133, Preprint: arXiv:1410.3060
M. Röhrig-Zöllner, J. Thies, M. Kreutzer, A. Alvermann, A. Pieper, A. Basermann, G. Hager, G. Wellein, and H. Fehske: Increasing the performance of the Jacobi-Davidson method by blocking. SIAM Journal on Scientific Computing, 37(6), C697–C722 (2015). DOI: 10.1137/140976017, Preprint: http://elib.dlr.de/89980/
H. Stengel, J. Treibig, G. Hager, and G. Wellein: Quantifying performance bottlenecks of stencil computations using the Execution-Cache-Memory model. Proc. ICS15, the 29th International Conference on Supercomputing, June 8-11, 2015, Newport Beach, CA. DOI: 10.1145/2751205.2751240. Preprint: arXiv:1410.5010
H. Fehske, G. Hager, and A. Pieper: Electron confinement in graphene with gate-defined quantum dots. Phys. Status Solidi B, 252: 1868–1871 (2015). DOI: 10.1002/pssb.201552119. Preprint: arXiv:1503.05815
M. Wittmann, G. Hager, T. Zeiser, J. Treibig, and G. Wellein: Chip-level and multi-node analysis of energy-optimized lattice-Boltzmann CFD simulations. Concurrency and Computation: Practice and Experience 28(7), 2295-2315 (2015). DOI: 10.1002/cpe.3489 Preprint: arXiv:1304.7664
M. Kreutzer, G. Hager, G. Wellein, A. Pieper, A. Alvermann, and H. Fehske: Performance Engineering of the Kernel Polynomial Method on Large-Scale CPU-GPU Systems. Proc. IPDPS15, the 29th IEEE International Parallel & Distributed Processing Symposium, May 25-29, 2015, Hyderabad, India. DOI: 10.1109/IPDPS.2015.76, Preprint: arXiv:1410.5242

2014

T. Röhl, J. Treibig, G. Hager, and G. Wellein: Overhead Analysis of Performance Counter Measurements. In: Proc. PSTI 2014, the Fifth International Workshop on Parallel Software Tools and Tool Infrastructures, Sept 11, 2014, Minneapolis, MN. DOI: 10.1109/ICPPW.2014.34
T. M. Malas, G. Hager, H. Ltaief, and D. E. Keyes: Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking. Preprint: arXiv:1410.5561
A. Alvermann, A. Basermann, H. Fehske, Martin Galgon, G. Hager, M. Kreutzer, L. Krämer, B. Lang, A. Pieper, M. Röhrig-Zöllner, F. Shahzad, J. Thies, and G. Wellein: ESSEX: Equipping Sparse Solvers for Exascale. In: L. Lopes et al. (Eds.): Euro-Par 2014 Workshops, Part II, LNCS 8806, 577-588 (2014). DOI: 10.1007/978-3-319-14313-2_49. Preprint
M. Kreutzer, G. Hager, G. Wellein, H. Fehske, and A. R. Bishop: A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM Journal on Scientific Computing 36(5), C401–C423 (2014). DOI: 10.1137/130930352, Preprint: arXiv:1307.6209, BibTeX
J. Hofmann, J. Treibig, G. Hager, and G. Wellein: Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips. Accepted for WPMVP 2014, the Workshop on Programming Models for SIMD/Vector Processing at PPoPP 2014, Orlando, FL, Feb 16, 2014. DOI: 10.1145/2568058.2568068, Preprint: arXiv:1401.7494
J. Hofmann, J. Treibig, G. Hager, and G. Wellein: Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator. Accepted for PASA 2014, the 11th Workshop on Parallel Algorithms and Systems and Algorithms, Lübeck, Germany, Feb 25-26, 2014. IEEE Archive, Preprint: arXiv:1401.3615
S. Kronawitter, H. Stengel, G. Hager, and C. Lengauer: Domain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model. Parallel Processing Letters 24, 1441004 (2014). DOI: 10.1142/S0129626414410047
G. Hager, J. Treibig, J. Habich, and G. Wellein: Exploring performance and power properties of modern multicore chips via simple machine models. Concurrency and Computation: Practice and Experience 28(2), 189-210 (2016). First published online December 2013, DOI: 10.1002/cpe.3180, Preprint: arXiv:1208.2908

2013

M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: Domain decomposition and locality optimization for large-scale lattice Boltzmann simulations. Computers & Fluids 80 (2013), 283-289. DOI: 10.1016/j.compfluid.2012.02.007. Preprint: arXiv 1111.1129 (2011).
M. Wittmann, G. Hager, G. Wellein, T. Zeiser, and B. Krammer: MPC and Coarray Fortran: Alternatives to Classic MPI Implementations on the Examples of Scalable Lattice Boltzmann Flow Solvers. In: W. E. Nagel et al. (eds.), High Performance Computing in Science and Engineering ‘12, Springer, ISBN 978-3-642-33373-6 (2013) 367-372. DOI: 10.1007/978-3-642-33374-3_27
C. Scheit, G. Hager, J. Treibig, S. Becker, and G. Wellein: Optimization of FASTEST-3D for Modern Multicore Systems. Submitted. Preprint: arXiv:1303.4538
T. Scharpff, K. Iglberger, G. Hager, and U. Rüde: Model-guided Performance Analysis of the Sparse Matrix-Matrix Multiplication. Proc. 2013 International Conference on High Performance Computing & Simulation (HPCS 2013), July 1-5, 2013, Helsinki, Finland. DOI: 10.1109/HPCSim.2013.6641452, Preprint: arXiv:1303.1651
M. Wittmann, G. Hager, T. Zeiser, and G. Wellein: Asynchronous MPI for the Masses. Submitted. Preprint: arXiv:1302.4280
F. Shahzad, M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: An Evaluation of Different IO Techniques for Checkpoint/Restart. Workshop on Large-Scale Parallel Processing 2013 (LSPP13). DOI: 10.1109/IPDPSW.2013.145, Preprint: asyn_ckpt_130115.pdf
F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: A survey of checkpoint/restart techniques on distributed memory systems. Parallel Processing Letters 23(04), 1340011-1340030 (2013). DOI: 10.1142/S0129626413400112
F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: PGAS implementation of SpMVM and LBM with GPI. Proceedings of the 7th International Conference on PGAS Programming Models, Oct. 3-4, 2013, Edinburgh, Scotland, 172-184 (2013).

2012

J. Treibig, G. Hager, and G. Wellein: likwid-bench: An Extensible Microbenchmarking Platform for x86 Multicore Compute Nodes. In: H. Brunst et al. (eds.), Tools for High Performance Computing 2011. Springer, ISBN 978-3-642-31475-9, (2012) 27-36 . DOI: 978-3-642-31475-9.
K. Sembritzki, G. Hager, B. Krammer, J. Treibig, and G. Wellein: Evaluation of the Coarray Fortran Programming Model on the Example of a Lattice Boltzmann Code. Proceedings of PGAS ’12, The 6th Conference on Partitioned Global Address Space Programming Models, Oct 10-12, 2012, Santa Barbara, CA, USA.
G. Hager: Performance engineering: From numbers to insight. Proc. 5^th Workshop on Productivity and Performance (PROPER 2012) at Euro-Par 2012, August 28, 2012, Rhodes Island, Greece. Euro-Par 2012: Parallel Processing Workshops, Lecture Notes in Computer Science 7640, 393-394 (2013), Springer, ISBN 978-3-642-36948-3. DOI: 10.1007/978-3-642-36949-0_44
J. Treibig, G. Hager, and G. Wellein: Performance patterns and hardware metrics on modern multicore processors: Best practices for performance engineering. Proc. 5^th Workshop on Productivity and Performance (PROPER 2012) at Euro-Par 2012, August 28, 2012, Rhodes Island, Greece. Euro-Par 2012: Parallel Processing Workshops, Lecture Notes in Computer Science 7640, 451-460 (2013), Springer, ISBN 978-3-642-36948-3. DOI: 10.1007/978-3-642-36949-0_50. Preprint: arXiv:1206.3738
K. Iglberger, G. Hager, J. Treibig, and U. Rüde: High Performance Smart Expression Template Math Libraries. Accepted for the 2nd International Workshop on New Algorithms and Programming Models for the Manycore Era (APMM 2012) at HPCS 2012, July 2-6, 2012, Madrid, Spain. DOI: 10.1109/HPCSim.2012.6266939
M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: Comparison of Different Propagation Steps for Lattice Boltzmann Methods. Computers & Mathematics with Applications (Proc. ICMMES 2011). Available online, DOI: 10.1016/j.camwa.2012.05.002. Preprint: arXiv:1111.0922
M. Kreutzer, G. Hager, G. Wellein, H. Fehske, A. Basermann, and A.R. Bishop: Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation. Accepted for the Workshop on Large-Scale Parallel Processing 2012 (LSPP12). DOI: 10.1109/IPDPSW.2012.211. Preprint: arXiv:1112.5588
J. Habich, C. Feichtinger, H. Köstler, G. Hager, and G. Wellein: Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results. Computers & Fluids, DOI: 10.1016/j.compfluid.2012.02.013. Preprint: arXiv:1112.0850
J. Treibig, G. Hager, H. G. Hofmann, J. Hornegger, and G. Wellein: Pushing the limits for medical image reconstruction on recent standard multicore processors. International Journal of High Performance Computing Applications 27(2), 162–177 (2013).
DOI: 10.1177/1094342012442424, Preprint: arXiv:1104.5243
K. Iglberger, G. Hager, J. Treibig, and U. Rüde: Expression Templates Revisited: A Performance Analysis of Current ET Methodologies. SIAM Journal on Scientific Computing 34(2), C42-C69 (2012). DOI: 10.1137/110830125, Preprint: arXiv:1104.1729

2011

G. Schubert, H. Fehske, G. Hager, and G. Wellein: Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems. Parallel Processing Letters 21(3), 339-358 (2011). DOI: 10.1142/S0129626411000254, Preprint: arXiv:1106.5908
G. Hager, G. Schubert, T. Schoenemeyer, and G. Wellein: Prospects for Truly Asynchronous Communication with Pure MPI and Hybrid MPI/OpenMP on Current Supercomputing Platforms. Proc. Cray Users Group Conference 2011 (CUG 2011), May 23-26, 2011, Fairbanks, AK. Hager-Paper-CUG11.pdf
J. Treibig, G. Hager, and G. Wellein: LIKWID performance tools. In: C. Bischof et al. (eds.), Competence in High Performance Computing 2010. Springer, ISBN 978-3-642-24025-6 (2012), 165-175. DOI: 10.1007/978-3-642-24025-6_14, Preprint: arXiv:1104.4874
G. Schubert, G. Hager, H. Fehske and G. Wellein: Parallel sparse matrix-vector multiplication as a test case for hybrid MPI+OpenMP programming. Workshop on Large-Scale Parallel Processing (LSPP 2011), May 20th, 2011, Anchorage, AK. DOI:10.1109/IPDPS.2011.332, Preprint: arXiv:1101.0091
J. Treibig, G. Wellein and G. Hager: Efficient multicore-aware parallelization strategies for iterative stencil computations. Journal of Computational Science 2, 130-137 (2011). DOI: 10.1016/j.jocs.2011.01.010, Preprint: arXiv:1004.1741

2010

M. Wittmann and G. Hager: Optimizing ccNUMA locality for task-parallel execution under OpenMP and TBB on multicore-based systems. Preprint: arXiv:1101.0093
G. Hager and G. Wellein: Introduction to High Performance Computing for Scientists and Engineers. CRC Press, ISBN 978-1439811924, 356 pages, July 2010. Available as eBook.
C. Feichtinger, J. Habich, H. Köstler, G. Hager, U. Rüde and G.Wellein: A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters. Parallel Computing 37(9), 536-549 (2011) . DOI: 10.1016/j.parco.2011.03.005. Preprint: arXiv:1007.1388
M. Wittmann, G. Hager, J. Treibig and G. Wellein: Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. Parallel Processing Letters 20 (4), 359-376 (2010). DOI: 10.1142/S0129626410000296 Preprint: arXiv:1006.3148
H. Fehske and G. Hager: Luttinger, Peierls or Mott? Quantum Phase Transitions in Strongly Correlated 1D Electron-Phonon Systems. In: F. Hensel, P. Edwards and R. Redmer (Eds.), Metal-to-Nonmetal Transitions. Springer Series in Material Sciences, Vol. 132, (Springer) 1-22, 2010. DOI: 10.1007/978-3-642-03953-9_1
J. Treibig, G. Hager, M. Meier and G. Wellein: LIKWID performance tools. InSiDE 8(1), 50-53 (Spring 2010).
J. Treibig, G. Hager and G. Wellein: LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego CA, September 13, 2010. DOI: 10.1109/ICPPW.2010.38 Preprint: arXiv:1004.4431
J. Treibig, G. Hager and G. Wellein: Complexities of Performance Prediction for Bandwidth-Limited Loop Kernels on Multi-Core Architectures. In: S. Wagner et al., High Performance Computing in Science and Engineering, Garching/Munich 2009. Springer, ISBN 978-3642138713 (2010), 3-12. DOI: 10.1007/978-3-642-13872-0_1, Preprint (Multi-core architectures: Complexities of performance prediction and the impact of cache topology): arXiv:0910.4865.
G. Schubert, G. Hager and H. Fehske: Performance limitations for sparse matrix-vector multiplications on current multicore environments. In: S. Wagner et al., High Performance Computing in Science and Engineering, Garching/Munich 2009. Springer, ISBN 978-3642138713 (2010), 13-26. DOI: 10.1007/978-3-642-13872-0_2, Preprint: arXiv:0910.4836.
M. Wittmann, G. Hager and G. Wellein: Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory. Accepted for LSPP10, the Workshop on Large-Scale Parallel Processing at IPDPS 2010, April 23rd, 2010, Atlanta, GA.Preprint: arXiv:0912.4506, DOI: 10.1109/IPDPSW.2010.5470813
J. Habich, T. Zeiser, G. Hager, G. Wellein: Performance analysis and optimization strategies for a D3Q19 Lattice Boltzmann Kernel on nVIDIA GPUs using CUDA. Advances in Engineering Software 42 (5), 266-272 (2011). DOI: 10.1016/j.advengsoft.2010.10.007

2009

T. Zeiser, G. Hager and G. Wellein: Benchmark analysis and application results for lattice Boltzmann simulations on NEC SX vector and Intel Nehalem systems. Parallel Processing Letters 19 (4), 491-511 (2009) DOI:10.1142/S0129626409000389
J. Treibig and G. Hager: Introducing a Performance Model for Bandwidth-Limited Loop Kernels. Proceedings of the Workshop “Memory issues on Multi- and Manycore Platforms” at PPAM 2009, the 8th International Conference on Parallel Processing and Applied Mathematics, Wroclaw, Poland, September 13-16, 2009. Lecture Notes in Computer Science Volume 6067, 2010, pp 615-624. DOI: 10.1007/978-3-642-14390-8_64. arXiv:0905.0792
T. Zeiser, G. Hager and G. Wellein: The world’s fastest CPU and SMP node: Some performance results from the NEC SX-9. Proceedings of LSPP 2009 at IPDPS09, Rome, Italy, May 25-29, 2009. DOI:10.1109/IPDPS.2009.5161089
G. Hager, G. Jost, and R. Rabenseifner: Communication Characteristics and Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-core SMP Nodes. In: Proceedings of the Cray Users Group Conference 2009 (CUG 2009), Atlanta, GA, USA, May 4-7, 2009. cug09_hager_jost_rabenseifner.pdf
G. Wellein, G. Hager, T. Zeiser, M. Wittmann, and H. Fehske: Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. Proceedings of COMPSAC 2009, the 33rd Annual IEEE International Computer Software and Applications Conference, Seattle, WA, July 20-24, 2009. DOI:10.1109/COMPSAC.2009.82
J. Habich, T. Zeiser, G. Hager, and G. Wellein: Speeding up a Lattice Boltzmann Kernel on nVIDIA GPUs. Proceedings of PARENG09-S01, the First International Conference on Parallel, Distributed and Grid Computing for Engineering, Pecs, Hungary, April 2009. DOI:10.4203/ccp.90.17
M. Wittmann and G. Hager: A Proof of Concept for Optimizing Task Parallelism by Locality Queues. arXiv:0902.1884
R. Rabenseifner, G. Hager, and G. Jost: Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In Didier El Baz et al. (Eds.), Proceedings of the 17th Euromicro International Conference on Parallel, Distributed, and network-based Processing PDP 2009, Feb 18-20, 2009, Weimar, Germany. Computer Society Press, pp. 427-436. DOI:10.1109/PDP.2009.43 hjr.pdf
S. Ejima, G. Hager, and H. Fehske: Quantum phase transition in a 1D transport model with boson affected hopping: Luttinger liquid versus charge-density-wave behavior. Phys. Rev. Lett. 102, 106404 (2009), DOI: 10.1103/PhysRevLett.102.106404, arXiv:0811.0742
T. Zeiser, G. Hager, and G. Wellein: Vector computers in a world of commodity clusters, massively parallel systems and many-core many-threaded CPUs: recent experience based on advanced lattice Boltzmann flow solvers. In: W. E. Nagel, D. B. Kröner, M. Resch (eds.), High Performance Computing in Science and Engineering ’08, Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2008, Springer, ISBN 978-3-540-88301-2, (2009) 333-347. DOI:10.1007/978-3-540-88303-6.

2008

N. Schindzielorz, J. Erler, P. Klüpfel, P.-G. Reinhard, and G. Hager: Fission of super-heavy nuclei explored with Skyrme forces. Int. J. Mod. Phys. E 18(4), 773-781 (2009). DOI:10.1142/S0218301309012860
M. Breuer, P. Lammers, T. Zeiser, G. Hager, and G. Wellein: Towards the simulation of the turbulent flow over dimples – Code evaluation and optimization for the NEC SX-8. In: W.E. Nagel, D. Körner, M. Resch (eds.), High Performance Computing in Science and Engineering ’07, Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2007, Springer, ISBN 978-3-540-74739-0 / 978-3-540-74738-3, (2008) 303-318. doi:10.1007/978-3-540-74739-0_21.
H. Fehske, G. Hager and J. Jeckelmann: Metallicity in the half-filled Holstein-Hubbard model. Europhys. Lett. 84, 57001 (2008), DOI:10.1209/0295-5075/84/57001, arXiv:0808.1675
G. Hager, T. Zeiser and G. Wellein: Data access optimizations for highly threaded multi-core CPUs with multiple memory controllers. Accepted for Workshop on Large-Scale Parallel Processing 2008, DOI:10.1109/IPDPS.2008.4536341, arXiv:0712.2302
G. Hager, T. Zeiser and G. Wellein: Data access characteristics and optimizations for Sun UltraSPARC T2 and T2+ systems. Parallel Processing Letters, Vol. 18, No. 4 (2008) 471-490. DOI:10.1142/S0129626408003521 Preprint: ppl-hzw.pdf

2007

G. Hager, A. Weiße, G. Wellein, E. Jeckelmann and H. Fehske: The spin-Peierls chain revisited. J. Magn. Magn. Mater. 310, 1380-1382 (2007). Erratum: J. Magn. Magn. Mater. 316, 43 (2007). Proceedings of the 17th International Conference on Magnetism (ICM 2006), Aug 20-25 2006, Kyoto, Japan. arXiv:cond-mat/0606360
M. Hohenadler, G. Hager, G. Wellein and H. Fehske: Carrier-density effects in many-polaron systems. J. Phys.: Condens. Matter 19 (2007) 255202. arXiv:cond-mat/0609296
T. Zeiser, G. Wellein, A. Nitsure, K. Iglberger, U. Rüde and G. Hager: Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method. Progress in Computational Fluid Dynamics, An Int. J. Vol. 8, No.1/2/3/4 (2008) 179-188. Proceedings of ICMMES 2006. DOI:10.1504/PCFD.2008.018088
G. Hager and G. Wellein: Architectures and Performance Characteristics of Modern High Performance Computers. In Fehske et al., Lect. Notes Phys. 739, 681-730 (2008), ISBN: 978-3-540-74685-0
G. Hager and G. Wellein: Optimization Techniques for Modern High Performance Computers. In Fehske et al., Lect. Notes Phys. 739, 731-767 (2008), ISBN: 978-3-540-74685-0
G. Hager, H. Stengel, T. Zeiser and G. Wellein: RZBENCH: Performance evaluation of current HPC architectures using low-level and application benchmarks. In: S. Wagner et al. (Eds.), High Performance Computing in Science and Engineering, Garching/Munich 2007. Transactions of the Third Joint HLRB and KONWIHR Status and Result Workshop, Dec 3-4, 2007, LRZ Garching, Springer, ISBN 978-3-540-69181-5 (2009) 485-501. arXiv:0712.3389
M. Stürmer, G. Wellein, G. Hager, H. Köstler and Ulrich Rüde: Challenges and potentials of emerging multicore architectures. In: S. Wagner et al. (Eds.), High Performance Computing in Science and Engineering, Garching/Munich 2007. Transactions of the Third Joint HLRB and KONWIHR Status and Result Workshop, Dec 3-4, 2007, LRZ Garching, Springer, ISBN 978-3-540-69181-5 (2009) 551-566.

2006

G. Wellein, P. Lammers, G. Hager, S. Donath and T. Zeiser: Towards optimal performance for lattice Boltzmann applications on terascale computers. In: A. Deane et al. (eds), Parallel Computational Fluid Dynamics – Theory and Applications. Proceedings of the Parallel CFD 2005 Conference, College Park, MD, USA, May 24-27, 2005. Elsevier, ISBN 0-444-52206-9 (2006) 31-40.
H. Fehske, G. Hager, G. Wellein and E. Jeckelmann: Hole-doped Hubbard ladders. Physica B 378-380, 319-320 (2006). arXiv:cond-mat/0505666
G. Schubert, A. Alvermann, A. Weiße, G. Hager, G. Wellein and H. Fehske: Spectral Properties of Strongly Correlated Electron Phonon Systems. NIC Symposium 2006, G. Münster, D. Wolf, M. Kremer (Editors), John von Neumann Institute for Computing, Jülich, NIC Series, Vol. 32, ISBN 3-00-017351-X, pp. 201-210, 2006.
A. Weiße, G. Hager, A. R. Bishop and H. Fehske: Phase diagram of the spin-Peierls chain with local coupling. Phys. Rev. B 74, 214426 (2006). arXiv:cond-mat/0607209
A. Nitsure, K. Iglberger, U. Rüde, C. Feichtinger, G. Wellein, G. Hager: Optimization of Cache Oblivious Lattice Boltzmann Method in 2D and 3D. In: Becker, Matthias; Szczerbicka, Helena (Hrsg.): Simulationstechnique – 19th Symposium in Hannover, September 2006 (ASIM 2006 – 19. Symposium Simulationstechnik, Hannover, 12. – 14. 09. 2006). Erlangen, SCS Publishing House, 2006, S. 265-270 (Frontiers in Simulation, Vol. 16)
P. Lammers, G. Wellein, T. Zeiser, G. Hager, M. Breuer: Have the vectors the continuing ability to parry the attack of the killer micros? In: M. Resch, T. Bönisch, K. Benkert, T. Furui, Y. Seo, W. Bez (editors): High Performance Computing on Vector Systems. Proceedings of the High Performance Computing Center Stuttgart, March 2005), Springer, ISBN 3-540-29124-5, (2006) 25-39. doi:10.1007/3-540-35074-8_2.

2005

G. Hager: A parallelized density matrix renormalization group algorithm and its application to strongly correlated quantum systems. Dissertation, Ernst-Moritz-Arndt-Universität Greifswald, 2005. URN: urn:nbn:de:gbv:9-000024-1
G. Hager, T. Zeiser and H. Heller:Setting up ByGRID – First Steps Towards an e-Science Infrastructure in Bavaria. In: A. Bode, F. Durst (Eds.): High Performance Computing in Science and Engineering, Garching 2005. Transactions of the KONWIHR Result Workshop, October 14-15, 2004 2, Technical University of Munich, Garching, Springer, ISBN 3-540-26145-1 (2005) 97-102.
G. Hager, G. Wellein, E. Jeckelmann and H. Fehske: Stripe formation in doped Hubbard ladders. Phys. Rev. B 71, 075108 (2005). arXiv:cond-mat/0409321
H. Fehske, G. Wellein, G. Hager, A. Weiße, K.W. Becker and A.R. Bishop: Luttinger liquid versus charge density wave behaviour in the one-dimensional spinless fermion Holstein model. Physica B 359-361, 699-701 (2005). arXiv:cond-mat/0406023
G. Hager, T. Zeiser, J. Treibig and G. Wellein: Optimizing performance on modern HPC systems: learning from simple kernel benchmarks. In: Proceedings of the 2nd Russian-German Advanced Research Workshop on Computational Science and High Performance Computing, HLRS, Stuttgart, March 14 – 16, 2005.
G. Wellein, T. Zeiser, S. Donath and G. Hager: On the Single Processor Performance of Simple Lattice Boltzmann Kernels. Proc. ICMMES, 2004. Computers & Fluids 35, 910-919 (2006). DOI:10.1016/j.compfluid.2005.02.008
S. Donath, T. Zeiser, G. Hager, J. Habich and G. Wellein: Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures. In: F. Huelsemann, M. Kowarschik, U. Ruede (Eds.): Frontiers in Simulation: Simulation Techniques – 18th Symposium in Erlangen, September 2005 (ASIM), pp. 728-735, SCS Publishing House, Erlangen, 2005.
G. Hager, B. Bergen, P. Lammers and G. Wellein: Taming the Bandwidth Behemoth – First Experiences on a Large SGI Altix System.InSiDE 3, No. 2, Autumn 2005, pp. 24-25 (2005).

2004

G. Hager, E. Jeckelmann, H. Fehske and G. Wellein: Parallelization Strategies for Density Matrix Renormalization Group Algorithms on Shared-Memory Systems. J. Comput. Phys. 194(2), 795 (2004). arXiv:cond-mat/0305463
H. Fehske, G. Wellein, G. Hager, A. Weiße and A. R. Bishop: Quantum Lattice Dynamical Effects on Single-Particle Excitations in One-dimensional Mott and Peierls Insulators. Phys. Rev. B 69, 165115 (2004). arXiv:cond-mat/0312426
G. Hager, G. Wellein, E. Jeckelmann and H. Fehske: DMRG Investigation of Stripe Formation in Doped Hubbard Ladders. In: A. Bode (Ed.): High Performance Computing in Science and Engineering 2004 – Transactions of the Second Joint HLRB and KONWIHR Result and Reviewing Workshop (Second Joint HLRB and KONWIHR Result and Reviewing Workshop Munich – Germany 2-3 March 2004). Berlin: Springer, 2004.
G. Hager, E. Jeckelmann, H. Fehske and G. Wellein: Exact Numerical Treatment of Finite Quantum Systems using Leading-Edge Supercomputers. In: Modelling, Simulation and Optimization of Complex Processes, Eds. H. G. Bock, E. Kostina, H.-X. Phu, R. Rannacher, Springer-Verlag Berlin Heidelberg (2005), pp 165-175.
G. Wellein, T. Zeiser, G. Hager and P. Lammers: Application Performance of Modern Number Crunchers. CSAR Focus, Ed. 12, Summer-Autumn 2004, pp. 17-19 (2004).

2003

G. Wellein, G. Hager, A. Basermann and H. Fehske: Fast sparse matrix-vector multiplication for TFlop/s computers.In: J.M.L.M. Palma; J. Dongarra (Hrsg.) : High Performance Computing for Computational Science – VECPAR2002 (High Performance Computing for Computational Science – VECPAR2002 Porto – Portugal 26-28 June 2002). Berlin : Springer, 2003.
H. Fehske, G. Wellein, A. P. Kampf, M. Sekania, G. Hager, A. Weiße, H. Büttner and A. R. Bishop: One-dimensional electron-phonon systems: Mott- versus Peierls-insulators. In: A. Bode (Hrsg.) : High Performance Computing in Science and Engineering 2002 – Transactions of the First Joint HLRB and KONWIHR Result and Reviewing Workshop (First Joint HLRB and KONWIHR Result and Reviewing Workshop Garching – Germany 10-11 October 2002). Berlin : Springer, 2003.
G. Hager, F. Deserno and G. Wellein: Pseudo-Vectorization and RISC Optimization Techniques for the Hitachi SR8000 architecture. In: A. Bode (Ed.) : High Performance Computing in Science and Engineering 2002 – Transactions of the First Joint HLRB and KONWIHR Result and Reviewing Workshop (First Joint HLRB and KONWIHR Result and Reviewing Workshop Garching – Germany 10-11 October 2002). Berlin : Springer, 2003.
G. Hager, F. Brechtefeld, P. Lammers and G. Wellein: Processor Architecture and Application Performance in Modern Supercomputers.InSiDE 1, No. 1, Spring 2003, pp. 8-13 (2003).

2001

G. Wellein, G. Hager, A. Basermann and H. Fehske: Exact Diagonalization of Large Sparse Matrices: A Challenge for Modern Supercomputers. In: Proceedings of CRAY Users Group (CUG) Summit 2001 (CUG Summit 2001 Indian Wells – USA May 2001). 2001, S. CD-ROM.

Random thoughts on High Performance Computing

Content

Publications

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2001