Georg Hager's Blog

Random thoughts on High Performance Computing

Content

Publications

Google scholar page

2017

  • A. Pieper, G. Hager, and H. Fehske: PVSC-DTM: A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials. Submitted. Preprint: arXiv:1708.09689
  • F. Shahzad, J. Thies, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: CRAFT: A library for easier application-level checkpoint/restart and automatic fault tolerance. Submitted. Preprint: arXiv:1708.02030
  • T. Röhl, J. Eitzinger, G. Hager, and G. Wellein: LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses. Accepted for the HPCMASPA 2017, the Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications, held in conjunction with IEEE Cluster 2017, Honolulu, HI, September 5, 2017. Preprint: arXiv:1708.01476
  • T. M. Malas, G. Hager, H. Ltaief, and D. E. Keyes: Multi-dimensional intra-tile parallelization for memory-starved stencil computations. Accepted for publication in ACM Transactions on Parallel Computing. Preprint: arXiv:1510.04995
  • J. Hofmann, G. Hager, G. Wellein, and D. Fey: An analysis of core- and chip-level architectural features in four generations of Intel server processors. In: J. Kunkel et al. (eds.), High Performance Computing: 32nd International Conference, ISC High Performance 2017, Frankfurt, Germany, June 18-22, 2017, Proceedings, Springer, Cham, LNCS 10266, ISBN 978-3-319-58667-0 (2017), 294-314. DOI: 10.1007/978-3-319-58667-0_16. Preprint: arXiv:1702.07554
  • J. Hammer, J. Eitzinger, G. Hager, and G. Wellein: Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels. In: Niethammer C., Gracia J., Hilbrich T., Knüpfer A., Resch M., Nagel W. (eds), Tools for High Performance Computing 2016, ISBN 978-3-319-56702-0, 1-22 (2017). Proceedings of IPTW 2016, the 10th International Parallel Tools Workshop, October 4-5, 2016, Stuttgart, Germany. Springer, Cham. DOI: 10.1007/978-3-319-56702-0_1, Preprint: arXiv:1702.04653

2016

  • T. Röhl, J. Eitzinger, G. Hager, and G. Wellein: Validation of Hardware Events for Successful Performance Pattern Identification in High Performance Computing. In: A. Knüpfer et al. (eds.), Tools for High Performance Computing 2015, Springer International Publishing, ISBN 978-3-319-39589-0 (2016), 17-28. DOI: 10.1007/978-3-319-39589-0_2
  • F. Shahzad, M. Kreutzer, T. Zeiser, R. Machado, A. Pieper, G. Hager, and G. Wellein: Building and utilizing fault tolerance support tools for the GASPI applications. International Journal of High Performance Computing Applications (2016). First published date: November-28-2016, DOI: 10.1177/1094342016677085. Preprint (post-review): ft-gaspi-ijhpca.pdf
  • M. Kreutzer, J. Thies, M. Röhrig-Zöllner, A. Pieper, F. Shahzad, M. Galgon, A. Basermann, H. Fehske, G. Hager, and G. Wellein: GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems. International Journal of Parallel Programming (2016). DOI: 10.1007/s10766-016-0464-z. Preprint: arXiv:1507.08101
  • A. Pieper, M. Kreutzer, A. Alvermann, M. Galgon, H. Fehske, G. Hager, B. Lang, and G. Wellein: High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations. Journal of Computational Physics 325, 226-243 (2016). DOI: 10.1016/j.jcp.2016.08.027, Preprint: arXiv:1510.04895
  • J. Hofmann, D. Fey, J. Eitzinger, G. Hager, and G. Wellein: Analysis of Intel’s Haswell Microarchitecture Using the ECM Model and Microbenchmarks. Proc. Architecture of Computing Systems — ARCS 2016, Volume 9637 of the series Lecture Notes in Computer Science, 210-222 (2016). DOI: 10.1007/978-3-319-30695-7_16
  • J. Hofmann, D. Fey, M. Riedmann, J. Eitzinger, G. Hager, and G. Wellein: Performance analysis of the Kahan-enhanced scalar product on current multi- and manycore processors. Concurrency & Computation: Practice & Experience (2016). Available online, DOI: 10.1002/cpe.3921. Preprint: arXiv:1604.01890
  • M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: Modeling and analyzing performance for highly optimized propagation steps of the lattice Boltzmann method on sparse lattices. Submitted. Preprint: arXiv:1410.0412
  • T. M. Malas, J. Hornich, G. Hager, H. Ltaief, C. Pflaum, and D. E. Keyes: Optimization of an electromagnetics code with multicore wavefront diamond blocking and multi-dimensional intra-tile parallelization. Proc. IPDPS16, the 30th IEEE International Parallel & Distributed Processing Symposium, May 23-27, 2016, Chicago, IL. DOI: 10.1109/IPDPS.2016.87. Preprint: arXiv:1510.05218
  • J. Thies, M. Galgon, F. Shahzad, A. Alvermann, M. Kreutzer, A. Pieper, M. Röhrig-Zöllner, A. Basermann, H. Fehske, G. Hager, B. Lang, and G. Wellein: Towards an Exascale Enabled Sparse Solver Repository. In: Software for Exascale Computing – SPPEXA 2013-2015, Volume 113 of the series Lecture Notes in Computational Science and Engineering, 295-316 (2016). DOI: 10.1007/978-3-319-40528-5_13. Preprint: lncs_CWPs-4.pdf
  • M. Kreutzer, J. Thies, A. Pieper, A. Alvermann, M. Galgon, M. Röhrig-Zöllner, F. Shahzad, A. Basermann, A. R. Bishop, H. Fehske, G. Hager, B. Lang, and G. Wellein: Performance Engineering and Energy Efficiency of Building Blocks for Large, Sparse Eigenvalue Computations on Heterogeneous Supercomputers. In: Software for Exascale Computing – SPPEXA 2013-2015, Volume 113 of the series Lecture Notes in Computational Science and Engineering, 317-338 (2016). DOI: 10.1007/978-3-319-40528-5_14

2015

  • J. Hammer, G. Hager, J. Eitzinger, and G. Wellein: Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft. Proc. PMBS15, the 6th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, in conjunction with ACM/IEEE Supercomputing 2015 (SC15), November 16, 2015, Austin, TX. DOI: 10.1145/2832087.2832092, Preprint: arXiv:1509.03778
  • J. Hofmann, D. Fey, J. Eitzinger, G. Hager, and G. Wellein: Performance analysis of the Kahan-enhanced scalar product on current multicore processors. In: R.  Wyrzykowski et al. (eds.), Parallel Processing and Applied Mathematics: 11th International Conference, PPAM 2015, Krakow, Poland, September 6-9, 2015. Revised Selected Papers, Part I. LNCS vol. 9573, 63-73 (2016). DOI: 10.1007/978-3-319-32149-3_7  Preprint: arXiv:1505.02586
  • F. Shahzad, M. Kreutzer, T. Zeiser, R. Machado, A. Pieper, G. Hager, G. Wellein: Building a fault tolerant application using the GASPI communication layer. Proc. FTS 2015, the 1st International Workshop on Fault-Tolerant Systems, in conjunction with IEEE Cluster 2015, September 8, 2015, Chicago, IL. DOI: 10.1109/CLUSTER.2015.106, Preprint: arXiv:1505.04628
  • T. M. Malas, G. Hager, H. Ltaief, H. Stengel, G. Wellein, and D. E. Keyes: Multicore-optimized wavefront diamond blocking for optimizing stencil updates. SIAM Journal on Scientific Computing 37(4), C439-C464 (2015). DOI: 10.1137/140991133, Preprint: arXiv:1410.3060
  • M. Röhrig-Zöllner, J. Thies, M. Kreutzer, A. Alvermann, A. Pieper, A. Basermann, G. Hager, G. Wellein, and H. Fehske: Increasing the performance of the Jacobi-Davidson method by blocking. SIAM Journal on Scientific Computing, 37(6), C697–C722 (2015). DOI: 10.1137/140976017, Preprint: http://elib.dlr.de/89980/
  • H. Stengel, J. Treibig, G. Hager, and G. Wellein: Quantifying performance bottlenecks of stencil computations using the Execution-Cache-Memory model. Proc. ICS15, the 29th International Conference on Supercomputing, June 8-11, 2015, Newport Beach, CA. DOI: 10.1145/2751205.2751240. Preprint: arXiv:1410.5010
  • H. Fehske, G. Hager, and A. Pieper: Electron confinement in graphene with gate-defined quantum dots. Phys. Status Solidi B, 252: 1868–1871 (2015). DOI: 10.1002/pssb.201552119. Preprint: arXiv:1503.05815
  • M. Wittmann, G. Hager, T. Zeiser, J. Treibig, and G. Wellein: Chip-level and multi-node analysis of energy-optimized lattice-Boltzmann CFD simulations. Concurrency and Computation: Practice and Experience 28(7), 2295-2315 (2015). DOI: 10.1002/cpe.3489 Preprint: arXiv:1304.7664
  • M. Kreutzer, G. Hager, G. Wellein, A. Pieper, A. Alvermann, and H. Fehske: Performance Engineering of the Kernel Polynomial Method on Large-Scale CPU-GPU Systems. Proc. IPDPS15, the 29th IEEE International Parallel & Distributed Processing Symposium, May 25-29, 2015, Hyderabad, India. DOI: 10.1109/IPDPS.2015.76, Preprint: arXiv:1410.5242

2014

  • T. Röhl, J. Treibig, G. Hager, and G. Wellein: Overhead Analysis of Performance Counter Measurements. In: Proc. PSTI 2014, the Fifth International Workshop on Parallel Software Tools and Tool Infrastructures, Sept 11, 2014, Minneapolis, MN. DOI: 10.1109/ICPPW.2014.34
  • T. M. Malas, G. Hager, H. Ltaief, and D. E. Keyes: Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking. Preprint: arXiv:1410.5561
  • A. Alvermann, A. Basermann, H. Fehske, Martin Galgon, G. Hager, M. Kreutzer, L. Krämer, B. Lang, A. Pieper, M. Röhrig-Zöllner, F. Shahzad, J. Thies, and G. Wellein: ESSEX: Equipping Sparse Solvers for Exascale. In: L. Lopes et al. (Eds.): Euro-Par 2014 Workshops, Part II, LNCS 8806, 577-588 (2014). DOI: 10.1007/978-3-319-14313-2_49. Preprint
  • M. Kreutzer, G. Hager, G. Wellein, H. Fehske, and A. R. Bishop: A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM Journal on Scientific Computing 36(5), C401–C423 (2014). DOI: 10.1137/130930352, Preprint: arXiv:1307.6209, BibTeX
  • J. Hofmann, J. Treibig, G. Hager, and G. Wellein: Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips. Accepted for WPMVP 2014, the Workshop on Programming Models for SIMD/Vector Processing at PPoPP 2014, Orlando, FL, Feb 16, 2014. DOI: 10.1145/2568058.2568068, Preprint: arXiv:1401.7494
  • J. Hofmann, J. Treibig, G. Hager, and G. Wellein: Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator. Accepted for PASA 2014, the 11th Workshop on Parallel Algorithms and Systems and Algorithms, Lübeck, Germany, Feb 25-26, 2014. IEEE Archive, Preprint: arXiv:1401.3615
  • S. Kronawitter, H. Stengel, G. Hager, and C. Lengauer: Domain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model. Parallel Processing Letters 24, 1441004 (2014). DOI: 10.1142/S0129626414410047
  • G. Hager, J. Treibig, J. Habich, and G. Wellein: Exploring performance and power properties of modern multicore chips via simple machine models. Concurrency and Computation: Practice and Experience 28(2), 189-210 (2016). First published online December 2013, DOI: 10.1002/cpe.3180, Preprint: arXiv:1208.2908

2013

  • M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: Domain decomposition and locality optimization for large-scale lattice Boltzmann simulations. Computers & Fluids 80 (2013), 283-289. DOI: 10.1016/j.compfluid.2012.02.007. Preprint: arXiv 1111.1129 (2011).
  • M. Wittmann, G. Hager, G. Wellein, T. Zeiser, and B. Krammer: MPC and Coarray Fortran: Alternatives to Classic MPI Implementations on the Examples of Scalable Lattice Boltzmann Flow Solvers. In: W. E. Nagel et al. (eds.), High Performance Computing in Science and Engineering ‘12, Springer, ISBN 978-3-642-33373-6 (2013) 367-372. DOI: 10.1007/978-3-642-33374-3_27
  • C. Scheit, G. Hager, J. Treibig, S. Becker, and G. Wellein: Optimization of FASTEST-3D for Modern Multicore Systems. Submitted. Preprint: arXiv:1303.4538
  • T. Scharpff, K. Iglberger, G. Hager, and U. Rüde: Model-guided Performance Analysis of the Sparse Matrix-Matrix Multiplication. Proc. 2013 International Conference on High Performance Computing & Simulation (HPCS 2013), July 1-5, 2013, Helsinki, Finland. DOI: 10.1109/HPCSim.2013.6641452, Preprint: arXiv:1303.1651
  • M. Wittmann, G. Hager, T. Zeiser, and G. Wellein: Asynchronous MPI for the Masses. Submitted. Preprint: arXiv:1302.4280
  • F. Shahzad, M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: An Evaluation of Different IO Techniques for Checkpoint/Restart. Workshop on Large-Scale Parallel Processing 2013 (LSPP13). DOI: 10.1109/IPDPSW.2013.145, Preprint: asyn_ckpt_130115.pdf
  • F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: A survey of checkpoint/restart techniques on distributed memory systems. Parallel Processing Letters 23(04), 1340011-1340030 (2013). DOI: 10.1142/S0129626413400112
  • F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: PGAS implementation of SpMVM and LBM with GPI. Proceedings of the 7th International Conference on PGAS Programming Models, Oct. 3-4, 2013, Edinburgh, Scotland, 172-184 (2013).

2012

2011

  • G. Schubert, H. Fehske, G. Hager, and G. Wellein: Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems. Parallel Processing Letters 21(3), 339-358 (2011). DOI: 10.1142/S0129626411000254, Preprint: arXiv:1106.5908
  • G. Hager, G. Schubert, T. Schoenemeyer, and G. Wellein: Prospects for Truly Asynchronous Communication with Pure MPI and Hybrid MPI/OpenMP on Current Supercomputing Platforms. Proc. Cray Users Group Conference 2011 (CUG 2011), May 23-26, 2011, Fairbanks, AK. Hager-Paper-CUG11.pdf
  • J. Treibig, G. Hager, and G. Wellein: LIKWID performance tools. In: C. Bischof et al. (eds.), Competence in High Performance Computing 2010. Springer, ISBN 978-3-642-24025-6 (2012), 165-175. DOI: 10.1007/978-3-642-24025-6_14, Preprint: arXiv:1104.4874
  • G. Schubert, G. Hager, H. Fehske and G. Wellein: Parallel sparse matrix-vector multiplication as a test case for hybrid MPI+OpenMP programming. Workshop on Large-Scale Parallel Processing (LSPP 2011), May 20th, 2011, Anchorage, AK. DOI:10.1109/IPDPS.2011.332, Preprint:  arXiv:1101.0091
  • J. Treibig, G. Wellein and G. Hager: Efficient multicore-aware parallelization strategies for iterative stencil computations. Journal of Computational Science 2, 130-137 (2011). DOI: 10.1016/j.jocs.2011.01.010, Preprint: arXiv:1004.1741

2010

  • M. Wittmann and G. Hager: Optimizing ccNUMA locality for task-parallel execution under OpenMP and TBB on multicore-based systems. Preprint: arXiv:1101.0093
  • G. Hager and G. Wellein: Introduction to High Performance Computing for Scientists and Engineers. CRC Press, ISBN 978-1439811924, 356 pages, July 2010. Available as eBook.
  • C. Feichtinger, J. Habich, H. Köstler, G. Hager, U. Rüde and G.Wellein: A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters. Parallel Computing 37(9), 536-549 (2011) . DOI: 10.1016/j.parco.2011.03.005. Preprint: arXiv:1007.1388
  • M. Wittmann, G. Hager, J. Treibig and G. Wellein: Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. Parallel Processing Letters 20 (4), 359-376 (2010). DOI: 10.1142/S0129626410000296 Preprint: arXiv:1006.3148
  • H. Fehske and G. Hager: Luttinger, Peierls or Mott? Quantum Phase Transitions in Strongly Correlated 1D Electron-Phonon Systems. In: F. Hensel, P. Edwards and R. Redmer (Eds.), Metal-to-Nonmetal Transitions. Springer Series in Material Sciences, Vol. 132, (Springer) 1-22, 2010. DOI: 10.1007/978-3-642-03953-9_1
  • J. Treibig, G. Hager, M. Meier and G. Wellein: LIKWID performance tools. InSiDE 8(1), 50-53 (Spring 2010).
  • J. Treibig, G. Hager and G. Wellein: LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego CA, September 13, 2010. DOI: 10.1109/ICPPW.2010.38 Preprint: arXiv:1004.4431
  • J. Treibig, G. Hager and G. Wellein: Complexities of Performance Prediction for Bandwidth-Limited Loop Kernels on Multi-Core Architectures. In: S. Wagner et al., High Performance Computing in Science and Engineering, Garching/Munich 2009. Springer, ISBN 978-3642138713 (2010), 3-12. DOI: 10.1007/978-3-642-13872-0_1, Preprint (Multi-core architectures: Complexities of performance prediction and the impact of cache topology):  arXiv:0910.4865.
  • G. Schubert, G. Hager and H. Fehske: Performance limitations for sparse matrix-vector multiplications on current multicore environments. In: S. Wagner et al., High Performance Computing in Science and Engineering, Garching/Munich 2009. Springer, ISBN 978-3642138713 (2010), 13-26. DOI: 10.1007/978-3-642-13872-0_2, Preprint:   arXiv:0910.4836.
  • M. Wittmann, G. Hager and G. Wellein: Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory. Accepted for LSPP10, the Workshop on Large-Scale Parallel Processing at IPDPS 2010, April 23rd, 2010, Atlanta, GA.Preprint: arXiv:0912.4506, DOI: 10.1109/IPDPSW.2010.5470813
  • J. Habich, T. Zeiser, G. Hager, G. Wellein: Performance analysis and optimization strategies for a D3Q19 Lattice Boltzmann Kernel on nVIDIA GPUs using CUDA. Advances in Engineering Software 42 (5), 266-272 (2011). DOI: 10.1016/j.advengsoft.2010.10.007

2009

  • T. Zeiser, G. Hager and G. Wellein: Benchmark analysis and application results for lattice Boltzmann simulations on NEC SX vector and Intel Nehalem systems. Parallel Processing Letters 19 (4), 491-511 (2009) DOI:10.1142/S0129626409000389
  • J. Treibig and G. Hager: Introducing a Performance Model for Bandwidth-Limited Loop Kernels. Proceedings of the Workshop “Memory issues on Multi- and Manycore Platforms” at PPAM 2009, the 8th International Conference on Parallel Processing and Applied Mathematics, Wroclaw, Poland, September 13-16, 2009. Lecture Notes in Computer Science Volume 6067, 2010, pp 615-624. DOI: 10.1007/978-3-642-14390-8_64. arXiv:0905.0792
  • T. Zeiser, G. Hager and G. Wellein: The world’s fastest CPU and SMP node: Some performance results from the NEC SX-9. Proceedings of LSPP 2009 at IPDPS09, Rome, Italy, May 25-29, 2009. DOI:10.1109/IPDPS.2009.5161089
  • G. Hager, G. Jost, and R. Rabenseifner: Communication Characteristics and Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-core SMP Nodes. In: Proceedings of the Cray Users Group Conference 2009 (CUG 2009), Atlanta, GA, USA, May 4-7, 2009. cug09_hager_jost_rabenseifner.pdf
  • G. Wellein, G. Hager, T. Zeiser, M. Wittmann, and H. Fehske: Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. Proceedings of COMPSAC 2009, the 33rd Annual IEEE International Computer Software and Applications Conference, Seattle, WA, July 20-24, 2009. DOI:10.1109/COMPSAC.2009.82
  • J. Habich, T. Zeiser, G. Hager, and G. Wellein: Speeding up a Lattice Boltzmann Kernel on nVIDIA GPUs. Proceedings of PARENG09-S01, the First International Conference on Parallel, Distributed and Grid Computing for Engineering, Pecs, Hungary, April 2009. DOI:10.4203/ccp.90.17
  • M. Wittmann and G. Hager: A Proof of Concept for Optimizing Task Parallelism by Locality Queues. arXiv:0902.1884
  • R. Rabenseifner, G. Hager, and G. Jost: Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In Didier El Baz et al. (Eds.), Proceedings of the 17th Euromicro International Conference on Parallel, Distributed, and network-based Processing PDP 2009, Feb 18-20, 2009, Weimar, Germany. Computer Society Press, pp. 427-436. DOI:10.1109/PDP.2009.43 hjr.pdf
  • S. Ejima, G. Hager, and H. Fehske: Quantum phase transition in a 1D transport model with boson affected hopping: Luttinger liquid versus charge-density-wave behavior. Phys. Rev. Lett. 102, 106404 (2009), DOI: 10.1103/PhysRevLett.102.106404, arXiv:0811.0742
  • T. Zeiser, G. Hager, and G. Wellein: Vector computers in a world of commodity clusters, massively parallel systems and many-core many-threaded CPUs: recent experience based on advanced lattice Boltzmann flow solvers. In: W. E. Nagel, D. B. Kröner, M. Resch (eds.), High Performance Computing in Science and Engineering ’08, Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2008, Springer, ISBN 978-3-540-88301-2, (2009) 333-347. DOI:10.1007/978-3-540-88303-6.

2008

  • N. Schindzielorz, J. Erler, P. Klüpfel, P.-G. Reinhard, and G. Hager: Fission of super-heavy nuclei explored with Skyrme forces. Int. J. Mod. Phys. E 18(4), 773-781 (2009). DOI:10.1142/S0218301309012860
  • M. Breuer, P. Lammers, T. Zeiser, G. Hager, and G. Wellein: Towards the simulation of the turbulent flow over dimples – Code evaluation and optimization for the NEC SX-8. In: W.E. Nagel, D. Körner, M. Resch (eds.), High Performance Computing in Science and Engineering ’07, Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2007, Springer, ISBN 978-3-540-74739-0 / 978-3-540-74738-3, (2008) 303-318. doi:10.1007/978-3-540-74739-0_21.
  • H. Fehske, G. Hager and J. Jeckelmann: Metallicity in the half-filled Holstein-Hubbard model. Europhys. Lett. 84, 57001 (2008), DOI:10.1209/0295-5075/84/57001, arXiv:0808.1675
  • G. Hager, T. Zeiser and G. Wellein: Data access optimizations for highly threaded multi-core CPUs with multiple memory controllers. Accepted for Workshop on Large-Scale Parallel Processing 2008, DOI:10.1109/IPDPS.2008.4536341, arXiv:0712.2302
  • G. Hager, T. Zeiser and G. Wellein: Data access characteristics and optimizations for Sun UltraSPARC T2 and T2+ systems. Parallel Processing Letters, Vol. 18, No. 4 (2008) 471-490. DOI:10.1142/S0129626408003521 Preprint: ppl-hzw.pdf

2007

  • G. Hager, A. Weiße, G. Wellein, E. Jeckelmann and H. Fehske: The spin-Peierls chain revisited. J. Magn. Magn. Mater. 310, 1380-1382 (2007). Erratum: J. Magn. Magn. Mater. 316, 43 (2007). Proceedings of the 17th International Conference on Magnetism (ICM 2006), Aug 20-25 2006, Kyoto, Japan. arXiv:cond-mat/0606360
  • M. Hohenadler, G. Hager, G. Wellein and H. Fehske: Carrier-density effects in many-polaron systems. J. Phys.: Condens. Matter 19 (2007) 255202. arXiv:cond-mat/0609296
  • T. Zeiser, G. Wellein, A. Nitsure, K. Iglberger, U. Rüde and G. Hager: Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method. Progress in Computational Fluid Dynamics, An Int. J. Vol. 8, No.1/2/3/4 (2008) 179-188. Proceedings of ICMMES 2006. DOI:10.1504/PCFD.2008.018088
  • G. Hager and G. Wellein: Architectures and Performance Characteristics of Modern High Performance Computers. In Fehske et al., Lect. Notes Phys. 739, 681-730 (2008), ISBN: 978-3-540-74685-0
  • G. Hager and G. Wellein: Optimization Techniques for Modern High Performance Computers. In Fehske et al., Lect. Notes Phys. 739, 731-767 (2008), ISBN: 978-3-540-74685-0
  • G. Hager, H. Stengel, T. Zeiser and G. Wellein: RZBENCH: Performance evaluation of current HPC architectures using low-level and application benchmarks. In: S. Wagner et al. (Eds.), High Performance Computing in Science and Engineering, Garching/Munich 2007. Transactions of the Third Joint HLRB and KONWIHR Status and Result Workshop, Dec 3-4, 2007, LRZ Garching, Springer, ISBN 978-3-540-69181-5 (2009) 485-501. arXiv:0712.3389
  • M. Stürmer, G. Wellein, G. Hager, H. Köstler and Ulrich Rüde: Challenges and potentials of emerging multicore architectures. In: S. Wagner et al. (Eds.), High Performance Computing in Science and Engineering, Garching/Munich 2007. Transactions of the Third Joint HLRB and KONWIHR Status and Result Workshop, Dec 3-4, 2007, LRZ Garching, Springer, ISBN 978-3-540-69181-5 (2009) 551-566.

2006

  • G. Wellein, P. Lammers, G. Hager, S. Donath and T. Zeiser: Towards optimal performance for lattice Boltzmann applications on terascale computers. In: A. Deane et al. (eds), Parallel Computational Fluid Dynamics – Theory and Applications. Proceedings of the Parallel CFD 2005 Conference, College Park, MD, USA, May 24-27, 2005. Elsevier, ISBN 0-444-52206-9 (2006) 31-40.
  • H. Fehske, G. Hager, G. Wellein and E. Jeckelmann: Hole-doped Hubbard ladders. Physica B 378-380, 319-320 (2006). arXiv:cond-mat/0505666
  • G. Schubert, A. Alvermann, A. Weiße, G. Hager, G. Wellein and H. Fehske: Spectral Properties of Strongly Correlated Electron Phonon Systems. NIC Symposium 2006, G. Münster, D. Wolf, M. Kremer (Editors), John von Neumann Institute for Computing, Jülich, NIC Series, Vol. 32, ISBN 3-00-017351-X, pp. 201-210, 2006.
  • A. Weiße, G. Hager, A. R. Bishop and H. Fehske: Phase diagram of the spin-Peierls chain with local coupling. Phys. Rev. B 74, 214426 (2006). arXiv:cond-mat/0607209
  • A. Nitsure, K. Iglberger, U. Rüde, C. Feichtinger, G. Wellein, G. Hager: Optimization of Cache Oblivious Lattice Boltzmann Method in 2D and 3D. In: Becker, Matthias; Szczerbicka, Helena (Hrsg.): Simulationstechnique – 19th Symposium in Hannover, September 2006 (ASIM 2006 – 19. Symposium Simulationstechnik, Hannover, 12. – 14. 09. 2006). Erlangen, SCS Publishing House, 2006, S. 265-270 (Frontiers in Simulation, Vol. 16)
  • P. Lammers, G. Wellein, T. Zeiser, G. Hager, M. Breuer: Have the vectors the continuing ability to parry the attack of the killer micros? In: M. Resch, T. Bönisch, K. Benkert, T. Furui, Y. Seo, W. Bez (editors): High Performance Computing on Vector Systems. Proceedings of the High Performance Computing Center Stuttgart, March 2005), Springer, ISBN 3-540-29124-5, (2006) 25-39. doi:10.1007/3-540-35074-8_2.

2005

  • G. Hager: A parallelized density matrix renormalization group algorithm and its application to strongly correlated quantum systems. Dissertation, Ernst-Moritz-Arndt-Universität Greifswald, 2005. URN: urn:nbn:de:gbv:9-000024-1
  • G. Hager, T. Zeiser and H. Heller:Setting up ByGRID – First Steps Towards an e-Science Infrastructure in Bavaria. In: A. Bode, F. Durst (Eds.): High Performance Computing in Science and Engineering, Garching 2005. Transactions of the KONWIHR Result Workshop, October 14-15, 2004 2, Technical University of Munich, Garching, Springer, ISBN 3-540-26145-1 (2005) 97-102.
  • G. Hager, G. Wellein, E. Jeckelmann and H. Fehske: Stripe formation in doped Hubbard ladders. Phys. Rev. B 71, 075108 (2005). arXiv:cond-mat/0409321
  • H. Fehske, G. Wellein, G. Hager, A. Weiße, K.W. Becker and A.R. Bishop: Luttinger liquid versus charge density wave behaviour in the one-dimensional spinless fermion Holstein model. Physica B 359-361, 699-701 (2005). arXiv:cond-mat/0406023
  • G. Hager, T. Zeiser, J. Treibig and G. Wellein: Optimizing performance on modern HPC systems: learning from simple kernel benchmarks. In: Proceedings of the 2nd Russian-German Advanced Research Workshop on Computational Science and High Performance Computing, HLRS, Stuttgart, March 14 – 16, 2005.
  • G. Wellein, T. Zeiser, S. Donath and G. Hager: On the Single Processor Performance of Simple Lattice Boltzmann Kernels. Proc. ICMMES, 2004. Computers & Fluids 35, 910-919 (2006). DOI:10.1016/j.compfluid.2005.02.008
  • S. Donath, T. Zeiser, G. Hager, J. Habich and G. Wellein: Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures. In: F. Huelsemann, M. Kowarschik, U. Ruede (Eds.): Frontiers in Simulation: Simulation Techniques – 18th Symposium in Erlangen, September 2005 (ASIM), pp. 728-735, SCS Publishing House, Erlangen, 2005.
  • G. Hager, B. Bergen, P. Lammers and G. Wellein: Taming the Bandwidth Behemoth – First Experiences on a Large SGI Altix System.InSiDE 3, No. 2, Autumn 2005, pp. 24-25 (2005).

2004

  • G. Hager, E. Jeckelmann, H. Fehske and G. Wellein: Parallelization Strategies for Density Matrix Renormalization Group Algorithms on Shared-Memory Systems. J. Comput. Phys. 194(2), 795 (2004). arXiv:cond-mat/0305463
  • H. Fehske, G. Wellein, G. Hager, A. Weiße and A. R. Bishop: Quantum Lattice Dynamical Effects on Single-Particle Excitations in One-dimensional Mott and Peierls Insulators. Phys. Rev. B 69, 165115 (2004). arXiv:cond-mat/0312426
  • G. Hager, G. Wellein, E. Jeckelmann and H. Fehske: DMRG Investigation of Stripe Formation in Doped Hubbard Ladders. In: A. Bode (Ed.): High Performance Computing in Science and Engineering 2004 – Transactions of the Second Joint HLRB and KONWIHR Result and Reviewing Workshop (Second Joint HLRB and KONWIHR Result and Reviewing Workshop Munich – Germany 2-3 March 2004). Berlin: Springer, 2004.
  • G. Hager, E. Jeckelmann, H. Fehske and G. Wellein: Exact Numerical Treatment of Finite Quantum Systems using Leading-Edge Supercomputers. In: Modelling, Simulation and Optimization of Complex Processes, Eds. H. G. Bock, E. Kostina, H.-X. Phu, R. Rannacher, Springer-Verlag Berlin Heidelberg (2005), pp 165-175.
  • G. Wellein, T. Zeiser, G. Hager and P. Lammers: Application Performance of Modern Number Crunchers. CSAR Focus, Ed. 12, Summer-Autumn 2004, pp. 17-19 (2004).

2003

  • G. Wellein, G. Hager, A. Basermann and H. Fehske: Fast sparse matrix-vector multiplication for TFlop/s computers.In: J.M.L.M. Palma; J. Dongarra (Hrsg.) : High Performance Computing for Computational Science – VECPAR2002 (High Performance Computing for Computational Science – VECPAR2002 Porto – Portugal 26-28 June 2002). Berlin : Springer, 2003.
  • H. Fehske, G. Wellein, A. P. Kampf, M. Sekania, G. Hager, A. Weiße, H. Büttner and A. R. Bishop: One-dimensional electron-phonon systems: Mott- versus Peierls-insulators. In: A. Bode (Hrsg.) : High Performance Computing in Science and Engineering 2002 – Transactions of the First Joint HLRB and KONWIHR Result and Reviewing Workshop (First Joint HLRB and KONWIHR Result and Reviewing Workshop Garching – Germany 10-11 October 2002). Berlin : Springer, 2003.
  • G. Hager, F. Deserno and G. Wellein: Pseudo-Vectorization and RISC Optimization Techniques for the Hitachi SR8000 architecture. In: A. Bode (Ed.) : High Performance Computing in Science and Engineering 2002 – Transactions of the First Joint HLRB and KONWIHR Result and Reviewing Workshop (First Joint HLRB and KONWIHR Result and Reviewing Workshop Garching – Germany 10-11 October 2002). Berlin : Springer, 2003.
  • G. Hager, F. Brechtefeld, P. Lammers and G. Wellein: Processor Architecture and Application Performance in Modern Supercomputers.InSiDE 1, No. 1, Spring 2003, pp. 8-13 (2003).

2001

  • G. Wellein, G. Hager, A. Basermann and H. Fehske: Exact Diagonalization of Large Sparse Matrices: A Challenge for Modern Supercomputers. In: Proceedings of CRAY Users Group (CUG) Summit 2001 (CUG Summit 2001 Indian Wells – USA May 2001). 2001, S. CD-ROM.