Content

Publications

Papers (published or accepted)

  • C. L. Alappat, A. Alvermann, A. Basermann, H. Fehske, Y. Futamura, M. Galgon, G. Hager, S. Huber, A. Imakura, M. Kawai, M. Kreutzer, B. Lang, K. Nakajima, M. Röhrig-Zöllner, T. Sakurai, F. Shahzad, J. Thies, and G. Wellein: ESSEX: Equipping Sparse Solvers For Exascale. In: Bungartz HJ., Reiz S., Uekermann B., Neumann P., Nagel W. (eds) Software for Exascale Computing – SPPEXA 2016-2019. Lecture Notes in Computational Science and Engineering 136, 143-187 (2020). Springer, Cham. Available with Open Access. DOI: 10.1007/978-3-030-47956-5_7
  • C. L. Alappat, G. Hager, O. Schenk, J. Thies, A. Basermann, A. R. Bishop, H. Fehske, and G. Wellein: A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication. ACM Trans. Parallel Comput. 7(3), Article 19 (June 2020), 37 pages. Available with Open Access. DOI: 10.1145/3399732.
  • D. Ernst, G. Hager, J. Thies, and G. Wellein: Performance Engineering for a Tall & Skinny Matrix Multiplication Kernel on GPUs. Accepted for PPAM’2019, the 13th International Conference on Parallel Processing and Applied Mathematics,  September 8-11, 2019, Białystok, Poland. Preprint: arXiv:1905.03136
  • A. Alvermann, A. Basermann, H.-J. Bungartz, C. Carbogno, D. Ernst, H. Fehske, Y. Futamura, M. Galgon, G. Hager, S. Huber, T. Huckle, A. Ida, A. Imakura, M. Kawai, S. Köcher, M. Kreutzer, P. Kus, B. Lang, H. Lederer, V. Manin, A. Marek,  K. Nakajima, L. Nemec, K. Reuter, M. Rippl, M. Röhrig-Zöllner, T. Sakurai, M. Scheffler, C. Scheurer, F. Shahzad, D. Simoes Brambila, J. Thies, and G. Wellein: Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects. Proc. EPASA 2018, Japan Journal of Industrial and Applied Mathematics, 36(2), 699-717, DOI: 10.1007/s13160-019-00360-8. Preprint: arXiv:1806.01036.
  • F. Shahzad, J. Thies, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: CRAFT: A library for easier application-level checkpoint/restart and automatic fault tolerance. Accepted for publication in IEEE Transactions on Parallel and Distributed Systems. DOI: 10.1109/TPDS.2018.2866794, Preprint: arXiv:1708.02030
  • M. Kreutzer, D. Ernst, A.R. Bishop, H. Fehske, G. Hager, K. Nakajima, and G. Wellein: Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs. In: R. Yokota, M. Weiland, D. Keyes, and C. Trinitis (eds.): High Performance Computing: 33rd International Conference, ISC High Performance 2018, Frankfurt, Germany, June 24-28, 2018, Proceedings, Springer, Cham, LNCS 10876, ISBN 978-3-319-92040-5 (2018), 329-349. DOI: 10.1007/978-3-319-92040-5_17ISC 2018 Hans Meuer Award Finalist.
  • K. Nakajima and T. Hanawa. Communication-computation overlapping with dynamic loop scheduling for preconditioned parallel iterative solvers on multicore/manycore clusters. IEEE Proceedings of 10th International Workshop on Parallel Programming Models & Systems Software for High-End Computing (P2S2 2017) in conjunction with the 46th International Conference on Parallel Processing (ICPP 2017) .
  • T. Katagiri, S. Ohshima and M. Matsumoto. Auto-tuning on NUMA and many-core environments with an FDM code. IEEE Proceedings of the 12th International Workshop on Automatic Performance Tuning
    (iWAPT2017) (in conjunction with the IEEE IPDPS 2017) .
  • T. Iwashita, A. Ida, T. Mifune and Y. Takahashi. Software framework for parallel BEM analyses with H-matrices using MPI and OpenMP. Procedia Computer Science 108, (2017) 2200–2209.
  • N. Nomura, A. Fujii, T. Tanaka, K. Nakajima and O. Marques. Performance analysis of SA-AMG method by setting extracted near-kernel vectors. Lecture Notes in Computer Science (LNCS) 10150, (2017) 52–63.
  • N. Tominaga, T. Mifune, A. Ida, Y. Sogabe, T. Iwashita and N. Amemiya. Application of hierarchical matrices to large-scale electromagnetic field analyses of coils wound with coated conductors. Proceedings of the 25th
    International Conference on Magnet Technology (MT25).
  • M. Kawai, A. Ida and K. Nakajima. Hierarchical parallelization of multicoloring algorithms for block ICC preconditioners. IEEE 19th International Conference on High Performance Computing and Communications (HPCC) 19, (2017) 138–145.
  • A. Ida, T. Ataka, T. Mifune, Y. Takahashi, T. Iwashita and A. Furuya. Application of improved H-matrices in micromagnetic simulations. IEEE Transactions on Magnetics 54(3).
  • A. Ida. Lattice H-matrices on distributed-memory systems. 32nd IEEE International Parallel & Distributed Processing Symposium (IPDPS 2018) Accepted.
  • I. Yamazaki, A. Abdelfattah, A. Ida, S. Ohshima, S. Tomov, R.Yokota and J. Dongarra. Analyzing performance of Bicgstab with hierarchical matrix on GPU clusters. 32nd IEEE International Parallel & Distributed Processing
    Symposium (IPDPS 2018) Accepted.
  • N. Tominaga, T. Mifune, A. Ida, Y. Sogabe, T. Iwashita and N. Amemiya. Application of hierarchical matrices to large-scale electromagnetic field analyses of coils wound with coated conductors. IEEE Transactions on
    Applied Superconductivity 28(3), (2018) 1–5.
  • S. Ohshima, I. Yamazaki, A. Ida and R. Yokota. Optimization of hierarchical matrix computation on GPU. Supercomputing Asia 2018 Accepted.
  • A. Ida, T. Ataka, T. Mifune, Y. Takahashi, T. Iwashita and A. Furuya. Application of improved H-matrices in micromagnetic simulations. 21st International Conference on the Computation of Electromagnetic Fields
    (Compumag 2017) .
  • W. Song, F. W. Wubs, J. Thies, and S. Baars: Numerical Bifurcation Analysis of a 3D Turing-Type Reaction-Diffusion Model. Communications in Nonlinear Science and Numerical Simulation 60, 145-164 (2018).  DOI: 10.1016/j.cnsns.2018.01.003
  • M. Galgon, L. Krämer, and B. Lang: Improving projection-based eigensolvers via adaptive techniques. Numer. Linear Algebra Appl., e2124:1–15, 2017. DOI: 10.1002/nla.2124
  • M. Galgon, L. Krämer, B. Lang, A. Alvermann, H. Fehske, A. Pieper, G. Hager, M. Kreutzer, F. Shahzad, G. Wellein, A. Basermann, M. Röhrig-Zöllner, and J. Thies: Improved coefficients for polynomial filtering in ESSEX. In T. Sakurai, S.-L. Zhang, T. Imamura, Y. Yamamoto, Y. Kuramashi, and T. Hoshi (eds.), Eigenvalue Problems: Algorithms, Software and Applications, in Petascale Computing. Proc. EPASA 2015, Tsukuba, Japan, September 2015, volume 117 of LNCSE, pages 63–79. Springer International Publishing, 2017. DOI: 10.1007/978-3-319-62426-6_5
  • M. Kreutzer, J. Thies, M. Röhrig-Zöllner, A. Pieper, F. Shahzad, M. Galgon, A. Basermann, H. Fehske, G. Hager, and G. Wellein: GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems. International Journal of Parallel Programming 45, 1046 (2016). DOI: 10.1007/s10766-016-0464-z. Preprint: arXiv:1507.08101
  • H. Anzt, J. Dongarra, M. Kreutzer, G. Wellein, and M. Köhler: Efficiency of General Krylov Methods on GPUs – An Experimental Study. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Chicago, IL, 683-691 (2016). DOI: 10.1109/IPDPSW.2016.45
  • H. Anzt, M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra: Optimization and performance evaluation of the IDR iterative Krylov solver on GPUs. International Journal of High Performance Computing Applications, first published on May 5, 2016. DOI: 10.1177/1094342016646844
  • J. Thies, M. Galgon, F. Shahzad, A. Alvermann, M. Kreutzer, A. Pieper, M. Röhrig-Zöllner, A. Basermann, H. Fehske, G. Hager, B. Lang, and G. Wellein: Towards an Exascale Enabled Sparse Solver Repository. In: Software for Exascale Computing – SPPEXA 2013-2015, Volume 113 of the series Lecture Notes in Computational Science and Engineering, 295-316 (2016). DOI: 10.1007/978-3-319-40528-5_13. Preprint: lncs_CWPs-4.pdf
  • M. Kreutzer, J. Thies, A. Pieper, A. Alvermann, M. Galgon, M. Röhrig-Zöllner, F. Shahzad, A. Basermann, A. R. Bishop, H. Fehske, G. Hager, B. Lang, and G. Wellein: Performance Engineering and Energy Efficiency of Building Blocks for Large, Sparse Eigenvalue Computations on Heterogeneous Supercomputers. In: Software for Exascale Computing – SPPEXA 2013-2015, Volume 113 of the series Lecture Notes in Computational Science and Engineering, 317-338 (2016). DOI: 10.1007/978-3-319-40528-5_14
  • F. Shahzad, M. Kreutzer, T. Zeiser, R. Machado, A. Pieper, G. Hager, and G. Wellein: Building and utilizing fault tolerance support tools for the GASPI applications. International Journal of High Performance Computing Applications (2016). First published date: November-28-2016, DOI: 10.1177/1094342016677085Preprint (post-review)
  • A. Pieper, M. Kreutzer, A. Alvermann, M. Galgon, H. Fehske, G. Hager, B. Lang, and G. Wellein: High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations. Journal of Computational Physics 325, 226-243 (2016). DOI: 10.1016/j.jcp.2016.08.027, Preprint: arXiv:1510.04895
  • M. Röhrig-Zöllner, J. Thies, M. Kreutzer, A. Alvermann, A. Pieper, A. Basermann, G. Hager, G. Wellein, and H. Fehske: Increasing the performance of the Jacobi-Davidson method by blocking. SIAM Journal on Scientific Computing, 37(6), C697–C722 (2015). DOI: 10.1137/140976017, Preprint: http://elib.dlr.de/89980/
  • F. Shahzad, M. Kreutzer, T. Zeiser, R. Machado, A. Pieper, G. Hager, and G. Wellein: Building a fault tolerant application using the GASPI communication layer. Acepted for FTS 2015, the 1st International Workshop on Fault-Tolerant Systems, in conjunction with IEEE Cluster 2015, September 8, 2015, Chicaco, IL. Preprint: arXiv:1505.04628
  • L. Bakemeier, A. Alvermann, and H. Fehske: Route to chaos in optomechanics. Phys. Rev. Lett. 114, 013601 (2015). DOI: 10.1103/PhysRevLett.114.013601
  • M. Galgon, L. Krämer, B. Lang, A. Alvermann, H. Fehske, and A. Pieper: Improving robustness of the FEAST algorithm and solving eigenvalue problems from graphene nanoribbons. Proc. Appl. Math. Mech. 14(1), 821-822 (2014). DOI: 10.1002/pamm.201410391Preprint
  • M. Kreutzer, G. Hager, G. Wellein, A. Pieper, A. Alvermann, and H. Fehske: Performance Engineering of the Kernel Polynomial Method on Large-Scale CPU-GPU Systems. Proc. IPDPS15, the 29th IEEE International Parallel & Distributed Processing Symposium, May 25-29, 2015, Hyderabad, India. DOI: 10.1109/IPDPS.2015.76, Preprint: arXiv:1410.5242
  • A. Alvermann, A. Basermann, H. Fehske, Martin Galgon, G. Hager, M. Kreutzer, L. Krämer, B. Lang, A. Pieper, M. Röhrig-Zöllner, F. Shahzad, J. Thies, and G. Wellein: ESSEX: Equipping Sparse Solvers for Exascale. In: L. Lopes et al. (eds.), Proc. Euro-Par 2014 Workshops Part II, LNCS vol. 8806, Springer (2014), 577-588. DOI: 10.1007/978-3-319-14313-2_49, Preprint
  • A. Pieper, R. L. Heinisch, G. Wellein, and H. Fehske: Dot-bound and dispersive states in graphene quantum dot superlattices. Physical Review B 89, 165121 (2014). DOI: 10.1103/PhysRevB.89.165121. Preprint: arXiv:1404.2097
  • A. Pieper, G. Schubert, G. Wellein, and H. Fehske: Effects of disorder and contacts on transport through graphene nanoribbons. Physical Review B 88, 195409 (2013). DOI: 10.1103/PhysRevB.88.195409, Preprint: arXiv:1308.6079
  • M. Kreutzer, G. Hager, G. Wellein, H. Fehske, and A. R. Bishop: A unified sparse matrix data format for modern processors with wide SIMD units. SIAM Journal on Scientfic Computing 36(5), C401–C423 (2014). DOI: 10.1137/130930352. Preprint: arXiv:1307.6209
  • F. Shahzad, M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: An Evaluation of Different IO Techniques for Checkpoint/Restart. Workshop on Large-Scale Parallel Processing 2013 (LSPP13). DOI: 10.1109/IPDPSW.2013.145, Preprint: asyn_ckpt_130115.pdf
  • F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: A survey of checkpoint/restart techniques on distributed memory systems. Parallel Processing Letters 23(04), 1340011-1340030 (2013). DOI: 10.1142/S0129626413400112
  • F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: PGAS implementation of SpMVM and LBM with GPI. Proceedings of the 7th International Conference on PGAS Programming Models, 172-184 (2013).
  • M. Galgon, L. Krämer, J. Thies, A. Basermann, and B. Lang: On the parallel iterative solution of linear systems arising in the FEAST algorithm for computing inner eigenvalues. Parallel Computing, Available online 25 June 2015, ISSN 0167-8191, DOI: 10.1016/j.parco.2015.06.005, Preprint BUW-IMACM 14/35

Papers (submitted)

  • C. L. Alappat, G. Hager, O. Schenk, J. Thies, A. Basermann, A. R. Bishop, H. Fehske, and G. Wellein: A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication. Submitted. Preprint: arXiv:1907.06487
  • M. Galgon, L. Krämer, and B. Lang. Adaptive choice of projectors in projection based eigensolvers. Submitted. Preprint BUW-IMACM 15/07
  • L. Krämer: Convergence of integration based methods for the solution of standard and generalized Hermitian eigenvalue problems. Submitted. Preprint BUW-IMACM 14/30

Posters