Georg Hager's Blog

Random thoughts on High Performance Computing

Content

Restricting member function calls by numeric template parameters

Thanks to Johannes for this interesting problem.

Let’s say you have a class template B where T is an arbitrary type and C is an integer template argument:

template  <class T, int C> B;

B has two member functions with the same name:

template  <class T, int C> void B::member(T _t);
template  <class T, int C> void B::member(T _t, int _i);

How do you make sure that the first member (the one with just a single argument) can only be called for instances of the class template with C==1? This is supposed to happen at compile time (runtime would be easy, of course).

One could (partially) specialize the whole class for C=1, which generates a whole lot of code bloat. Another solution would be to have a base class with only the two-argument member and a derived class (inheriting from B) implementing the single-argument member. This is also unsatisfactory because the derived class must have a name different from the first.

A more elegant solution is to have a private template class declaration encapsulated into the base class which gets instantiated when calling the single-argument member:

template  class B {
  template  <int U> class BX {
  public:
    BX(B<T,U>* p) {}
  };
public:
  B();
  // no problem here
  void member(T _x, int _i);
  // only valid if C==1
  void member(T _x) {
    BX<2-C> bx(this);
    ...
  }
  ...
};

Only if C==1 will calling the single-argument member not fail, because BX<2-C> gets instantiated using a pointer to class B. If C!=1, the compiler spits out an error message saying that it can’t find a constructor with the appropriate argument. In this special case we could just have used BX<1>, but I wanted to show that any simple integer expression will do (see below).

Admittedly, the error message is a little clumsy, but it does the job. This example can easily be generalized – remember that the ternary operator will also be evaluated at compile time. All you have to do is provide a function of C that is equal to C for all permitted values of C, and different from C otherwise.

I’ve checked that this trick works using the Intel 10.1 and GNU 4.1.3 compilers.

OpenMP, ccNUMA and C++

OpenMP, ccNUMA and C++
If you are interested in programming with C++ and OpenMP, the just-finished diploma thesis of Holger Stengel might be interesting for you (in German – available on request). It studies ccNUMA effects in C++ and ways to circumvent them. To fuel your appetite, there is a nice English poster with most of the results: poster_cppnuma.pdf

This whole work was kicked off by some of the problems I had encountered during my PhD thesis where I had parallelized a C++ code from condensed matter physics. At that time, nobody had even thought about what would happen if standard C++ elements (arrays of objects, std::vector<> etc.) were used on a ccNUMA machine with OpenMP. Another inspiration came from Matt Austern‘s article about Segmented Iterators and Hierarchical Algorithms. The segmented iterator described in this paper could by useful for many purposes, of which NUMA placement is only one. In the thesis we implemented a version in which you could exactly control data placement by configurable padding.

I would be glad to continue on this topic with another diploma/bachelor/masters student. If you are hooked, feel free to contact me.