The Marker API of likwid-perfctr lets you count hardware events on your CPU core(s) separately for different execution regions. E.g., in order to count events for a loop, you would use it like this:
#include <likwid.h> int main(...) { // always required once LIKWID_MARKER_INIT; // ... LIKWID_MARKER_START("loop"); for(int i=0; i<n; ++i) { do_some_work(); } LIKWID_MARKER_STOP("loop"); // ... LIKWID_MARKER_CLOSE; return 0; }
An arbitrary number of regions is allowed, and you can use the LIKWID_MARKER_START
and LIKWID_MARKER_STOP
macros in parallel regions to get per-core readings. The events to be counted are configured on the likwid-perfctr
command line. As with anything that is not part of the actual work in a code, one may ask about the cost of the marker API calls. Do they impact the runtime of the code? Does the number of cores play a role? Continue reading