Aligning heap data in C is simple: Just use the standard
memalign() or better
posix_memalign() functions instead of
malloc() and you’re done. Intel compilers also feature special library calls to achieve the same thing, but you don’t really need them (you do need compiler support, though, for stack data, structures etc.). It should be clear for everyone familiar with current x86 architectures what properly aligned data can do for you: Packed aligned loads and non-temporal stores become possible. Even though the compiler can still employ aligned data movement by itself in some cases by loop peeling, one may want to align all references properly to enable the use of the
#pragma vector aligned sledgehammer (why don’t they provide an argument list for this directive?).
In Fortran there is no standard way to make allocatable data automatically aligned on some address boundary (standard alignment is on 8 byte). The Intel compiler, however, provides a special directive to do just that. In order to enforce 16-byte alignment you can write:
double precision, allocatable, dimension(:) :: array !DEC$ ATTRIBUTES ALIGN: 16 :: array ! ... allocate(array(100000))
Although the compiler docs say at some point that this doesn’t work for allocatables, it does at least for versions 9.1 and 10.1 (I’ve checked by printing out the address explicitly).
This should enable the same vectorization stunts as in C without too much hassle.