A.3 Use of FHI-aims with multithreaded BLAS (e.g., Intel’s MKL)

The performance of FHI-aims depends critically on the basic linear algebra subroutine (BLAS) library used to perform matrix operations. Such libraries are highly CPU-specific, and should be provided and optimized by yourself or your computer vendor for your particular computer.

Unfortunately, with the advent of multi-core CPUs for PCs, some computer vendors (Intel, IBM) have decided that their proprietary BLAS implementations will, by default, use all available CPU’s by way of threads, since they do not expect a user to know how to create parallel code.

In contrast, FHI-aims makes great efforts to distribute its workload evenly itself (much more efficient than leaving the task up to the BLAS, which are used for some, but by no means all operations in the code). Thus, FHI-aims invokes the correct number of sub-processes via the message-passing interface (MPI), then distributing any further basic numeric operations (matrix multiplications) using BLAS routines correctly itself.

If the default settings provided by a vendor are to use all CPUs for every single call or the BLAS operations, on a system with n CPUs this will lead to n×n tasks running in parallel – not good at all for efficiency.

The problem is easily fixed by setting the system variable OMP_NUM_THREADS (number of threads invoked by OpenMP-parallelized libraries, e.g., BLAS) to 1:

      export OMP_NUM_THREADS=1
    

(This syntax is correct for the bash shell). When using Intel’s MKL, you may likewise wish to set MKL_NUM_THREADS to 1. On top of this, Intel will still ignore your choice unless you set the less well documented variable MKL_DYNAMIC to FALSE.

Another, much simpler and equally well performing option is to use the freely available Goto BLAS subroutines that can be downloaded and compiled on standard architectures.

Some versions of Intel’s MKL are known to have an error in a function called “pdtran”. Thus at startup FHI-aims tests the version of pdtran it is currently using for correctness. Should FHI-aims abort with an error message “pdtran test failed! Aborting…” you will have to use a different version of Intel’ MKL or replacements like Goto BLAS.