Highlights

Efficient all-electron hybrid density functionals for atomistic simulations beyond 10 000 atoms

Case Study
15 Jul 2024

A highly optimized implementation of hybrid density functional approximations (DFAs) in the all-electron code FHI-aims has been developed, dramatically improving performance and scalability for both non-periodic and periodic systems. This collaborative effort extends the reach of hybrid DFAs to simulations of over ten thousand atoms, opening new possibilities for large-scale electronic structure calculations across diverse chemical systems, exemplified for perovskites, organic crystals, and complex ice structures.

Shown are benchmark results for the largest periodic structures considered. Average runtimes to evaluate the HSE06 exchange operator (blue bars) and the ELPA two-stage eigenvalue solver (red bars) per self-consistent field iteration are shown.
Shown are benchmark results for the largest periodic structures considered. Average runtimes to evaluate the HSE06 exchange operator (blue bars) and the ELPA two-stage eigenvalue solver (red bars) per self-consistent field iteration are shown.

- Taken from Kokott, S., et al., J. Chem. Phys. 161, 024112 (2024),

DOI: 10.1063/5.0208103

Exchange-correlation approximations that include a fraction of exact exchange, so called hybrid exchange-correlation functionals, offer compelling accuracy for ab initio electronic-structure simulations of molecules, nanosystems, and bulk materials. They address some deficiencies of computationally cheaper, frequently used semilocal DFAs. However, the computational bottleneck of hybrid DFAs is the evaluation of the non-local exact exchange contribution, which is the limiting factor for the application of the method for large-scale simulations. 

A team consisting of researchers in Berlin, Hamburg, Garching, Stuttgart (Germany), Espoo (Finland), and Durham (USA) has developed a highly optimized implementation of exact exchange calculations in the all-electron code FHI-aims. This advance applies to both periodic and non-periodic systems and is designed for high-performance CPU clusters. By leveraging refined MPI-3 parallelization techniques, shared memory arrays, and parallelization over basis functions, the team achieved substantial improvements in memory efficiency, computational performance, and workload distribution. Their optimized implementation achieves nearly perfect linear scaling with system size and ideal speedup with increasing node count, resulting in runtime reductions of over 100 times for large systems compared to the previous implementation. Excellent scaling and accuracy is maintained even for simulations of systems exceeding 10,000 atoms.

Scaling of the new implementation of the HSE06 exchange evaluation timings per SCF iteration for different GaAs supercells in FHI-aims. Grey lines indicate ideal strong scaling (left plot) and linear scaling (right plot). All calculations are all-electron without any shape approximations to the underlying electron-nuclear potential, performed with intermediate species defaults, and without employing any symmetry or system-specific simplifications. The k-grid for the largest 1024-atom supercell is chosen as 1x1x1 and scaled accordingly for smaller cells.

- Adapted from Kokott, S., et al., J. Chem. Phys. 161, 024112 (2024), DOI: 10.1063/5.0208103

The implementation's efficiency was demonstrated for production settings for a diverse range of chemical systems, including complex materials like hybrid perovskites, organic crystals, and ice structures with up to 30,576 atoms (101,920 electrons described by 244,608 basis functions). These advances enable hybrid DFA calculations with FHI-aims for complex nanoscale systems across chemistry and materials science without a compromise in accuracy, opening new possibilities for accurate simulations of defects, interfaces, and other phenomena requiring large structural models.

The full reference for the published article is:

Kokott, S., et al., J. Chem. Phys. 161, 024112 (2024), DOI: 10.1063/5.0208103

return