3.26 Hartree-Fock and hybrid functionals, including periodic systems
Periodic versions of the Hartree-Fock method and of and hybrid density functionals are implemented in FHI-aims. Generally, our experiences are very good. The implementation is stable and seems to scale well towards large systems. However, we still ask you to exercise some care. If you encounter any unexpected difficulties, consult the developers.
Periodic versions of MP2, RPA and are not yet part of the main FHI-aims distribution, as they are in various stages of development. Please feel free to ask at our Slack channel regarding these methods for periodic systems. They are very important to us, and we will be happy to share them as they become ready and more usable.
Complete descriptions of the material described below, as well as extensive benchmarks, are summarized in Ref. [197], as well as Ref. [151] (for the non-periodic implementation of the “LVL” approach). Forces and the stress tensor are also implemented, with a clear description of the stress tensor given in Ref. [171].
There are two specific issues from a usability point of view which should be considered:
-
•
The RI_method “LVL” described below is implemented in a linear scaling and is therefore the only reasonable pathway for large and/or periodic systems. It is, however, slightly less accurate (for formal reasons) than the non-linear-scaling version for non-periodic systems, RI-V. Please bear this in mind. For standard solids and hybrid functionals, the effects seem very small. See reference [151] for quantitative tests. In general, RI-LVL has been extremely reliable for us.
-
•
For band structure output, the exx_band_structure_version keyword allows to toggle between a faster real-space version that works only when a relatively dense -space grid has been used during the regular s.c.f. cycle and a slow(!) fallback version that is calculated in reciprocal space. The underlying reason is that the real-space Born-von Karman cell of the regular s.c.f. cycle may become too small to accommodate some -vectors that are not exact reciprocal lattice vectors of the Born-von Karman cell. The slow fallback version should not simply be used by default since it can easily become the computational bottleneck – both regarding time and memory.
In periodic Hartree-Fock (and hybrid functional) implementations, the key quantity that needs to be evaluated is the exact-exchange matrix,
| (3.92) |
where are the Bloch vectors, is the Bloch summation of the -th atomic orbital living in the unit cell , and is the density matrix. can be obtained from its Fourier transform,
| (3.93) |
where
| (3.94) |
In FHI-aims, periodic Hartree-Fock and hybrid density functionals are implemented in two different ways. One implementation is based on the “k-space” formulation, where one computes directly from Eq. (3.92). An alternative, and more efficient implementation is based on the“real-space” formulation, where one first computes from Eq. (3.94), and then Fourier transform it to . The “real-space” implementation is used in the code by default, and the “k-space” implementation is only used for crossing-check purposes.
Both implementations are based on a localized resolution-of-identity approximation, which we termed as “RI-LVL”, in analogy to “RI-SVS” and “RI-V” introduced in Sec. 3.25. Under “RI-LVL”, the products of two normal basis functions centering at atoms and are expanded only in terms of auxiliary functions centering on these two atoms. Possible contributions of auxiliary functions from a third center are excluded in this approximation, in contrast to “RI-V”. Specifically, one has
| (3.95) |
where and enumerate the auxiliary basis functions centering on atom and respectively. This approximation has been extensively benchmarked with respect to the more accurate “RI-V” approximation for finite systems, and with respect to other independent implementations for molecular systems. The achieved accuracy is remarkable and should be sufficiently good for production calculations.
Periodic Hartree-Fock and hybrid-functional calculations can be run in the same manner as the periodic LDA and GGA cases, by setting the keyword xc to hf or desired hybrid functionals, and setting the k_grid mesh to appropriate values. As mentioned above, by default the “real-space” periodic Hartree-Fock implementation will be invoked. There are two thresholding parameters (detailed below) which control the balance between the computational load and accuracy in the calculation. One may also switch to the “k-space” implementation of periodic Hartree-Fock and hybrid functionals for testing or comparison purposes by setting the keyword use_hf_kspace to be true (see below). The thresholding parameters do not apply to the “k-space” implementation, however.
Koopmans-compliant screened exchange hybrid functional
The functional is defined through the
-
•
background dielectric constant,,
eps_sx -
•
effective number of electrons,
N_sx -
•
volume of the cell,
V_sx -
•
the parameter,
Z_sx
as described in Eqs.(2-4) of the paper [204]. If you want to do a
calculation on a system contained in the paper, the values of Z_sx,
N_sx and are provided there, and V_sx is calculated
with the experimentally observed lattice parameters. If you have a material,
which is not listed in Table IV of the paper, do the following steps:
-
1.
To obtain the background dielectric constant , perform a GGA calculation of the optical properties. Select the number of bands and k-points in the range indicated by Table IV of the paper.
-
2.
To determine
N_sxinvestigate, which states primarily form the highest valence band and choose the corresponding number of electrons from the constituting atomic states. As an example, in GaN, the top of the VB is made up of nitrogen 2p orbitals. In a nitrogen atom, these hold 3 electrons, hence for the primitive cell, containing 2 nitrogen atoms,N_sx= 6. Consider your actual supercell (even for materials listed in Table IV) , and increaseN_sxaccording to the number of atoms involved. -
3.
To determine the optimal
Z_sxvalue, perform band structure calculations for the primitive unit, using the experimental lattice parameters, and a good k-point set. TuneZ_sxto match the calculated gap to the 0K quasi-particle band gap, obtained either from a GW0 calculation, or from renormalized 0K optical data [M. Cardona and M. L. W. Thewalt, Rev. Mod. Phys. 77, 1173 (2005)]. Z=0.6 is usually a meaningful starting point. -
4.
The default value of
V_sxis the cell volume calculated by the code. However, the idea is that the functional should be optimal at the experimental geometry, and is supposed to provide the equilibrium lattice parameters close to that. In course of the optimization, however, the volume is changing of course. To make the functional optimal, and to avoid issues with the functional changing during optimization (which would never lead to convergence) the actual volume has to be replaced with the experimental one, so this should be provided. In case of calculating alloys, the volume changes with the composition. Also the functional must change, so the actual volume is being used to define it. -
5.
To perform a calculation with the new functional set the parameters in control.in (here e.g. for bulk GaN):
xc kc_sx Z_sx 0.72 N_sx 6 eps_sx 5.82
Optional: if the calculation should not be done at the volume of the input structure, provide it as
V_sx=....
Note that the screenend exchange functional is similar sensitive to the choice of the basis as the Hartree-Fock method (in contrast to other screened hybrid functionals like, e.g., HSE06). Good results were obtained using "intermediate". Using "light" gave significant deviations from the converged results.
Tags for general section of control.in:
Tag: calculate_fock_matrix_version(control.in)
Usage: calculate_fock_matrix_version value
Purpose: Sets the internal code version to be used for the the linear-scaling
evaluation of the exchange matrix.
value is a number between 0 (zero) and 5 (five). Default: 5.
A significant optimization effort has targeted the large-scale hybrid density-functional theory part of FHI-aims since the original implementation in Ref. [197]. The original version, calculate_fock_matrix_version 0, is still available and went up to seriously large system sizes (up to 1,024 atoms were demonstrated in the original paper) but the updated versions (now the default) are significantly more memory-efficient, time-efficient, and scale to yet much larger systems.
The current default (at the time of writing) is calculate_fock_matrix_version 5 and offers a fairly sophisticated, load-balanced infrastructure that scales nearly ideally with number of processor for large systems and reduces memory use to the level deemed appropriate for the available compute nodes. It is based on shared memory usage within the MPI-3 standard and notably avoids OpenMP (no OpenMP used and no OpenMP commands should ever be needed with FHI-aims). This allows us to control the near-optimal layout of arrays etc. entirely within the code, for a given number of MPI tasks and compute nodes.
A possible, still rather efficient fallback in case of issues is calculate_fock_matrix_version 4, which is less sophisticated (but also less efficient) in its efforts to control load balancing. However, note that the correct way to address a failure of the hybrid DFT in FHI-aims (e.g., due to lack of memory for very large systems) using calculate_fock_matrix_version 5 is NOT just to fall back to calculate_fock_matrix_version 4. Rather (in our experience) one may simply need some more nodes, towards very large systems; alternatively, some underlying MPI libraries (not part of FHI-aims itself) may have unresolved bugs that can be addressed by switching to a different MPI library. Our recommendation is to just stick with calculate_fock_matrix_version 5 and address the underlying reasons for an observed problem, if helpful by contacting us via the Slack channel.
Tag: fock_matrix_nodes_per_instance(control.in)
Usage: fock_matrix_nodes_per_instance value
Purpose: Allows to manually set the number of nodes to be used for one instance
of a Fock matrix calculation.
value is an integer number that is a divisor of the total number of nodes.
The number of instances created is the total number of nodes divided by value.
An instance is a collection of arrays and communicators that are needed to compute
one row of the Fock matrix. Specify either fock_matrix_nodes_per_instance
or fock_matrix_instances_per_node, not both as the same time.
Default: the parameter is determined on runtime based on the largest arrays in a
Fock matrix calculation.
Tag: fock_matrix_instances_per_node(control.in)
Usage: fock_matrix_instances_per_node value
Purpose: Allows to manually set the number of instances to be created per node.
value is an integer number. An instance is a collection of arrays and
communicators that are needed to compute one row of the Fock matrix. value
should not be set to value larger than 8. This keyword usually is only needed for small
calculations, when there is enough memory per node available.
Default: the parameter is determined on runtime based on the largest arrays in a
Fock matrix calculation.
Tag: fock_matrix_blocking(control.in)
Usage: fock_matrix_blocking value
Purpose: Divides the Fock matrix into blocks with value rows that are
computed simultaneously.
value is an integer number, optimally a divisor of the number of basis functions.
Default: the parameter is determined on runtime based on the available memory per node.
Tag: coulomb_threshold(control.in)
Usage: periodic_hf coulomb_threshold value
Purpose: This sets a threshold value for a key ingredient in the construction
of the exact-exchange matrix – the Coulomb matrix.
The Coulomb matrix elements below the specified threshold
value are discarded in the calculation. Suggested values
are between and 0. The default value is .
Tag: exx_band_structure_version(control.in)
Usage: exx_band_structure_version value
Purpose: A periodic band structure calculation can be performed
either using a real-space version (value=1) or a
reciprocal-space version (value=2).
value is an integer, either 1 or 2. Default: 1.
The distinction between real-space and reciprocal-space pertains to the method used to calculate the Fock matrix; in both cases, the coordinate system used when specifying the k-path via the output band keyword is expressed in terms of reciprocal coordinates.
If output band is requested for a periodic Hartree-Fock or hybrid functional calculation, adhere to the following rules:
-
•
Do not use excessively many points in each band segment, for instance no more than 11. We also note that 21 is a reasonable value to sample the fine features of a band structure.
-
•
The real-space band structure version value=1 has low overhead and is accurate IF a reasonably dense k_grid is used during the preceding s.c.f. calculation. For very sparse s.c.f. k_grid settings, it can, however, fail. In that case, the failure is so obvious that one cannot miss it. We have implemented a criterion that checks if the used k_grid is dense enough to produce the reasonable band structure. It stops the code in case the check was not successful. The used criterion is deliberately tight to be safe, and that means that for some (not all!) systems you can get the reasonable band structure even with the k_grid that has 1 k-point less in every direction compared to the grid returned by the criterion. If you want to bypass the check, you can use override_warning_loose_kgrid keyword in control.in. If you do, we advise to use it with discretion and always test the results.
-
•
exx_band_structure_version 2 is a fallback method that will always work but comes with significant time and memory overhead. If the plotted band structure from the real-space version exx_band_structure_version value=1 has obvious numerical problems, please switch to a denser k_grid during s.c.f. Only if this approach is not successful or possible, consider exx_band_structure_version 2. The latter will always work, as the critical part of the work is handled in reciprocal space. As a consequence, though, sparsity in real space can no longer be exploited, and the band structure calculation becomes much slower than the real-space version.
-
•
In case of doubt, the band structure ONLY at -points used during the s.c.f. cycle itself can also be printed along certain directions by using the output band_during_scf keyword, which ensures that only the information that went into the s.c.f. cycle is actually used. This is mainly useful for debugging purposes.
Tag: override_warning_loose_kgrid(control.in)
Usage: override_warning_loose_kgrid
Purpose: If present, this keyword overrides the stop in the code that
prevents to run the band structure calculation when hybrid or Hartree–Fock
exchange correlation functional is asked for and the given s.c.f.
k-grid is insufficient for Fourier interpolation. As the criterion
is rather tight, it can in some cases return a slightly too tight k-grid.
Thus, you have the option to bypass the check with this keyword. However, we advise
to use it with discretion.
Tag: screening_threshold(control.in)
Usage: periodic_hf screening_threshold value
Purpose: This sets a screening parameter in a periodic Hartree-Fock (or hybrid
functional) calculation. The real-space exact-exchange matrix elements below the specified
threshold value are neglected in the calculation. Suggested values are between
and 0. Smaller values mean better accuracy but heavier computational loads.
The default value is .
Tag: use_hf_kspace(control.in)
Usage: use_hf_kspace flag
Purpose: The “k-space” periodic HF implementation can be invoked
by setting flag to be .true. This is, however,
very expensive.
Tag: split_atoms(control.in)
Usage: split_atoms flag
Purpose: The “split_atoms” periodic HF implementation can be switched off
by setting flag to be .false.
This keyword is no longer needed since calculate_fock_matrix_version 5 (the current default of the linear-scaling exact exchange implementation for Hartree-Fock and hybrid DFT) effectively supersedes it.
We no longer document this keyword here for this reason. Note that, if you use it with calculate_fock_matrix_version 5, split_atoms may actually increase the memory usage substantially (potentially preventing a calculation from working) while not offering any other gains.