3.25 Hartree-Fock, hybrid functionals, GW, et al.: All the details

The basic keywords to invoke different exchange-correlation methods (ground state and excited states) are given described in Sec. 3.3. Usually, invoking the relevant keyword together with the normal infrastructure required to run FHI-aims should be sufficient to produce a correct, converged result.

For Hartree-Fock and hybrid functionals, particularly for their periodic implementations, see also the dedicated next section, Sec. 3.26.

For methods that rely explicitly on a two-electron Coulomb operator (Hartree-Fock, hybrid functionals, GW, MP2, RPA, etc.) and/or a frequency-dependent response function (GW, RPA, …), some considerable numerical trickery enters the computation in order to keep it efficient yet manageable for practical purposes. We hope to provide resilient, system-independent default settings, but there is a lot of freedom beyond those defaults to either tighten up things or speed up calculations (at the price of reduced accuracy).

The present section describes all numerical settings for the aforementioned exchange-correlation treatments.

Specifically, we describe:

  • All settings that relate to the setup of the (over-)complete auxiliary basis that expands the products of pairs of basis functions into a separate basis to represent the Coulomb operator

  • All settings that rely to the frequency grid, analytic continuation from the imaginary to the real axis and contour deformation in GW related methods.

Even if you do not know what this is all about, you should know that the “auxiliary basis” is determined as an overcomplete basis, and superfluous basis functions are then reduced out by a threshold criterion, using singular value decomposition. This threshold is a value to be tested in case something unexpected happens.

There are four key references that provide the technical background for these sections:

  1. 1.

    Principle of how we calculate the two-electron Coulomb operator by so-called resolution of identity: Xinguo Ren et al. (2012), New J. Phys. 14, 053020, Ref. [257]. The approach summarized below and implemented in Keyword RI_method V is still the default for any non-periodic many-body perturbation calculations beyond DFT (i.e., for MP2, RPA, GW, etc.)

  2. 2.

    Localized resolution of identity (Keyword RI_method LVL), which is used by default for all Hartree-Fock and hybrid functional calculations, and which is the only option for any periodic calculations including the two-electron Coulomb operator: Arvid Ihrig et al. (2015), New J. Phys. 17, 093020, Ref. [151].

  3. 3.

    Linear-scaling and periodic implementation of periodic Hartree-Fock and hybrid functionals based on RI_method LVL, described in Sergey Levchenko et al. (2015), Comput. Phys. Commun. 192, 60-69, Ref. [197].

  4. 4.

    Technical details of self-energy evaluation in GW related methods, described in Ref. [257] and more detailed in Golze et al. (2018), J. Chem. Theory Comput., 14, 4856  [109]. The latter contains also a comparison of different techniques.

Mathematical background:

Any feature beyond standard DFT (e.g., HF, hybrid functional, MP2, GW, etc) requires the two-electron Coulomb repulsion integrals, and in FHI-aims an additional auxiliary basis set is introduced to deal with them. By utilizing the auxiliary basis functions the Nbasis4 many 4-center integrals are reduced to Nbasis2Naux many 3-center integrals and Naux2 many 2-center integrals ( where Nbasis and Naux are the numbers of regular basis functions and auxiliary basis functions, respectively). There are different ways to do so, and here we describe two versions of these, namely, the “V" and “SVS" [302] versions which have been implemented in this code. In the “V" version, the 4-center integrals are approximated by

(ij|ij)μν(ij|μ)Vμν1(ν|ij), (3.76)

and in the “SVS" one,

(ij|ij)μνμν(ijμ)Sμμ1VμνSνν1(νij), (3.77)

where i,j,i,j, denote the regular basis functions and μ,ν, denote the auxiliary basis functions. Here Vμν is the Coulomb repulsion integral between two auxiliary basis functions, and Sμν is the corresponding overlap integral. (ijμ) and (ij|μ) are the overlap and Coulomb repulsion between the regular basis orbital product ϕiϕj and the auxiliary basis function Pμ respectively. Eq. (3.76) and (3.77) are often refered to as resolution of identity (Refs. [41, 6, 302, 86] and others). In practice satisfactory accuracy can be gained with an auxiliary basis size Naux of 4-5 times of Nbasis. In addition, we implement a modified localized version of RI-V known as “RI-LVL”, described in Ref. [151] and also in Sec. 3.26. For the forthcoming low-scalign RPA and GW approaches, we have implemented the separable RI [78] where the 4-center integrals are approximated by

(ij|ij)=μνkkϕi(𝐫k)ϕj(𝐫k)MkμMνkϕi(𝐫k)ϕj(𝐫k) (3.78)

In Eq. (3.78) the orbital basis functions ϕi(𝐫k) are calculated in a set of real-space (RS) 𝐫k grid points and the Mμk fitting coefficients are obtained by the least square estimator. This version of RI is known as “RI-RS”.

How is the auxiliary basis functions constructed? In FHI-aims, it is built up as the “on-site” pair products of the regular basis functions (hence the auxiliary basis is also called the “product basis” in this context). These products are then orthonormalized at each atom using the gram-Schmidt method. These auxiliary basis functions are hence atom-centered numeric functions with a radial function times spherical harmonics Pμ(𝐫)=ξ(r)at,n,lrYlm(ϑ,φ). The radial part of the auxiliary basis function is formally linked to that of the regular basis functions by

{ξat,n,l(r)}={uat,n1,l1(r)uat,n2,l2,|l1l2|l|l1+l2|}. (3.79)

In Eq. (3.79) we make it clear that the set of auxiliary basis functions centered on certain atom originates from the pair products of regular basis functions centered on the the same atom. The angular momentum of the auxiliary basis and those of the two constituent regular basis satisfy the triangular true. The number of auxiliary basis function for a give l (enumerated by n) is controlled by the allowed pairs of regular basis functions, and the accuracy threshold in the Gram-Schmidt orthormalization. The process is described in Ref. [257] and in more detail with an illustrating figure in Ref. [151] (open access). The parameters that controls the construction of the auxiliary basis functions can be found below.

𝑮𝑾: Self-energy evaluation

Practical guidelines how to conduct GW calculations and a summary of numerical techniques are given a recent, comprehensive review article [107]. In GW, we have to calculate the self-energy Σ, which is given by

Σ(𝐫,𝐫,ω)=i2π𝑑ωeiωηG0(𝐫,𝐫,ω+ω)W0(𝐫,𝐫,ω). (3.80)

The frequency integration in Eq. (3.80) presents one of the major challenges in a G0W0 calculation because G0 and W0 have poles close to the real frequncy axis. Different numerical techniques were developed, which are summarized in Ref. [107]. In FHI-aims, the self-energy can be calculated with the analytic continuation and the contour deformation. The contour deformation is computationally more expensive, but more accurate. The implementation of the analytic continuation in FHI-aims is described in [257] and the implementation of the contour deformation in [109, 239].

Analytic continuation

When using the analytic continuation, the self-energy is first calculated on the imaginary frequency axis

Σ(𝐫,𝐫;iω)=i2π𝑑ωG(𝐫,𝐫;iω+iω)W(𝐫,𝐫;iω) (3.81)

and then analytically continued to the real frequency axis. A proper frequency grid is needed for the analytic continuation, i.e., a set of imaginary frequencies {iω} for which Σ(iω) is computed. One popular way of performing the analytical continuation is to model the self-energy with a multi-pole expression [264], namely,

Σ(iω)A0+nAniωBn, (3.82)

where n is the number of poles, and An and Bn are complex numbers. Eq. (3.82) is used to fitted the calculated self-energy on imagninary axis using the non-linear least square fitting algorithm. Once the the the parameters An and Bn are obtained that give the best fitting, the self-energy on the real frequency axis can be obtained by

Σ(ω)A0+nAnωBn. (3.83)

In practice n=2 (the so-called two-pole fitting) is often found to give good performance.

A Pade approximation based variant with more poles is also implemented in FHI-aims. In the Pade approximation, the self-energy is parameterized as

Σ(iω)=a11+a2(iωiω1)1+a3(iωiω2). (3.84)

For a given, chosen set of calculated self-energy data points {iωn,Σ(iωn)} with n=1,,N, the N complex parameters a1,,aN can be uniquely determined. The self-energy on the real frequency axis is then obtained by replacing iω by ω in Eq. (3.84). We note that the Pade approximation given by (3.84) can be interpreted as a multipole expression, with the number of poles Npole=N1.

The type of the analytical continuation used in GW (Eqs. (3.83) or (3.84)) is determined using the anacon_type and n_anacon_par keywords. The number of frequency points used on the imaginary axis (this determines the accuracy of the input used for fitting the expressions Eqs. (3.83) or (3.84)) can be set using the frequency_points keyword. The frequency grid used is a modified Gauss-Legendre grid that ranges from zero to infinity. Our experience suggests that highly accurate results for molecules (few meV accuracy for electronic excitations, compared to exact expressions for the self-energy on the real axis) can be obtained using a 16-parameter Pade approximation with 200 frequency points [304]. However, the numerical accuracy is then much higher (and much more costly) than the accuracy of the underlying GW approximation itself, and thus somewhat reduced defaults are set in the code (see keyword descriptions below).

More importantly, the Pade approximation is also numerically less stable than the two-pole approximation. This means that, for some systems with a complicated pole structure of the self-energy, the Pade fit might not converge for certain eigenvalues. The results must therefore be inspected carefully, even if (for normal light-element molecules and valence-like states) its accuracy can be much higher than the two-pole approximation. This is, in fact, not a simple implementation issue but rather one that goes back to the mathematical structure of the true self-energy.

Contour deformation

The analytical continuation becomes increasingly inaccurate for deeper states since the structure of the self-energy is typically more complicated, i.e., has more poles. The fitted pole models fail to represent these more complicated structures, see Ref. [109]. In these cases, the self-energy should be calculated on the real-frequency axis using the contour deformation technique, where the correlation part of the self-energy is then expressed as

Σc(𝒓,𝒓,ω)=12π𝑑ωG0(𝒓,𝒓,ω+iω)W0c(𝐫,𝐫,iω)iϕi(𝒓)ϕi(𝒓)W0c(𝒓,𝒓,|ϵiω|+iη)θ(ϵiω)+aϕa(𝒓)ϕa(𝒓)W0c(𝒓,𝒓,|ϵaω|+iη)θ(ωϵa) (3.85)

θ is the Heaviside step function, {ϕn} molecular orbitals and W0c(𝐫,𝐫,ω)=W0(𝐫,𝐫,ω)v(𝐫,𝐫) with the bare Coloumb interaction v(𝐫,𝐫). The index i refers to occupied and a to unoccupied orbitals. The evaluation of the self-energy with the contour deformation technique is controlled by the keyword contour_def_gw. An example input file for GW with the contour deformation can be found in [110].

The contour deformation calculations can be substantially optimized by employing the analytic continuation of the W matrices. This method leverages efficient Padé interpolation algorithms (details in Ref. [239]) to approximate the matrices, leading to a notable acceleration in computation times due to a scaling reduction from O(N5) to O(N4). This optimization, referred to as CD–WAC (Contour Deformation with W Analytic Continuation), is activated through the contour_def_wac command.

Fully analytical formulation

Instead of solving the integral in Eq. (3.80) numerically, the convolution can be carried out analytically, yielding a sum-over-states formula for the correlation part of the self-energy

Σnc(ω)=m,νρmnνρnmνωϵm+(Ωνiη)sgn(ϵFϵm) (3.86)

where Ων and ρmnν denote excitations energies and transition moments in the random phase approximation (RPA). The excitation energies are obtained from a Casida equation

[ABBA][XνYν]=Ων[XνYν] (3.87)

with matrices

Aia,jb=δijδab(ϵaϵi)+(ia|jb)Bia,jb=(ia|bj) (3.88)

where in the RPA approximation electron-hole pairs interact only through the direct Coulomb interaction (ia|jb). The transition moments ρmnν are obtained by contracting the eigenvectors with the direct Coulomb integrals as

ρmnν=ia(mn|ia)(Xiaν+Yiaν) (3.89)

The solution of Eq. (3.87) scales as N6, limiting this approach to systems with 60-80 atoms, depending on the basis set size. The computation of the fully analytical G0W0 self-energy is activated through the keyword fully_analytical_gw.

Tags for general section of control.in:

 

Tag: hf_version(control.in)

Usage: hf_version version
Purpose: Allows to switch between the standard density-matrix based setup of the Fock operator, and an orbital based exchange operator.
version is a number, either 0 (alias density_matrix) or 1 (aliases eigencoefficients and overlap). Default: 0

The exchange operator can be constructed either by summing over all states first to construct the density matrix (version 0), of by a straightforward setup of orbitals and computation of the exchange operator only then (version 1).

Both versions are pure matrix algebra. For small systems (few states), version 1 is vastly more efficient, but towards large systems (many states) an efficiency crossover clearly favors the density matrx based update.

By default, FHI-aims relies on the density-matrix based update always, because this is more memory efficient, but for small systems (atoms) with lots of basis functions, this choice should be reconsidered. Please note that both versions should work seamlessly with Pulay mixing.

 

Tag: anacon_type(control.in)

Usage: anacon_type string
Purpose: Specifies type of analytical continuation for the self-energy (we calculate the self-energy on the imaginary frequency axis, and hence need to continue it to the real axis)
string is a string that indicates the self-energy type, either ’two-pole’ or ’pade’. Default: No default (must be set by the user if the self-energy is required).


  • string = ’two-pole’ or ’0’ : The normal two-pole fitting (Eq (3.83)).

  • string = ’pade’ or ’1’ : Pade approximation (Eq. (3.84)).

The number of parameters in either approximation can be set using the keyword n_anacon_par.

Note that the anacon_type only makes sense if qpe_calc or sc_self_energy is set, i.e., the post-processing-type self-energy calculation is required.

This keyword must be set in control.in if the self-energy on the real axis is needed (usually, for GW). In past versions of FHI-aims, the code would set a silent default for anacon_type if a self-energy calculation for the real axis was required. This is no longer the case in present versions. Users must make this choice explicitly in ’control.in’; if that is not the case, the code will stop with a (hopefully gentle and instructive) warning message.

The reason is that the choice of the analytical continuation used can have a noticeable effect on the accuracy of GW-calculated eigenvalues. The two-pole approximation is well established, but less accurate than the Pade approximation when the latter works.[304] On the other hand, systems with a complicated self-energy structure can lead to numerical problems with the Pade approximation that can result in seemingly random values for certain predicted quasiparticle eigenvalues (this can be tested, for instance, by modifying the frequency_points keyword and tracking the results).

 

Tag: freq_grid_type(control.in)

Usage: freq_grid_type value
Purpose: If set, specifies the type of the grid for the imaginary frequency.

  • value = 0 : Normal Gauss-Legendre grid ranging from 0 to a maximum frequency, specified by the keyword maximum_frequency.

  • value = 1 : Modified Gauss-Legendre grid ranging from 0 to positive infinity.

  • value = 2 : Logarithmic grid ranging from 0.01 a.u. to a maximum value specified by maximum_frequency.

  • value = minimax : The minimax frequency grid is used. The minimax frequency grid was originally developed for the GW/RPA space–time method, and it has been shown that it can also be used for the canonical RPA implementation [18]. The minimax time–frequency grids are provided by the greenX library [19] (also available on GitHub).

Default: value=1


Note that the freq_grid_type only makes sense if qpe_calc or sc_self_energy is set, i.e., the post-processing-type self-energy calculation is required.

 

Tag: n_anacon_par(control.in)

Usage: n_anacon_par value
Purpose: If set, specifies the number of parameters used in the two-pole fitting (Eq. (3.82)) or Pade approximation (Eq (3.84)). The default value for n_anacon_par is 5 if anacon_type is set to ’two-pole’ (two-pole fitting), and 16 if anacon_type is set to ’pade’ (Pade approximation).


Note that the n_anacon_par only makes sense if qpe_calc or sc_self_energy is set, i.e., the post-processing-type self-energy calculation is required.

 

Tag: contour_def_gw(control.in)

Usage: contour_def_gw statestart,α stateend,α statestart,β stateend,β
Purpose: If set, specifies the range of states for which the GW quasiparticle energies are computed with the contour deformation, see Ref. [109] for a description of the implementation. The range can be specified for the α and β spin channel separately. Giving the range for the β channel is optional. If not specified, the range set for α will be also used for the β channel. For spin-unpolarized calculations specify only the α channel.

The quasiparticle energies for the other states are computed with the analytic continuation, i.e., the parameters anacon_type and n_anacon_par should be set as well. Setting frequency_points is mandatory (200 grid points is a solid choice).

 

Tag: contour_spin_channel(control.in)

Usage: contour_spin_channel integer (1 or 2)
Purpose: If specified, restricts the contour deformation to a certain spin channel. If not given, QP energies for both channels will be calculated.

 

Tag: contour_eta(control.in)

Usage: contour_eta real
Purpose: Specifies the broadening parameter η originally used for the contour deformation. It can be used with fully_analytical_gw to set the broadening in the fully analytical approach. It might be useful to set this parameter to higher values (e.g 0.002 a.u.) when printing the self-energy or spectral function. Otherwise the default setting ensures numerical accuracy and stability.
Default: 0.001 a.u.

 

Tag: contour_restart(control.in)

Usage: contour_restart task
Purpose: The iteration of the QP equation can become expensive for large systems. A restart is possible.
task is a string, specifying the desired restart task.
Available options for task are:

  • write : Writes restart info to file contour_gw_qp_energies.dat.

  • read : Reads restart info from contour_gw_qp_energies.dat and continues the QP iteration cycle.

  • read_and_write : Performs what the write and read options do. If the restart files do not exist, the code will still proceed normally.

 

Tag: contour_def_wac(control.in)

Usage: contour_def_wac Ncore Nvalence
Purpose: Activates the CD-WAC (Contour Deformation with W Analytic Continuation) mode, enhancing the contour deformation technique by utilizing analytic continuation of the W matrices, as detailed in Ref. [239]. This mode implements sensitive defaults derived from comprehensive studies on community benchmark sets like GW100 and CORE65. However, for specific scenarios requiring refined accuracy, the user can adjust the number of additional reference frequencies for the core (Ncore) and valence (Nvalence) states residues. The methodology for determining these regions and their heuristic significance is elaborated in Ref. [239].
Default: N=core20, N=valence20

Note that this keyword requires that the contour deformation is also activated with contour_def_gw.

 

Tag: range_cdwac_cor(control.in)

Usage: range_cdwac_cor ωmin ωmax
Purpose: This keyword specifies the energy range, defined by minimum (ωmin) and maximum (ωmax) values in eV, which will be employed for generating additional frequency points in the core residues region of CD-WAC. The selection of appropriate values is guided by a heuristic criterion described in Ref. [239]. Adjusting these parameters allows for fine-tuning the accuracy and efficiency of the CD-WAC process, especially in systems with complex core state interactions.
Default: ωmin=0.0eV ωmax=40.8eV

 

Tag: range_cdwac_val(control.in)

Usage: range_cdwac_val Δlower Δupper
Purpose: This keyword defines the neighborhood radii around valence residues, specified as a percentage of deviation from the actual residue value. The lower (Δlower) and upper (Δupper) radii deviation percentages determine the range within which additional frequency points are generated for the analytic continuation of the W matrices in the CD-WAC algorithm. The selection of appropriate values is guided by a heuristic criterion described in Ref. [239].
Default: Δlower=8.0% Δupper=1.0%

 

Tag: fully_analytical_gw(control.in)

Usage: fully_analytical_gw statestart stateend
Purpose: This keyword activates the computation of the G0W0 self-energy based on Eq. (3.86) for states statestart to stateend. The quasiparticle energies for the other states are computed with the analytic continuation, i.e., the parameters anacon_type and n_anacon_par should be set as well. It can be combined with standard GW keywords gw_hedin_shift, contour_eta and print_self_energy.

Note that this keyword is only implemented for closed-shell calculations.

Note that this keyword can possibly create a large amount of files, so use it with caution.

 

Tag: full_cmplx_sigma(control.in)

Usage: full_cmplx_sigma boolean
Purpose: Technical keyword for the contour deformation. If set to .true., the complex broadening term iη enters also the integral term of the self-energy. This requires more grid points in the frequency integration, i.e., frequency_points should be set to 2000. Including iη is not necessary when calculating the quasiparticle energies, but gets rid of (the very small) unphysical steps in the spectral function at the KS/HF energies.
Default: .false.

 

Tag: gw_zshot(control.in)

Usage: gw_zshot boolean
Purpose: If set to .true., the Z-factor is calculated and the quasiparticle equation is not calculated iteratively, but linearized using a Taylor expansion. Less exact than the iterative solution. Works with analytic continuation an contour deformation.
Default: .false.

 

Tag: contour_zshot_offset(control.in)

Usage: contour_zshot_offset real
Purpose: When using gw_zshot, the Z-factor is calculated, which contains the dervative of the self-energy with respect to the frequency. For the contour deformation, this derivative is calcuated numerically. This keyword defines the offset (delta value) to calculate the derivate numerically and should be a small number.
Default: 0.002 a.u.

 

Tag: gw_hedin_shift(control.in)

Usage: gw_hedin_shift boolean/state
Purpose: If set, the poor-man’s self-consistency proposed by Hedin [131] is enabled. The Hedin shift is referenced to a particular state, typically the HOMO. This procedure can be also considered as fixing the zero of the energy scale in a G0W0 calculation employing an overall energy shift ΔE. When including this shift, the starting point dependence is significantly reduced, similar to an ev-scGW0 calculation. The computational overhead is negligible. The Hedin shift can be calculated individually for each state. In this case, set ’.true.’ The shift can be also calculated for a particular state, which is then applied to all other states. For the latter, give an integer for the state instead of a boolean. Note that this is the common way to apply the Hedin shift with the HOMO as reference level.
Default: .false.

 

Tag: cumulant_expansion(control.in)

Usage: cumulant_expansion statestart stateend freqstart freqend freqRes
Purpose: If set, the spectral function An(ω) of states statestart to stateend is calculated using a GW+cumulant expansion (GW+C) approach. The spectral function is printed in the frequency range from freqstart to freqend with a resolution of freqRes. The frequency range should be given in eV. By setting cumulant_relative_output, the spectral function can be printed relative to the quasiparticle peak. The quasiparticle peak is calculated at the GW level specified by qpe_calc. Recommended use is together with gw_hedin_shift, which equals the GW+C quasiparticle shift. Only works for states calculated with CD or CD-WAC and only if the keyword contour_def_gw is set, and the usage is tested for core levels only so far. It requires the compilation with the USE_FFT_MKL flag.
Default: freq=Res0.01eV

 

Tag: use_decoupling(control.in)

Usage: use_decoupling boolean
Purpose: If set, the cumulant_expansion spectral function is calculated using only the diagonal elements ImWnn(ω) and neglecting all off-diagonal contributions ImWnm(ω), if the levels n and m are non-degenerate.
Default: .true.

 

Tag: cumulant_eta(control.in)

Usage: cumulant_eta real
Purpose: Specifies the broadening parameter η used for the cumulant_expansion spectral function.
Default: 0.1 eV

 

Tag: cumulant_relative_output(control.in)

Usage: cumulant_relative_output boolean
Purpose: Specifies the output format of the spectral function. If .false., the frequency grid of the spectral function is placed relative to the vacuum, i.e. the output is An(ω). If .true., the frequency grid of the spectral function is placed relative to the quasiparticle energy ϵnQP, i.e. the output is An(ωϵnQP). Note that setting this keyword changes the definition of the frequency limits specified in cumulant_expansion. If set, frequency limits are defined relative to ϵnQP.
Default: .false.

 

Tag: post_adjust_qp_relativistic(control.in)

Usage: post_adjust_qp_relativistic level1 adjustment1, ... , leveln adjustmentn
Purpose: At the conclusion of the quasiparticle calculation, the nth quasiparticle level is adjusted by the given amount (in eV). Relativistic effects are very pronounced for the deep core states, but these effects are rather independent of the valence configuration, so for the 1s core of light elements a post-quasiparticle correction dependent only on the atomic species of the orbital provides sufficient accuracy. Only works in conjunction with contour deformation. See reference [163] for further details.

 

Tag: pre_adjust_qp_relativistic(control.in)

Usage: post_adjust_qp_relativistic level1 adjustment1, ... , leveln adjustmentn
Purpose: As for post_adjust_qp_relativistic, but the adjustment is made to the KS-orbital energies before the quasiparticle calculation is initiated.

 

Tag: calc_spectral_func(control.in)

Usage: calc_spectral_func freqstart freqend resolution
Purpose: If set, the total spectral function A(ω) is printed in the given frequency (ω) range that should be given in eV. The spectral function is defined as

A(ω)=1πm|Σm(ω)|[ωεm(Σm(ω)vmxc)]2+[Σm(ω)]2 (3.90)

where m runs over all occupied and virtual states and where we include also the imaginary part of the complex self-energy Σ. See reference [108] for further details. Implemented for G0W0, evGW0 and evGW in combination with contour deformation. Unlike for fully self-consistent GW, we include only the diagonal matrix elements of Σ. Note that the calculation of the spectral is computationally much more expensive than solving the QP equation and is not recommended for prodcution runs. However, it gives access to the underlying physics, e.g., to investigate multisolution behavior and peak intensities.
Setting the resolution is optional. If not given, the resolution is 0.001 eV.

 

Tag: spectral_func_state(control.in)

Usage: spectral_func_state state
Purpose: If set, the spectral function is only calculated for state n

An(ω)=1π|Σn(ω)|[ωεn(Σn(ω)vnxc)]2+[Σn(ω)]2 (3.91)

where the total spectral function defined in Eq. (3.90) is A(ω)=mAm(ω).
Keyword is only active if calc_spectral_func is set.

 

Tag: iterations_sc_cd(control.in)

Usage: iterations_sc_cd integer
Purpose: Sets the maximum number of iterations in the eigenvalue self-consistent loop in an evGW (ev_scgw) or evGW0 (ev_scgw0) calcuation using the contour deformation (contour_def_gw). Note, this keyword does not set the number of iterations to converge the QP equation, but the number of iterations in the outer loop (iteration of eigenvalues in G and W). Convergence of the outer loop is usually reached within 10-20 steps.
Default: 20

 

Tag: nocc_sc_cd(control.in)

Usage: nocc_sc_cd integer
Purpose: Sets the number of occupied states that enter in the eigenvalue self-consistent loop in an evGW (ev_scgw) or evGW0 (ev_scgw0) calcuation using the contour deformation (contour_def_gw). Ideally all states should enter, but such a calculation is expensive with the contour deformation. If set to, e.g, 3, the first three occupied states (HOMO, HOMO-1 and HOMO-2) will enter in addition to the ones specified in contour_def_gw. A scissor shift will be applied to the rest of the occupied orbitals. It is recommend to include all occupied states if possible.
Default: 5

 

Tag: nvirt_sc_cd(control.in)

Usage: nvirt_sc_cd integer
Purpose: Sets the number of virtual states that enter in the eigenvalue self-consistent loop in an evGW (ev_scgw) or evGW0 (ev_scgw0) calcuation using the contour deformation (contour_def_gw). Ideally, all states should enter, but such a calculation is expensive with the contour deformation. If set to, e.g, 3, the first three unoccupied states (LUMO, LUMO+1 and LUMO+2) will enter in addition to the ones specified in contour_def_gw. A scissor shift will be applied to the rest of the virtual orbitals. It is recommended to include only 5 to 10 virtual states explicitly, in particular if the states of interest are, e.g., core states.
Default: 5

 

Tag: sc_reiterate(control.in)

Usage: sc_reiterate boolean
Purpose: Keyword for evGW (ev_scgw) or evGW0 (ev_scgw0) calcuations using the contour deformation (contour_def_gw). States that have converged in the eigenvalue self-consistent (outer) loop exit the eigenvalue iteration in G and W. If this keyword is set and if the states given in contour_def_gw converge before convergence of the evGW or evGW0 calculation is reached, they are re-iterated at the end. The changes upon re-iteration is typically smaller than 0.1 eV. Re-iteration is recommended for calculations that require benchmark accuracy.
Default: .false.

 

Tag: try_zshot(control.in)

Usage: try_zshot boolean
Purpose: Keyword for evGW (ev_scgw) or evGW0 (ev_scgw0) calcuations using the contour deformation (contour_def_gw). The QP solution might not converge for core states or high-energy virtual states when using a GGA starting point for the G0W0 calculation due to a lack of a distinct QP peak, which has been explained in [108]. For the first few iterations of the evGW or evGW0 calculation, the QP equation for some states might thus not converge. If this keyword is set to true, an approximation of the non-converged QP energies is obtained by linearizing the QP equation (Z-shot). Check that the Z-shot solution is not used in the last iterations of the evGW and evGW0 calculation since it is numerically not very exact, in particular for deep states.
Default: .true.

 

Tag: auxil_basis(control.in)

Usage: auxil_basis type
Purpose: Specifies the type of auxiliary basis used in the “beyond-DFT” calculation.
type is a string, which can be set either as full or opt. Default: full. Here is a brief explanation.

  • full : The auxiliary basis is constructed as the “on-site" pair products of the regular basis functions. The allowed pair products are controlled by the parameters max_n_prodbas and max_l_prodbas (see later). These pair products are then orthonormalized using Gram-Schmidt procedure for each atom.

  • opt : The auxiliary basis is obtained from an optimization procedure, and must be specified by hand in control.in – in the same spirit as the basis sets used in standard Gaussian-bases RI-MP2 calculations.

 

Tag: default_prodbas_acc(control.in)

Usage: default_prodbas_acc threshold
Purpose: Specifies the default for prodbas_acc
threshold is a real value, defining the onsite threshold for the auxiliary basis construction.
Default: 10-4 for RI_method lvl, depends on species (species_z) otherwise.

See prodbas_acc for more details. Default settings are:

  • 10-2 for Z10 (light elements)

  • 10-3 for 10<Z18

  • 10-4 for Z>18 (all heavier elements)

The old default (version 042811 and earlier) was simply 10-2 for all elements. For light elements, this setting produces accurate total energies and energy differences to the sub-meV level in all our tests. For heavier elements, significant inaccuracies could happen in atomic total energies. These inaccuracies would cancel out in energy differences; to guarantee total energy accuracy as well, we now set significantly tighter defaults for prodbas_acc in this range (alas, also more expensive, both in time and memory use).

 

Tag: default_max_l_prodbas(control.in)

Usage: default_max_l_prodbas value
Purpose: Specifies the default for max_l_prodbas
Default: 20 for RI_method lvl, 5 for RI_method V and nuclear charge Z<=54, and 6 for RI_method V and Z>54.

 

Tag: default_max_n_prodbas(control.in)

Usage: default_max_n_prodbas value
Purpose: Specifies the default for max_n_prodbas
Default: 20 for RI_method lvl, 6 otherwise.

 

Tag: frequency_points(control.in)

Usage: frequency_points value
Purpose: If set, specifies the number of (imaginary) frequency points for the self-energy calculation.
The default value for the frequency points depends on the choice of the analytical continutation type. For two-pole fitting (anacon_type=’two-pole’), the default value for frequency_points value is 40; for the Pade approximation (anacon_type=’pade’) the default value for frequency_points is 100.
For the freq_grid_type minimax, the default frequency_points is 6 gives the good accuracy.

Note that the frequency_points only makes sense if qpe_calc or sc_self_energy is set, i.e., the post-processing-type self-energy calculation is required.

 

Tag: maximum_frequency(control.in)

Usage: maximum_frequency value
Purpose: If set, specifies the maximal (imaginary) frequency value for the self-energy self-consistent calculation. The unit for value here is Hartree.

Note that the maximum_frequency only makes sense when the freq_grid_type is set to be 0 or 2, i.e., when the stdandard Gauss-Legendre grid or logrithmic grid is used. For freq_grid_type=0, the default value for maximum_frequency is 10 Hartree; for freq_grid_type=2, the default value is 5000 Hartree. However, when self-consistent GW is involked (both scGW and scGW0), the default value for maximum_frequency is 7000 Hartree.

 

Tag: maximum_time(control.in)

Usage: maximum_time value
Purpose: If set, specifies the maximal (imaginary) time value for the self-consistent self-energy calculation. The unit for value here is Hartree-1. The default value is 1000 a.u..

Note that the maximum_time only makes sense if sc_self_energy is set, i.e., the self-consistent self-energy calculation is required.

 

Tag: n_poles(control.in)

Usage: n_poles value
Purpose: If set, specifies the number of poles (i.e. the number of functions of the form fi(ω)=1/(bi+iω) ) adopted in the pole-based computation of the Fourier transform in self-consistent GW-type calculations.

Note that the n_poles only makes sense if sc_self_energy is set, i.e., the post-processing-type self-consistent self-energy calculation is required.

 

Tag: pole_max(control.in)

Usage: pole_max value
Purpose: If set, specifies the position in the (imaginary) frequency axis of the largest poles (i.e. the largest bi coefficient in fi(ω)=1/(bi+iω) ) used in computation of the Fourier transform in self-consistent GW-type calculations. The unit for value here is Hartree.

Note that the pole_max only makes sense if sc_self_energy is set, i.e., the post-processing-type self-consistent self-energy calculation is required.

 

Tag: pole_min(control.in)

Usage: pole_min value
Purpose: If set, specifies the position in the (imaginary) frequency axis of the smallest poles (i.e. the smallest bi coefficient in fi(ω)=1/(bi+iω) ) used in computation of the Fourier transform in self-consistent GW-type calculations. The unit for value here is Hartree.

Note that the pole_min only makes sense if sc_self_energy is set, i.e., the post-processing-type self-consistent self-energy calculation is required.

 

Tag: prodbas_nb(control.in)

Usage: prodbas_nb nb
Purpose: For very large scale beyond-GGA calculations, the distribution of auxiliary basis functions among the CPUs becomes problematic because each CPU only gets very few. The default Scalapack distribution is more taylored to efficient calculations and distributes these functions in chunks of finite size. In massively parallel runs, often each CPU gets either one or two of these chunks, leading to bad memory distribution. This can be circumvented, possibly sacrificing some performance, by setting the chunk size nb to “1”.
nb is the chunk size of auxiliary basis functions.
Default: min(16,Naux/Nproc).

 

Tag: prodbas_threshold(control.in)

Usage: prodbas_threshold threshold
Purpose: Prevent the possible ill-conditioning of the auxiliary basis, similar to basis_threshold for the regular basis.
threshold is a small positive real number for the eigenvalue of the Coulomb matrix for the auxiliary basis. Default: 10-5.

The the auxiliary basis functions centered on different atoms are nonorthogonal, and the possible linear dependence (with certain accuracy) between them and the resultant behavior has to be carefully eliminated. This is achieved by setting the cutoff threshold for the auxiliary basis prodbas_threshold. From many test calculations, it is found that the reliable value for threshold here are between 10-5 to 10-3. Within this window, the total energy may still have some noticable change, but the energy difference is usually negligible. It is suggested that the user should play with this value if he/she is not sure about his/her result.

 

Tag: RI_method(control.in)

Usage: RI_method type
Purpose: Specifies the version of the resolution of identity used in the beyond-DFT calculations. Here type is a string, with possible options listed below.
Default: Non-periodic: type=lvl for Hartree-Fock and hybrid functional calculations, which can be used for geometry relaxations. type=v for MP2, RPA and GW etc. calculations that require unoccupied states. This gives better accuracy for a given auxiliary basis, but is usually more expensive and provides no geometry relaxation at present. Periodic: type=LVL_fast, which provides the necessary reduction to what is essentially an O(N) framework.

Different options for the type option include:

  • svs (for the “SVS” version, i.e., Eq. (3.77))

  • v (for the “V” version, i.e. Eq. (3.76))

  • lvl (for the “LVL” version)

  • LVL, LVL_fast, or lvl_fast are all synonymous with option lvl

  • lvl_full implements a non-linear scaling version of the “LVL” approach for total energy evaluations only

  • lvl_2nd implements the so-called “robust Dunlap correction” for total energies only, which essentially follows up an s.c.f. calculation with an additional RI-V-like step. In our experience, similar results are better accomplished without this correction (see also Ref. [151]).

  • rs (for the real-space “RS” separable RI (RI-RS) version, i.e., Eq. (3.78))

Rules of thumb:

  • type=V is the preferred method for non-periodic calculations. For formal reasons, this is clearly the most accurate version. However, neither a periodic version nor gradients (forces and relaxations) are implemented at present.

  • type=LVl_fast is the preferred version for periodic calculations, as well as for non-periodic Hartree-Fock and hybrid functional calculations with forces and relaxation. It scales as O(N) and greatly limits the memory use compared to RI-V. RI_method LVL_fast localizes the expansion of the Coulomb potential of basis function products to two centers, as described in Sec. 3.26 and in Refs. [151, 197]. It also relies extensively on screening near-zero elements of the density matrix and of the Coulomb operator. On the other hand, the localized version has slightly larger errors than RI-V for hybrid functionals, and significantly larger errors for correlated methods like MP2 or RPA. These can be remedied by adding a few extra functions to the construction of the auxiliary basis set using the for_aux keyword, as described in detail in Ref. [151].

  • At present, do not use RI-LVL for MP2 and RPA unless you know what you are doing. It is possible to repair their accuracy in RI-LVL, by increasing the size of the auxiliary basis set using the for_aux keyword, as described in detail in Ref. [151], but please see the figures and benchmarks in that reference before trying.

  • type=LVL_full is a slower, non-screened version of LVL for non-periodic systems. Very useful as a reference to make sure.

  • type=SVS is here only for testing purposes. This is the naive, purely overlap based version of RI and should not be used in production.

  • type=RS is here only for testing purposes. The real-space separable RI (RI-RS) has been tested for the non-periodic HF, MP2, CCSD, RPA, rPT2 and GW calculations. To achieve similar accuracy to the RI-V method the cutoff threshold for the auxiliary basis prodbas_threshold should be set to 10-10.

There is an additional option for non-periodic systems, lvl_2nd, which adds a correction term to the nonlocal exchange energy to make it “robust” in the Dunlap sense [80], that is, the error in the energy is quadratic in the error in the product expansion.

 

Tag: ri_multipole_threshold(control.in)

Usage: ri_multipole_threshold float
Purpose: Specifies the tolerance for vanishing multipole moments of auxiliary functions.
Default: float default to 1e-8.

 

Tag: sbtgrid_lnrange(control.in)

Usage: sbtgrid_lnrange lnrange
Purpose: for use_logsbt, set the range of the logarithmic grid (in logarithmic units). Default is 45, which corresponds to nearly twenty orders of magnitude. Please note that the range should be larger than intuitively guessed because the range is the same both in real and reciprocal space (for algorithmic reasons) and the tails in reciprocal space have to be captured.

 

Tag: sbtgrid_lnr0(control.in)

Usage: sbtgrid_lnr0 lnr0
Purpose: for use_logsbt,set the onset of the logarithmic grid in real space. The default is 38.

 

Tag: sbtgrid_lnk0(control.in)

Usage: sbtgrid_lnk0 lnk0
Purpose: for use_logsbt, set the onset of the logarithmic grid in Fourier space. The default is 25.

 

Tag: sbtgrid_N(control.in)

Usage: sbtgrid_N N
Purpose: for use_logsbt, set the number of logarithmic grid points both in real and Fourier space. The default is 4096.

For the accuracy, the density of points N/𝚕𝚗𝚛𝚊𝚗𝚐𝚎 is relevant. Additionally, one has to make sure that the tails in real and Fourier space are properly included.

 

Tag: sort_product_basis(control.in)

Usage: sort_product_basis bool
Purpose: If set, the product/auxiliary basis functions of each species will be sorted by increasing field radius extent. The ordering is printed in the standard output underneath the heading:

Species   l  charge radius    field radius

As of version 240813, by default this sorting does not happen. Previously the default was to sort.
Default: .false.

 

Tag: state_lower_limit(control.in)

Usage: state_lower_limit value
Purpose: If set, specifies the lowest single-particle eigenstate to be included for the quasiparticle calculation.

Note that the state_lower_limit only makes sense if qpe_calc is set, i.e., the post-processing-type self-energy calculation is required.

 

Tag: state_upper_limit(control.in)

Usage: state_upper_limit value
Purpose: If set, specifies the highest single-particle eigenstate to be included for the quasiparticle calculation.

Note that the state_upper_limit only makes sense if qpe_calc is set, i.e., the post-processing-type self-energy calculation is required.

 

Tag: time_points(control.in)

Usage: time_points value
Purpose: If set, specifies the number (imaginary) time points for the self-energy calculation.
Default: value=80.

Note that the frequency_points only makes sense if qpe_calc or sc_self_energy is set, i.e., the post-processing-type self-energy calculation is required.

 

Tag: use_logsbt(control.in)

Usage: use_logsbt bool
Purpose: If set, the two-center integrals are calculated by one-dimensional integrations in Fourier-space. In the case of RI_method LVL, also the three-center integrals are computed by this method.
Default: .true.

use_logsbt .true. is faster and more accurate than the alternative.

The algorithm for the overlap and Coulomb integrals is described by Talman in [292]. It uses an efficient spherical Bessel transform on a logarithmic radial grid [291, 123, 124] to obtain the Fourier transform of the auxiliary basis functions and calculates the integrals in Fourier space.

 

Tag: use_ovlp_swap(control.in)

Usage: use_ovlp_swap
Purpose: if set, the atomic orbital (AO) based 3-index overlap matrix (“ovlp_3fn" in the source code) is written to the disk before transforming to the molecular orbital (MO) based 3-index overlap matrix (“O_2bs1HF" for HF calculations and “ovlp_3KS" for GW calculations in the source code). This avoids the double allocation of both the AO-based 3-index integral matrix and MO-based 3-index integral matrix at the same time and thus reduces the memory cost by about a factor of two.

Subtags for species tag in control.in:

 

species sub-tag: aux_gaussian(control.in)

Usage: aux_gaussian L N [alpha]
              [ alpha_1 coeff_1 ]
              [ alpha_2 coeff_2 ]
              [ … ]
              [ alpha_N coeff_N ]
Purpose: For auxil_basis opt, adds a Gaussian basis function to the auxiliary basis set for the Coulomb operator.
L is an integer number, specifying the angular momentum
N is an integer number, specifying how many primitive Gaussians comprise the present radial function
alpha : If N=1, this is the exponent defining a primitive Gaussian function [in bohr-2].
alpha_i coeff_i : If N>1, i=1,,N additional lines specify exponents αi and expansion coefficients gi for a non-primitive linear combination of Gaussians.

See the description of gaussian basis functions; Gaussian basis functions in the auxiliary basis use essentially the same infrastructure.

 

species sub-tag: for_aux(control.in)

Usage: for_aux basis options
Purpose: Add a extra basis function to constructor of auxiliary basis function.
Basis is a either hydro or ionic basis function keyword and options are options for that basis function.

Adds extra basis function ONLY to the construction of the auxiliary basis set used to expand the Coulomb operator (resolution of identity, see keyword RI_method). In particular, the accuracy of RI_method LVL can be increased by adding extra high-angular momentum radial functions to the auxiliary basis set. The improvement becomes less and less relevant as the orbital basis set itself increases. For instance, there may be a noticeable change for tier 1, but much less or not at all for tier 2. Currently supports only hydro and ionic basis functions.

For a detailed description with benchmarks, please see Ref. [151].

Here is an example for the use of the for_aux, which was obtained by altering the “light” settings of the C atom:

[...]
#  "First tier" - improvements: -1214.57 meV to -155.61 meV
     hydro 2 p 1.7
     hydro 3 d 6
     hydro 2 s 4.9
#  "Second tier" - improvements: -67.75 meV to -5.23 meV
  for_aux     hydro 4 f 9.8
#     hydro 3 p 5.2
#     hydro 3 s 4.3
  for_aux     hydro 5 g 14.4
#     hydro 3 d 6.2
[...]

Adding those two high-angular momentum functions does improve the quality of the RI_method LVL noticeably, at the price of more CPU time. On the other hand … “light” settings are specifically chosen because not everything is completely converged. In fact, the orbital basis set error itself may be larger than the accuracy gained by amending the two-electron Coulomb operator expansion.

 

species sub-tag: prodbas_acc(control.in)

Usage: prodbas_acc threshold
Purpose: Technical cutoff criterion for on-site orthonormalization of auxiliary radial functions. Here threshold is a small positive real value. Default: See below.

To construct the set of auxiliary basis functions, the radial functions for a single species are “on-site" Gram-Schmidt orthonormalized. If the norm of the function after orthonormalization is smaller than threshold, that function is omitted.

The present default values are:

  • 10-2 for Z10 (light elements)

  • 10-3 for 10<Z18

  • 10-4 for Z>18 (all heavier elements)

The old default (version 042811 and earlier) was simply 10-2 for all elements. For light elements, this setting produces accurate total energies and energy differences to the sub-meV level in all our tests. For heavier elements, significant inaccuracies could happen in atomic total energies. These inaccuracies would cancel out in energy differences; to guarantee total energy accuracy as well, we now set significantly tighter defaults for prodbas_acc in this range (alas, also more expensive, both in time and memory use).

For simplicity, it is also possible to use default_prodbas_acc to set a global value of prodbas_acc across all elements.

Note that the prodbas_acc should not be confused with the prodbas_threshold. The former is used when constructing the auxiliary basis functions for each species, whereas the latter is used to deal with the ill-conditioning behavior of the Coulomb repulsion and/or the overlap matrix between the set of auxiliary basis functions for the whole systems.

 

species sub-tag: max_n_prodbas(control.in)

Usage: max_n_prodbas value
Purpose: Specifies the maximal principal quantum number for the regular basis function to be included in the auxiliary (product) basis construction.
value is a positive integer number here.

Note that max_n_prodbas has an effect only when auxil_basis is setted to full.

 

species sub-tag: max_l_prodbas(control.in)

Usage: max_l_prodbas value
Purpose: Specifies the maximal angular quantum number for the auxiliary (product) basis function. Any possible auxiliary basis with an angular momentum higher than max_l_prodbas is excluded.
value is a positive integer number here.

Note that max_l_prodbas controls the auxiliary basis whereas max_n_prodbas controls only directly the regular basis. max_l_prodbas has an effect regardless whether auxil_basis is setted to full or opt.