abacus-develop icon indicating copy to clipboard operation
abacus-develop copied to clipboard

Request: Better convergence of HSE in magnetic system

Open QuantumMisaka opened this issue 2 years ago • 15 comments

Details

I've tested HSE SCF in magnetic system, example is Fe-bcc conventional cell:

ATOMIC_SPECIES
Fe 55.845 Fe_ONCV_PBE-1.0.upf upf201

NUMERICAL_ORBITAL
Fe_gga_8au_100Ry_4s2p2d1f.orb

LATTICE_CONSTANT
1.889726

LATTICE_VECTORS
    2.8301511117     0.0000000000     0.0000000000 #latvec1
    0.0000000000     2.8301511117     0.0000000000 #latvec2
    0.0000000000    -0.0000000000     2.8301511117 #latvec3

ATOMIC_POSITIONS
Direct

Fe #label
2 #magnetism
2 #number of atoms
    0.0000000000     0.0000000000     0.0000000000 m  1  1  1
    0.5000000000     0.5000000000     0.5000000000 m  1  1  1

And KPT is 9 9 9

Fe-HSE.tar.gz

Information: ABACUS version: 3.4.4: Commit: 5f9d472 (Mon Dec 4 14:10:21 2023 +0800) Dependence: Intel-OneAPI and Intel-toolchain LibRI and LibComm: latest version before Nov 18

At first, my INPUT example is

#Parameters (1.General)
suffix                  Fe  # suffix of OUTPUT DIR
nspin                   2   #  1/2/4 4 for SOC
symmetry                0   #  0/1  1 for open, default
esolver_type            ksdft  # ksdft, ofdft, sdft, tddft, lj, dp
dft_functional          hse  # same as upf file, can be lda/pbe/scan/hf/pbe0/hse
ks_solver             genelpa  # default for ksdft-lcao
vdw_method              none  # none, d3, d3_bj

#Parameters (2.Iteration)
calculation             scf # scf relax cell-relax md
ecutwfc                 100
scf_thr                 1e-7
scf_nmax                300

#Parameters (3.Basis)
basis_type              lcao  # lcao or pw

#Parameters (4.Smearing)
smearing_method         mp    # mp/gaussian/fixed
smearing_sigma          0.002  # Rydberg

#Parameters (5.Mixing)
mixing_type             broyden  # pulay/broyden

#Parameters (6.Calculation)
cal_force          1
cal_stress         1
out_stru           1  # print STRU in OUT
out_chg            1  # print CHG or not
out_bandgap        1
out_mul            1  

it is very hard to converge to scf_the 1e-7, even cannot reach scf_thr 1e-6 within 5-days calculation in OMP_NUM_THREADS=16 mpirun -np 4 abacus in Intel-8358

# After more than 700 lines of print-out and 4-days calculation
 Updating EXX and rerun SCF
 GE1    5.32e+00  5.81e+00  -6.437418e+03  0.000000e+00   1.291e-06  9.196e+00  
 GE2    5.32e+00  5.81e+00  -6.437418e+03  1.364268e-09   7.188e-07  8.863e+00  
 GE3    5.32e+00  5.81e+00  -6.437418e+03  1.468676e-09   3.518e-07  8.800e+00  
 GE4    5.32e+00  5.81e+00  -6.437418e+03  5.839128e-10   2.326e-07  8.802e+00  
 GE5    5.32e+00  5.81e+00  -6.437418e+03  -2.100539e-09  3.236e-08  8.843e+00  
 Updating EXX and rerun SCF
 GE1    5.32e+00  5.81e+00  -6.437418e+03  0.000000e+00   1.058e-06  9.111e+00  
 GE2    5.32e+00  5.81e+00  -6.437418e+03  -1.546015e-09  5.929e-07  8.805e+00  
 GE3    5.32e+00  5.81e+00  -6.437418e+03  -1.840679e-10  2.948e-07  8.871e+00  
 GE4    5.32e+00  5.81e+00  -6.437418e+03  1.423819e-09   4.995e-08  8.820e+00  

And after I saw #3103 , I add a parameter in my INPUT:

mixing_gg0   0.0

After that, convergence performance is better, in 2-days calculation of OMP_NUM_THREADS=24 mpirun -np 2 abacus in Intel-8162, the SCF converge to scf_thr 1e-6, but not scf_thr 1e-7

 START CHARGE      : atomic
 DONE(177.792    SEC) : INIT SCF
 ITER   TMAG      AMAG      ETOT(eV)       EDIFF(eV)      DRHO       TIME(s)    
 GE1    4.01e+00  4.01e+00  -6.440073e+03  0.000000e+00   4.826e-02  4.429e+00  
 GE2    4.31e+00  4.41e+00  -6.440405e+03  -3.311553e-01  1.996e-02  3.688e+00  
 GE3    4.33e+00  4.43e+00  -6.440409e+03  -4.691903e-03  5.726e-03  3.677e+00  
 GE4    4.33e+00  4.43e+00  -6.440409e+03  2.332581e-04   3.079e-03  3.684e+00  
 GE5    4.33e+00  4.43e+00  -6.440409e+03  -5.472160e-05  1.219e-03  3.626e+00  
 GE6    4.33e+00  4.43e+00  -6.440409e+03  -1.579811e-05  1.703e-04  3.681e+00  
 GE7    4.33e+00  4.43e+00  -6.440409e+03  -2.383246e-07  6.439e-05  3.724e+00  
 GE8    4.33e+00  4.43e+00  -6.440409e+03  -6.277874e-08  2.805e-05  3.635e+00  
 GE9    4.33e+00  4.43e+00  -6.440409e+03  -2.755682e-08  9.261e-06  3.668e+00  
 GE10   4.33e+00  4.43e+00  -6.440409e+03  1.987624e-10   9.984e-07  3.717e+00  
 GE11   4.33e+00  4.43e+00  -6.440409e+03  1.256766e-09   1.477e-07  3.667e+00  
 GE12   4.33e+00  4.43e+00  -6.440409e+03  -2.078884e-09  8.750e-08  3.641e+00  
 Updating EXX and rerun SCF
 GE1    5.07e+00  5.25e+00  -6.432274e+03  0.000000e+00   6.975e-02  1.732e+01  
 GE2    5.12e+00  5.38e+00  -6.437178e+03  -4.903432e+00  5.335e-02  1.714e+01  
 GE3    5.08e+00  5.37e+00  -6.437337e+03  -1.595823e-01  2.761e-02  1.717e+01  
 GE4    5.08e+00  5.36e+00  -6.436762e+03  5.755460e-01   2.955e-02  1.724e+01  
 GE5    5.18e+00  5.45e+00  -6.437070e+03  -3.075961e-01  1.282e-02  1.730e+01  
 GE6    5.20e+00  5.46e+00  -6.437078e+03  -8.548606e-03  8.137e-03  1.715e+01  
 GE7    5.19e+00  5.45e+00  -6.437053e+03  2.523551e-02   9.021e-03  1.717e+01  
 GE8    5.22e+00  5.47e+00  -6.437049e+03  4.194422e-03   4.162e-03  1.725e+01  
 GE9    5.25e+00  5.49e+00  -6.437052e+03  -2.974158e-03  3.035e-04  1.720e+01  
 GE10   5.25e+00  5.49e+00  -6.437052e+03  1.164049e-05   3.154e-04  1.713e+01  
 GE11   5.25e+00  5.49e+00  -6.437052e+03  -1.927004e-05  9.251e-05  1.714e+01  
 GE12   5.25e+00  5.49e+00  -6.437052e+03  4.742927e-06   1.342e-04  1.723e+01  
 GE13   5.25e+00  5.49e+00  -6.437052e+03  -3.654831e-06  1.064e-04  1.724e+01  
 GE14   5.25e+00  5.49e+00  -6.437052e+03  -1.292602e-06  2.761e-06  1.720e+01  
 GE15   5.25e+00  5.49e+00  -6.437052e+03  -4.918788e-10  1.088e-06  1.726e+01  
 GE16   5.25e+00  5.49e+00  -6.437052e+03  -1.480277e-09  5.592e-07  1.722e+01  
 GE17   5.25e+00  5.49e+00  -6.437052e+03  3.408349e-09   1.277e-07  1.720e+01  
 GE18   5.25e+00  5.49e+00  -6.437052e+03  3.209587e-10   1.536e-08  1.722e+01  
 Updating EXX and rerun SCF
 GE1    5.30e+00  5.66e+00  -6.437386e+03  0.000000e+00   7.783e-03  1.756e+01  
 GE2    5.30e+00  5.70e+00  -6.437389e+03  -2.905916e-03  3.097e-03  1.776e+01  
 GE3    5.30e+00  5.69e+00  -6.437389e+03  -1.426553e-04  3.709e-04  1.768e+01  
 GE4    5.30e+00  5.69e+00  -6.437389e+03  -3.933669e-07  1.830e-04  1.763e+01  
 GE5    5.30e+00  5.69e+00  -6.437389e+03  -1.916378e-07  6.337e-05  1.758e+01  
 GE6    5.30e+00  5.69e+00  -6.437389e+03  -5.438509e-08  6.068e-06  1.773e+01  
 GE7    5.30e+00  5.69e+00  -6.437389e+03  6.844540e-10   4.172e-06  1.765e+01  
 GE8    5.30e+00  5.69e+00  -6.437389e+03  -2.401390e-09  2.932e-06  1.760e+01  
 GE9    5.30e+00  5.69e+00  -6.437389e+03  -1.980663e-09  3.465e-07  1.768e+01  
 GE10   5.30e+00  5.69e+00  -6.437389e+03  1.095900e-09   4.516e-08  1.761e+01  
 Updating EXX and rerun SCF
 GE1    5.30e+00  5.75e+00  -6.437412e+03  0.000000e+00   2.970e-03  1.772e+01  
 GE2    5.30e+00  5.77e+00  -6.437412e+03  -5.071874e-04  1.115e-03  1.761e+01  
 GE3    5.30e+00  5.76e+00  -6.437412e+03  -3.600643e-05  3.660e-04  1.766e+01  
 GE4    5.30e+00  5.76e+00  -6.437412e+03  1.002332e-06   1.333e-04  1.767e+01  
 GE5    5.30e+00  5.76e+00  -6.437412e+03  -3.536508e-07  3.344e-05  1.765e+01  
 GE6    5.30e+00  5.76e+00  -6.437412e+03  -1.065892e-08  3.677e-06  1.761e+01  
 GE7    5.30e+00  5.76e+00  -6.437412e+03  1.508119e-10   2.340e-06  1.777e+01  
 GE8    5.30e+00  5.76e+00  -6.437412e+03  -2.848412e-09  1.372e-06  1.762e+01  
 GE9    5.30e+00  5.76e+00  -6.437412e+03  -6.697595e-10  5.157e-07  1.766e+01  
 GE10   5.30e+00  5.76e+00  -6.437412e+03  2.343385e-10   2.126e-07  1.772e+01  
 GE11   5.30e+00  5.76e+00  -6.437412e+03  2.143849e-09   3.190e-08  1.777e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   8.249e-04  1.792e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -3.188180e-05  3.782e-04  1.772e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  -7.317703e-08  1.303e-04  1.774e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  -3.181451e-07  5.785e-05  1.770e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  1.346944e-08   1.282e-05  1.783e+01  
 GE6    5.29e+00  5.78e+00  -6.437418e+03  -2.061869e-09  2.488e-06  1.767e+01  
 GE7    5.29e+00  5.78e+00  -6.437418e+03  2.597832e-09   4.422e-07  1.771e+01  
 GE8    5.29e+00  5.78e+00  -6.437418e+03  3.727761e-10   1.378e-07  1.783e+01  
 GE9    5.29e+00  5.78e+00  -6.437418e+03  -5.916467e-10  5.191e-08  1.774e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   2.048e-04  1.776e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -1.439515e-06  9.945e-05  1.772e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  2.207036e-08   4.256e-05  1.779e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  1.388243e-09   1.148e-05  1.771e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  -1.028305e-08  4.965e-06  1.766e+01  
 GE6    5.29e+00  5.78e+00  -6.437418e+03  -4.977566e-09  4.509e-07  1.777e+01  
 GE7    5.29e+00  5.78e+00  -6.437418e+03  8.770292e-10   1.506e-07  1.783e+01  
 GE8    5.29e+00  5.78e+00  -6.437418e+03  -9.288467e-10  6.491e-08  1.770e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   6.077e-05  1.769e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -1.134252e-07  3.019e-05  1.777e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  8.373541e-09   1.607e-05  1.777e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  2.111367e-09   2.498e-06  1.768e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  -3.221961e-09  4.133e-07  1.770e+01  
 GE6    5.29e+00  5.78e+00  -6.437418e+03  4.555293e-10   1.491e-07  1.771e+01  
 GE7    5.29e+00  5.78e+00  -6.437418e+03  2.135342e-09   4.883e-08  1.783e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   2.286e-05  1.784e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -2.029078e-08  1.168e-05  1.772e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  1.047176e-09   6.660e-06  1.773e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  2.583137e-10   1.001e-06  1.789e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  -1.795822e-09  4.420e-07  1.777e+01  
 GE6    5.29e+00  5.78e+00  -6.437418e+03  1.625675e-09   7.378e-08  1.779e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   1.176e-05  1.768e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -6.389011e-09  5.673e-06  1.767e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  1.296982e-09   3.038e-06  1.780e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  3.175557e-09   3.255e-06  1.771e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  -3.668210e-09  2.879e-07  1.769e+01  
 GE6    5.29e+00  5.78e+00  -6.437418e+03  7.262173e-10   4.905e-08  1.772e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   6.956e-06  1.776e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -2.089712e-09  3.181e-06  1.795e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  -1.856147e-11  1.484e-06  1.772e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  4.439284e-10   7.081e-07  1.771e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  -5.777256e-10  2.457e-07  1.781e+01  
 GE6    5.29e+00  5.78e+00  -6.437418e+03  6.627990e-10   3.775e-08  1.774e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   3.987e-06  1.776e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  5.614843e-10   1.749e-06  1.779e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  -1.237431e-11  7.742e-07  1.771e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  -1.139210e-09  9.115e-07  1.785e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  2.412990e-10   6.300e-08  1.778e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   2.462e-06  1.779e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  6.380504e-10   1.072e-06  1.781e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  1.345706e-09   5.480e-07  1.785e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  2.142302e-10   4.647e-07  1.776e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  -2.590871e-10  4.279e-08  1.778e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   1.403e-06  1.777e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  1.235111e-09   6.003e-07  1.775e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  8.615614e-10   2.236e-07  1.787e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  -1.662025e-09  1.244e-07  1.777e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  8.360393e-10   4.124e-08  1.779e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   9.645e-07  1.775e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -1.137663e-09  6.704e-07  1.771e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  9.613292e-10   4.905e-07  1.784e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  6.372770e-10   1.410e-07  1.780e+01  
 GE5    5.29e+00  5.78e+00  -6.437418e+03  -5.181742e-10  2.714e-08  1.778e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   5.150e-07  1.781e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -2.575403e-10  3.110e-07  1.782e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  1.438514e-09   2.384e-07  1.790e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  -8.399063e-10  7.898e-08  1.780e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   3.857e-07  1.780e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -1.633409e-09  5.688e-07  1.778e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  -1.924205e-09  1.518e-07  1.777e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  3.443925e-09   5.881e-08  1.782e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   3.686e-07  1.778e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -1.023974e-09  1.722e-07  1.784e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  2.084298e-09   7.166e-08  1.777e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   2.508e-07  1.841e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -7.285375e-10  5.146e-07  1.832e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  1.717709e-09   9.339e-08  1.835e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   2.401e-07  1.783e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  2.266046e-10   2.545e-07  1.782e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  -5.599375e-10  1.674e-07  1.791e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  -1.832945e-10  5.861e-08  1.780e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   2.153e-07  1.786e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  -2.714614e-10  2.968e-07  1.779e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  -6.473311e-10  1.489e-07  1.779e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  2.733949e-09   4.245e-08  1.788e+01  
 Updating EXX and rerun SCF
 GE1    5.29e+00  5.78e+00  -6.437418e+03  0.000000e+00   2.514e-07  1.780e+01  
 GE2    5.29e+00  5.78e+00  -6.437418e+03  1.046403e-09   2.553e-07  1.787e+01  
 GE3    5.29e+00  5.78e+00  -6.437418e+03  1.063417e-09   1.525e-07  1.774e+01  
 GE4    5.29e+00  5.78e+00  -6.437418e+03  -5.251348e-10  4.837e-08  1.787e+01  

And memory consumption is 50G during calculation. Is this performance normal and proper for this system ? Can some improvements be done ?

Also. there exists some problem from user for using HSE :

  1. There is not any print-out in stdout and running*.log in EXX process (despite Updateing EXX and rerun SCF notice), which will give user a bad view that the calculation is stuck. Can more print-out information like consumed time in EXX process and some key process
  2. How can I restart HSE SCF calculation properly if a complete SCF is not done? Because total SCF is not done, charge file will not be written, I'm trying using wavefunction file and restart file. However, due to HSE process will calculate PBE SCF first no matter exx_separate_loop is 0 or 1, if I directly use wfc or restart file from half-calculated HSE process, will the initialization useless because of the first PBE process ?
  3. How can I set MPI and OMP number for best calculation performance (if memory is permitted and number of physical core is fixed)? set more OMP number will reduce memory cost, but from my observation on CPU status of HPC server during EXX process, it seems EXX process are sometimes mainly parallelized by MPI

Task list for Issue attackers (only for developers)

  • [ ] Reproduce the performance issue on a similar system or environment.
  • [ ] Identify the specific section of the code causing the performance issue.
  • [ ] Investigate the issue and determine the root cause.
  • [ ] Research best practices and potential solutions for the identified performance issue.
  • [ ] Implement the chosen solution to address the performance issue.
  • [ ] Test the implemented solution to ensure it improves performance without introducing new issues.
  • [ ] Optimize the solution if necessary, considering trade-offs between performance and other factors (e.g., code complexity, readability, maintainability).
  • [ ] Review and incorporate any relevant feedback from users or developers.
  • [ ] Merge the improved solution into the main codebase and notify the issue reporter.

QuantumMisaka avatar Dec 14 '23 12:12 QuantumMisaka

  1. How can I restart HSE SCF calculation properly if a complete SCF is not done? Because total SCF is not done, charge file will not be written, I'm trying using wavefunction file and restart file. However, due to HSE process will calculate PBE SCF first no matter exx_separate_loop is 0 or 1, if I directly use wfc or restart file from half-calculated HSE process, will the initialization useless because of the first PBE process ?

From my practice now, HSE calculation CANNOT be restarted from wfc file or restart file, they can only restart PBE part.

QuantumMisaka avatar Dec 14 '23 16:12 QuantumMisaka

@dyzheng @PeizeLin May this problem need together view and work ?

QuantumMisaka avatar Dec 17 '23 05:12 QuantumMisaka

I complete a HSE SCF calculation by using 'scf_the 1e-6' and mixing_gg0 0 in Fe-bcc system above. The time cost is below:

TIME STATISTICS
--------------------------------------------------------------------------------------
     CLASS_NAME                  NAME            TIME(Sec)  CALLS   AVG(Sec)  PER(%)
--------------------------------------------------------------------------------------
                      total                      137589.57        9 15287.73  100.00
Driver                reading                      0.02           1   0.02      0.00
Input                 Init                         0.01           1   0.01      0.00
Input_Conv            Convert                      0.00           1   0.00      0.00
Driver                driver_line                137589.55        1 137589.55 100.00
UnitCell              check_tau                    0.00           1   0.00      0.00
PW_Basis_Sup          setuptransform               0.00           1   0.00      0.00
PW_Basis_Sup          distributeg                  0.00           1   0.00      0.00
mymath                heapsort                     0.00           4   0.00      0.00
PW_Basis_K            setuptransform               0.02           1   0.02      0.00
PW_Basis_K            distributeg                  0.00           1   0.00      0.00
PW_Basis              setup_struc_factor           0.02           1   0.02      0.00
ORB_control           read_orb_first               0.08           1   0.08      0.00
LCAO_Orbitals         Read_Orbitals                0.08           1   0.08      0.00
NOrbital_Lm           extra_uniform               43.58       16798   0.00      0.03
Mathzone_Add1         SplineD2                     0.20       16798   0.00      0.00
Mathzone_Add1         Cubic_Spline_Interpolation   0.80       16798   0.00      0.00
Mathzone_Add1         Uni_Deriv_Phi               40.54       16798   0.00      0.03
Sphbes                Spherical_Bessel             0.02        6030   0.00      0.00
Exx_LRI               init                       118.34           1 118.34      0.09
Matrix_Orbs21         init                        11.41           2   5.70      0.01
ORB_gaunt_table       init_Gaunt_CH                0.94           3   0.31      0.00
ORB_gaunt_table       Calc_Gaunt_CH                0.47      408058   0.00      0.00
ORB_gaunt_table       init_Gaunt                   9.68           3   3.23      0.01
ORB_gaunt_table       Get_Gaunt_SH                16.71    28208149   0.00      0.01
Matrix_Orbs21         init_radial                  0.00           2   0.00      0.00
Matrix_Orbs21         init_radial_table           93.67           2  46.83      0.07
ORB_table_phi         cal_ST_Phi12_R              49.66       31655   0.00      0.04
LRI_CV                set_orbitals                61.76           1  61.76      0.04
Matrix_Orbs11         init                         0.28           1   0.28      0.00
Matrix_Orbs11         init_radial                  0.00           1   0.00      0.00
Matrix_Orbs11         init_radial_table           11.38           1  11.38      0.01
ppcell_vl             init_vloc                    0.00           1   0.00      0.00
Ions                  opt_ions                   137470.74        1 137470.74  99.91
ESolver_KS_LCAO       Run                        82117.20         1 82117.20   59.68
ESolver_KS_LCAO       beforescf                   91.86           1  91.86      0.07
ESolver_KS_LCAO       beforesolver                17.00           1  17.00      0.01
ESolver_KS_LCAO       set_matrix_grid             16.98           1  16.98      0.01
atom_arrange          search                       0.00           1   0.00      0.00
Grid_Technique        init                        16.97           1  16.97      0.01
Grid_BigCell          grid_expansion_index         0.01           2   0.00      0.00
Record_adj            for_2d                       0.01           1   0.01      0.00
Grid_Driver           Find_atom                    0.01         402   0.00      0.00
LCAO_Hamilt           grid_prepare                 0.00           1   0.00      0.00
Veff                  initialize_HR                0.00           1   0.00      0.00
OverlapNew            initialize_SR                0.00           1   0.00      0.00
EkineticNew           initialize_HR                0.00           1   0.00      0.00
NonlocalNew           initialize_HR                0.00           1   0.00      0.00
Charge                set_rho_core                 0.00           1   0.00      0.00
Charge                atomic_rho                   0.04           1   0.04      0.00
PW_Basis_Sup          recip2real                   0.66         882   0.00      0.00
PW_Basis_Sup          gathers_scatterp             0.42         882   0.00      0.00
Potential             init_pot                     0.05           1   0.05      0.00
Potential             update_from_charge          17.10          97   0.18      0.01
Potential             cal_fixed_v                  0.00           1   0.00      0.00
PotLocal              cal_fixed_v                  0.00           1   0.00      0.00
Potential             cal_v_eff                   17.09          97   0.18      0.01
H_Hartree_pw          v_hartree                    0.23          97   0.00      0.00
PW_Basis_Sup          real2recip                   1.22         891   0.00      0.00
PW_Basis_Sup          gatherp_scatters             1.00         891   0.00      0.00
PotXC                 cal_v_eff                   16.85          97   0.17      0.01
XC_Functional         v_xc                       58072.22        54 1075.41    42.21
Potential             interpolate_vrs              0.00          97   0.00      0.00
Exx_LRI               cal_exx_ions                74.76           1  74.76      0.05
LRI_CV                cal_datas                   43.30           3  14.43      0.03
H_Ewald_pw            compute_ewald                0.01           1   0.01      0.00
HSolverLCAO           solve                      2210.18         96  23.02      1.61
HamiltLCAO            updateHk                   1006.91      70080   0.01      0.73
OperatorLCAO          init                        22.59      273020   0.00      0.02
Veff                  contributeHR                16.24         192   0.08      0.01
Gint_interface        cal_gint                    19.10         290   0.07      0.01
Gint_interface        cal_gint_vlocal             11.71         192   0.06      0.01
Gint_Tools            cal_psir_ylm                 0.68        7776   0.00      0.00
Gint_k                transfer_pvpR                4.53         192   0.02      0.00
OverlapNew            calculate_SR                 0.09           1   0.09      0.00
OverlapNew            contributeHk                24.81       70080   0.00      0.02
EkineticNew           contributeHR                 0.16         192   0.00      0.00
EkineticNew           calculate_HR                 0.16           1   0.16      0.00
NonlocalNew           contributeHR                 0.17         192   0.00      0.00
NonlocalNew           calculate_HR                 0.14           1   0.14      0.00
OperatorLCAO          contributeHk                44.18       70080   0.00      0.03
HSolverLCAO           hamiltSolvePsiK            1092.18      70080   0.02      0.79
DiagoElpa             elpa_solve                 941.99       70080   0.01      0.68
ElecStateLCAO         psiToRho                   110.81          96   1.15      0.08
elecstate             cal_dm                      73.74          97   0.76      0.05
psiMulPsiMpi          pdgemm                      72.28       70810   0.00      0.05
DensityMatrix         cal_DMR                      9.99          97   0.10      0.01
 Local_Orbital_wfc    wfc_2d_to_grid              46.65       81030   0.00      0.03
Gint                  transfer_DMR                 1.07          96   0.01      0.00
Gint_interface        cal_gint_rho                 7.13          96   0.07      0.01
Charge_Mixing         get_drho                     0.12          96   0.00      0.00
Charge                mix_rho                      0.15          81   0.00      0.00
Charge                Broyden_mixing               0.10          81   0.00      0.00
ModuleIO              write_wfc_nao_complex       28.07       10950   0.00      0.02
Exx_LRI               cal_exx_elec               79763.06        14 5697.36    57.97
RI_2D_Comm            split_m2D_ktoR              54.60          14   3.90      0.04
RI_2D_Comm            add_Hexx                   918.97       62780   0.01      0.67
XC_Functional         v_xc_libxc                  16.72          86   0.19      0.01
Exx_LRI               write_Hexxs                  0.19           1   0.19      0.00
ESolver_KS_LCAO       out_deepks_labels            0.00           1   0.00      0.00
LCAO_Deepks_Interface out_deepks_labels            0.00           1   0.00      0.00
HamiltLCAO            updateSk                     0.27         730   0.00      0.00
Force_Stress_LCAO     getForceStress             55353.54         1 55353.54   40.23
Forces                cal_force_loc                0.00           1   0.00      0.00
Forces                cal_force_ew                 0.00           1   0.00      0.00
Forces                cal_force_cc                 0.00           1   0.00      0.00
Forces                cal_force_scc                0.01           1   0.01      0.00
Stress_Func           stress_loc                   0.00           1   0.00      0.00
Stress_Func           stress_har                   0.00           1   0.00      0.00
Stress_Func           stress_ewa                   0.00           1   0.00      0.00
Stress_Func           stress_cc                    0.00           1   0.00      0.00
Stress_Func           stress_gga                   0.03           1   0.03      0.00
Force_LCAO_k          ftable_k                     3.31           1   3.31      0.00
Force_LCAO_k          allocate_k                   0.79           1   0.79      0.00
LCAO_gen_fixedH       b_NL_mu_new                  0.40           1   0.40      0.00
Force_LCAO_k          cal_foverlap_k               1.12           1   1.12      0.00
Force_LCAO_k          cal_edm_2d                   1.12           1   1.12      0.00
DensityMatrix         sum_DMR_spin                 0.00           1   0.00      0.00
Force_LCAO_k          cal_ftvnl_dphi_k             0.00           1   0.00      0.00
Force_LCAO_k          cal_fvl_dphi_k               0.26           1   0.26      0.00
Gint_interface        cal_gint_force               0.26           2   0.13      0.00
Gint_Tools            cal_dpsir_ylm                0.08          54   0.00      0.00
Gint_Tools            cal_dpsirr_ylm               0.01          54   0.00      0.00
Force_LCAO_k          cal_fvnl_dbeta_k_new         0.94           1   0.94      0.00
Exx_LRI               cal_exx_force              38787.02         1 38787.02   28.19
Exx_LRI               cal_exx_stress             16563.17         1 16563.17   12.04
ModuleIO              write_istate_info            0.05           1   0.05      0.00
--------------------------------------------------------------------------------------

 ----------------------------------------------------------

 START  Time  : Sat Dec 16 07:12:39 2023
 FINISH Time  : Sun Dec 17 21:25:49 2023
 TOTAL  Time  : 137590

Much time costs in EXX, and also force and stress calculation.

I consider that there should be much space left for performance update.

Fe-bcc-hse.tar.gz

QuantumMisaka avatar Dec 17 '23 13:12 QuantumMisaka

Here are some tests of convergence steps.

gg0 loop0,broyden loop0,pulay loop1,broyden loop1,pulay
0.0 36 35 22 22
0.2 30 36 19 21
0.4 31 46 19 19
0.6 27 28 21 23
0.8 29 34 19 19
1.0 27 27 20 20

It seems that in this system, gg0 does not affect the convergence speed.

PeizeLin avatar Jan 09 '24 04:01 PeizeLin

Here are some tests of convergence steps.

gg0 loop0,broyden loop0,pulay loop1,broyden loop1,pulay 0.0 36 35 22 22 0.2 30 36 19 21 0.4 31 46 19 19 0.6 27 28 21 23 0.8 29 34 19 19 1.0 27 27 20 20 It seems that in this system, gg0 does not affect the convergence speed.

But, there are two other things which are more important

  1. The numbers of loop for final convergence, if set mixing_gg0 1.0 as default, the HSE calculation CANNOT converge to DRHO=1e-6
  2. The time cost in EXX step may be affected by this setting.

QuantumMisaka avatar Jan 09 '24 04:01 QuantumMisaka

@QuantumMisaka, Do you want more discussion, or we can close this issue?

WHUweiqingzhou avatar Jan 30 '24 07:01 WHUweiqingzhou

@QuantumMisaka, Do you want more discussion, or we can close this issue?

There is some update, I'll have more discussion later

QuantumMisaka avatar Feb 08 '24 14:02 QuantumMisaka

I am testing HSE computation performance on these FeCx systems below

  1. Fe2C image which have 16 Fe atoms and 8 C atoms in a 6.4 * 6.4 * 5.6 tetragonal cell.
ATOMIC_SPECIES
C 12.011 C_ONCV_PBE-1.0.upf upf201
Fe 55.845 Fe_ONCV_PBE-1.0.upf upf201

NUMERICAL_ORBITAL
C_gga_7au_100Ry_2s2p1d.orb
Fe_gga_8au_100Ry_4s2p2d1f.orb

LATTICE_CONSTANT
1.889726

LATTICE_VECTORS
    6.3769109328     0.0343260832    -0.0001405731 #latvec1
    0.5967808069     6.3487148203     0.0001877534 #latvec2
    0.0004958326    -0.0005937998     5.6475745284 #latvec3

ATOMIC_POSITIONS
Direct

C #label
-1 #magnetism
8 #number of atoms
    0.3333347121     0.0000000460     0.0416681325 m  1  1  1
    0.8333318617     0.5000000561     0.0416692141 m  1  1  1
    0.3333319326     0.4999999864     0.2916627638 m  1  1  1
    0.8333347316     0.9999998433     0.2916668251 m  1  1  1
    0.3333347550     0.0000000537     0.5416683381 m  1  1  1
    0.8333320725     0.5000000845     0.5416689581 m  1  1  1
    0.3333319742     0.5000000254     0.7916628430 m  1  1  1
    0.8333347420     0.9999999572     0.7916668352 m  1  1  1

Fe #label
2 #magnetism
16 #number of atoms
    0.5356480922     0.4526941341     0.0416327237 m  1  1  1
    0.1310164298     0.5473060909     0.0416969262 m  1  1  1
    0.0356503367     0.9526938603     0.0416364215 m  1  1  1
    0.6310186916     0.0473059161     0.0417006360 m  1  1  1
    0.8806444504     0.2976880874     0.2916325414 m  1  1  1
    0.3806449276     0.7976871739     0.2916315420 m  1  1  1
    0.2860222877     0.2023122773     0.2917006324 m  1  1  1
    0.7860215681     0.7023128061     0.2917016191 m  1  1  1
    0.5356481536     0.4526940290     0.5416326165 m  1  1  1
    0.1310164033     0.5473060037     0.5416968098 m  1  1  1
    0.0356502134     0.9526938868     0.5416363924 m  1  1  1
    0.6310185668     0.0473059275     0.5417005647 m  1  1  1
    0.8806444140     0.2976878707     0.7916326007 m  1  1  1
    0.3806448806     0.7976870789     0.7916316216 m  1  1  1
    0.2860223248     0.2023119874     0.7917007357 m  1  1  1
    0.7860214815     0.7023128197     0.7917017073 m  1  1  1
  1. Fe3C image which have 12 Fe atoms and 4 C atoms in a 5.0 * 4.5 * 6.7 orthogonal cell.
ATOMIC_SPECIES
C 12.011 C_ONCV_PBE-1.0.upf upf201
Fe 55.845 Fe_ONCV_PBE-1.0.upf upf201

NUMERICAL_ORBITAL
C_gga_7au_100Ry_2s2p1d.orb
Fe_gga_8au_100Ry_4s2p2d1f.orb

LATTICE_CONSTANT
1.8897

LATTICE_VECTORS
    5.0336918943    -0.0000153613     0.0001148504 #latvec1
    0.0000242702     4.5205688988    -0.0004021423 #latvec2
    0.0000295086    -0.0042714172     6.7265819577 #latvec3

ATOMIC_POSITIONS
Direct

C #label
-1 #magnetism
4 #number of atoms
    0.9999002510     0.7477581421     0.2496296076 m  1  1  1
    0.4999042747     0.1269749211     0.2505363048 m  1  1  1
    0.2501001489     0.6268918528     0.7497891752 m  1  1  1
    0.7501027998     0.2478196639     0.7499201588 m  1  1  1

Fe #label
2 #magnetism
12 #number of atoms
    0.3002934154     0.8583856824     0.0682195983 m  1  1  1
    0.7999770331     0.0167677619     0.0682404297 m  1  1  1
    0.1611385679     0.3533551730     0.2499013338 m  1  1  1
    0.6611406049     0.5213978215     0.2499994318 m  1  1  1
    0.8003304262     0.0164804414     0.4318831297 m  1  1  1
    0.2999663836     0.8579983835     0.4318564945 m  1  1  1
    0.4499017658     0.3579948191     0.5679338394 m  1  1  1
    0.9498731410     0.5164006883     0.5681836656 m  1  1  1
    0.5888838255     0.8532989502     0.7499398028 m  1  1  1
    0.0888926917     0.0214473387     0.7500547648 m  1  1  1
    0.4498977733     0.3584613447     0.9315988971 m  1  1  1
    0.9499203204     0.5167191345     0.9318523371 m  1  1  1

Calculation setting: KPT: use 25A^-1 setting, for Fe2C and for Fe3C is 9 9 9 INPUT is setting as below by the advices from this issue

INPUT_PARAMETERS RUNNING ABACUS-DFT

#Parameters (1.General)
suffix                  Fe2C-HSE  # suffix of OUTPUT DIR
#ntype                   4   #  number of element
nspin                   2   #  1/2/4 4 for SOC
symmetry                0   #  0/1  1 for open, default
esolver_type            ksdft  # ksdft, ofdft, sdft, tddft, lj, dp
dft_functional          hse  # same as upf file, can be lda/pbe/scan/hf/pbe0/hse
ks_solver             genelpa  # default for ksdft-lcao
vdw_method              none  # none, d3, d3_bj
pseudo_dir              /lustre/home/2201110432/example/abacus/PP
orbital_dir             /lustre/home/2201110432/example/abacus/ORB

# SCF if HSE
exx_separate_loop     1   # default, optimized HSE method using LibRI
exx_cauchy_threshold            0  #default 1e-7, 0 to turn off
exx_cauchy_force_threshold      0
exx_cauchy_stress_threshold     0
exx_ccp_rmesh_times             1   # default 1.5
exx_dm_threshold                1e-3  # default 1e-4
mixing_gg0                      0   # for HSE this is needed

#Parameters (2.Iteration)
calculation             scf # scf relax cell-relax md
ecutwfc                 100
scf_thr                 1e-6
scf_nmax                300

#Parameters (3.Basis)
basis_type              lcao  # lcao or pw

#Parameters (4.Smearing)
smearing_method         mp    # mp/gaussian/fixed
smearing_sigma          0.002  # Rydberg

#Parameters (5.Mixing)
mixing_type             broyden  # pulay/broyden
mixing_ndim             8    # mixing dimension, for low-d can set to 20

#Parameters (6.Calculation)
cal_force          1
cal_stress         1
out_stru           1  # print STRU in OUT
out_chg            1  # print CHG or not
out_bandgap        1
out_mul            1  # print Mulliken charge and mag of atom in mulliken.txt
out_wfc_lcao           1  ## I forgot to close it sometimes

And:

  • All calculation performed using Intel-8358 Server by using 4node, 64core.
  • Parallelism scheme: MPI=8, OMP=32

The test result till now is :

  1. if use the original LibRI in GitHub, the EXX step will costs lots of time (1E-4 magnitude) and lead to much calculation time cost for HSE
GE22   2.26e+01  2.44e+01  -3.926133e+04  -3.829849e-09  1.132e-06  3.898e+01  
GE23   2.26e+01  2.44e+01  -3.926133e+04  1.033255e-09   3.032e-07  3.861e+01  
Updating EXX and rerun SCF     1.370e+04 (s)
GE1    2.54e+01  2.87e+01  -3.920527e+04  0.000000e+00   6.442e-02  1.991e+02  
GE2    2.77e+01  3.04e+01  -3.923653e+04  -3.125964e+01  6.639e-02  1.976e+02 
  1. if use the loop3 version in gitee, which is recommended by @PeizeLin , the EXX time cost is much better (8E-2 - 1E-3 magnitude) under the same parameter
 GE26   2.26e+01  2.44e+01  -3.926134e+04  -4.201078e-09  1.981e-06  3.689e+01  
 GE27   2.26e+01  2.44e+01  -3.926134e+04  1.806031e-08   8.870e-07  3.725e+01  
 Updating EXX and rerun SCF     8.483e+02 (s)
 GE1    2.53e+01  2.87e+01  -3.920525e+04  0.000000e+00   6.444e-02  1.967e+02  
 GE2    2.77e+01  3.04e+01  -3.923653e+04  -3.128343e+01  6.641e-02  1.946e+02
  1. Fe2C system can be converged in 120800 s (33.6 h)

abacus.log

running_scf.log

But Fe3C systems, even have less number of atoms, is hard to converge

 Updating EXX and rerun SCF     1.022e+03 (s)
 GE1    2.97e+01  3.33e+01  -3.923971e+04  0.000000e+00   2.960e-05  2.873e+02  
 GE2    2.97e+01  3.33e+01  -3.923971e+04  5.290983e-06   2.980e-05  2.835e+02  
 GE3    2.97e+01  3.33e+01  -3.923971e+04  5.187453e-06   4.958e-05  2.785e+02  
 GE4    2.97e+01  3.33e+01  -3.923971e+04  -7.941523e-06  2.576e-05  2.875e+02  
 GE5    2.97e+01  3.33e+01  -3.923971e+04  1.500385e-07   2.448e-05  2.881e+02  
 GE6    2.97e+01  3.33e+01  -3.923971e+04  -1.990061e-06  7.003e-06  2.765e+02  
 GE7    2.97e+01  3.33e+01  -3.923971e+04  -1.512264e-07  1.223e-06  2.865e+02  
 GE8    2.97e+01  3.33e+01  -3.923971e+04  3.427684e-09   1.540e-06  2.925e+02  
 GE9    2.97e+01  3.33e+01  -3.923971e+04  -5.395199e-09  3.099e-07  3.310e+02  
 Updating EXX and rerun SCF     1.014e+03 (s)
 GE1    2.97e+01  3.33e+01  -3.923971e+04  0.000000e+00   3.050e-05  2.861e+02  
 GE2    2.97e+01  3.33e+01  -3.923971e+04  9.559464e-07   2.721e-05  2.842e+02  
 GE3    2.97e+01  3.33e+01  -3.923971e+04  7.419216e-06   3.428e-05  2.825e+02  
 GE4    2.97e+01  3.33e+01  -3.923971e+04  -2.696152e-06  3.636e-05  2.839e+02  
 GE5    2.97e+01  3.33e+01  -3.923971e+04  -4.620017e-06  1.934e-05  2.835e+02  
 GE6    2.97e+01  3.33e+01  -3.923971e+04  -9.654870e-07  2.373e-06  2.853e+02  
 GE7    2.97e+01  3.33e+01  -3.923971e+04  -2.641915e-09  2.520e-06  2.793e+02  
 GE8    2.97e+01  3.33e+01  -3.923971e+04  -1.140911e-08  1.989e-06  2.794e+02  
 GE9    2.97e+01  3.33e+01  -3.923971e+04  -1.233719e-08  3.303e-07  3.120e+02  
 Updating EXX and rerun SCF     1.011e+03 (s)
 GE1    2.97e+01  3.33e+01  -3.923971e+04  0.000000e+00   3.178e-05  2.780e+02  
 GE2    2.97e+01  3.33e+01  -3.923971e+04  2.795153e-06   2.902e-05  2.777e+02  
 GE3    2.97e+01  3.33e+01  -3.923971e+04  6.991417e-06   4.116e-05  2.757e+02  
 GE4    2.97e+01  3.33e+01  -3.923971e+04  -5.639035e-06  3.640e-05  2.759e+02  
 GE5    2.97e+01  3.33e+01  -3.923971e+04  -1.501697e-06  2.542e-05  2.735e+02  
 GE6    2.97e+01  3.33e+01  -3.923971e+04  -1.930423e-06  7.286e-06  2.976e+02  
 GE7    2.97e+01  3.33e+01  -3.923971e+04  -1.525629e-07  1.379e-06  3.123e+02  
 GE8    2.97e+01  3.33e+01  -3.923971e+04  2.623354e-09   1.713e-06  3.190e+02  
 GE9    2.97e+01  3.33e+01  -3.923971e+04  -6.830619e-09  3.337e-07  3.508e+02  
 Updating EXX and rerun SCF     1.025e+03 (s)
 GE1    2.97e+01  3.33e+01  -3.923971e+04  0.000000e+00   3.292e-05  2.773e+02  
 GE2    2.97e+01  3.33e+01  -3.923971e+04  1.766482e-06   2.947e-05  2.772e+02  
 GE3    2.97e+01  3.33e+01  -3.923971e+04  7.101357e-06   3.737e-05  2.806e+02  
 GE4    2.97e+01  3.33e+01  -3.923971e+04  -2.647007e-06  3.911e-05  2.818e+02  
 GE5    2.97e+01  3.33e+01  -3.923971e+04  -5.002272e-06  2.147e-05  2.842e+02  
 GE6    2.97e+01  3.33e+01  -3.923971e+04  -3.785116e-07  3.718e-06  2.861e+02  
 GE7    2.97e+01  3.33e+01  -3.923971e+04  -3.899145e-08  2.582e-06  2.852e+02  
 GE8    2.97e+01  3.33e+01  -3.923971e+04  -1.095126e-08  1.974e-06  2.816e+02  
 GE9    2.97e+01  3.33e+01  -3.923971e+04  -9.119867e-09  3.413e-07  3.264e+02

abacus.log

  1. some parameter will have effect 4.1. if I make smearing_sigma larger from 0.002 to 0.010, the SCF convergence performance will be worse 4.2. Making OMP parallel number larger (32->64, and the total core keep unchanged to 256) will lead to more SCF time in each EXX and GE steps, but making OMP parallel number smaller (32->16) will lead to OOM error. 4.3. Larger mixing_ndim will have little effect on performance but will lead to OOM error.

My next step will test HSE performance in

  1. Fe-O bulk systems
  2. FeCx surface systems
  3. FeCx surface systems which having some C1 molecular adsorbed.

Now I'm wandering:

  1. What's the best parameters for HSE usage in spin-polarized system (or, just in Fe-Cx and Fe-C-H-O system)
  2. Does LibRI have more improvement space in these spin-polarized and magnetic system ?
  3. What's the size limit for HSE calculation (within 5 days' calculation time) ? I've heard that VASP HSE can be done in surface which contain 50-60 atoms.

I may need more discussion and cooperate for HSE usage in these Fe-contained magnetic system @PeizeLin @WHUweiqingzhou @mohanchen

QuantumMisaka avatar Mar 19 '24 03:03 QuantumMisaka

I noticed that the KPT is much larger that the selected criterion ka > 25A, so I'm doing test in a proper KPT

QuantumMisaka avatar Mar 19 '24 10:03 QuantumMisaka

After normalizing KPT to KSPACING 0.14 in INPUT file, the calculation have done more reasonably, for Fe2C above by using OMP_NUM_THREADS=16 mpirun -np 16 abacus, the time cost for SCF is totally acceptable

 GE20   2.63e+01  2.91e+01  -5.276221e+04  -7.251346e-08  1.316e-06  3.808e+00  
 GE21   2.63e+01  2.91e+01  -5.276221e+04  1.201546e-08   7.316e-07  3.935e+00  
 Updating EXX and rerun SCF	5.924e+02 (s)
 GE1    2.76e+01  3.32e+01  -5.267478e+04  0.000000e+00   7.036e-02  7.883e+00 
...
TIME STATISTICS
-------------------------------------------------------------------------------------
     CLASS_NAME                  NAME            TIME(Sec)  CALLS   AVG(Sec) PER(%)
-------------------------------------------------------------------------------------
                      total                      75159.80         9 8351.09  100.00
Driver                reading                      0.07           1   0.07     0.00
Input                 Init                         0.06           1   0.06     0.00
Input_Conv            Convert                      0.00           1   0.00     0.00
Driver                driver_line                75159.73         1 75159.73 100.00
UnitCell              check_tau                    0.00           1   0.00     0.00
PW_Basis_Sup          setuptransform               0.06           1   0.06     0.00
PW_Basis_Sup          distributeg                  0.00           1   0.00     0.00
mymath                heapsort                     0.00           3   0.00     0.00
PW_Basis_K            setuptransform               0.01           1   0.01     0.00
PW_Basis_K            distributeg                  0.00           1   0.00     0.00
PW_Basis              setup_struc_factor           0.01           1   0.01     0.00
NOrbital_Lm           extra_uniform               29.94       22261   0.00     0.04
Mathzone_Add1         SplineD2                     0.22       22261   0.00     0.00
Mathzone_Add1         Cubic_Spline_Interpolation   0.40       22261   0.00     0.00
Mathzone_Add1         Uni_Deriv_Phi               27.95       22261   0.00     0.04
Exx_LRI               init                        92.02           1  92.02     0.12
Matrix_Orbs21         init                         9.36           2   4.68     0.01
ORB_gaunt_table       init_Gaunt_CH                0.88           3   0.29     0.00
ORB_gaunt_table       Calc_Gaunt_CH                0.44      408058   0.00     0.00
ORB_gaunt_table       init_Gaunt                   7.63           3   2.54     0.01
ORB_gaunt_table       Get_Gaunt_SH                14.57    28208149   0.00     0.02
Matrix_Orbs21         init_radial                  0.00           2   0.00     0.00
Matrix_Orbs21         init_radial_table           71.08           2  35.54     0.09
ORB_table_phi         cal_ST_Phi12_R              44.64       43034   0.00     0.06
LRI_CV                set_orbitals                61.00           1  61.00     0.08
Matrix_Orbs11         init                         0.06           1   0.06     0.00
Matrix_Orbs11         init_radial                  0.00           1   0.00     0.00
Matrix_Orbs11         init_radial_table           10.36           1  10.36     0.01
ppcell_vl             init_vloc                    0.01           1   0.01     0.00
Ions                  opt_ions                   75066.34         1 75066.34  99.88
ESolver_KS_LCAO       Run                        28425.97         1 28425.97  37.82
ESolver_KS_LCAO       beforescf                  178.49           1 178.49     0.24
ESolver_KS_LCAO       beforesolver                 0.74           1   0.74     0.00
ESolver_KS_LCAO       set_matrix_grid              0.64           1   0.64     0.00
atom_arrange          search                       0.00           1   0.00     0.00
Grid_Technique        init                         0.60           1   0.60     0.00
Grid_BigCell          grid_expansion_index         0.00           2   0.00     0.00
Record_adj            for_2d                       0.03           1   0.03     0.00
Grid_Driver           Find_atom                    0.24       17400   0.00     0.00
LCAO_Hamilt           grid_prepare                 0.00           1   0.00     0.00
Veff                  initialize_HR                0.00           1   0.00     0.00
OverlapNew            initialize_SR                0.00           1   0.00     0.00
EkineticNew           initialize_HR                0.00           1   0.00     0.00
NonlocalNew           initialize_HR                0.00           1   0.00     0.00
Charge                set_rho_core                 0.00           1   0.00     0.00
Charge                atomic_rho                   0.03           1   0.03     0.00
PW_Basis_Sup          recip2real                   1.92        3240   0.00     0.00
PW_Basis_Sup          gathers_scatterp             0.69        3240   0.00     0.00
Potential             init_pot                     0.02           1   0.02     0.00
Potential             update_from_charge          32.32         359   0.09     0.04
Potential             cal_fixed_v                  0.00           1   0.00     0.00
PotLocal              cal_fixed_v                  0.00           1   0.00     0.00
Potential             cal_v_eff                   32.29         359   0.09     0.04
H_Hartree_pw          v_hartree                    1.28         359   0.00     0.00
PW_Basis_Sup          real2recip                   2.88        3260   0.00     0.00
PW_Basis_Sup          gatherp_scatters             1.35        3260   0.00     0.00
PotXC                 cal_v_eff                   30.95         359   0.09     0.04
XC_Functional         v_xc                       14663.85       191  76.77    19.51
Potential             interpolate_vrs              0.03         359   0.00     0.00
Exx_LRI               cal_exx_ions               176.98           1 176.98     0.24
LRI_CV                cal_datas                   12.17           3   4.06     0.02
H_Ewald_pw            compute_ewald                0.72           1   0.72     0.00
Charge_Mixing         init_mixing                  0.03          45   0.00     0.00
HSolverLCAO           solve                      2546.72        358   7.11     3.39
HamiltLCAO            updateHk                   1319.18      30072   0.04     1.76
OperatorLCAO          init                       134.75      118524   0.00     0.18
Veff                  contributeHR               133.19         716   0.19     0.18
Gint_interface        cal_gint                   154.59        1076   0.14     0.21
Gint_interface        cal_gint_vlocal            104.96         716   0.15     0.14
Gint_Tools            cal_psir_ylm                 8.99       34368   0.00     0.01
Gint_k                transfer_pvpR               28.22         716   0.04     0.04
OverlapNew            calculate_SR                 0.05           1   0.05     0.00
OverlapNew            contributeHk                 3.21       30072   0.00     0.00
EkineticNew           contributeHR                 0.08         716   0.00     0.00
EkineticNew           calculate_HR                 0.08           1   0.08     0.00
NonlocalNew           contributeHR                 0.69         716   0.00     0.00
NonlocalNew           calculate_HR                 0.29           1   0.29     0.00
OperatorLCAO          contributeHk                 6.12       30072   0.00     0.01
HSolverLCAO           hamiltSolvePsiK            1074.60      30072   0.04     1.43
DiagoElpa             elpa_solve                 1059.26      30072   0.04     1.41
ElecStateLCAO         psiToRho                   152.86         358   0.43     0.20
elecstate             cal_dm                      35.04         359   0.10     0.05
psiMulPsiMpi          pdgemm                      34.28       30156   0.00     0.05
DensityMatrix         cal_DMR                      4.31         359   0.01     0.01
Local_Orbital_wfc     wfc_2d_to_grid              44.45       33852   0.00     0.06
Gint                  transfer_DMR                14.18         358   0.04     0.02
Gint_interface        cal_gint_rho                48.87         358   0.14     0.07
Charge_Mixing         get_drho                     0.04         358   0.00     0.00
Charge                mix_rho                      0.89         313   0.00     0.00
Charge                Broyden_mixing               0.57         313   0.00     0.00
RI_2D_Comm            split_m2D_ktoR              95.74          44   2.18     0.13
Exx_LRI               cal_exx_elec               25560.41        44 580.92    34.01
RI_2D_Comm            add_Hexx                   1175.08      28308   0.04     1.56
XC_Functional         v_xc_libxc                  30.63         337   0.09     0.04
Exx_LRI               write_Hexxs                  0.57           1   0.57     0.00
ESolver_KS_LCAO       out_deepks_labels            0.00           1   0.00     0.00
LCAO_Deepks_Interface out_deepks_labels            0.00           1   0.00     0.00
HamiltLCAO            updateSk                     0.01          84   0.00     0.00
Force_Stress_LCAO     getForceStress             46640.37         1 46640.37  62.05
Forces                cal_force_loc                0.01           1   0.01     0.00
Forces                cal_force_ew                 0.02           1   0.02     0.00
Forces                cal_force_cc                 0.00           1   0.00     0.00
Forces                cal_force_scc                0.01           1   0.01     0.00
Stress_Func           stress_loc                   0.03           1   0.03     0.00
Stress_Func           stress_har                   0.00           1   0.00     0.00
Stress_Func           stress_ewa                   0.01           1   0.01     0.00
Stress_Func           stress_cc                    0.00           1   0.00     0.00
Stress_Func           stress_gga                   0.02           1   0.02     0.00
Force_LCAO_k          ftable_k                     1.75           1   1.75     0.00
Force_LCAO_k          allocate_k                   0.44           1   0.44     0.00
LCAO_gen_fixedH       b_NL_mu_new                  0.19           1   0.19     0.00
Force_LCAO_k          cal_foverlap_k               0.15           1   0.15     0.00
Force_LCAO_k          cal_edm_2d                   0.14           1   0.14     0.00
DensityMatrix         sum_DMR_spin                 0.00           1   0.00     0.00
Force_LCAO_k          cal_ftvnl_dphi_k             0.00           1   0.00     0.00
Force_LCAO_k          cal_fvl_dphi_k               0.77           1   0.77     0.00
Gint_interface        cal_gint_force               0.77           2   0.38     0.00
Gint_Tools            cal_dpsir_ylm                0.23          64   0.00     0.00
Gint_Tools            cal_dpsirr_ylm               0.10          64   0.00     0.00
Force_LCAO_k          cal_fvnl_dbeta_k_new         0.40           1   0.40     0.00
Exx_LRI               cal_exx_force              4521.58          1 4521.58    6.02
Exx_LRI               cal_exx_stress             42116.88         1 42116.88  56.04
ModuleIO              write_istate_info            0.05           1   0.05     0.00
-------------------------------------------------------------------------------------

 START  Time  : Tue Mar 19 20:15:41 2024
 FINISH Time  : Wed Mar 20 17:08:22 2024
 TOTAL  Time  : 75161
 SEE INFORMATION IN : OUT.Fe2C-HSE/

But the time cost for Force_LCAO-k and cal_exx_stress is too large and need to be optimized.

abacus.log

QuantumMisaka avatar Mar 20 '24 11:03 QuantumMisaka

The Force and Stress calculation in HSE can be separated, if I turn down the Stress calculation, the time cost will be much lower.

TIME STATISTICS
-------------------------------------------------------------------------------------
     CLASS_NAME                  NAME            TIME(Sec)  CALLS   AVG(Sec) PER(%)
-------------------------------------------------------------------------------------
                      total                      33647.82         9 3738.65  100.00
Driver                reading                      0.11           1   0.11     0.00
Input                 Init                         0.07           1   0.07     0.00
Input_Conv            Convert                      0.03           1   0.03     0.00
Driver                driver_line                33647.71         1 33647.71 100.00
UnitCell              check_tau                    0.00           1   0.00     0.00
PW_Basis_Sup          setuptransform               0.01           1   0.01     0.00
PW_Basis_Sup          distributeg                  0.00           1   0.00     0.00
mymath                heapsort                     0.00           3   0.00     0.00
PW_Basis_K            setuptransform               0.02           1   0.02     0.00
PW_Basis_K            distributeg                  0.01           1   0.01     0.00
PW_Basis              setup_struc_factor           0.01           1   0.01     0.00
NOrbital_Lm           extra_uniform               30.20       22261   0.00     0.09
Mathzone_Add1         SplineD2                     0.22       22261   0.00     0.00
Mathzone_Add1         Cubic_Spline_Interpolation   0.47       22261   0.00     0.00
Mathzone_Add1         Uni_Deriv_Phi               28.16       22261   0.00     0.08
Exx_LRI               init                        94.32           1  94.32     0.28
Matrix_Orbs21         init                        10.95           2   5.48     0.03
ORB_gaunt_table       init_Gaunt_CH                0.96           3   0.32     0.00
ORB_gaunt_table       Calc_Gaunt_CH                0.48      408058   0.00     0.00
ORB_gaunt_table       init_Gaunt                   9.03           3   3.01     0.03
ORB_gaunt_table       Get_Gaunt_SH                15.40    28208149   0.00     0.05
Matrix_Orbs21         init_radial                  0.00           2   0.00     0.00
Matrix_Orbs21         init_radial_table           71.43           2  35.71     0.21
ORB_table_phi         cal_ST_Phi12_R              44.92       43034   0.00     0.13
LRI_CV                set_orbitals                62.43           1  62.43     0.19
Matrix_Orbs11         init                         0.07           1   0.07     0.00
Matrix_Orbs11         init_radial                  0.00           1   0.00     0.00
Matrix_Orbs11         init_radial_table           10.46           1  10.46     0.03
ppcell_vl             init_vloc                    0.01           1   0.01     0.00
Ions                  opt_ions                   33551.43         1 33551.43  99.71
ESolver_KS_LCAO       Run                        28994.91         1 28994.91  86.17
ESolver_KS_LCAO       beforescf                  186.88           1 186.88     0.56
ESolver_KS_LCAO       beforesolver                 0.20           1   0.20     0.00
ESolver_KS_LCAO       set_matrix_grid              0.10           1   0.10     0.00
atom_arrange          search                       0.00           1   0.00     0.00
Grid_Technique        init                         0.07           1   0.07     0.00
Grid_BigCell          grid_expansion_index         0.00           2   0.00     0.00
Record_adj            for_2d                       0.03           1   0.03     0.00
Grid_Driver           Find_atom                    0.22       17400   0.00     0.00
LCAO_Hamilt           grid_prepare                 0.00           1   0.00     0.00
Veff                  initialize_HR                0.00           1   0.00     0.00
OverlapNew            initialize_SR                0.00           1   0.00     0.00
EkineticNew           initialize_HR                0.00           1   0.00     0.00
NonlocalNew           initialize_HR                0.00           1   0.00     0.00
Exx_LRI               cal_exx_ions               185.47           1 185.47     0.55
LRI_CV                cal_datas                   12.49           3   4.16     0.04
Charge                set_rho_core                 0.00           1   0.00     0.00
Charge                atomic_rho                   1.18           1   1.18     0.00
PW_Basis_Sup          recip2real                   3.29        3234   0.00     0.01
PW_Basis_Sup          gathers_scatterp             2.14        3234   0.00     0.01
Potential             init_pot                     0.03           1   0.03     0.00
Potential             update_from_charge          34.05         359   0.09     0.10
Potential             cal_fixed_v                  0.00           1   0.00     0.00
PotLocal              cal_fixed_v                  0.00           1   0.00     0.00
Potential             cal_v_eff                   34.03         359   0.09     0.10
H_Hartree_pw          v_hartree                    1.72         359   0.00     0.01
PW_Basis_Sup          real2recip                   3.80        3255   0.00     0.01
PW_Basis_Sup          gatherp_scatters             2.56        3255   0.00     0.01
PotXC                 cal_v_eff                   32.25         359   0.09     0.10
XC_Functional         v_xc                       14926.25       191  78.15    44.36
Potential             interpolate_vrs              0.02         359   0.00     0.00
H_Ewald_pw            compute_ewald                0.00           1   0.00     0.00
Charge_Mixing         init_mixing                  0.04          45   0.00     0.00
HSolverLCAO           solve                      2960.28        358   8.27     8.80
HamiltLCAO            updateHk                   1395.76      30072   0.05     4.15
OperatorLCAO          init                       140.79      120288   0.00     0.42
Veff                  contributeHR               139.42         716   0.19     0.41
Gint_interface        cal_gint                   158.20        1076   0.15     0.47
Gint_interface        cal_gint_vlocal            106.82         716   0.15     0.32
Gint_Tools            cal_psir_ylm                 8.88       34368   0.00     0.03
Gint_k                transfer_pvpR               32.60         716   0.05     0.10
OverlapNew            calculate_SR                 0.05           1   0.05     0.00
OverlapNew            contributeHk                 3.45       30072   0.00     0.01
EkineticNew           contributeHR                 0.08         716   0.00     0.00
EkineticNew           calculate_HR                 0.08           1   0.08     0.00
NonlocalNew           contributeHR                 0.69         716   0.00     0.00
NonlocalNew           calculate_HR                 0.29           1   0.29     0.00
OperatorLCAO          contributeHk                 8.64       30072   0.00     0.03
HSolverLCAO           hamiltSolvePsiK            1403.75      30072   0.05     4.17
DiagoElpa             elpa_solve                 1382.46      30072   0.05     4.11
ElecStateLCAO         psiToRho                   160.68         358   0.45     0.48
elecstate             cal_dm                      34.69         359   0.10     0.10
psiMulPsiMpi          pdgemm                      33.93       30156   0.00     0.10
DensityMatrix         cal_DMR                      4.31         359   0.01     0.01
Local_Orbital_wfc     wfc_2d_to_grid              45.35       33852   0.00     0.13
Gint                  transfer_DMR                15.27         358   0.04     0.05
Gint_interface        cal_gint_rho                50.83         358   0.14     0.15
Charge_Mixing         get_drho                     0.07         358   0.00     0.00
Charge                mix_rho                      2.18         313   0.01     0.01
Charge                Broyden_mixing               1.86         313   0.01     0.01
RI_2D_Comm            split_m2D_ktoR             102.40          44   2.33     0.30
Exx_LRI               cal_exx_elec               25445.02        44 578.30    75.62
RI_2D_Comm            add_Hexx                   1429.29      32004   0.04     4.25
XC_Functional         v_xc_libxc                  31.90         337   0.09     0.09
Exx_LRI               write_Hexxs                  0.71           1   0.71     0.00
ESolver_KS_LCAO       out_deepks_labels            0.00           1   0.00     0.00
LCAO_Deepks_Interface out_deepks_labels            0.00           1   0.00     0.00
HamiltLCAO            updateSk                     0.01          84   0.00     0.00
Force_Stress_LCAO     getForceStress             4556.51          1 4556.51   13.54
Forces                cal_force_loc                0.00           1   0.00     0.00
Forces                cal_force_ew                 0.00           1   0.00     0.00
Forces                cal_force_cc                 0.00           1   0.00     0.00
Forces                cal_force_scc                0.01           1   0.01     0.00
Force_LCAO_k          ftable_k                     1.46           1   1.46     0.00
Force_LCAO_k          allocate_k                   0.42           1   0.42     0.00
LCAO_gen_fixedH       b_NL_mu_new                  0.20           1   0.20     0.00
Force_LCAO_k          cal_foverlap_k               0.17           1   0.17     0.00
Force_LCAO_k          cal_edm_2d                   0.16           1   0.16     0.00
DensityMatrix         sum_DMR_spin                 0.00           1   0.00     0.00
Force_LCAO_k          cal_ftvnl_dphi_k             0.00           1   0.00     0.00
Force_LCAO_k          cal_fvl_dphi_k               0.55           1   0.55     0.00
Gint_interface        cal_gint_force               0.55           2   0.28     0.00
Gint_Tools            cal_dpsir_ylm                0.22          64   0.00     0.00
Force_LCAO_k          cal_fvnl_dbeta_k_new         0.20           1   0.20     0.00
Exx_LRI               cal_exx_force              4554.97          1 4554.97   13.54
ModuleIO              write_istate_info            0.22           1   0.22     0.00
-------------------------------------------------------------------------------------

 START  Time  : Wed Mar 20 19:23:23 2024
 FINISH Time  : Thu Mar 21 04:44:11 2024
 TOTAL  Time  : 33648
 SEE INFORMATION IN : OUT.Fe2C-HSE

Then the time cost for HSE SCF can be accepted

abacus.log

running_scf.log

QuantumMisaka avatar Mar 21 '24 03:03 QuantumMisaka

But ,I still wonder can we use wavefunction extrapolation in HSE AIMD/Opt calculation? for the separated loop algorism will always do PBE first

QuantumMisaka avatar Mar 21 '24 03:03 QuantumMisaka

But ,I still wonder can we use wavefunction extrapolation in HSE AIMD/Opt calculation? for the separated loop algorism will always do PBE first

There are only chg_extrap in ABACUS, which cannot be used by HSE. I wonder another method by ase-abacus which use restart_save and restart_load

QuantumMisaka avatar Mar 21 '24 03:03 QuantumMisaka

@PeizeLin For Fe3C system above, the calculation can be done but the SCF performance is poor (the 1st DRHO are keeping in (6-8)*e^{-6} and CANNOT reach 1e-6 thr, and, if use the default exx parameter, the convergence will be harder). And, it seems the calculation was done NOT by converge separate loop to scf_thr 1e-6.

Updating EXX and rerun SCF     6.049e+02 (s)
 GE1    2.97e+01  3.33e+01  -3.923971e+04  0.000000e+00   6.828e-06  1.784e+01  
 GE2    2.97e+01  3.33e+01  -3.923971e+04  9.753741e-07   6.848e-06  1.774e+01  
 GE3    2.97e+01  3.33e+01  -3.923971e+04  -7.646272e-07  8.249e-06  1.780e+01  
 GE4    2.97e+01  3.33e+01  -3.923971e+04  -1.425273e-07  3.995e-06  1.775e+01  
 GE5    2.97e+01  3.33e+01  -3.923971e+04  -1.786850e-08  5.548e-06  1.774e+01  
 GE6    2.97e+01  3.33e+01  -3.923971e+04  -8.728838e-08  1.483e-06  1.777e+01  
 GE7    2.97e+01  3.33e+01  -3.923971e+04  5.073467e-09   6.515e-07  1.792e+01  
----------------------------------------------------------------
TOTAL-STRESS (KBAR)                                           
----------------------------------------------------------------
      148.9112096763         0.0072703256        -0.1701243744
        0.0070003380        91.9771779461        -0.8375153913
       -0.1703347058        -0.8376343605        19.3338642150
----------------------------------------------------------------
 TOTAL-PRESSURE: 86.740751 KBAR

TIME STATISTICS
--------------------------------------------------------------------------------------
     CLASS_NAME                  NAME            TIME(Sec)  CALLS   AVG(Sec)  PER(%)
--------------------------------------------------------------------------------------
                      total                      104645.52        9 11627.28  100.00
Driver                reading                      0.03           1   0.03      0.00
Input                 Init                         0.03           1   0.03      0.00
Input_Conv            Convert                      0.00           1   0.00      0.00
Driver                driver_line                104645.49        1 104645.49 100.00
UnitCell              check_tau                    0.00           1   0.00      0.00
PW_Basis_Sup          setuptransform               0.01           1   0.01      0.00
PW_Basis_Sup          distributeg                  0.00           1   0.00      0.00
mymath                heapsort                     0.00           3   0.00      0.00
PW_Basis_K            setuptransform               0.01           1   0.01      0.00
PW_Basis_K            distributeg                  0.00           1   0.00      0.00
PW_Basis              setup_struc_factor           0.01           1   0.01      0.00
NOrbital_Lm           extra_uniform               27.40       22261   0.00      0.03
Mathzone_Add1         SplineD2                     0.22       22261   0.00      0.00
Mathzone_Add1         Cubic_Spline_Interpolation   0.43       22261   0.00      0.00
Mathzone_Add1         Uni_Deriv_Phi               25.60       22261   0.00      0.02
Exx_LRI               init                        67.52           1  67.52      0.06
Matrix_Orbs21         init                         9.46           2   4.73      0.01
ORB_gaunt_table       init_Gaunt_CH                0.88           3   0.29      0.00
ORB_gaunt_table       Calc_Gaunt_CH                0.44      408058   0.00      0.00
ORB_gaunt_table       init_Gaunt                   7.75           3   2.58      0.01
ORB_gaunt_table       Get_Gaunt_SH                 9.56    28208149   0.00      0.01
Matrix_Orbs21         init_radial                  0.00           2   0.00      0.00
Matrix_Orbs21         init_radial_table           51.59           2  25.79      0.05
ORB_table_phi         cal_ST_Phi12_R              24.79       43034   0.00      0.02
LRI_CV                set_orbitals                39.42           1  39.42      0.04
Matrix_Orbs11         init                         0.06           1   0.06      0.00
Matrix_Orbs11         init_radial                  0.00           1   0.00      0.00
Matrix_Orbs11         init_radial_table            5.29           1   5.29      0.01
ppcell_vl             init_vloc                    0.01           1   0.01      0.00
Ions                  opt_ions                   104576.84        1 104576.84  99.93
ESolver_KS_LCAO       Run                        76530.48         1 76530.48   73.13
ESolver_KS_LCAO       beforescf                  132.43           1 132.43      0.13
ESolver_KS_LCAO       beforesolver                 0.19           1   0.19      0.00
ESolver_KS_LCAO       set_matrix_grid              0.11           1   0.11      0.00
atom_arrange          search                       0.00           1   0.00      0.00
Grid_Technique        init                         0.09           1   0.09      0.00
Grid_BigCell          grid_expansion_index         0.00           2   0.00      0.00
Record_adj            for_2d                       0.02           1   0.02      0.00
Grid_Driver           Find_atom                    0.52       29264   0.00      0.00
LCAO_Hamilt           grid_prepare                 0.00           1   0.00      0.00
Veff                  initialize_HR                0.00           1   0.00      0.00
OverlapNew            initialize_SR                0.00           1   0.00      0.00
EkineticNew           initialize_HR                0.00           1   0.00      0.00
NonlocalNew           initialize_HR                0.01           1   0.01      0.00
Charge                set_rho_core                 0.00           1   0.00      0.00
Charge                atomic_rho                   0.05           1   0.05      0.00
PW_Basis_Sup          recip2real                   4.08        8208   0.00      0.00
PW_Basis_Sup          gathers_scatterp             1.60        8208   0.00      0.00
Potential             init_pot                     0.04           1   0.04      0.00
Potential             update_from_charge          95.42         911   0.10      0.09
Potential             cal_fixed_v                  0.00           1   0.00      0.00
PotLocal              cal_fixed_v                  0.00           1   0.00      0.00
Potential             cal_v_eff                   95.35         911   0.10      0.09
H_Hartree_pw          v_hartree                    2.24         911   0.00      0.00
PW_Basis_Sup          real2recip                   4.76        8235   0.00      0.00
PW_Basis_Sup          gatherp_scatters             2.52        8235   0.00      0.00
PotXC                 cal_v_eff                   92.93         911   0.10      0.09
XC_Functional         v_xc                       39320.76       470  83.66     37.58
Potential             interpolate_vrs              0.07         911   0.00      0.00
Exx_LRI               cal_exx_ions               131.73           1 131.73      0.13
LRI_CV                cal_datas                    8.26           3   2.75      0.01
H_Ewald_pw            compute_ewald                0.42           1   0.42      0.00
Charge_Mixing         init_mixing                  0.10         101   0.00      0.00
HSolverLCAO           solve                      15672.73       910  17.22     14.98
HamiltLCAO            updateHk                   9583.95     112840   0.08      9.16
OperatorLCAO          init                       275.57      447888   0.00      0.26
Veff                  contributeHR               272.48        1820   0.15      0.26
Gint_interface        cal_gint                   320.82        2732   0.12      0.31
Gint_interface        cal_gint_vlocal            239.64        1820   0.13      0.23
Gint_Tools            cal_psir_ylm                12.80       65520   0.00      0.01
Gint_k                transfer_pvpR               32.83        1820   0.02      0.03
OverlapNew            calculate_SR                 0.02           1   0.02      0.00
OverlapNew            contributeHk                10.87      112840   0.00      0.01
EkineticNew           contributeHR                 0.04        1820   0.00      0.00
EkineticNew           calculate_HR                 0.04           1   0.04      0.00
NonlocalNew           contributeHR                 1.43        1820   0.00      0.00
NonlocalNew           calculate_HR                 0.20           1   0.20      0.00
OperatorLCAO          contributeHk                26.87      112840   0.00      0.03
HSolverLCAO           hamiltSolvePsiK            5435.97     112840   0.05      5.19
DiagoElpa             elpa_solve                 5276.76     112840   0.05      5.04
ElecStateLCAO         psiToRho                   652.46         910   0.72      0.62
elecstate             cal_dm                     456.91         911   0.50      0.44
psiMulPsiMpi          pdgemm                     454.76      112964   0.00      0.43
DensityMatrix         cal_DMR                     10.47         911   0.01      0.01
Local_Orbital_wfc     wfc_2d_to_grid              66.30      125364   0.00      0.06
Gint                  transfer_DMR                18.41         910   0.02      0.02
Gint_interface        cal_gint_rho                80.75         910   0.09      0.08
Charge_Mixing         get_drho                     0.05         910   0.00      0.00
Charge                mix_rho                      2.00         809   0.00      0.00
Charge                Broyden_mixing               1.22         809   0.00      0.00
RI_2D_Comm            split_m2D_ktoR             369.52         100   3.70      0.35
Exx_LRI               cal_exx_elec               60234.83       100 602.35     57.56
RI_2D_Comm            add_Hexx                   9270.09     109368   0.08      8.86
XC_Functional         v_xc_libxc                  92.52         882   0.10      0.09
Exx_LRI               write_Hexxs                  1.62           1   1.62      0.00
ESolver_KS_LCAO       out_deepks_labels            0.00           1   0.00      0.00
LCAO_Deepks_Interface out_deepks_labels            0.00           1   0.00      0.00
HamiltLCAO            updateSk                     0.01         124   0.00      0.00
Force_Stress_LCAO     getForceStress             28046.31         1 28046.31   26.80
Forces                cal_force_loc                0.00           1   0.00      0.00
Forces                cal_force_ew                 0.00           1   0.00      0.00
Forces                cal_force_cc                 0.00           1   0.00      0.00
Forces                cal_force_scc                0.01           1   0.01      0.00
Stress_Func           stress_loc                   0.02           1   0.02      0.00
Stress_Func           stress_har                   0.00           1   0.00      0.00
Stress_Func           stress_ewa                   0.00           1   0.00      0.00
Stress_Func           stress_cc                    0.00           1   0.00      0.00
Stress_Func           stress_gga                   0.01           1   0.01      0.00
Force_LCAO_k          ftable_k                     1.49           1   1.49      0.00
Force_LCAO_k          allocate_k                   0.25           1   0.25      0.00
LCAO_gen_fixedH       b_NL_mu_new                  0.12           1   0.12      0.00
Force_LCAO_k          cal_foverlap_k               0.55           1   0.55      0.00
Force_LCAO_k          cal_edm_2d                   0.54           1   0.54      0.00
DensityMatrix         sum_DMR_spin                 0.00           1   0.00      0.00
Force_LCAO_k          cal_ftvnl_dphi_k             0.00           1   0.00      0.00
Force_LCAO_k          cal_fvl_dphi_k               0.42           1   0.42      0.00
Gint_interface        cal_gint_force               0.42           2   0.21      0.00
Gint_Tools            cal_dpsir_ylm                0.15          48   0.00      0.00
Gint_Tools            cal_dpsirr_ylm               0.07          48   0.00      0.00
Force_LCAO_k          cal_fvnl_dbeta_k_new         0.25           1   0.25      0.00
Exx_LRI               cal_exx_force              4381.26          1 4381.26     4.19
Exx_LRI               cal_exx_stress             23663.47         1 23663.47   22.61
ModuleIO              write_istate_info            0.07           1   0.07      0.00
--------------------------------------------------------------------------------------

 START  Time  : Tue Mar 19 19:09:13 2024
 FINISH Time  : Thu Mar 21 00:13:18 2024
 TOTAL  Time  : 104645
 SEE INFORMATION IN : OUT.Fe3C-HSE/

I wonder:

  1. why this occasion occur? Did there some limits in EXX calculation?
  2. Can user set a stopping point for HSE calculation? for example in the HSE above, one will consider that EDIFF=1e-6 for the second SCF(out of EXX) is a good converge point and want HSE SCF end there

QuantumMisaka avatar Mar 21 '24 10:03 QuantumMisaka

After scf_ene_thr added, there seems to be some way to stop EXX calculation early.

But HSE convergence in magnetic system is still a problem

I'll keep an eye on it.

QuantumMisaka avatar Aug 30 '24 07:08 QuantumMisaka