R2SCAN convergence problem
Details
For SCF of same structure (CO and H2 adsorption in Fe5C2(510) surface, the latest v2.1(dev) NAO will not converge in 300 steps, while the official v2.0 NAO will converge:
INPUT:
INPUT_PARAMETERS RUNNING ABACUS-DFT
#Parameters (1.General)
suffix clean # suffix of OUTPUT DIR
#ntype 4 # number of element
nspin 2 # 1/2/4 4 for SOC
symmetry 0 # 0/1 1 for open, default
esolver_type ksdft
dft_functional MGGA_X_R2SCAN+MGGA_C_R2SCAN # same as upf file, can be lda/pbe/scan/hf/pbe0/hse
ks_solver genelpa # default for ksdft-lcao
#Parameters (2.Iteration)
calculation relax
ecutwfc 200
scf_thr 1e-7
scf_nmax 300
relax_nmax 400
relax_method bfgs
force_thr_ev 0.05 # ev
# stress_thr 5
#Parameters (3.Basis)
basis_type lcao # lcao or pw
kspacing 0.14 0.50 0.14 # replace KPT
# gamma_only 1 # 0/1, replace KPT
#Parameters (4.Smearing)
smearing_method mp # mp/gaussian/fixed
smearing_sigma 0.002 # Rydberg
#Parameters (5.Mixing)
mixing_type broyden # pulay/broyden
mixing_ndim 20
#Parameters (6.Calculation)
cal_force 1
cal_stress 1
out_stru 1 # print STRU in OUT
out_chg 0 # print CHG or not
out_bandgap 0
out_mul 1
#Parameters (7. Dipole Correction)
efield_flag 1 # open added potential, if 0, all below useless
dip_cor_flag 1 # open dipole correction
efield_dir 1 # direction of dipole correction, 0,1,2 for x,y,z
the 1st SCF in relax process will not converge by using the v2.1(dev) NAO:
GE297 1.81e+02 2.12e+02 -2.63782113e+05 -7.03217024e-03 1.6101e-04 1.0818e-03 60.76
GE298 1.81e+02 2.12e+02 -2.63782115e+05 -1.96780108e-03 1.5566e-04 1.1212e-03 60.71
GE299 1.81e+02 2.12e+02 -2.63782118e+05 -3.54125372e-03 1.0910e-04 1.8743e-03 60.62
GE300 1.80e+02 2.12e+02 -2.63782137e+05 -1.90117122e-02 1.1790e-04 2.9315e-03 60.62
>> Leave SCF iteration.
* * * * * *
!! CONVERGENCE HAS NOT BEEN ACHIEVED !!
While in v2.0 official NAO, this SCF can converge
GE184 1.75e+02 2.01e+02 -2.63754454e+05 -2.48189066e-06 1.5504e-07 4.6288e-07 69.56
GE185 1.75e+02 2.01e+02 -2.63754454e+05 5.37807322e-06 1.4567e-07 5.2853e-07 69.81
GE186 1.75e+02 2.01e+02 -2.63754454e+05 -1.73654140e-05 1.3080e-07 4.4374e-07 69.84
GE187 1.75e+02 2.01e+02 -2.63754454e+05 7.34717246e-06 1.2054e-07 5.5272e-07 69.71
GE188 1.75e+02 2.01e+02 -2.63754454e+05 9.64171606e-06 1.0511e-07 5.2776e-07 69.58
GE189 1.75e+02 2.01e+02 -2.63754454e+05 -5.70485401e-06 9.4869e-08 3.5031e-07 69.49
>> Leave SCF iteration.
* * * * * *
Attachment: Fe5C2-CO-3fold-SCAN.tar.gz
Task list for Issue attackers (only for developers)
- [ ] Reproduce the performance issue on a similar system or environment.
- [ ] Identify the specific section of the code causing the performance issue.
- [ ] Investigate the issue and determine the root cause.
- [ ] Research best practices and potential solutions for the identified performance issue.
- [ ] Implement the chosen solution to address the performance issue.
- [ ] Test the implemented solution to ensure it improves performance without introducing new issues.
- [ ] Optimize the solution if necessary, considering trade-offs between performance and other factors (e.g., code complexity, readability, maintainability).
- [ ] Review and incorporate any relevant feedback from users or developers.
- [ ] Merge the improved solution into the main codebase and notify the issue reporter.
It seems related to #4802 and #4058 , waiting for mixing part refinement
In 3.7.3 version ,this convergence problem seems not solved, but even worse.
I'm doing and checking more tests
@QuantumMisaka
could you try PR #4859, and make some tests with different mixing_eps, like 1e-10, 1e-12, 1e-14, 1e-15?
@WHUweiqingzhou I've tried them , all of them cannot lead SCF converge
(base) [2201110432@wm2-login01 test-mixeps]$ ls
abacus.err abacus.slurm eps14 INPUT KPT slurm.hosts time.json
abacus.out eps12 eps15 JobProcessing.state OUT.ABACUS STRU
(base) [2201110432@wm2-login01 test-mixeps]$ grep GE abacus.out | tail -n 1
!! CONVERGENCE HAS NOT BEEN ACHIEVED !!
(base) [2201110432@wm2-login01 test-mixeps]$ grep GE abacus.out | tail -n 2
GE300 1.83e+02 2.13e+02 -2.63782924e+05 1.91234615e-03 7.2299e-05 2.1935e-04 38.53
!! CONVERGENCE HAS NOT BEEN ACHIEVED !!
(base) [2201110432@wm2-login01 test-mixeps]$ grep GE eps12/abacus.out | tail -n 2
GE300 1.83e+02 2.13e+02 -2.63782926e+05 3.92081852e-04 5.9905e-05 3.6945e-04 19.12
!! CONVERGENCE HAS NOT BEEN ACHIEVED !!
(base) [2201110432@wm2-login01 test-mixeps]$ grep GE eps14/abacus.out | tail -n 2
GE300 1.83e+02 2.13e+02 -2.63782931e+05 8.10737831e-04 5.3708e-05 6.9621e-05 19.33
!! CONVERGENCE HAS NOT BEEN ACHIEVED !!
(base) [2201110432@wm2-login01 test-mixeps]$ grep GE eps15/abacus.out | tail -n 2
GE300 1.82e+02 2.13e+02 -2.63782936e+05 1.67324681e-04 5.2416e-05 5.3002e-04 19.18
!! CONVERGENCE HAS NOT BEEN ACHIEVED !!
It seems related to charge mixing or smearing, it is not a diagonalization problem.
Convergence issues always exist in many examples, so let's change the issue to discussion panel.