abacus-develop icon indicating copy to clipboard operation
abacus-develop copied to clipboard

HSE stdout bug in relax process

Open QuantumMisaka opened this issue 1 year ago • 2 comments

Describe the bug

When using HSE to do relax porcess in exx_separate_loop 1 mode, the 1st inner loop will have bug in stdout printing starting from the 2nd ION step

TOTAL-PRESSURE: -22.134299 KBAR

 ETOT DIFF (eV)       : -0.690660
 LARGEST GRAD (eV/A)  : 0.418169
 DONE(15236.752240 SEC) : SETUP UNITCELL
 -------------------------------------------
 STEP OF RELAXATION : 3
 -------------------------------------------
 DONE(15236.821788 SEC) : LOCAL POTENTIAL
 DONE(15236.822789 SEC) : INIT K-POINTS
 DONE(15282.524033 SEC) : INIT SCF
 * * * * * *
 << Start SCF iteration.
 ITER      TMAG       AMAG        ETOT/eV          EDIFF/eV         DRHO     TIME/s
 GE1      2.85e+01   3.22e+01  -3.92634289e+04   0.00000000e+00   3.2089e-02   7.49
 GE2      2.55e+01   2.83e+01  -3.92626842e+04   7.44742791e-01   1.8050e-02   7.32
 GE3      2.45e+01   2.70e+01  -3.92611591e+04   1.52505841e+00   4.5524e-02   7.19
 GE4      2.53e+01   2.79e+01  -3.92643996e+04  -3.24051411e+00   8.1621e-03   7.32
 GE5      2.55e+01   2.81e+01  -3.92645071e+04  -1.07476616e-01   4.2116e-03   7.27
 GE6      2.57e+01   2.83e+01  -3.92644568e+04   5.02833142e-02   5.0461e-03   7.18
 GE7      2.57e+01   2.84e+01  -3.92644786e+04  -2.17796188e-02   3.5864e-03   7.29
 GE8      2.57e+01   2.83e+01  -3.92645041e+04  -2.55047874e-02   1.0675e-03   7.37
 GE9      2.56e+01   2.83e+01  -3.92645043e+04  -2.12941920e-04   1.1540e-03   7.28
 GE10     2.56e+01   2.82e+01  -3.92645057e+04  -1.34526067e-03   3.6637e-04   7.19
 GE11     2.56e+01   2.82e+01  -3.92645058e+04  -1.25170671e-04   1.5479e-04   7.22
 GE12     2.56e+01   2.82e+01  -3.92645058e+04  -2.22409232e-05   2.0060e-04   7.20
 GE13     2.56e+01   2.82e+01  -3.92645059e+04  -6.63754970e-05   4.4398e-05   7.26
 GE14     2.56e+01   2.82e+01  -3.92645059e+04   1.24830804e-06   5.8425e-05   7.44
 GE15     2.56e+01   2.82e+01  -3.92645059e+04  -5.66183472e-06   1.5115e-05   7.18
 Updating EXX and rerun SCF     0x1.aa193d9663843p+6 (s)
 GE0      2.56e+01   2.82e+01  -3.92645059e+04   2.87974948e-07   1.4544e-05 114.03
 ITER      TMAG       AMAG        ETOT/eV          EDIFF/eV         DRHO     TIME/s
 GE1      2.74e+01   3.24e+01  -3.91939224e+04   0.00000000e+00   8.3791e-02   7.37
 GE2      3.11e+01   3.45e+01  -3.92427893e+04  -4.88669139e+01   9.5294e-02   7.63
 GE3      3.05e+01   3.43e+01  -3.92452452e+04  -2.45583307e+00   4.2979e-02   7.38
 GE4      3.03e+01   3.43e+01  -3.92439128e+04   1.33235149e+00   3.1752e-02   7.55
 GE5      3.00e+01   3.41e+01  -3.92435859e+04   3.26953111e-01   1.7840e-02   7.46
 GE6      2.97e+01   3.40e+01  -3.92435317e+04   5.42001162e-02   7.5518e-03   7.35
 GE7      2.95e+01   3.39e+01  -3.92436060e+04  -7.43102898e-02   5.0135e-04   7.50
 GE8      2.95e+01   3.39e+01  -3.92436057e+04   2.65337855e-04   4.7433e-04   7.44
 GE9      2.95e+01   3.39e+01  -3.92436063e+04  -5.57115713e-04   2.4872e-04   7.45
 GE10     2.95e+01   3.39e+01  -3.92436059e+04   3.10569442e-04   4.0696e-04   7.58
 GE11     2.95e+01   3.39e+01  -3.92436062e+04  -2.06215808e-04   6.8668e-05   7.34
 GE12     2.95e+01   3.39e+01  -3.92436062e+04  -7.14678288e-06   1.9158e-05   7.45
EDIFF/eV (outer loop): 2.08997417e+01 
 Updating EXX and rerun SCF     1.055e+02 (s)
 GE0      2.95e+01   3.39e+01  -3.92436062e+04   9.94677993e-07   9.6405e-06 113.02

Expected behavior

The stdout can be printed as expected

To Reproduce

No response

Environment

ABACUS version: Commit: 2037ae38e (Sat Aug 31 17:24:35 2024 +0800) LibRi version : 0.2.0 LibComm version : 0.1.1

Additional Context

No response

Task list for Issue attackers (only for developers)

  • [ ] Verify the issue is not a duplicate.
  • [ ] Describe the bug.
  • [ ] Steps to reproduce.
  • [ ] Expected behavior.
  • [ ] Error message.
  • [ ] Environment details.
  • [ ] Additional context.
  • [ ] Assign a priority level (low, medium, high, urgent).
  • [ ] Assign the issue to a team member.
  • [ ] Label the issue with relevant tags.
  • [ ] Identify possible related issues.
  • [ ] Create a unit test or automated test to reproduce the bug (if applicable).
  • [ ] Fix the bug.
  • [ ] Test the fix.
  • [ ] Update documentation (if necessary).
  • [ ] Close the issue and inform the reporter (if applicable).

QuantumMisaka avatar Sep 09 '24 04:09 QuantumMisaka

This problem seems to emerge recently. @pxlxingliang Can we find which commit introduced this strange bug from the daily tests?

PeizeLin avatar Sep 11 '24 08:09 PeizeLin

@QuantumMisaka could you help check the bug is induced by which commit?

WHUweiqingzhou avatar Sep 19 '24 07:09 WHUweiqingzhou

It seems that this problem has been solved, from the tests of me and @xuan112358 .

maki49 avatar Nov 25 '24 12:11 maki49

@maki49 @xuan112358 What's your version of ABACUS and LibRI/LibComm for test ? in my ABACUS 3.8.3 (8b048f4) and LibRI 0.2.1.0, LibComm 0.1.1, the problem still exists. Dependencies; Intel-OneAPI 2023.0, LibXC 6.2.2, elpa 2024.05.001

QuantumMisaka avatar Nov 26 '24 12:11 QuantumMisaka

@QuantumMisaka My newest test is with ABACUS v3.8.3 (93ac7e34d). LibRI/LibComm is downloaded automatically. They are v0.2.1.1 and v0.1.1 respectivelly. By the way, my test result with ABACUS v3.8.2 (c03c8796f) still seems right.

xuan112358 avatar Nov 27 '24 07:11 xuan112358

@QuantumMisaka Is this problem related to particular examples? Can you upload your test files?

xuan112358 avatar Nov 27 '24 07:11 xuan112358

@xuan112358 Sure Fe-HSE-relax.tar.gz

I guess the problem may be on the case of nspin=2

QuantumMisaka avatar Nov 27 '24 10:11 QuantumMisaka

@QuantumMisaka I had the same output problem with this example. Maybe you're right

xuan112358 avatar Nov 28 '24 14:11 xuan112358

@QuantumMisaka It seems like the bug output is in the hexadecimal output form, which is not expected. As far as I know, the settings std::fixed and std::scientific together will make the output form as hexadecimal form. Although I don't find the specific parameter in this case that leeds to the introduction of the setting std::fixed, given that std::scientific is already present when output, simply adding setting std::defaultfloat can settle this problem, as I have tested. I will propose a PR.

xuan112358 avatar Apr 12 '25 10:04 xuan112358