Microphysics update self-consistent NSE SDC update with Tabular NSE

address issue #1543 This depends on pynucastro/pynucastro#739

May 27 '24 21:05 zhichen3

Detonation with this pr: but it was a lot slower. Took ~400s runtime.

For comparison, I could run the same problem about ~100s with NSE before, and its about the same without NSE

May 28 '24 02:05 zhichen3

Detonation with this pr also gets statements like: burn entered NSE during integration (after 117 steps), zone = (227, 0, 0) recovering burn failure in NSE, zone = (227, 0, 0) throughout all zones behind the shock.

Probably due to higher energy release behind the shock compared to the version currently on development. This causes a steeper change in composition and puts it out of NSE at the beginning of the next step?

Previously, we would only get these for zones near the shock front and for other zones behind the shock its simply in NSE. It appears that it made it drastically slower.

May 28 '24 16:05 zhichen3

I realized that the current version on development is likely missing a factor of 1/dt in the definition of enuc_dot, but it seems to produce consistent results when I don't use NSE, which is weird.

Jun 05 '24 22:06 zhichen3

Update with some progress: With current version, I have:

 density                         3.575417e+19   1.942      9.303788e+18   1.980       2.35768e+18    1.601      7.773908e+17
 xmom                            1.804038e+28   2.065      4.311366e+27   1.983      1.090598e+27    1.902      2.917623e+26
 ymom                            1.803993e+28   2.065      4.311497e+27   1.983      1.090619e+27    1.902      2.917184e+26
 rho_E                           5.395202e+37   1.953      1.393596e+37   2.006      3.470112e+36    1.864      9.535919e+35
 rho_e                           5.323479e+37   1.953      1.375263e+37   2.006      3.424472e+36    1.864      9.404656e+35
 Temp                            5.891681e+20   1.996      1.477084e+20   2.013      3.658885e+19    1.169      1.627243e+19
 rho_He4                         2.839664e+18   1.852      7.863418e+17   1.921      2.076376e+17    1.393      7.903603e+16
 rho_Cr48                        1.238844e+18   2.007      3.081691e+17   1.740      9.226988e+16    0.532      6.379937e+16
 rho_Fe52                         1.43196e+19   1.892      3.857925e+18   1.750      1.146709e+18    0.820      6.496755e+17
 rho_Ni56                        3.878334e+20   2.038      9.443133e+19   2.077       2.23866e+19    2.139      5.083817e+18
 rho_enuc                         5.75844e+38   1.087      2.711178e+38   1.501      9.576835e+37    0.598      6.329164e+37

Not sure why convergence looks so bad when I included 512 resolution run.

And detonation looks like:

which is really fuzzy. Also encountered burn failures so I can't even run til the end.

Jun 12 '24 03:06 zhichen3

switched to evolving Ye using dyedt instead of updating Xn separately. Also added nse_T_abar_from_e to get consistent abar from nse_state. I was able to get reasonable results for detonation:

But its really slow now since I'm calling get_actual_nse_state iteratively for nse_T_abar_from_e. It takes forever to run 256 resolution and above for nse_test so I just gave up on that. Here is what convergence looks like using res 32, 64, and 128.

| density  | 3.571011e+19 | 1.939343 | 9.310881e+18 |
| xmom     | 1.806454e+28 | 2.067241 | 4.310478e+27 |
| ymom     | 1.806451e+28 | 2.067204 | 4.310582e+27 |
| rho_E    | 5.401591e+37 | 1.951982 | 1.396100e+37 |
| rho_e    | 5.329586e+37 | 1.951633 | 1.377823e+37 |
| Temp     | 5.864914e+20 | 1.992819 | 1.473545e+20 |
| rho_He4  | 2.824860e+18 | 1.849176 | 7.840419e+17 |
| rho_Cr48 | 1.227592e+18 | 2.011193 | 3.045263e+17 |
| rho_Fe52 | 1.437322e+19 | 1.875196 | 3.917995e+18 |
| rho_Ni56 | 3.875479e+20 | 2.037524 | 9.439947e+19 |
| enuc     | 5.533897e+29 | 1.088820 | 2.601738e+29 |

Jun 14 '24 15:06 zhichen3

GPROF from detonation:

|  time | seconds | seconds |      calls | name                             |
|-------+---------+---------+------------+----------------------------------|
| 56.74 |  400.61 |  400.61 |   92644127 | void nse_hybrid_solver           |
| 24.48 |  573.41 |  172.80 |   89336830 | burn_t get_actual_nse_state      |
|  3.28 |  596.56 |   23.15 | 1175465768 | Castro::fill_geom_source         |
|  2.13 |  611.60 |   15.04 |     120478 | void actual_integrator           |
|  1.84 |  624.62 |   13.02 |  103879485 | std::_Function_handler           |
|  1.74 |  636.88 |   12.26 |     485795 | nse_T_abar_from_e                |
|  1.67 |  648.69 |   11.81 |    7419933 | void evaluate_rates              |
|  1.34 |  658.15 |    9.46 |          _ | init                             |
|  1.27 |  667.09 |    8.94 |    3808349 | in_nse(burn_t&, bool)            |
|  0.88 |  673.32 |    6.23 |   32347252 | void apply_electrons             |
|  0.68 |  678.15 |    4.83 |  184819446 | ca_generic_fill                  |
|  0.62 |  682.50 |    4.35 |    9013649 | Castro::valid_zones_to_burn      |
|  0.42 |  685.45 |    2.95 |   12458993 | Castro::do_new_reactions(double, |
|  0.37 |  688.07 |    2.62 |    7466778 | void sneut5                      |
|  0.34 |  690.50 |    2.43 |    7419933 | void fill_reaclib_rates          |
|  0.27 |  692.44 |    1.94 |     541677 | void actual_jac                  |
|  0.20 |  693.87 |    1.43 |    7814137 | rhs_nuc                          |

Jun 14 '24 19:06 zhichen3

the convergence looks excellent

Jun 14 '24 23:06 zingale

updated using state.T as initial guess decreases runtime from ~1200s to ~550s for detonation

Jun 17 '24 16:06 zhichen3

awesome!

we should check throughout the code to make sure we are using accurate guesses.

Jun 17 '24 16:06 zingale

seems like I can just use a central difference instead of the 5-stencil for dabardT and still get second order convergence, might just stick with that

| density  | 3.574572e+19 | 1.953256 | 9.230715e+18 |
| xmom     | 1.802391e+28 | 2.059622 | 4.323554e+27 |
| ymom     | 1.802378e+28 | 2.059636 | 4.323481e+27 |
| rho_E    | 5.405299e+37 | 1.963402 | 1.386043e+37 |
| rho_e    | 5.333390e+37 | 1.963301 | 1.367700e+37 |
| Temp     | 5.856458e+20 | 1.977031 | 1.487611e+20 |
| rho_He4  | 2.829280e+18 | 1.841342 | 7.895450e+17 |
| rho_Cr48 | 1.222621e+18 | 1.990356 | 3.077052e+17 |
| rho_Fe52 | 1.426642e+19 | 1.881819 | 3.871069e+18 |
| rho_Ni56 | 3.876091e+20 | 2.036248 | 9.449790e+19 |
| enuc     | 5.467707e+29 | 1.063684 | 2.615799e+29 |

Jun 17 '24 19:06 zhichen3

Current state of this pr:

Its takes ~400s to run detonation with NSE and ~250s without NSE. (just run with USE_NSE_NET=TRUE and USE_SIMPLIFIED_SDC=TRUE with inputs-det-x.nse_net)

Profile for detonation, without NSE:

With NSE:

Jul 10 '24 20:07 zhichen3

previously only ran up to 128 res because its slow, but I decided to run 256 res to check the convergence, and for whatever reason, its not good...

 density                          3.57458e+19   1.954      9.227014e+18   1.783      2.681491e+18
 xmom                            1.802359e+28   2.060      4.323031e+27   2.001       1.07969e+27
 ymom                            1.802364e+28   2.060      4.322971e+27   2.001      1.079702e+27
 rho_E                           5.404931e+37   1.964      1.385427e+37   2.017      3.423357e+36
 rho_e                           5.333054e+37   1.964      1.367084e+37   2.018      3.375406e+36
 Temp                            5.856433e+20   1.990       1.47429e+20   1.623       4.78547e+19
 rho_He4                         2.828525e+18   1.844      7.876808e+17   1.470      2.843232e+17
 rho_Cr48                        1.222316e+18   2.013      3.027591e+17   0.713       1.84743e+17
 rho_Fe52                        1.426529e+19   1.904      3.812699e+18   1.250      1.602976e+18
 rho_Ni56                        3.876178e+20   2.037       9.44657e+19   2.083      2.229842e+19
 rho_enuc                        5.431598e+38   1.071      2.585564e+38   0.988      1.303703e+38

By decreasing the tolerance for temperature convergence from 1.e-6 to 1.e-7, I get:

 density                         3.576225e+19   1.947      9.278317e+18   2.028      2.274303e+18    1.999      5.690442e+17
 xmom                             1.80319e+28   2.045      4.370978e+27   2.013      1.083208e+27    2.011       2.68692e+26
 ymom                            1.803184e+28   2.045      4.371014e+27   2.013       1.08321e+27    2.011      2.686647e+26
 rho_E                           5.406496e+37   1.954      1.395561e+37   2.025      3.429176e+36    2.012      8.499385e+35
 rho_e                           5.334729e+37   1.954      1.377077e+37   2.026      3.381836e+36    2.012      8.384062e+35
 Temp                             5.85942e+20   2.000      1.464391e+20   2.109      3.395133e+19    2.024       8.34525e+18
 rho_He4                         2.828242e+18   1.860       7.78868e+17   1.991      1.958958e+17    2.083      4.623912e+16
 rho_Cr48                        1.225026e+18   2.039      2.981315e+17   2.145      6.738324e+16    1.655      2.139777e+16
 rho_Fe52                        1.429159e+19   1.901      3.826192e+18   1.997      9.587731e+17    1.754      2.843321e+17
 rho_Ni56                        3.876275e+20   2.038       9.44107e+19   2.084      2.226414e+19    2.208      4.820328e+18
 rho_enuc                        5.419401e+38   1.082      2.559245e+38   1.708      7.832314e+37    0.817      4.446305e+37

With ttol=1.e-8, I get

 density                         3.576018e+19   1.948      9.268791e+18   2.033      2.265095e+18    2.042      5.498526e+17
 xmom                            1.803014e+28   2.046      4.365948e+27   2.002      1.090134e+27    1.988       2.74821e+26
 ymom                            1.803014e+28   2.046      4.365947e+27   2.002      1.090141e+27    1.988      2.748503e+26
 rho_E                           5.406225e+37   1.955      1.394656e+37   2.021      3.436645e+36    1.999        8.5966e+35
 rho_e                           5.334454e+37   1.955      1.376181e+37   2.022      3.389202e+36    1.999      8.479188e+35
 Temp                            5.859102e+20   2.000      1.464346e+20   2.131      3.344017e+19    2.278      6.896348e+18
 rho_He4                         2.828401e+18   1.859      7.794693e+17   2.014      1.930269e+17    2.227      4.121872e+16
 rho_Cr48                        1.224792e+18   2.041       2.97577e+17   2.214      6.413942e+16    2.242       1.35617e+16
 rho_Fe52                         1.42888e+19   1.903      3.820818e+18   1.999      9.559052e+17    2.076      2.267084e+17
 rho_Ni56                        3.876303e+20   2.038      9.440292e+19   2.086      2.223161e+19    2.228       4.74419e+18
 rho_enuc                        5.418569e+38   1.083       2.55734e+38   1.692      7.914355e+37    0.929      4.156051e+37

After using ttol=1.e-8, I would get abs(H)=HMIN failure for detonation, then I tried switching to 5-point stencil for constructing the derivative dabardT, then it goes away.

Then the new detonation profile is:

Jul 13 '24 21:07 zhichen3

I did a comparison to the tabular case and this seems to follow the same flow.

Jul 14 '24 14:07 zingale