Make the usage of feature scaling + large x a bit clearer
Closes #2377
This PR deals with the clarity issues pointed out by #2377
- Allows the user to drop completely large-x from the runcard
- If feature scaling is enabled, the iterated runcard will have nothing for large x (instead of having something that is then ignored completely by the fit)
- Changes
interpolation_pointstofeature_scaling_pointsso that it is clear when it is set toFalseor not set at all.
@kamillaurent @juanrojochacon let me know whether this is what you had in mind. This changes nothing on how the fit is run, right now the fit will just drop silently the large-x if feature scaling is set, now instead if you have large-x set it will refuse to run until you remove it.
Looking at the PR description yes, this is what we need. @kamillaurent can take a look and then try to run a fit without and with FS, everything else kept the same, to check that things work properly.
I think it is useful that we can keep the two options for PDF parametrisation: for NNPDF4.1, probably we want to show in the paper the fit variants where everything is kept the same except for FS on or off
@kamillaurent have you looked into this? Otherwise I'll rather close it and keep the current conventions, because the more this drags the most things are going to be broken and the more work it is for me to keep it. This should be something quick and trivial to review.
Hi @scarlehoff , I am looking at it now. Sorry for the late reply, I will get back to you as soon as I finish
@scarlehoff I tried to run the runcard in the branch make_feature_scaling_clearer. I then want to compare the fit to our baseline, 251124-jcm-nnpdf41-001, but I obtain this error by validphys:
File "/data/theorie/klaurent/nnpdf/validphys2/src/validphys/eff_exponents.py", line 335, in previous_effective_exponents_table
prev_b_bounds = [runcard_fl['largex'] for runcard_fl in fitting["basis"]]
~~~~~~~~~~^^^^^^^^^^
KeyError: 'largex'
Hi @kamillaurent, by design I wanted to remove large-x from as many places as possible
now instead if you have large-x set it will refuse to run until you remove it.
and I would prefer if we do not mix both types of fits but I guess that ship has sailed anyway so the commit below puts the beta to NaN for the report if the fit has feature scaling on.
Thanks for clarifying, the same error message appears also if I try to compare the fit done with feature scaling and the same fit iterated, I try now to do with the new commit:
- a comparison between the feature scaling fit and
251124-jcm-nnpdf41-001 - a comparison between the feature scaling fit and his iterated version\
Let me know if you have other checks to suggest
the same error message appears also if I try to compare the fit done with feature scaling and the same fit iterated,
That should not have happened instead!
Let me know if you have other checks to suggest
The important one is actually that you iterated the fit (so largex was dropped) and were able to run (so it was, indeed, dropped).
But this PR was also about making the feature scaling part clearer so that's the important part. If you think it is indeed clearer, then good, otherwise (if the previous version seem clearer) better to roll it back.
The important one is actually that you iterated the fit (so largex was dropped) and were able to run (so it was, indeed, dropped).
Yes, the fit (and comparisons) now work using runcards without largex (both iterated fit and normal one), here are the reports:
- Test of a fit with this branch compared to baseline: https://vp.nnpdf.science/GiPUMdXHSwu8gjCeUFSjPQ==/
- Test of iterated fit: https://vp.nnpdf.science/ECbdZZvTSaKnnHF_D5LSWg==/
But this PR was also about making the feature scaling part clearer
I think it is clearer, because you specify in the runcard feature_scaling_points instead of interpolation_points
Thanks! Then on Monda/Tuesday I'll merge both this and @jekoorn's PR so that the Christmas hyperopt is run already with this change.
Thank you very much for your work!