nnpdf Re-implementation of some of the ATLAS Collider DY Datasets

The following PR implements some of the ATLAS collider DY in the new commondata format. The status is summarized in the table below.

☑️ in Comparison vs. Old means that the results are fully identical while 🔴 means that comparisons are available but noticeable differences are perceived.

Dataset Name	Comparison vs. Old	General Comments	Status
ATLAS_DY_7TEV_EMUON_Y	☑️	The old implementation used a luminosity uncertainty of 3.5% while in HepData it is 3.4%	☑️
ATLAS_DY_7TEV_DILEPTON_Y-CRAP	☑️	The new implementation (from HepData) is missing one source of uncorrelated systematics	☑️
ATLAS_DY_7TEV_DILEPTON_Y-FRAP	☑️	The new implementation (from HepData) is missing one source of uncorrelated systematics	☑️
ATLAS_WPWM_8TEV_MUON_Y		FK tables are missing but old commondata exists	☑️
ATLAS_Z0_8TEV_LOWMASS_2D	☑️		☑️
ATLAS_Z0_8TEV_HIGHMASS_2D	🔴	Slight differences in treatment of asymmetric systematic correlated uncertainties	☑️
ATLAS_Z0_8TEV_3D_CRAP		Could not find dataset and FK tables to compare the implementation to	☑️
ATLAS_Z0_8TEV_3D_FRAP		Could not find dataset and FK tables to compare the implementation to	☑️
ATLAS_DY_13TEV_FID	☑️	Just needs to fix the plotting to be as a function of Gauge bosons	☑️
ATLAS_Z0_8TEV_20FB_PT-INVDIST			☑️
ATLAS_Z0_8TEV_20FB_PT-RAPDIST			☑️

Remains TODO:

[ ] Define the plotting entries to be exactly the same as before

Nov 26 '23 21:11 Radonirinaunimi

@scarlehoff, is something maybe wrong with the loading when the treatment of the systematics is set to MULT?

Comparing the old:

In [25]: ds = API.dataset(dataset_input={"dataset": "ATLASWZRAP11CC", "cfac": ["QCD"]}, theoryid=600, use_cuts="internal")

In [26]: ds.load_commondata().systematics_table
Out[26]:
               ADD      MULT         ADD   MULT         ADD   MULT         ADD   MULT        ADD   MULT  ...         ADD   MULT         ADD   MULT         ADD   MULT          ADD  MULT          ADD      MULT
entry                                                                                                    ...
1       760.914560  0.131840 -126.973000 -0.022    5.771500  0.001    5.771500  0.001   0.000000  0.000  ...   23.086000  0.004   23.086000  0.004  409.776500  0.071  10388.70000   1.8   715.666000  0.124000
2       845.576046  0.146580  -92.299200 -0.016   11.537400  0.002   11.537400  0.002   5.768700  0.001  ...   40.380900  0.007  -63.455700 -0.011  184.598400  0.032  10383.66000   1.8   663.400500  0.115000
3       719.275700  0.123640 -215.247500 -0.037   23.270000  0.004   17.452500  0.003  -5.817500 -0.001  ...   -5.817500 -0.001  319.962500  0.055  459.582500  0.079  10471.50000   1.8   605.020000  0.104000
4       673.101395  0.114850  -41.024900 -0.007   52.746300  0.009   23.442800  0.004   5.860700  0.001  ...   35.164200  0.006   46.885600  0.008  298.895700  0.051  10549.26000   1.8   750.169600  0.128000
5       847.481382  0.144540   23.453200  0.004   76.222900  0.013   29.316500  0.005   5.863300  0.001  ...  205.215500  0.035 -134.855900 -0.023 -222.805400 -0.038  10553.94000   1.8   738.775800  0.126000
6       766.929414  0.128020  113.823300  0.019  137.786100  0.023   53.916300  0.009   5.990700  0.001  ...  -41.934900 -0.007  281.562900  0.047 -299.535000 -0.050  10783.26000   1.8   623.032800  0.104000
7      1951.014450  0.326940 -143.220000 -0.024  310.310000  0.052  113.382500  0.019 -53.707500 -0.009  ...   83.545000  0.014   95.480000  0.016  -41.772500 -0.007  10741.50000   1.8   853.352500  0.143000
8       784.152243  0.129790  -36.250200 -0.006   84.583800  0.014   24.166800  0.004  -6.041700 -0.001  ...  132.917400  0.022   36.250200  0.006  -48.333600 -0.008  10875.06000   1.8   815.629500  0.135000
9      1071.656301  0.176570   -6.069300 -0.001   66.762300  0.011   18.207900  0.003   6.069300  0.001  ...  121.386000  0.020  182.079000  0.030   78.900900  0.013  10924.74000   1.8  1019.642400  0.168000
10      854.792700  0.144050  -23.736000 -0.004   35.604000  0.006    0.000000  0.000  -5.934000 -0.001  ...  124.614000  0.021  -29.670000 -0.005  183.954000  0.031  10681.20000   1.8   884.166000  0.149000
...

with the new implementation:

In [13]: ds_new = API.dataset(dataset_input={"dataset": "ATLAS_DY_7TEV_DILEPTON_Y", "cfac": ["QCD"]}, theoryid=600, use_cuts="internal")

In [14]: ds_new.load_commondata().systematics_table
Out[14]:
               MULT          MULT          MULT          MULT          MULT          MULT          MULT      MULT  ...          MULT          MULT      MULT          MULT          MULT      MULT      MULT      MULT
entry                                                                                                              ...
1     -3.811834e-06  1.732652e-07  1.732652e-07  0.000000e+00 -1.732652e-07  8.663259e-07 -2.079182e-06  0.000025  ...  6.930607e-07 -3.465304e-07 -0.000003  6.930607e-07  6.930607e-07  0.000012  0.000023  0.000312
2     -2.773589e-06  3.466986e-07  3.466986e-07  1.733493e-07 -0.000000e+00  8.667464e-07 -2.080191e-06  0.000020  ...  1.386794e-06  0.000000e+00 -0.000005  1.213445e-06 -1.906842e-06  0.000006  0.000026  0.000312
3     -6.360120e-06  6.875806e-07  5.156854e-07 -1.718951e-07  3.437903e-07  2.234637e-06 -3.781693e-06  0.000023  ...  5.328749e-06  1.718951e-07  0.000005 -1.718951e-07  9.454233e-06  0.000014  0.000021  0.000309
4     -1.194397e-06  1.535653e-06  6.825123e-07  1.706281e-07  1.706281e-07  1.706281e-06 -2.559421e-06  0.000022  ...  4.095074e-06  3.412562e-07  0.000010  1.023768e-06  1.365025e-06  0.000009  0.000019  0.000307
5      6.822097e-07  2.217181e-06  8.527621e-07  1.705524e-07  1.705524e-07  2.046629e-06 -2.728839e-06  0.000026  ...  1.364419e-05  3.411048e-07 -0.000016  5.969335e-06 -3.922706e-06 -0.000006  0.000024  0.000307
6      3.171583e-06  3.839284e-06  1.502329e-06  1.669254e-07 -1.669254e-07  3.839284e-06 -3.505433e-06  0.000020  ...  4.506986e-06  8.346270e-07  0.000018 -1.168478e-06  7.845494e-06 -0.000008  0.000022  0.000300
7     -4.021785e-06  8.713867e-06  3.183913e-06 -1.508169e-06 -8.378718e-07  1.139506e-05 -9.719313e-06  0.000021  ...  5.194805e-06  1.005446e-06  0.000009  2.346041e-06  2.681190e-06 -0.000001  0.000055  0.000302
8     -9.930980e-07  2.317229e-06  6.620653e-07 -1.655163e-07 -3.310327e-07  2.979294e-06 -2.648261e-06  0.000021  ...  3.475843e-06  9.930980e-07  0.000005  3.641359e-06  9.930980e-07 -0.000001  0.000022  0.000298
9     -1.647636e-07  1.812400e-06  4.942909e-07  1.647636e-07  3.295273e-07  1.647636e-06 -1.482873e-06  0.000021  ...  6.425782e-06  1.153346e-06  0.000010  3.295273e-06  4.942909e-06  0.000002  0.000030  0.000297
10    -6.740816e-07  1.011122e-06  0.000000e+00 -1.685204e-07 -5.055612e-07  8.426020e-07 -8.426020e-07  0.000022  ...  1.196495e-05  1.516684e-06  0.000013  3.538928e-06 -8.426020e-07  0.000005  0.000024  0.000303

while you can see that dumped values are exactly the same (modulo the first column ADD and MULT in the old):

In [15]: ds_new.load_commondata().systematic_errors()
Out[15]:
       ATLASWZRAP11_1001  ATLASWZRAP11_1002  ATLASWZRAP11_1003  ATLASWZRAP11_1004  ATLASWZRAP11_1005  ...  ATLASWZRAP11_1128  ATLASWZRAP11_1129  ATLASWZRAP11_1130  UNCORR  ATLASLUMI11
entry                                                                                                 ...
1                 -0.022              0.001              0.001              0.000             -0.001  ...              0.004              0.004              0.071    0.13          1.8
2                 -0.016              0.002              0.002              0.001             -0.000  ...              0.007             -0.011              0.032    0.15          1.8
3                 -0.037              0.004              0.003             -0.001              0.002  ...             -0.001              0.055              0.079    0.12          1.8
4                 -0.007              0.009              0.004              0.001              0.001  ...              0.006              0.008              0.051    0.11          1.8
5                  0.004              0.013              0.005              0.001              0.001  ...              0.035             -0.023             -0.038    0.14          1.8
6                  0.019              0.023              0.009              0.001             -0.001  ...             -0.007              0.047             -0.050    0.13          1.8
7                 -0.024              0.052              0.019             -0.009             -0.005  ...              0.014              0.016             -0.007    0.33          1.8
8                 -0.006              0.014              0.004             -0.001             -0.002  ...              0.022              0.006             -0.008    0.13          1.8
9                 -0.001              0.011              0.003              0.001              0.002  ...              0.020              0.030              0.013    0.18          1.8
10                -0.004              0.006              0.000             -0.001             -0.003  ...              0.021             -0.005              0.031    0.14          1.8

Nov 26 '23 21:11 Radonirinaunimi

It does look wrong, specially since there's nothing that would justify a 10^-7, isn't there? The data is all > 1 so it cannot be a add vs mult problem, let me have a look.

Nov 27 '23 07:11 scarlehoff

It does look wrong, specially since there's nothing that would justify a 10^-7, isn't there? The data is all > 1 so it cannot be a add vs mult problem, let me have a look.

By just converting the percentage (MULT) into the absolute value (ADD), that is representing the systematics as additive instead, the entries are exactly the same (omitting the first column of ds).

n [3]: ds_new = API.dataset(dataset_input={"dataset": "ATLAS_DY_7TEV_DILEPTON_Y", "cfac": ["QCD"]}, theoryid=600, use_cuts="internal")

In [4]: ds_new.load_commondata().systematics_table
Out[4]:
              ADD         ADD         ADD        ADD        ADD         ADD         ADD  ...        ADD         ADD         ADD        ADD         ADD         ADD         ADD
entry                                                                                    ...
1     -126.973000    5.771500    5.771500   0.000000  -5.771500   28.857500  -69.258000  ... -11.543000  -86.572500   23.086000   23.08600  409.776500   750.29500  10388.7000
2      -92.299200   11.537400   11.537400   5.768700  -0.000000   28.843500  -69.224400  ...   0.000000 -155.754900   40.380900  -63.45570  184.598400   865.30500  10383.6600
3     -215.247500   23.270000   17.452500  -5.817500  11.635000   75.627500 -127.985000  ...   5.817500  168.707500   -5.817500  319.96250  459.582500   698.10000  10471.5000
4      -41.024900   52.746300   23.442800   5.860700   5.860700   58.607000  -87.910500  ...  11.721400  357.502700   35.164200   46.88560  298.895700   644.67700  10549.2600
5       23.453200   76.222900   29.316500   5.863300   5.863300   70.359600  -93.812800  ...  11.726600 -551.150200  205.215500 -134.85590 -222.805400   820.86200  10553.9400
6      113.823300  137.786100   53.916300   5.990700  -5.990700  137.786100 -125.804700  ...  29.953500  641.004900  -41.934900  281.56290 -299.535000   778.79100  10783.2600
7     -143.220000  310.310000  113.382500 -53.707500 -29.837500  405.790000 -346.115000  ...  35.805000  316.277500   83.545000   95.48000  -41.772500  1969.27500  10741.5000
8      -36.250200   84.583800   24.166800  -6.041700 -12.083400  108.750600  -96.667200  ...  36.250200  181.251000  132.917400   36.25020  -48.333600   785.42100  10875.0600
9       -6.069300   66.762300   18.207900   6.069300  12.138600   60.693000  -54.623700  ...  42.485100  382.365900  121.386000  182.07900   78.900900  1092.47400  10924.7400
10     -23.736000   35.604000    0.000000  -5.934000 -17.802000   29.670000  -29.670000  ...  53.406000  445.050000  124.614000  -29.67000  183.954000   830.76000  10681.2000

So I think it is really a difference between how the systematics are represented (unless I am doing something stupid here).

Nov 27 '23 09:11 Radonirinaunimi

The systematic_errors method are all absolute.

I think the difference might be that you are implementing the multiplicative uncertainties as % or relative, while in the new commondata format they should be implemented always as absolute.

https://github.com/NNPDF/nnpdf/pull/1679#issuecomment-1490394694

I thought that we had added this to the documentations but it seems we didn't. Let me update it!

Nov 27 '23 14:11 scarlehoff

The systematic_errors method are all absolute.

This definitely explains why using absolute $\oplus$ ADD works.

I think the difference might be that you are implementing the multiplicative uncertainties as % or relative, while in the new commondata format they should be implemented always as absolute.

#1679 (comment)

I thought that we had added this to the documentations but it seems we didn't. Let me update it!

Is this actually correct (I don't think so!)? If the values are quoted as absolute then their treatment have to be ADD, and reciprocally if the values are quoted as percentage then their treatment have to be MULT. I don't think one can have absolute values but treated as MULT , or percentage but treated as ADD.

Nov 27 '23 15:11 Radonirinaunimi

Regardless on how they are given in hepdata (they could tell you it's a relative value but give you a table with the absolute values) you can convert them to absolute.

I honestly don't remember why we went for everything absolute, I guess it is more consistent this way.

Nov 27 '23 15:11 scarlehoff

Regardless on how they are given in hepdata (they could tell you it's a relative value but give you a table with the absolute values) you can convert them to absolute.

I honestly don't remember why we went for everything absolute, I guess it is more consistent this way.

Right. I just want to emphasize that if everything now is given as absolute, then only the treatment ADD is allowed (not MULT).

Re everything absolute, we might want to keep in mind the following sentence from the docs:

While it may seem at first that the multiplicative error is spurious given the presence of the additive error and data central value, this may not be the case. For example, in a closure test scenario, the data central values may have been replaced in the CommonData file by theoretical predictions. Therefore if you wish to use a covariance matrix generated with the original multiplicative uncertainties via the method, you must also store the original multiplicative (percentage) error. For flexibility and ease of I/O this is therefore done in the CommonData file itself.

Nov 27 '23 15:11 Radonirinaunimi

I just want to emphasize that if everything now is given as absolute, then only the treatment ADD is allowed (not MULT)

Why? The first thing the parser does is to make it relative to the central values. The way it is written in the actual file doesn't really matter that much.

(that said... it makes it unreliable in closure tests? we need @enocera here!)

Nov 27 '23 15:11 scarlehoff

Why? The first thing the parser does is to make it relative to the central values. The way it is written in the actual file doesn't really matter that much.

But such extra-operation is not needed at all if everything is defined as Absolute $\oplus$ ADD. At the end of the day (modulo the CT business), the treatments ADD and MULT (and representation of thereof) are exactly the same information.

Nov 28 '23 09:11 Radonirinaunimi

Not for the t0 covmat.

Nov 28 '23 09:11 scarlehoff

@scarlehoff, @enocera, this is also ready for review. Here is the report: https://vp.nnpdf.science/kuWT56KBSlai3_-XqLZxeA==/

For some datasets, I couldn't find the commondata and/or FK tables to compare to.

Dec 04 '23 08:12 Radonirinaunimi

The ones you didn't find the commondata for is because they have no corresponding old dataset, right?

Dec 04 '23 08:12 scarlehoff

I understand that these are the 3D ATLAS distributions, of which we implemented only the 2D version.

Dec 04 '23 10:12 enocera

So let's forget about the 3D distributions, for the moment.

Dec 04 '23 10:12 enocera

Ok! Thanks. First comments, then I'll start going through all the old-new datasets one by one:

What about these ones? Did you forget about them or are they part of another set (or maybe they have a different name in your list?)

ATLASZHIGHMASS49FB
ATLASLOMASSDY11EXT
ATLASWZRAP11CF (I see you do have the CC version so this might actually be forgotten!)
ATLAS_WZ_TOT_13TEV (maybe this one is the one you call ATLASWZTOT13TEV81PB ??

And these four I think already asked you about, so I know you were not taking care of them but just for completeness:

ATLAS_WP_JET_8TEV_PT
ATLAS_WM_JET_8TEV_PT
ATLASZPT8TEVMDIST
ATLASZPT8TEVYDIST

Dec 04 '23 11:12 scarlehoff

So the status is then the following:

ATLASZHIGHMASS49FB, ATLASLOMASSDY11EXT: these datasets I haven't touched on purpose because as far as I understood @cschwan was/has been looking into them (?).
ATLASWZRAP11CF, ATLASZPT8TEVMDIST, ATLASZPT8TEVYDIST: I genuinely missed these datasets. I will implement them in this PR.
~ATLAS_WZ_TOT_13TEV: this is indeed an updated version of ATLASWZTOT13TEV81PB (I implemented the outdated one) in that the correct one should include the experimental correlation coefficients. I will fix the currently implemented one.~ This is now Done.
As for the _JET_: If no one is looking into them yet, I can also implement them in this PR.

All in all, still a few to be done before this PR is complete :sweat_smile:

Dec 04 '23 15:12 Radonirinaunimi

I've updated the parser so that automatically repeats a column if one is missing.

Here's the report for the one with the weird plot_x option: https://vp.nnpdf.science/fkjMDKmrSC6XpBXv7-mhHA==

I've repeated the tests and now I have:
old: ATLASWZRAP36PB vs new: ATLAS_DY_7TEV_EMUON_Y
# Differences in the computation of chi2 32.119931215298024 vs 32.21855155561812
    The covmats are different
    even the diagonal

old: ATLAS_DY_2D_8TEV_LOWMASS vs new: ATLAS_Z0_8TEV_LOWMASS_2D
 > Everything ok

 > old: ATLAS_WZ_TOT_13TEV vs new: ATLAS_DY_13TEV_FID
The t0 chi2 is different: 10934.218358793676 vs 81138.4387008005

> old: ATLASDY2D8TEV vs new: ATLAS_Z0_8TEV_HIGHMASS_2D
% difference in the data 
 Differences in the computation of chi2  80.2445870631963 vs 76.30021056788038
    The covmats are different
    even the diagonal
In the last one I've noticed the data itself is different at the level of few-per-mille which could be driving the difference (since a difference in the data will modify also the covmat through the multiplicative uncertainties).

For the one that has a very different t0 (but nothing else was different) I guess the MULT and ADD uncertainties are very wrong? Or I've done something else wrong...

The one that is only t0, might be a problem with MULT and ADD?

As usual, thanks a lot for the detailed checks! For the one with different t0, I am a bit surprised that this is the case. I thought that I had check that the treatment of the systematics were the same as before. I will check again.

As for the rests, the differences are exactly understood. Before implementing the legacy versions, maybe I am just missing something from the new hepdata (?), which we'd need @enocera.

PS: I will also check the boson plotting now.

Dec 08 '23 09:12 Radonirinaunimi

Thanks @Radonirinaunimi, your last commit fixes the t0 issue.

Dec 08 '23 12:12 scarlehoff

This is also now ready for review.

For all of the datasets (except one), they have been implemented in the same way as in the old commondata (for legacy purposes), and comments are left in the table above to describe what I've found to be different wrt the hepdata. Nevertheless, sometimes, the numerical values of the correlated systematics (and even the central values) are not exactly equal because it might happen that the numerical values quoted in the hepdata tables are slightly different from the rawdata used in the old commondata.

PS: there are only the ATLAS_Z0_8TEV_20FB_PT-* datasets which raise some weird errors regarding indexing when computing data vs theory comparisons although the data can be loaded properly and the entries of the tables are exactly the same.

Jan 09 '24 12:01 Radonirinaunimi

When the results are different you can implement the hepdata one and then a legacy variant with the different version (that it is compatible with the old one). This is preferred.

Btw, did you check that when loading the entire set of datasets the associated covariance matrix is the same as the old (same for the datasets in the other PRs)?

Jan 09 '24 12:01 scarlehoff

When the results are different you can implement the hepdata one and then a legacy variant with the different version (that it is compatible with the old one). This is preferred.

The issue that I am struggling at the moment is that I am not sure if it makes sense to have legacy versions for some particular datasets are not. And this is really one of the things we should discuss (cc @enocera). Let me provide two explicit examples:

Take CMS_WP_7TEV_MUON_ASY for example, when one downloads the full thing from hepdata there are two different type of files: the usual hepdata table (as shown on the HepData interface) and the rawdata (usually in txt or dat format and does not follow any convention/structure). In most of the old implementation, the rawdata were used. However, the numerical values in both are not always the same and thus the covariance matrix slightly differ. If we resort to always use the rawdata, then some of the entries in the metadata (such as tables) will be deprecated.
Then, there are the cases in which maybe some conscientious decisions were made (?) such as the example of ATLAS_DY_7TEV_EMUON_Y. In the paper, it is mentioned that luminosity uncertainties are about $3.5$% (and this was the value used in the old implementation) but in the hepdata entries the values are $3.4$%.

Btw, did you check that when loading the entire set of datasets the associated covariance matrix is the same as the old (same for the datasets in the other PRs)?

Yes, for the datasets listed here and have a checkmark in the column comparison vs old. For some of the CMS datasets in https://github.com/NNPDF/nnpdf/pull/1869, it is a bit tricky because of the numerical differences mentioned in the first point as I tried to use as much as possible the hepdata files instead.

Jan 09 '24 14:01 Radonirinaunimi

I'm going to rebase these datasets on top of the ones currently in master.

@Radonirinaunimi I'll leave this as PR and not merge immediately in case you want to rollback the changes that you did for legacy purposes. Now we have the legacy version for reproduction as the copy from the old one but I think it is better in general to have the proper hepdata one as well

Feb 20 '24 10:02 scarlehoff

I'm going to rebase these datasets on top of the ones currently in master.

@Radonirinaunimi I'll leave this as PR and not merge immediately in case you want to rollback the changes that you did for legacy purposes. Now we have the legacy version for reproduction as the copy from the old one but I think it is better in general to have the proper hepdata one as well

That sounds good! I will revert back to before it produced the legacy versions. I guess in doing so, I will need to call the uncertainty files to something else?

Thanks for the comments plotting metadata, I will have a second look at them and make sure they are fully correct.

Feb 20 '24 13:02 Radonirinaunimi

As in other cases, this implementation is now obsolete. Some of the information here might be relevant though @comane @ecole41 for the missing datasets.

Dec 06 '24 20:12 scarlehoff