nnpdf [WIP] Implement ATLAS WPWM 13TEV DIF (Future test)

This branch included an implementation of the ATLAS 13TEV WPWM Differential measurements for future test data. Another version of this implementation has been added in PR #2380.

Still To Do:

[ ] Check uncertainty definitions
[ ] Complete metadata.yamlfile
[ ] Cross check with #2380

Oct 06 '25 11:10 ecole41

@enocera I am not certain on the treatment of the uncertainties here. Do you know if any more should be added to the treatment and correlations dictionaries so that they are not treated as additive correlated uncertainties?

Oct 06 '25 13:10 ecole41

@enocera: Jelle and I have discussed the two implementations of this dataset. We have both followed different structures when constructing the observables so we are unsure which is the preferred structure. Jelle has implemented as:

$W^+$ single differential
$W^-$ single differential
$W^+$ double differential
$W^-$ double differential

Whereas I have implemented these as:

1D combined $W^+$ and $W^-$ in sequence (Tabs. 38-39);
1D muons $W^+$ and $W^-$ in sequence (Tabs. 20-21);
2D combined $W^+$ and $W^-$ in sequence (Tabs. 44-53);
2D muons $W^+$ and $W^-$ in sequence (Tabs. 23-32).

I have also added some changes to the uncertainty treatments after discussing this with an ATLAS experimentalist. She suggested that all unfolding systematics should be treated as uncorrelated and all normalisation systematics as multiplicative and uncorrelated - I have set this in this branch.

Oct 23 '25 10:10 ecole41

@ecole41 is this (de)correlation prescription approved by ATLAS in some manner? They always get quite nervous if we start to play with their correlation model, so having some kind of official endorsement always helps

Oct 23 '25 10:10 juanrojochacon

Hi @ecole41 @jekoorn thanks for the work. Maybe @enocera has other ideas but my two cents are the following:

We would never want to fit separately W+ and W- data. So it is clear to me that a "W production" dataset should always consist in W+ and W- cross-sections.
We don't want to fit separately the muon data but always the combined datasets (electron + muons). So I would forget about the muon only and implement only the combined measurements
One cannot fit at the same time 1D and 2D distributions (same underlying dataset) so I would keep them separated.

so to me the preferred structure would be what @ecole41 has done but removing the muon datasets, if this is clear.

In any case it should be easy for @jekoorn to adopt Ella's implementation, and then you can cross-check each other concerning the implementation of systematic errors.

Oct 23 '25 14:10 juanrojochacon

In any case as I mentioned above it is important to document our choice of correlation model, and make sure we can back it up with some official ATLAS recommendation

Oct 23 '25 14:10 juanrojochacon

Once @enocera signs off the dataset implementation, we will move to the generation of NNLO grids using NNLOJET, which will also be a non-trivial amount of work specially the first time that it is done

Oct 23 '25 14:10 juanrojochacon

hi @juanrojochacon thanks a lot for the comments!

I agree with everything, also in terms of fool-proofing so that there does not need to be any confusion about what dataset is to be fitted, and which is not. With your proposed structure one should enter one dataset to the runcard at the time.

Just to be sure:

never want to fit separately W+ and W- data

So that means we we should put them sequentially in the same file, as Ella already did?

In any case I will make these changes to my implementation. Clear, thanks!

Oct 23 '25 14:10 jekoorn

yes indeed, we put one after the other. It is the exact same analysis, so there will never be a reason why we choose to fit W+ but not W-. This is the same as what is done for similar datasets.

So yes, follow Ella's implementation and then you can compare the two and check that they are the same

Oct 23 '25 14:10 juanrojochacon

For your reference, I paste here what I recommended to @ecole41 in a private conversation.

I would implement the cross section single differential in m_W^T separately for positive and negative leptons (so only Tabs. 38 and 39). I would also implement the cross section double differential in m_W^T and eta, again separately for positive and negative leptons (Tabs. 44-53).

Two remarks.

The combination of electron and muon channels leads to smaller uncertainties (and, as I said many times, this is something we like). However, because of the way in which the combination is performed, the correspondence between each systematic uncertainty and its physical source may get lost. I understand that this is the reason why, in an attempt to recover this correspondence, the combination is performed by rotating to the orthogonal basis (which makes the combination easier) and then by rotating back to the physical basis. This procedure may be inaccurate and it may alter the correspondence between a given systematic uncertainty and its physical source. This means that there is an additional ambiguity in interpreting each of these uncertainties as corr or uncorr, add or mult.
Because of what is said above, it may be good to also implement the pure muon channel, again separately for positive and negative muons, for the 1D and the 2D distributions. In that case, whereas the measurement is less precise than the combined one, we do not break the correspondence between systematic uncertainties and their physical source, a fact that makes the interpretation of whether a sys unc is corr/uncorr cleaner.

I assume these include the correct tables to be implemented. Do you have a preference for how the dataset should be formed? E.g WPWM (l+ and l- in sequence, WP and WM as separate observables, also should the l+/- channel be considered?

I would implement four different observables in the same data set, as follows:

1D combined W+ and W- in sequence (Tabs. 38-39);
1D muons W+ and W- in sequence (Tabs. 20-21);
2D combined W+ and W- in sequence (Tabs. 44-53);
2D muons W+ and W- in sequence (Tabs. 23-32).

Oct 23 '25 15:10 enocera

That being said, I think that @jekoorn and @ecole41 would like some input on their choice of treatment of the various sources of uncertainties, which I will give them asap.

Oct 23 '25 15:10 enocera

Good point @enocera I agree. We can check that results based on the muon dataset are consistent with those of the combined dataset. In any case for this measurement I expect that we are limited by systematics, so actually it may be better to stick to the muon dataset to have a better grasp of the systematics.

So we have a plan

Oct 23 '25 15:10 juanrojochacon

* We don't want to fit separately the muon data but always the combined datasets (electron + muons). So I would forget about the muon only and implement only the combined measurements

This is one point on which I don't agree completely, for reasons related to the interpretation of systematic uncertainties, that can get more ambiguous (especially w.r.t. correlations) in the combined case, as I explained above. Theoretical predictions will remain the same, therefore I recommend the implementation of both the muon cross sections and the combined cross sections in the commondata framework.

* One cannot fit at the same time 1D and 2D distributions (same underlying dataset) so I would keep them separated.

This is another point on which I (partly) disagree. Our commondata implementation is flexible enough to have multiple observables for the same data set. In other words: the data set is one, that incorporates both the 1D and the 2D distributions. But they are two mutually exclusive observables (in the same data set), of course, because we don't know correlations. We can elegantly implement them in a single data set, and call only a subset of observables (1D or 2D) in our fit runcard. I have listed above the preferred clustering.

Oct 23 '25 15:10 enocera

sure @enocera I meant separated as in a different file, but we can keep them as subsets of the same dataset, as we do for many other datasets. So I agree with your remarks

Oct 23 '25 15:10 juanrojochacon

sure @enocera I meant separated as in a different file, but we can keep them as subsets of the same dataset, as we do for many other datasets. So I agree with your remarks

OK, we are on the same page, then.

Oct 23 '25 15:10 enocera

Dear @ecole41 I (finally!) had the chance to look at your implementation of the data set. I would say that most of it is very nicely done. I suggest to use your implementation as a baseline w.r.t. that of @jekoorn . I have some suggestions about the treatment of uncertainties, though.

muon channel (1D and 2D distributions). The label of the luminosity uncertainty should be changed from ATLASLUMI15 to ATLASLUMIRUNII (sorry, my bad). I think that the uncertainties with labels Data stat. unc., Sig. stat. unc., Bkg. stat. unc.,Alternative MC unf. unc. have to be treated as ADD UNCORR. The reason why I am saying this is that because, by reading Sects. 7.2-7.3 of the paper, I seem to understand that unc. in the uncertainty label stands for "uncorrelated" (and not for "uncertainty"). They indeed say that there are statistical uncorrelated components in the systematic uncertainties related to the muon trigger, identification, vertex association and isolation efficiency, and that the MC uncertainties are uncorrelated. I would treat all the other uncertainties as MULT CORR, including the normalisation uncertainties that are currently defined as UNCORR (why did you choose this? shouldn't a normalisation uncertainty be correlated across bins by definition?).
combined lepton channel (1D and 2D distributions). The label of the luminosity uncertainty should be changed from ATLASLUMI15 to ATLASLUMIRUNII (sorry, my bad). I would treat the Alternative MC unf. unc. and the Basic unf. unc. as ADD UNCORR. I would treat all the other uncertainties as ADD CORR (ADD because they are obtained by rotating back to the physical basis a set of uncertainties determined in the orthogonal basis). I see that you treat some of them as UNCORR, but I'm not able to understand whether this is correct or not just by reading the paper. Do you have any other source of information? Can you please clarify tour choice? Thanks!

Oct 28 '25 11:10 enocera

Dear @ecole41, @enocera, and @juanrojochacon, I have updated my implementation following Emanuele's request in #2380 , and cross-checked with Ella's numbers, which should all add up perfectly. I I suppose we can now move to FK-table generation.

Oct 31 '25 11:10 jekoorn

Great! I understand that your numbers and those from Ella are identical?

Oct 31 '25 11:10 juanrojochacon

If so yes, while @enocera completes his review i would start with the grid generation.

Oct 31 '25 11:10 juanrojochacon

For the NNLO grid implementation, as we agreed I would suggest that @ecole41 and @jekoorn proceed in parallel with the implementation of the PineFarm cards etc, produce a low-stats grid with NNLOJET, and check that they get consistent numbers. Then for the final, high stat grids, we only need to do it once

Oct 31 '25 11:10 juanrojochacon

at least this is the plan we made with @enocera and @scarlehoff at Morimondo, and I still think it is a good idea which saves time on the long run

Oct 31 '25 11:10 juanrojochacon

Great! I understand that your numbers and those from Ella are identical?

Whereas I initially thougt yes, it seems there is some deviation in the numbers for the <only muon, double differential> set. Interestingly, the other double differential, which is generated using the same function, does seem to be correct. I will investigate if somethig funny is going on with my code, and compare with the hepdata tables

Oct 31 '25 12:10 jekoorn

ok, this is precisely why benchmarks are useful ;)

With the help of the benchmark, should be possible to understand where the problem is

Then we move to the NNLO grid generation

Oct 31 '25 12:10 juanrojochacon

Hi all, I have looked a bit closer at the difference between my numbers and Ella's (why some were swapped around).

To be more precise, it seems that in my implementation and your implementation of the DDIF sets (I checked the data and kinematic tables), we have the following structure in the data file in terms of the HEPData tables:

lep_physical_plus_absetamtw_mtw0
lep_physical_plus_absetamtw_mtw1
lep_physical_plus_absetamtw_mtw2
lep_physical_plus_absetamtw_mtw3
lep_physical_plus_absetamtw_mtw4
lep_physical_minus_absetamtw_mtw0
lep_physical_minus_absetamtw_mtw1
lep_physical_minus_absetamtw_mtw2
lep_physical_minus_absetamtw_mtw3
lep_physical_minus_absetamtw_mtw4

But for the muon data, from what I understand, your filter swaps them around in the following way:

muo_plus_absetamtw_mtw0
muo_minus_absetamtw_mtw0
muo_plus_absetamtw_mtw1
muo_minus_absetamtw_mtw1
muo_plus_absetamtw_mtw2
muo_minus_absetamtw_mtw2
muo_plus_absetamtw_mtw3
muo_minus_absetamtw_mtw3
muo_plus_absetamtw_mtw4
muo_minus_absetamtw_mtw4

which makes sense given this line in your code

   elif observable == "WPWM_DDIF_LEP":
        tables = []
        for i in range(5):
            tables.append(f"lep_physical_plus_absetamtw_mtw{i}")
        for i in range(5):
            tables.append(f"lep_physical_minus_absetamtw_mtw{i}")
    elif observable == "WPWM_DDIF_MUON":
        tables = []
        for i in range(5):
            tables.append(f"muo_plus_absetamtw_mtw{i}")
            tables.append(f"muo_minus_absetamtw_mtw{i}")

where you append the data to your tables either 'first all plus then all minus' or 'alternating plus/min'.

But in the kinematics file for DDIF MUON they are not swapped around and instead follow the structure of "first all plus, then all minus". I assume that we would like to do the former and have first all plus tables, and then all minus.

So the numbers are correct in the end, just misaligned. I guess this is an easy fix ;-)

Nice! Then we are set.

Nov 04 '25 14:11 jekoorn

@jekoorn Thanks for figuring that out. I have now adjusted my filter.py script so that this structure should now match the structure in your PR. Let me know if there are still any inconsistencies.

I have also changed the uncertainty treatments to match @enocera's suggestions. I just wanted to check that Stat. unc. and Uncor. syst. unc. should also be treated as ADD UNCORR?

Nov 06 '25 11:11 ecole41

I have also changed the uncertainty treatments to match @enocera's suggestions. I just wanted to check that Stat. unc. and Uncor. syst. unc. should also be treated as ADD UNCORR?

Yes, thanks.

Nov 06 '25 11:11 enocera

Thanks for the work @ecole41 @jekoorn great that we are converging here.

Question for @enocera : the next step is to generate NNLO grids and compare them with the implemented data. Will work related to the NNLO grid calculation be discussed in this PR or should Ella ane Jelle open a separate one?

Nov 06 '25 14:11 juanrojochacon

Question for @enocera : the next step is to generate NNLO grids and compare them with the implemented data. Will work related to the NNLO grid calculation be discussed in this PR or should Ella ane Jelle open a separate one?

@juanrojochacon Grids will be generated with NNLOjet. I understand that production has been automatised as much as possible relying on the piece of information contained in the commondata. According to our established workflow I expect:

that a data/theory comparison (including the computation of the chi2) be discussed as part of this PR, see e.g. #2360;
that a PR be opened with the relevant gird(s) in the appropriate repository https://github.com/NNPDF/theories_slim, see e.g. https://github.com/NNPDF/theories_slim/pull/67.

Discussion can occur in either PR, though it should be possibly cross-referenced.

Nov 06 '25 14:11 enocera

ok clear @enocera . Yes, indeed, grid production should be automated with pinefarm, but as usual the proof is in the pudding.

I suggest that @ecole41 and @jekoorn try independently to generate the NNLO grid and then they both cross-check each other results. Once the low-stats grid is produced, we can produced to the high-stats grid generation and then produce the FK tables etc

Nov 06 '25 14:11 juanrojochacon

@enocera looking closely at this dataset, this is a W+J at Leading Order (which is a few orders of magnitude more expensive to compute than just W).

So perhaps we want to do a first NLO check before moving to NNLO?

Nov 12 '25 11:11 scarlehoff

So perhaps we want to do a first NLO check before moving to NNLO?

Of course. The idea is to first make a cheap run (perhaps you can even limit statistics a little?)

Nov 12 '25 16:11 enocera