nnpdf Saving pseudodata of replicas whit multiple fits

When using parallel models, pseudodata are not saved for each replica, resulting in the following error when running multiple fits with n3fit.

Cannot request that multiple replicas are fitted and that pseudodata is saved. Either set `fitting::savepseudodata` to `false` or fit replicas one at a time.

This branch ensures that pseudodata for each replica are saved, even when parallel_models=true is set.

Jul 31 '24 09:07 achiefa

Do you actually need this? We disabled this for multiple replicas due to issues with reproducibility.

And the (main) issue is not that easy to solve, namely that in parallel replicas datasets with a single point must all enter training or validation. So if you generate the data in parallel then you cannot reproduce it with the vp functions to do so.

If you need it we need to find a way to tag the data as having been generated in parallel.

Jul 31 '24 09:07 scarlehoff

Thanks for starting this. I actually meant doing something along the lines of replicas_training_pseudodata = collect("training_pseudodata", ("replicas",)) in n3fit_data.py.

I suspect this will save the pseudodata all in the same folder since somehow the output folder is probably the nnfit/replica_1 folder, using the table_folder from N3FitEnvironment but perhaps there is a way to solve this.

Jul 31 '24 09:07 RoyStegeman

And the (main) issue is not that easy to solve, namely that in parallel replicas datasets with a single point must all enter training or validation. So if you generate the data in parallel then you cannot reproduce it with the vp functions to do so.

This is easy to solve by always including datasets with a single point in the training data.

But indeed, we do need it because otherwise doing parameter determinations with CRM will be sketchy and with TCM will be impossible using GPU.

Jul 31 '24 09:07 RoyStegeman

This is easy to solve by always including datasets with a single point in the training data.

I would not do that. I'd rather go for the tag.

In any case, before continuing, could you check that the sequential and parallel fit (without taking into account the 1 point datasets) produce exactly the same pseudodata and trvl masks? (if that works we can think about the rest)

Jul 31 '24 10:07 scarlehoff

I would not do that. I'd rather go for the tag.

Why? If we're serious about using GPU as well as CPU fits, then I'd say they should be as close as possible in behaviour, no? The choice of what to do with those datasets was a bit arbitrary and has been changed over time anyway.

Jul 31 '24 10:07 RoyStegeman

Then we should go for an actual solution instead of changing the behaviour every time it becomes an inconvenience.

But first let's make sure that the rest works the same, then I'll take care of masking the 1 point datasets in the same way in both modes, eventually.

Jul 31 '24 12:07 scarlehoff

I've moved the commits from this branch to #2276

Feb 19 '25 08:02 scarlehoff