Aron Jansen comments

Results 24 comments of


                                            Aron Jansen

Avoiding duplicated computations by having a single observable model

Seems to work fine, and gives the same results as trvl-mask-layers.

Avoiding duplicated computations by having a single observable model

Ok perfect, thanks :)

Avoiding duplicated computations by having a single observable model

@scarlehoff @Radonirinaunimi Positivity is included in the validation model, I remember we discussed this before, and if I remember correctly there was some disagreement on whether this was necessary or...

Avoiding duplicated computations by having a single observable model

What I mean is we would have one model of the form (say we only have the DEUTERON observable) `x -> pdf -> DEUTERON -> (DEUTERON_tr, DEUTERON_val)`, where the tuple...

Avoiding duplicated computations by having a single observable model

In this PR I've already decoupled the computation of the observable from the masking+loss, that was quite simple and gives identical results. The tricky part is how to use that...

Avoiding duplicated computations by having a single observable model

Ah I hadn't thought about that, you're right that conventionally the validation at step t is computed after training for t steps. My proposal would have a shift by one...

Avoiding duplicated computations by having a single observable model

True, but that should be easy to solve. Just save the weights at every step and when the stopping condition hits, instead of just stopping, revert to the previous epoch....

Avoiding duplicated computations by having a single observable model

This old tensorboard profile illustrates the speedup. The gaps will be mostly removed by epochs-to-batches. Of the rest the validation step is more than 50% of the training step. ![image](https://github.com/NNPDF/nnpdf/assets/19712066/7610bb2e-d25c-47b5-9208-37b810b0ad73)...

Avoiding duplicated computations by having a single observable model

Yes the experimental one should stay separate I agree, it also doesn't play any role in performance. And I agree the approach I proposed is definitely not the cleanest. What...

Avoiding duplicated computations by having a single observable model

## Update I made some progress, and have good hopes that a much nicer implementation than I suggested above is possible. (Just hope still because I haven't looked at reverting...