Results 10 comments of Niket Kumar

@sw32-seo Can you please check Orbax version in all regions?

Hi David, Can you please try [0.4.7](https://github.com/google/orbax/blob/main/checkpoint/CHANGELOG.md#fixed), which might help.

Thanks for debugging the issue and associating it with SaveArgs.aggregate option! While we recreate the issue in our dev setup, please switch to aggregate=False if that works for your use...

Thank you for reporting this issue. We are working on the fix. I hope as a work around, you are fine with renaming the prefix to something like `pponetworks`?

How are you planning to construct the Generator object back from the restored *state*?

Thanks for sharing the details. Will using `numpy.random.get_state(legacy=False)` meet your requirements? In that case, Orbax already supports it. Please take a look at this unit test: https://github.com/google/orbax/blob/53e2f22234717d29eca59282b496d3a6ba897b84/checkpoint/orbax/checkpoint/random_key_checkpoint_handler_test.py#L118 Alternatively, using Json...

Thanks for clarifying the difference between MT19937 and PCG64! A JSON based solution is ideal for this scenario. I will look into it.

Based on the above error stack, it is not likely that `checkpoint_metadata_store` was called from a non-primary host. The checkpoint_metadata_store write is called right after the `tmp` dir creation, so...

Can you please attach additional details like env and error stack. May be generic debugging tips for JupyterLab can be helpful. https://stackoverflow.com/questions/74154123/how-to-debug-jupyter-kernel-crashes

> repro code in the single cell does not crash (at least I've never seen) Just to be clear, here by `cell` you meant the notebook cell. Correct?