[WIP] Async checkpointing
Very much WIP, overrides bunch of stuff I'm not sure that is stable to do. TODO: discuss if we want to do a bit different approach (and more easily maintainable)
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
bump
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hi @S1ro1, I am eagerly waiting for this to be merged. Any idea of how much time it might take? Is there any help that I can provide (tests/dev)?
Also, a side question, looking at the PR, it seems like it doesn't support safetensor serialization. Are there plans to support it, or are you open to contributions for that?
Thanks
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@S1ro1 Just a gentle reminder. Let me know if you'd like me to take a stab at any pending changes, and you can review? Thanks
cc: @SunMarc
@S1ro1 is not working at HF anymore, so feel free to take over his PR if you want @romitjain !
Sure @SunMarc. Let me get back on this next week
What is the status of this?