data-juicer icon indicating copy to clipboard operation
data-juicer copied to clipboard

Checkpointer support for Ray-Mode

Open yxdyc opened this issue 1 year ago • 1 comments

Search before continuing 先搜索,再继续

  • [X] I have searched the Data-Juicer issues and found no similar feature requests. 我已经搜索了 Data-Juicer 的 issue 列表但是没有发现类似的功能需求。

Description 描述

Currently, the dj_ckpt_manager and executor only support the HF dataset. They essentially performs three actions:

  1. Tracks and saves the executed operation list from OP_1 to OP_i.
  2. Saves the processed dataset ( D_{op_i} ).
  3. Checks and loads ( D_{op_i} ) when the feature is enabled during re-processing.

It would be straightforward to extend this feature into ray_executor. For step 2 and 3, we can implement a few new interfaces for snapshotting Ray Data states and using persistent storage.

Use case 使用场景

No response

Additional 额外信息

No response

Are you willing to submit a PR for this feature? 您是否乐意为此功能提交一个 PR?

  • [X] Yes I'd like to help by submitting a PR! 是的!我愿意提供帮助并提交一个PR!

yxdyc avatar Nov 12 '24 11:11 yxdyc

@yxdyc I'm a newbie for ray, I cannot understand ray's local data writing, I means the local://, it will write to the disk of host server? client node or work node, thanks

vincent-pli avatar Nov 14 '24 06:11 vincent-pli