ray icon indicating copy to clipboard operation
ray copied to clipboard

[Data] Specify memory resource for each operator

Open raulchen opened this issue 2 years ago • 0 comments

Right now, we only specify CPU/GPU resources for Dataset operators, not memory.

When an OOM happens, we need to increase num_cpus/num_gpus to reduce the number of concurrent tasks to avoid OOM. This approach is not intuitive. And the tuned configuration is not portable to other clusters (as they may have different cpu/gpu-to-memory ratio).

We should directly memory resource for the operators. For non-UDF-based operators, this configure should be transparent to the users, we should be able to determine it based on the input data size. For UDF-based operators, we need to figure out a proper API for the users to specify memory.

raulchen avatar Jun 15 '23 04:06 raulchen