dask-jobqueue icon indicating copy to clipboard operation
dask-jobqueue copied to clipboard

LSF scheduler seems to have redundant memory keyword

Open josephhardinee opened this issue 7 years ago • 5 comments

It looks like specifying the lsf scheduler requires two memory keywords (mem, and memory). Should mem on the LSF scheduler be turned into memory so that it lines up with other schedulers and the core object?

josephhardinee avatar Nov 27 '18 14:11 josephhardinee

Also it appears that memory should be of form '24 GB', but mem need to be passed as an integer number of bytes.

josephhardinee avatar Nov 27 '18 15:11 josephhardinee

Historically, the memory keyword was for the dask-worker command --memory-limit option. Now it is the overall memory that should be used by all the dask worker processes. In default cases in most dask-jobqueue systems implementations, it is also used as the memory reservation for the scheduler. But in every implementations, you have either a resource_spec, job_mem, mem kwarg that can be used to override this value if the user wants to, but you can leave it as None.

We discussed leaving or not the duplicated ncpus or mem kwarg in #82 or more recently in https://github.com/dask/dask-jobqueue/issues/172#issuecomment-430750396, and maybe in #181 too.

I'm still undecided on this, especially for the memory kwarg. It seems unlikely that a user wants to have a different reservation that what it uses on dask_side. And in this case, maybe he could implement it with job_extra kwarg.

guillaumeeb avatar Nov 27 '18 15:11 guillaumeeb

Okay, I don't have a strong preference either way except that I got a little confused.

Also I assume we would want the input form for the two to match (Using unit in the name as a string, vs passing an int for memory). To update this I assume we'd want to test the type for an int or str on mem keyword and do the conversion that way so as to not break previously existing scripts? Or would you rather it is just updated for both to take a str?

josephhardinee avatar Nov 27 '18 15:11 josephhardinee

Hm, I think it would be a minor breaking change and not impact a lot of people. I'd be more in favor of just update mem to be a string, and update the code using it.

guillaumeeb avatar Nov 27 '18 15:11 guillaumeeb

I think we can do two things to clarify this:

  • Update the documentation and examples to explain that for each job queuing system, there are workers kwargs, and job related kwargs.
  • For LSF, modify the ncpus and mem kwarg, prefixing them using job_.

guillaumeeb avatar Aug 30 '22 06:08 guillaumeeb