Provide custom header
Currently we take Pythonic keyword arguments and use them to generate a job_header. While often convenient, this transformation can also get in the way. Sometimes things would be simpler if we just provided our own job header directly. This can be particularly convenient for picky clusters, or for IT users that are more comfortable with job scripts than with Python arguments.
Should we accept a job_header= keyword or configuration value to override the job_header construction?
No strong feeling on this one. My only small concern is that it adds a bit of complexity with a non-trivial interaction with the other parameters.
If we do that, it'd be good to be very clear in the docstring and mention which parameters are overriden by using job_header. There will be parameters for which this is not 100% clear cut, for example should job_extra be ignored if job_header is specified? What about config_name some parameters from the named config will be used but some will be silently ignored. Another possibility is to have a job_header_template, so that you can chose to reuse some parameters, e.g. cores and memory rather than build it by hand.
Taking a step back, something I would say is that at the moment the job_script is built by small pieces in each cluster class (with some logic in the base class as well) and sometimes it is quite hard to have an overall view of what the job_script is going to look like. Having a jinja template (or something simpler if possible) was discussed at one point, maybe that's a good idea to investigate. Maybe it would help to have an overall vision of the job_script (including job_header) from the python arguments and reduce the need for an additional job_header argument?
Side-comment: IIRC for picky clusters you can use job_extra (by the way job_extra and env_extra are not very good names ...). job_extra will be at the end of the header and we have had reports where this was overriding the previous values of header. Note: this is the kind of edge cases that will not work uniformly across cluster configurations and job queue schedulers.
I think that implementing #133 and give the ability to override the template would be a good solution.
I ran into problems with LSFCluster where it was injecting lines into the header that LSF would not accept. I wanted something like a job_remove= keyword. But this can probably be fixed by improving the LSFCluster class and removing some options that are, perhaps, too specific to the clusters of the people that maintain that Cluster class.
Additionally, the terms that the group used were very different from project and queue. They had other terms like partition. We started from a job submission script that we knew worked, and had to reverse engineer which Python parameters affected which job submission script outputs. It felt incorrect. But this can probably be fixed with better documentation.
One approach would be to allow users to explicitly skip lines that their cluster manager finds troublesome. I put this into a commit in the spec-rewrite branch for now (I need it for testing) here: https://github.com/dask/dask-jobqueue/pull/307/commits/172ecd43917f2f35a6d51d1fcb825267521425f1
What do people think of this approach?
My feeling is that currently there are (implicitly) two tiers of support for the job schedulers:
- support: SGE, PBS, SLURM
- best-effort support: all the others including LSF
In my mind, moving a scheduler to the "support" tier needs at least a docker-compose CI set-up. Another great thing to have is one (ideally more) active person with access to this job scheduler so that modifications can be tested on a real cluster.
Out of curiosity, what kind of lines did you want to be removed from the header?
To get things to work on Summit (Oak Ridge National Labs machine, currently #1 in the Top 500) I had to remove -R, -M and -n.
One approach would be to allow users to explicitly skip lines that their cluster manager finds troublesome. I put this into a commit in the spec-rewrite branch for now (I need it for testing) here: 172ecd4 What do people think of this approach?
I have to say I don't like it. I really think we should provide a templated header or a templated job script, and give the possibility to replace it.
So we would have something like:
header = """
#PBS -l walltime={walltime}
#PBS -l select=1:ncpus={nthreads}:mem={memory}
...
"""
And then rather than build up the job_header we just plug it in?
def job_script(self):
header = self.job_header % self.header_options
...
If so, I'm in favor
It might be tricky though. I think that there are some lines that are added only conditionally.
Yes, that's the idea.
Using Jinja2 should allow conditionnals!
My general feeling is that LSFCluster is a bit rough around the edges. I would be enclined towards improving LSFCluster in the short-term to do what you need and leave the jinja2/templating option for later.
To get things to work on Summit (Oak Ridge National Labs machine, currently #1 in the Top 500) I had to remove -R, -M and -n.
@mrocklin could you let us know what you replaced -R, -M and -n with. I am guessing you still need to specify somehow:
- the memory used by your job
- the number of cores used by your job
- that all of the allocated cores should be on the same node
It might be tricky though. I think that there are some lines that are added only conditionally.
yes, for example setting resource_spec would replace your select=1... line I think.
I think at the moment there are a few different way to get the job header to make the job scheduler happy and is not very consistent across job schedulers. For example, resource_spec does not exist in LSFCluster and SLURMCluster but exists in SGECluster and PBSCluster.
Using Jinja2 should allow conditionnals!
It would but IMO the readability of the jinja2 template is also something important to keep in mind.
@mrocklin could you let us know what you replaced -R, -M and -n with. I am guessing you still need to specify somehow:
Nothing. I suspect that it is defaulting to giving me one full node per job
Nothing. I suspect that it is defaulting to giving me one full node per job
😄 Yeah, with Summit, there's no point using less than a full node!!
Closing this: header_skip kwarg is implemented, for the rest we should rely on a Jinja template.