lsf-drmaa forces `SUB_NOTIFY_END` when `output_path` is set unless undocumented `prepand_report` configuration option is set
We have spent the past few days tracking down an issue in which LSF-DRMAA wasn't working on our cluster. In our cluster, bsub -N and bsub -B are disabled, and so trying to submit a job with SUB_NOTIFY_BEGIN or SUB_NOTIFY_END options set results in a failure.
We have finally tracked the issue down to lsf-drmaa. The hint was that we are able to submit jobs as long as output_path is not set. If, however, output_path is set (i.e. by calling setOutputPath()), then the job submissions fail.
The code at fault appears to be: https://github.com/PlatformLSF/lsf-drmaa/blob/master/lsf_drmaa/job.c#L833-841
I'm not sure what the intention of this code block is, but the end result it that if the conditions are met, it sets the SUB_NOTIFY_END option (even if DRMAA_BLOCK_EMAIL is set and resulted in the block before (https://github.com/PlatformLSF/lsf-drmaa/blob/master/lsf_drmaa/job.c#L803-831) making sure that SUB_NOTIFY_END is not set!
The if condition basically says if the session's prepand_report_to_output boolean is false, and the SUB_NOTIFY_END option is not set, and the output_path is not null:
if( !((lsfdrmaa_session_t*)session)->prepand_report_to_output
&& (req->options & SUB_NOTIFY_END) == 0
&& output_path != NULL )
The session variable prepand_report_to_output is set from the configuration file in session.c and it defaults to false. It is set from lsf_drmaa.conf by the (undocumented) prepand_report configuration directive.
We are able to work around our issue by setting:
prepand_report: 1 in lsf_drmaa.conf
I don't know what prepand_report is meant to do (is that a misspelling of "prepend" or is it meant to be "prep_and_report"?) and cannot find any documentation on it, but the default behaviour of lsf-drmaa in which SUB_NOTIFY_END is forced to be set even when email is specifically blocked doesn't seem right.
Were you able to resolve this issue internally?
We continue to employ the workaround, which is to set prepand_report: 1 in lsf_drmaa.conf
Possible fixes for this issue:
- document what
prepand_reportdoes and note that it is required for clusters with notifications disabled - fix
job.cso thatSUB_NOTIFY_ENDis never set whenDRMAA_BLOCK_EMAILis set (N.B. I believe this can be accomplished simply by moving the block at https://github.com/PlatformLSF/lsf-drmaa/blob/master/lsf_drmaa/job.c#L833-841 to https://github.com/IBMSpectrumComputing/lsf-drmaa/blob/master/lsf_drmaa/job.c#L802.
Still running into this issue 7 years later (later version of cluster with same no emails policy). Would you except a pull request to implement one of Josh's fixes?
I would say yes. That's why, before I left IBM, we placed these API's on GitHub. Somebody has to create the pull request though.