lsf-drmaa icon indicating copy to clipboard operation
lsf-drmaa copied to clipboard

lsf-drmaa forces `SUB_NOTIFY_END` when `output_path` is set unless undocumented `prepand_report` configuration option is set

Open jrandall opened this issue 10 years ago • 5 comments

We have spent the past few days tracking down an issue in which LSF-DRMAA wasn't working on our cluster. In our cluster, bsub -N and bsub -B are disabled, and so trying to submit a job with SUB_NOTIFY_BEGIN or SUB_NOTIFY_END options set results in a failure.

We have finally tracked the issue down to lsf-drmaa. The hint was that we are able to submit jobs as long as output_path is not set. If, however, output_path is set (i.e. by calling setOutputPath()), then the job submissions fail.

The code at fault appears to be: https://github.com/PlatformLSF/lsf-drmaa/blob/master/lsf_drmaa/job.c#L833-841

I'm not sure what the intention of this code block is, but the end result it that if the conditions are met, it sets the SUB_NOTIFY_END option (even if DRMAA_BLOCK_EMAIL is set and resulted in the block before (https://github.com/PlatformLSF/lsf-drmaa/blob/master/lsf_drmaa/job.c#L803-831) making sure that SUB_NOTIFY_END is not set!

The if condition basically says if the session's prepand_report_to_output boolean is false, and the SUB_NOTIFY_END option is not set, and the output_path is not null:

if( !((lsfdrmaa_session_t*)session)->prepand_report_to_output
                &&  (req->options & SUB_NOTIFY_END) == 0
                &&  output_path != NULL )

The session variable prepand_report_to_output is set from the configuration file in session.c and it defaults to false. It is set from lsf_drmaa.conf by the (undocumented) prepand_report configuration directive.

We are able to work around our issue by setting: prepand_report: 1 in lsf_drmaa.conf

I don't know what prepand_report is meant to do (is that a misspelling of "prepend" or is it meant to be "prep_and_report"?) and cannot find any documentation on it, but the default behaviour of lsf-drmaa in which SUB_NOTIFY_END is forced to be set even when email is specifically blocked doesn't seem right.

jrandall avatar Mar 22 '15 18:03 jrandall

Were you able to resolve this issue internally?

adamsla avatar May 07 '18 16:05 adamsla

We continue to employ the workaround, which is to set prepand_report: 1 in lsf_drmaa.conf

jrandall avatar May 30 '18 16:05 jrandall

Possible fixes for this issue:

  • document what prepand_report does and note that it is required for clusters with notifications disabled
  • fix job.c so that SUB_NOTIFY_END is never set when DRMAA_BLOCK_EMAIL is set (N.B. I believe this can be accomplished simply by moving the block at https://github.com/PlatformLSF/lsf-drmaa/blob/master/lsf_drmaa/job.c#L833-841 to https://github.com/IBMSpectrumComputing/lsf-drmaa/blob/master/lsf_drmaa/job.c#L802.

jrandall avatar May 30 '18 16:05 jrandall

Still running into this issue 7 years later (later version of cluster with same no emails policy). Would you except a pull request to implement one of Josh's fixes?

mp15 avatar Jun 02 '22 18:06 mp15

I would say yes. That's why, before I left IBM, we placed these API's on GitHub. Somebody has to create the pull request though.

TheWitness avatar Jun 04 '22 13:06 TheWitness