drmaa-python icon indicating copy to clipboard operation
drmaa-python copied to clipboard

drmaa communication exception/errors while running on SLURM

Open radaniba opened this issue 6 years ago • 0 comments

Hello everyone I am trying to run some jobs on a SLURM cluster and I am running through this issue

drmaa.errors.DrmCommunicationException: code 2: unable to send message to qmaster using port 6444 on host xxxx 

The code I am using to test prior to running any jobs is

from __future__ import print_function
import os
import drmaa

LOGS = "logs/"
if not os.path.isdir(LOGS):
    os.mkdir(LOGS)

s = drmaa.Session()
s.initialize()
print("Supported contact strings:", s.contact)
print("Supported DRM systems:", s.drmsInfo)
print("Supported DRMAA implementations:", s.drmaaImplementation)
print("Version", s.version)

jt = s.createJobTemplate()
jt.remoteCommand = "/usr/bin/echo"
jt.args = ["Hello", "world"]
jt.jobName = "testdrmaa"
jt.jobEnvironment = os.environ.copy()
jt.workingDirectory = os.getcwd()

jt.outputPath = ":" + os.path.join(LOGS, "job-%A_%a.out")
jt.errorPath = ":" + os.path.join(LOGS, "job-%A_%a.err")

jt.nativeSpecification = "--ntasks=2 --mem-per-cpu=50 --partition=1day"

print("Submitting", jt.remoteCommand, "with", jt.args, "and logs to", jt.outputPath)
ids = s.runBulkJobs(jt, beginIndex=1, endIndex=10, step=1)
print("Job submitted with ids", ids)

s.deleteJobTemplate(jt)

That I found useful and was posted by a member of the community here.

Can anyone tell me if we need to setup any environment variables prior to run jobs on slurm ?

Thanks

radaniba avatar Feb 19 '19 19:02 radaniba