looper .sub file environment variables do not have a value

I saw that divvy is being used to generate the .sub files for a looper job submission. However, I could not easily find anywhere in the vignettes describing how to set the environment variables such as {MEM} and {CORES}. It would be nice if these variables were set to a default value without any configuration, or if there was extra description in the vignettes on how to set them.

May 27 '20 17:05 aaron-gu

how are you trying to use them? using divvy you set them with -c mem=8000 cores=1, for example

http://divvy.databio.org/en/latest/cli/

May 27 '20 17:05 nsheff

I am just using looper run project_config.yaml

May 27 '20 17:05 aaron-gu

ok -- looper should default to using the localhost template which doesn't have those variables... so, that doesn't make sense to me... can you be more specific about what you're trying to do? also, try the above

May 27 '20 17:05 nsheff

I set up a PEP project for my bedshift code to generate the 100 samples for every parameter combination. I followed the PEP and looper tutorials pretty smoothly until it came to running the looper job, where I got the error sbatch: error: invalid memory constraint {MEM}

Also, I'm not sure how to run the divvy command with looper, since there are many .sub files generated.

May 27 '20 18:05 aaron-gu

Here's an example of a .sub file:

#!/bin/bash
#SBATCH --job-name='bedshift_run_add1'
#SBATCH --output='looper_output/submission/bedshift_run_add1.log'
#SBATCH --mem='{MEM}'
#SBATCH --cpus-per-task='{CORES}'
#SBATCH --time='{TIME}'
#SBATCH --partition='standard'
#SBATCH -m block
#SBATCH --ntasks=1
#SBATCH --open-mode=append

echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`

cmd="/sfs/qumulo/qhome/ag5ym/databio/bedshift_paper/pep_project/bedshift.sh /project/shefflab/resources/regions/LOLACore/hg19/encode_tfbs/regions/wgEncodeAwgTfbsUwHek293CtcfUniPk.narrowPeak 0.1 0.0 0.0 100 "

y=`echo "$cmd" | sed -e 's/^/srun /'`
eval "$y"

May 27 '20 18:05 aaron-gu

ah, I see. you're on rivanna -- so we set the looper default to submit jobs to slurm.

there's lots of things you can do.

try using looper --package to run using a local template, to test. divvy list shows available templates
if you want to use the slurm template, then of course you must provide all the variables for that template. you can do it like I mentioned above: looper run -c cores=1 mem=4000
really, you should provide in your pipeline interface these variables. you do this using the compute section. http://looper.databio.org/en/latest/pipeline-interface-specification/#compute

you could just add this to your interface:

compute:
  mem: 4000
  cores: 1

May 27 '20 18:05 nsheff

Got it, thanks! Is there a way to make it easier to find that section of documentation? The order I went through the docs was Introduction > Defining a Project > Running on a cluster, and then I followed the links to divvy to try to solve the issue.

May 27 '20 18:05 aaron-gu

I added a bit of clarification on this to the docs for the upcoming release.

Jun 06 '24 13:06 donaldcampbelljr

Solved with v1.8.1 Release.

Jun 06 '24 14:06 donaldcampbelljr