DRAM icon indicating copy to clipboard operation
DRAM copied to clipboard

Submit tasks as job arrays and fix RNAs in distill summaries

Open tpall opened this issue 2 months ago • 1 comments

Changes

  • Added array_size = 10 parameter to nextflow.config and array to conf/base.config for more efficient cluster execution.
  • Fix inclusion of rRNA, tRNA, and quast summaries to genome_stats.tsv and metabolism_summary.xlsx in bin/distill.py script.
  • Refactor channel usage (Channel to channel) for consistency across workflows and improve readability + usage of implicit variable within closures (e.g. it.name to it -> it.name)

Computing environment and command

  • nextflow version 25.10.0.10289
  • openjdk 22.0.1-internal 2024-04-16
  • singularity 3.8.5
  • slurm
  • x86_64 GNU/Linux
nextflow run tpall/DRAM -r dev --input_fasta ./DRAM/input_fasta --outdir ./DRAM/call-annotate-distill --threads 8 --summarize --qc --use_kofam --use_dbcan --use_merops --use_viral --use_methyl --use_sulfur -profile singularity --slurm --partition main -with-report -with-trace -with-timeline --array_size 10 --queue_size 10 -resume --annotate 

tpall avatar Nov 27 '25 10:11 tpall

Hey @tpall Thanks for this. The job array addition it nice. There is a larger planned update to batch a lot of the inputs into singular jobs to reduce the burden on the queue, since running DRAM with lots of inputs can overwhelm a SLURM scheduler, but adding in job arrays, which weren't supported on the version of Nextflow we initially developed DRAM2 on, but we recently moved to >=24 (we should lock in >=24.04.0 since there are early 24.* prereleases out there if we add in job arrays). I will have to do some testing on utilizing job arrays and their implication. Because from my initial testing it seems like it stops the next stop from proceeding until their are enough inputs to fill an array. Which might be ok. But if we are going to be doing batching anyway, it might not be that important and not worth it.

Also thanks for some of the other QoL updates like updating some of the syntax to DSL2 (Channel -> channel, etc.).

I will have to more fully review the code, which I can get to in a couple weeks. I have deadline for next week, and probably won't be able to review much before then.

But I will leave just a couple of quick thoughts.

Thanks again

madeline-scyphers avatar Dec 01 '25 22:12 madeline-scyphers