[question] Errors running bactopia singularity container in slurm cluster (ComputeCanada)
Hello
I have been trying to run the bactopia pipeline from the singularity (supported in my cluster) container pulled from quay.io.
I managed to download the datasets, which worked only by running the container with a clean environment singularity exec -e ...
My first try with the main bactopia pipeline failed to execute singularity from inside the container to pull all the necessary tools from the registry. I got them outside the pipeline after failing to adjust it. The file of filenames check did not report an issue
Now, I am trying to get the main pipeline to run again and there seems to be an error in submitting jobs.
#!/bin/bash
#SBATCH --account=XXXXXXXXX
#SBATCH --mem-per-cpu=4G # GB of memory per cpu core
#SBATCH --time=01:00:00
#SBATCH --ntasks=4 # tasks in parallel / number of cores
#SBATCH --cpus-per-task=1 # number of cores per task
#SBATCH --job-name="main_workflow_bactopia"
#SBATCH --chdir=/scratch/mdprieto/
#SBATCH --output=./temp_results/%j_main_bactopia_nov26.out
################################## preparation #########################################
# load singularity
module load singularity
odule load nextflow
# mount my filesystem inside container, localscratch allows job to use compute node temp folder
BIND_MOUNT="-B /home,/project,/scratch,/localscratch,/localscratch:/temp,/opt,/cvmfs"
# git directory with input variables
kleb_git="/home/mdprieto/git/klebsiella/input"
# make output directory if necessary\
mkdir -p /scratch/mdprieto/temp_results/bactopia_output/
# define new temp folders for singularity
mkdir -p /scratch/$USER/singularity/tmp
export SINGULARITY_CACHEDIR="/project/6007413/cidgoh_share/singularity_imgs"
export SINGULARITY_TMPDIR="/scratch/$USER/singularity/tmp"
# export PATH to run singularity to container
export SINGULARITYENV_APPEND_PATH=$PATH
################################## BACTOPIA #########################################
singularity exec -e $BIND_MOUNT bactopia_2.1.1.sif bactopia \
--samples $kleb_git/kleb_qatar_fofn.txt \
--datasets /scratch/mdprieto/datasets \
--outdir /scratch/mdprieto/temp_results/bactopia_output/ \
--species "Klebsiella pneumoniae" \
--genome_size median \
--singularity_cache $SINGULARITY_CACHEDIR \
--max_cpus 2 \
--verbose \
-profile slurm,singularity \
-resume
I have tried to change the -profile to slurm or singularity alone too with the same results. Also tried to run with more memory just in case.
My error is that the command fails to run BACTOPIA: GATHER SAMPLES repeatedly. I am not that knowledgeable in nextflow yet so I am struggling to troubleshoot, any suggestions are welcome.
Thanks
2022-11-27 13:32:48:root:STDERR -
2022-11-27 13:32:48:root:INFO - Checking if environment pre-builds are needed
2022-11-27 13:32:48:root:DEBUG - Working on bactopia
2022-11-27 13:32:48:root:INFO - Found Singularity images in /project/6007413/cidgoh_share/singularity_imgs, if a complete rebuild is needed please use --force_rebuild
2022-11-27 13:32:48:root:DEBUG - Existing image (/project/6007413/cidgoh_share/singularity_imgs/quay.io-bactopia-annotate_genome-2.1.1.img) found, skipping unless --force is used
2022-11-27 13:32:48:root:DEBUG - Existing image (/project/6007413/cidgoh_share/singularity_imgs/quay.io-bactopia-assemble_genome-2.1.1.img) found, skipping unless --force is used
2022-11-27 13:32:48:root:DEBUG - Existing image (/project/6007413/cidgoh_share/singularity_imgs/quay.io-bactopia-assembly_qc-2.1.1.img) found, skipping unless --force is used
2022-11-27 13:32:49:root:DEBUG - Existing image (/project/6007413/cidgoh_share/singularity_imgs/quay.io-bactopia-call_variants-2.1.1.img) found, skipping unless --force is used
2022-11-27 13:32:49:root:DEBUG - Existing image (/project/6007413/cidgoh_share/singularity_imgs/quay.io-bactopia-gather_samples-2.1.1.img) found, skipping unless --force is used
2022-11-27 13:32:49:root:DEBUG - Existing image (/project/6007413/cidgoh_share/singularity_imgs/quay.io-bactopia-minmers-2.1.1.img) found, skipping unless --force is used
2022-11-27 13:32:49:executor.process:DEBUG - Executing external command: bash -c 'date > /project/6007413/cidgoh_share/singularity_imgs/quay.io-images-built-2.1.1.txt'
2022-11-27 13:32:49:executor.process:DEBUG - Constructing subprocess.Popen object ..
2022-11-27 13:32:49:executor.process:DEBUG - Joining synchronous process using subprocess.Popen.communicate() ..
2022-11-27 13:32:50:executor.process:DEBUG - Got return code 0 from synchronous process (bash -c 'date > /project/6007413/cidgoh_share/singularity_imgs/quay.io-images-built-2.1.1.txt').
2022-11-27 13:32:50:root:STDOUT -
2022-11-27 13:32:50:root:STDERR -
2022-11-27 13:32:50:root:DEBUG - Working on bactopia
2022-11-27 13:32:50:root:DEBUG - Found Singularity image /project/6007413/cidgoh_share/singularity_imgs/depot.galaxyproject.org-singularity-multiqc-1.11--pyhdfd78af_0.img, if a complete rebuild is needed please use --force_rebuild
2022-11-27 13:32:50:root:DEBUG - Working on bactopia
2022-11-27 13:32:50:root:DEBUG - Found Singularity image /project/6007413/cidgoh_share/singularity_imgs/depot.galaxyproject.org-singularity-csvtk-0.23.0--h9ee0642_0.img, if a complete rebuild is needed please use --force_rebuild
N E X T F L O W ~ version 22.04.0
Launching `/usr/local/share/bactopia-2.1.x/main.nf` [insane_wescoff] DSL2 - revision: 145bb11899
---------------------------------------------
_ _ _
| |__ __ _ ___| |_ ___ _ __ (_) __ _
| '_ \ / _` |/ __| __/ _ \| '_ \| |/ _` |
| |_) | (_| | (__| || (_) | |_) | | (_| |
|_.__/ \__,_|\___|\__\___/| .__/|_|\__,_|
|_|
bactopia v2.1.1
Bactopia is a flexible pipeline for complete analysis of bacterial genomes.
---------------------------------------------
Core Nextflow options
runName : insane_wescoff
containerEngine : singularity
container : quay.io/bactopia/bactopia:2.1.1
launchDir : /scratch/mdprieto
workDir : /scratch/mdprieto/work
projectDir : /usr/local/share/bactopia-2.1.x
userName : mdprieto
profile : slurm,singularity
configFiles : /usr/local/share/bactopia-2.1.x/nextflow.config
Required Parameters
samples : /home/mdprieto/git/klebsiella_Qatar_2022/input/kleb_qatar_fofn.txt
Dataset Parameters
datasets : /scratch/mdprieto/datasets
species : Klebsiella pneumoniae
genome_size : median
Optional Parameters
outdir : /scratch/mdprieto/temp_results/bactopia_output/
Max Job Request Parameters
max_cpus : 2
Nextflow Profile Parameters
condadir : /usr/local/share/bactopia-2.1.x/conda/envs
registry : quay
singularity_cache: /project/6007413/cidgoh_share/singularity_imgs
!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------
If you use bactopia for your analysis please cite:
* Bactopia
https://doi.org/10.1128/mSystems.00190-20
* The nf-core framework
https://doi.org/10.1038/s41587-020-0439-x
* Software dependencies
https://bactopia.github.io/acknowledgements/
--------------------------------------------------------------------
Found 1 Antimicrobial resistance datasets
/scratch/mdprieto/datasets/antimicrobial-resistance/amrfinderdb.tar.gz
Found 4 minmer sketches/signatures
/scratch/mdprieto/datasets/minmer/mash-refseq-k21.msh
/scratch/mdprieto/datasets/minmer/sourmash-genbank-k21.json.gz
/scratch/mdprieto/datasets/minmer/sourmash-genbank-k31.json.gz
/scratch/mdprieto/datasets/minmer/sourmash-genbank-k51.json.gz
Found Prokka proteins file
/scratch/mdprieto/datasets/species-specific/klebsiella-pneumoniae/annotation/klebsiella-pneumoniae.faa
Found Mash Sketch of auto variant calling
/scratch/mdprieto/datasets/species-specific/klebsiella-pneumoniae/minmer/refseq-genomes.msh
Found 1 MLST datasets
/scratch/mdprieto/datasets/species-specific/klebsiella-pneumoniae/mlst/default.tar.gz
Found 1 reference genomes
/scratch/mdprieto/datasets/species-specific/klebsiella-pneumoniae/minmer/refseq-genomes.msh
Will use 5650579 bp for genome size
If something looks wrong, now's your chance to back out (CTRL+C 3 times).
Sleeping for 5 seconds...
--------------------------------------------------------------------
[- ] process > BACTOPIA:GATHER_SAMPLES -
...TRUNCATED ...
(CP19_S19_L001)' for execution -- Execution is retried (1)
slurmstepd: error: *** JOB 51447211 ON cdr535 CANCELLED AT 2022-11-27T14:33:02 DUE TO TIME LIMIT ***
[- ] process > BACTOPIA:GATHER_SAMPLES -
[- ] process > BACTOPIA:GATHER_SAMPLES -
[- ] process > BACTOPIA:QC_READS -
[- ] process > BACTOPIA:ASSEMBLE_GENOME -
[- ] process > BACTOPIA:ASSEMBLY_QC -
[- ] process > BACTOPIA:ANNOTATE_GENOME -
[- ] process > BACTOPIA:MINMER_SKETCH -
[- ] process > BACTOPIA:ANTIMICROBIAL_RESI... -
[- ] process > BACTOPIA:MINMER_QUERY -
[- ] process > BACTOPIA:BLAST -
[- ] process > BACTOPIA:CALL_VARIANTS -
[- ] process > BACTOPIA:MAPPING_QUERY -
[- ] process > BACTOPIA:SEQUENCE_TYPE -
[- ] process > BACTOPIA:CUSTOM_DUMPSOFTWAR... -
[- ] process > BACTOPIA:GATHER_SAMPLES [ 0%] 0 of 4
[- ] process > BACTOPIA:QC_READS -
[- ] process > BACTOPIA:ASSEMBLE_GENOME -
[- ] process > BACTOPIA:ASSEMBLY_QC -
[- ] process > BACTOPIA:ANNOTATE_GENOME -
[- ] process > BACTOPIA:MINMER_SKETCH -
[- ] process > BACTOPIA:ANTIMICROBIAL_RESI... -
[- ] process > BACTOPIA:MINMER_QUERY -
[- ] process > BACTOPIA:BLAST -
[- ] process > BACTOPIA:CALL_VARIANTS -
[- ] process > BACTOPIA:MAPPING_QUERY -
[- ] process > BACTOPIA:SEQUENCE_TYPE -
[- ] process > BACTOPIA:CUSTOM_DUMPSOFTWAR... -
[66/237942] process > BACTOPIA:GATHER_SAMPLES (C1... [ 22%] 2 of 9, failed: 2...
[- ] process > BACTOPIA:QC_READS -
[- ] process > BACTOPIA:ASSEMBLE_GENOME -
[- ] process > BACTOPIA:ASSEMBLY_QC -
[- ] process > BACTOPIA:ANNOTATE_GENOME -
[- ] process > BACTOPIA:MINMER_SKETCH -
[- ] process > BACTOPIA:ANTIMICROBIAL_RESI... -
[- ] process > BACTOPIA:MINMER_QUERY -
[- ] process > BACTOPIA:BLAST -
[- ] process > BACTOPIA:CALL_VARIANTS -
[- ] process > BACTOPIA:MAPPING_QUERY -
[- ] process > BACTOPIA:SEQUENCE_TYPE -
[- ] process > BACTOPIA:CUSTOM_DUMPSOFTWAR... -
[50/00f728] NOTE: Error submitting process 'BACTOPIA:GATHER_SAMPLES (C12_S22_L001)' for execution -- Execution is retried (1)
[66/237942] NOTE: Error submitting process 'BACTOPIA:GATHER_SAMPLES (C19_S18_L001)' for execution -- Execution is retried (1)
[37/f4ee7c] process > BACTOPIA:GATHER_SAMPLES (CP... [ 17%] 3 of 17, failed: ...
[- ] process > BACTOPIA:QC_READS -
[- ] process > BACTOPIA:ASSEMBLE_GENOME -
[- ] process > BACTOPIA:ASSEMBLY_QC -
[- ] process > BACTOPIA:ANNOTATE_GENOME -
[- ] process > BACTOPIA:MINMER_SKETCH -
[- ] process > BACTOPIA:ANTIMICROBIAL_RESI... -
[- ] process > BACTOPIA:MINMER_QUERY -
[- ] process > BACTOPIA:BLAST -
[- ] process > BACTOPIA:CALL_VARIANTS -
[- ] process > BACTOPIA:MAPPING_QUERY -
[- ] process > BACTOPIA:SEQUENCE_TYPE -
[- ] process > BACTOPIA:CUSTOM_DUMPSOFTWAR... -
[50/00f728] NOTE: Error submitting process 'BACTOPIA:GATHER_SAMPLES (C12_S22_L001)' for execution -- Execution is retried (1)
[66/237942] NOTE: Error submitting process 'BACTOPIA:GATHER_SAMPLES (C19_S18_L001)' for execution -- Execution is retried (1)
[37/f4ee7c] NOTE: Error submitting process 'BACTOPIA:GATHER_SAMPLES (CP19_S19_L001)' for execution -- Execution is retried (1)
[37/f4ee7c] process > BACTOPIA:GATHER_SAMPLES (CP... [ 13%] 3 of 23, failed: ...
[- ] process > BACTOPIA:QC_READS -
[- ] process > BACTOPIA:ASSEMBLE_GENOME -
[- ] process > BACTOPIA:ASSEMBLY_QC -
[- ] process > BACTOPIA:ANNOTATE_GENOME -
[- ] process > BACTOPIA:MINMER_SKETCH -
[- ] process > BACTOPIA:ANTIMICROBIAL_RESI... -
[- ] process > BACTOPIA:MINMER_QUERY -
[- ] process > BACTOPIA:BLAST -
[- ] process > BACTOPIA:CALL_VARIANTS -
[- ] process > BACTOPIA:MAPPING_QUERY -
[- ] process > BACTOPIA:SEQUENCE_TYPE -
[- ] process > BACTOPIA:CUSTOM_DUMPSOFTWAR... -
[50/00f728] NOTE: Error submitting process 'BACTOPIA:GATHER_SAMPLES (C12_S22_L001)' for execution -- Execution is retried (1)
[66/237942] NOTE: Error submitting process 'BACTOPIA:GATHER_SAMPLES (C19_S18_L001)' for execution -- Execution is retried (1)
[37/f4ee7c] NOTE: Error submitting process 'BACTOPIA:GATHER_SAMPLES (CP19_S19_L001)' for execution -- Execution is retried (1)
@azmigueldario let's see if we can get this figured out! I think our issue might be singularity within singularity here.
By chance can you install Bactopia through Conda? We will not use Conda, but use it to let Nextflow handle the job submissions to your SLURM cluster.
Here's what I'm thinking:
# Head Node
bactopia -profile test,slurm \
--slurm_opts="--account=XXXXXXXXX" \
--slurm_queue "YOUR_QUEUE_NAME"
# Nextflow then submits jobs to the SLURM cluster
If you are interested, I think its worth considering the creation of a profile config file. Here's an example of one I use for a cluster here in Wyoming: https://github.com/bactopia/bactopia/blob/master/conf/profiles/arcc.config
This allows me to just add -profile arcc and Nextflow handles all the job submissions to the cluster.
Thank you @rpetit3.
I cannot use conda in the cluster. Although I have access to another one where it can be used. That is one of my alternatives
I will try to run your code and dig into the config file and follow-up.
No problem. I noticed in your sbatch above there was a module load nextflow.
One thing you can try to do is replacing
singularity exec -e $BIND_MOUNT bactopia_2.1.1.sif bactopia \
with
nextflow run bactopia/bactopia
This might get you past the singularity in singularity bit
Hello again @rpetit3 ,
If I run it from nextflow directly and specify my container:
nextflow run bactopia/bactopia -with-singularity bactopia_2.1.1.sif ...
it runs the new version (2.2.0) of the app from the repo and ignores my container. It runs into an error while downloading the required modules for the updated bactopia.
Pulling Singularity image docker://quay.io/bactopia/gather_samples:2.2.0 [cache /project/6007413/cidgoh_share/singularity_imgs/quay.io-bactopia-gather_samples-2.2.0.img]
Bactopia Execution Summary
---------------------------
Bactopia Version : 2.2.0
Nextflow Version : 22.04.3
Command Line : nextflow run bactopia/bactopia -with-singularity bactopia_2.1.1.sif --samples /home/mdprieto/git/klebsiella_Qatar_2022/input/kleb_qatar_fofn.txt --datasets /scratch/mdprieto/datasets --outdir /scratch/mdprieto/temp_results/bactopia_output/ --species 'Klebsiella pneumoniae' --genome_size median --singularity_cache /project/6007413/cidgoh_share/singularity_imgs --max_cpus 2 --verbose -profile slurm,singularity -resume
Resumed : true
Completed At : 2022-12-01T16:45:00.201158-08:00
Duration : 4m 13s
Success : false
Exit Code : null
Error Report : Error executing process > 'BACTOPIA:GATHER_SAMPLES (C12_S22_L001)'
Caused by:
Failed to pull singularity image
command: singularity pull --name quay.io-bactopia-gather_samples-2.2.0.img.pulling.1669941682580 docker://quay.io/bactopia/gather_samples:2.2.0 > /dev/null
status : 255
message:
INFO: Converting OCI blobs to SIF format
WARNING: 'nodev' mount option set on /scratch, it could be a source of failure during build process
INFO: Starting build...
Getting image source signatures
FATAL: While making image from oci registry: error fetching image to cache: while building SIF from layers: conveyor failed to get: initializing source oci:/project/6007413/cidgoh_share/singularity_imgs/cache/blob:fa8a02b24c2e8e6f2326c1a63d535a3a58d5261c0ead5afc97c05950a0dd38aa: reading blob sha256:eaead16dc43bb8811d4ff450935d607f9ba4baffda4fc110cc402fa43f601d83: Get "https://cdn02.quay.io/sha256/ea/eaead16dc43bb8811d4ff450935d607f9ba4baffda4fc110cc402fa43f601d83?username=None&namespace=bactopia&Expires=1669942286&Signature=PASLAAFLQt~oW2Qphf6pSl0PK8kTFLhoStmSOf8KxXI6nhW1AkTpGXEvK1~7-wtFupkdPIKKyf4xkcq~asHbY8uENOkLHK4ov5bDFbOs6hqe6-yiJEnrX-GlT0CA06T3vUQvLLIj0JCNhqhyX4W5kU8B8qly15GOT~84R1dXG3WVzSXX2nZyKGq5RYLQWVqpVak7GkfMJp1MHMNHupoO3urVKZuJ6Fb8U61WOLGETuOEzXwLwgVOsmu~xp-BOzmR5qSVfzRPH~Ha2dcQZ2NmJ5K7wsBPPM6G~tFdxSUxTKTmeNC-i305~P4d2v55DX1qKilKzmGuMLo-0yNVfuo2ZA__&Key-Pair-Id=APKAJ67PQLWGCSP66DGA": dial tcp 18.65.229.125:443: i/o timeout
I am not knowledgeable in nextflow yet, so I was wondering if I can somehow run the pipeline inside the container I have (2.1.1) to see if it recognizes the modules I already have downloaded.
@azmigueldario let's see if we can get this figured out! I think our issue might be singularity within singularity here.
By chance can you install Bactopia through Conda? We will not use Conda, but use it to let Nextflow handle the job submissions to your SLURM cluster.
Here's what I'm thinking:
# Head Node bactopia -profile test,slurm \ --slurm_opts="--account=XXXXXXXXX" \ --slurm_queue "YOUR_QUEUE_NAME" # Nextflow then submits jobs to the SLURM clusterIf you are interested, I think its worth considering the creation of a profile config file. Here's an example of one I use for a cluster here in Wyoming: https://github.com/bactopia/bactopia/blob/master/conf/profiles/arcc.config
This allows me to just add
-profile arccand Nextflow handles all the job submissions to the cluster.
I tried to follow this suggestion so I could easily implement bacteria to slurm on my institution's HPC. Using a Conda install of Bacteria 2.2.0, my question is where do I put the config file so that bactopia knows where to find it? Is the program looking at the file name or the name of the profile defined within the config when you say to add -profile arcc? The error I get with the following is:
bactopia
--samples FOFN_test.tsv
--datasets /ceph/db/bactopia_2.0
--maxcpus 4
--max_time 480
--outdir bactopia_output
-qs 2
-profile ctmr.config
-resume
N E X T F L O W ~ version 22.10.6
Unknown configuration profile: 'ctmr'
I switched to using the --nfconfig option and that seems to have worked. I'm just curious about the operations using the -profile flag as well. Thanks!
Hi @azmigueldario
I'm cleaning up old issues and since this is related to v2, I'm going to go ahead and close this with the recommendation to give v3 a try.
Please reach out if you have any questions or issues!
Cheers, Robert