db split1 issue
Describe the bug Hi I am trying to run NextPolish on Hifi pacbio sequences. I keep running into an unspecified error with db split. My workflow is:
ls /scratch/kcl58759/Eco_pacbio_kendall/pb_css_474/cromwell-executions/pb_ccs/c7a3dc30-7f94-40de-ac16-2445f965bfad/call-export_fasta/execution/m64060_210804_174320.hifi_reads.fasta.gz > lgs.fofn
Make config file:
#creat config file run.cfg
job_type = slurm job_prefix = nextPolish task = best rewrite = yes rerun = 3 parallel_jobs = 6 multithread_jobs = 5 genome = /scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.fa genome_size = auto workdir = ./01_rundir polish_options = -p {multithread_jobs}
[lgs_option] lgs_fofn = ./lgs.fofn lgs_options = -min_read_len 1k -max_depth 100 lgs_minimap2_options = -x map-ont
then I submit with #!/bin/bash #SBATCH --job-name=NextPolish #SBATCH --partition=batch #SBATCH --ntasks=1 #SBATCH --cpus-per-task=6 #SBATCH --mem=90gb #SBATCH --time=99:00:00 #SBATCH --output=nextpolish.out #SBATCH --error=nextpolish.err #SBATCH [email protected] #SBATCH --mail-type=END,FAIL
nextPolish run.cfg
Error message
[60417 INFO] 2022-03-03 13:02:46 NextPolish start...
[60417 INFO] 2022-03-03 13:02:46 version:v1.4.0 logfile:pid60417.log.info
[60417 WARNING] 2022-03-03 13:02:46 Re-write workdir
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 1 due to missing sgs_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 1 due to missing sgs_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 2 due to missing sgs_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 2 due to missing sgs_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 6 due to missing hifi_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 6 due to missing hifi_fofn.
[60417 INFO] 2022-03-03 13:03:05 scheduled tasks:
[5, 5]
[60417 INFO] 2022-03-03 13:03:05 options:
[60417 INFO] 2022-03-03 13:03:05
rerun: 3
rewrite: 1
kill: None
cleantmp: 0
task: [5, 5]
use_drmaa: 0
submit: None
job_type: sge
sgs_unpaired: 0
sgs_rm_nread: 1
parallel_jobs: 6
align_threads: 5
check_alive: None
job_id_regex: None
sgs_max_depth: 100
lgs_max_depth: 100
lgs_read_type: clr
multithread_jobs: 5
lgs_max_read_len: 0
hifi_max_depth: 100
polish_options: -p 5
lgs_min_read_len: 1k
hifi_max_read_len: 0
genome_size: 36224976
hifi_block_size: 500M
hifi_min_read_len: 1k
job_prefix: nextPolish
sgs_block_size: 500000000
lgs_block_size: 500000000
sgs_use_duplicate_reads: 0
sgs_align_options: bwa mem
hifi_minimap2_options: -x map-pb
lgs_minimap2_options: -x map-pb -t 5
lgs_fofn: /scratch/kcl58759/Eco_pacbio_kendall/./lgs.fofn
workdir: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir
snp_phase: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.snp_phase
snp_valid: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.snp_valid
lgs_polish: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.lgs_polish
kmer_count: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.kmer_count
hifi_polish: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.hifi_polish
score_chain: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.score_chain
genome: /scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.fa
[60417 INFO] 2022-03-03 13:03:05 step 0 and task 5 start:
[60417 INFO] 2022-03-03 13:03:19 Total jobs: 2
[60417 CRITICAL] 2022-03-03 13:03:19 Command 'qsub -pe smp 5 -l vf=2.5G -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh' returned non-zero exit status 1, error info: .
Traceback (most recent call last):
File "/apps/eb/NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2/nextPolish", line 515, in
GCC
What version of GCC are you using?
You can use the command gcc -v to get it.
Python
What version of Python are you using?
You can use the command python --version to get it.
NextPolish NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2
Could you try to submit task qsub -pe smp 5 -l vf=2.5G -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh manually and see what happens?
I am trying to rework it for slurm like this sbatch --mem=90 -pe smp 5 -l vf=2.5G -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh
However, it keeps saying cant open smp. Is this an option?
If you are using slurm but the log show you are using job_type: sge SGE, so check what happened?
Hmm, I am confused about that.
This is what happened when I tried the qsub command:
The command was: '/opt/apps/slurm/21.08.5/bin/sbatch -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh 2>&1' and the output was: 'sbatch: error: You must request some amount of memory. sbatch: error: Batch job submission failed: Job size specification needs to be provided '
I believe I got this to work with: sbatch --partition=batch --ntasks=1 --cpus-per-task=5 --mem-per-cpu=537 --time=99:00:00 -o /scratch/kcl58759/Eco_pacbio_kendall/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh
However, I am unsure if there are further steps since I cant seem to find the output files.
Hi, see here and ParallelTask to change the submit command template.