NextPolish icon indicating copy to clipboard operation
NextPolish copied to clipboard

How long and how much memory should NextPolish require?

Open Kendall-Lee opened this issue 3 years ago • 4 comments

Question or Expected behavior How long should it take for NextPolish to complete on a ~50Mb long read genome and what memory should I ask for? I submitted it at 90GB for 99hours and it timed out.

Operating system SLURM NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2

GCC What version of GCC are you using? gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)

Python What version of Python are you using? You can use the command python --version to get it.Python 3.8.2

Kendall-Lee avatar Mar 15 '22 16:03 Kendall-Lee

Hi, depends on your input and parameters, but if you prefer to use your own alignment pipeline, it will cost less resources and be faster, here

moold avatar Mar 16 '22 01:03 moold

Hi, I am trying to use my own alignment pipeline to decrease resources needed. Here is my alignment file:

#!/bin/bash #SBATCH --job-name=NextPolishBWA #SBATCH --partition=batch #SBATCH --ntasks=5 #SBATCH --cpus-per-task=10 #SBATCH --mem=90gb #SBATCH --time=99:00:00 #SBATCH --output=nextpolishself.out #SBATCH --error=nextpolishself.err #SBATCH [email protected] #SBATCH --mail-type=END,FAIL

module load BWA/0.7.17-GCC-8.3.0 ml SAMtools/1.10-GCC-8.3.0 ml NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2

round=2 threads=20 read=/scratch/kcl58759/Eco_pacbio_kendall/pb_css_474/cromwell-executions/pb_ccs/c7a3dc30-7f94-40de-ac16-2445f965bfad/call-export_fasta/execution/m64060_210804_174320.hifi_reads.fasta.gz read_type=hifi mapping_option=["hifi"]="asm20" input=/scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.fa

for ((i=1; i<=2;i++)); do minimap2 -ax asm20 [hifi] -t 6 /scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.f /scratch/kcl58759/Eco_pacbio_kendall/pb_css_474/cromwell-executions/pb_ccs/c7a3dc30-7f94-40de-ac16-2445f965bfad/call-export_fasta/execution/m64060_210804_174320.hifi_reads.fasta.gz | samtools sort - -m 2g --threads 6 -o lgs.sort.bam; samtools index lgs.sort.bam; ls pwd/lgs.sort.bam > lgs.sort.bam.fofn; python NextPolish/lib/nextpolish2.py -g /scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.f-l lgs.sort.bam.fofn -r hifi -p 6 -sp -o genome.nextpolish.fa; if ((i!=2));then mv genome.nextpolish.fa genome.nextpolishtmp.fa; input=genome.nextpolishtmp.fa; fi; done;

However I keep getting the errors:

[ERROR] failed to open file '[hifi]': No such file or directory python: can't open file 'NextPolish/lib/nextpolish2.py': [Errno 20] Not a directory mv: cannot stat ‘genome.nextpolish.fa’: No such file or directory [ERROR] failed to open file '[hifi]': No such file or directory python: can't open file 'NextPolish/lib/nextpolish2.py': [Errno 20] Not a directory

Is there something I am missing?

Kendall-Lee avatar Mar 18 '22 16:03 Kendall-Lee

see minimap2 manual to checkout how to run minimap2, [hifi] is not a correct option.

moold avatar Mar 21 '22 00:03 moold

I believe the issue is not with minimap but with NextPolish/lib/nextpolish2.py not being available. I cannot find the script on line and it doesn't load in with ml NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2

Kendall-Lee avatar Mar 22 '22 16:03 Kendall-Lee