Errors with nextPolish.sh.e
Hi all. I'm trying to polish an assembly from Flye for nanopore reads. However, I cannot make the command work as I receive errors that I cannot determine the source, or how to proceed. These errors appear also with the test data (nextPolish test_data/run.cfg).
Operating system Linux Mint 20.2 Uma (Ubuntu Focal base)
GCC gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
Python Python 3.9.7
NextPolish nextPolish v1.4.0
Input files (https://nextpolish.readthedocs.io/en/latest/TUTORIAL.html#polishing-using-long-reads-only)
lgs.fofn:
I have only one FASTQ file because I'm testing the tool to adapt it to CWL, proceeding with ls /path/to/fastq_runid.fastq > lgs.fofn
run.cfg: modified parts are commented
[General]
job_type = local
job_prefix = nextPolish
task = best
rewrite = yes
rerun = 3
parallel_jobs = 6
multithread_jobs = 5
genome = ./assembly.fasta # modified
genome_size = auto
workdir = ./01_rundir # modified, tried with different
polish_options = -p {multithread_jobs}
[lgs_option]
lgs_fofn = ./lgs.fofn
lgs_options = -min_read_len 1k -max_depth 100
lgs_minimap2_options = -x map-ont
Log
[183854 INFO] 2021-12-03 10:52:24 NextPolish start...
[183854 INFO] 2021-12-03 10:52:24 version:v1.4.0 logfile:pid183854.log.info
[183854 WARNING] 2021-12-03 10:52:24 Delete task: 5 due to missing lgs_fofn.
[183854 WARNING] 2021-12-03 10:52:24 Delete task: 5 due to missing lgs_fofn.
[183854 WARNING] 2021-12-03 10:52:24 Delete task: 6 due to missing hifi_fofn.
[183854 WARNING] 2021-12-03 10:52:24 Delete task: 6 due to missing hifi_fofn.
[183854 INFO] 2021-12-03 10:52:24 scheduled tasks:
[1, 2, 1, 2]
[183854 INFO] 2021-12-03 10:52:24 options:
[183854 INFO] 2021-12-03 10:52:24
rerun: 3
rewrite: 0
kill: None
cleantmp: 0
use_drmaa: 0
submit: None
job_type: local
sgs_unpaired: 0
sgs_rm_nread: 1
lgs_read_type:
parallel_jobs: 6
align_threads: 5
check_alive: None
task: [1, 2, 1, 2]
job_id_regex: None
genome_size: 18910
sgs_max_depth: 100
lgs_max_depth: 100
multithread_jobs: 5
lgs_max_read_len: 0
hifi_max_depth: 100
lgs_block_size: 500M
lgs_min_read_len: 1k
hifi_max_read_len: 0
polish_options: -p 5
hifi_block_size: 500M
hifi_min_read_len: 1k
job_prefix: nextPolish
sgs_use_duplicate_reads: 0
lgs_minimap2_options: -x map-ont
hifi_minimap2_options: -x map-pb
sgs_block_size: 315166.6666666667
sgs_align_options: bwa mem -p -t 5
workdir: path/to/NextPolish.backup0
genome: path/to/flye-output/assembly.fasta
sgs_fofn: path/to/NextPolish.backup0/test.fofn
snp_phase: path/to/NextPolish.backup0/%02d.snp_phase
snp_valid: path/to/NextPolish.backup0/%02d.snp_valid
lgs_polish: path/to/NextPolish.backup0/%02d.lgs_polish
kmer_count: path/to/NextPolish.backup0/%02d.kmer_count
hifi_polish: path/to/NextPolish.backup0/%02d.hifi_polish
score_chain: path/to/NextPolish.backup0/%02d.score_chain
[183854 WARNING] 2021-12-03 10:52:24 mv path/to/NextPolish.backup0 to path/to/NextPolish.backup0.backup0
[183854 INFO] 2021-12-03 10:52:24 step 0 and task 1 start:
[183854 INFO] 2021-12-03 10:52:29 Total jobs: 3
[183854 INFO] 2021-12-03 10:52:29 Submitted jobID:[183883] jobCmd:[path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh] in the local_cycle.
[183883 CRITICAL] 2021-12-03 10:52:29 Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 127, error info: .
Traceback (most recent call last):
File "path/to/NextPolish.backup0/./nextPolish", line 515, in <module>
File "path/to/NextPolish.backup0/./nextPolish", line 369, in main
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 347, in start
self._start()
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 371, in _start
self.submit(job)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 255, in submit
_, stdout, _ = self.run(job.cmd)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 291, in run
log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr))
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1493, in critical
self._log(CRITICAL, msg, args, **kwargs)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1589, in _log
self.handle(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1599, in handle
self.callHandlers(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1661, in callHandlers
hdlr.handle(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 952, in handle
self.emit(record)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/kit.py", line 42, in emit
raise Exception(record.msg)
Exception: Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 127, error info: .
[183854 INFO] 2021-12-03 10:52:29 Submitted jobID:[183889] jobCmd:[path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh] in the local_cycle.
[183889 CRITICAL] 2021-12-03 10:52:29 Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e' returned non-zero exit status 127, error info: .
Traceback (most recent call last):
File "path/to/NextPolish.backup0/./nextPolish", line 515, in <module>
File "path/to/NextPolish.backup0/./nextPolish", line 369, in main
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 347, in start
self._start()
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 371, in _start
self.submit(job)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 255, in submit
_, stdout, _ = self.run(job.cmd)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 291, in run
log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr))
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1493, in critical
self._log(CRITICAL, msg, args, **kwargs)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1589, in _log
self.handle(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1599, in handle
self.callHandlers(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1661, in callHandlers
hdlr.handle(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 952, in handle
self.emit(record)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/kit.py", line 42, in emit
raise Exception(record.msg)
Exception: Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e' returned non-zero exit status 127, error info: .
[183854 INFO] 2021-12-03 10:52:30 Submitted jobID:[183895] jobCmd:[path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh] in the local_cycle.
[183895 CRITICAL] 2021-12-03 10:52:30 Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e' returned non-zero exit status 127, error info: .
Traceback (most recent call last):
File "path/to/NextPolish.backup0/./nextPolish", line 515, in <module>
File "path/to/NextPolish.backup0/./nextPolish", line 369, in main
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 347, in start
self._start()
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 371, in _start
self.submit(job)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 255, in submit
_, stdout, _ = self.run(job.cmd)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/task_control.py", line 291, in run
log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr))
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1493, in critical
self._log(CRITICAL, msg, args, **kwargs)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1589, in _log
self.handle(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1599, in handle
self.callHandlers(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 1661, in callHandlers
hdlr.handle(record)
File "path/to/anaconda3/lib/python3.9/logging/__init__.py", line 952, in handle
self.emit(record)
File "path/to/anaconda3/lib/python3.9/site-packages/paralleltask/kit.py", line 42, in emit
raise Exception(record.msg)
Exception: Command '/bin/sh path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh > path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.o 2> path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e' returned non-zero exit status 127, error info: .
[183854 ERROR] 2021-12-03 10:52:37 db_split failed: please check the following logs:
[183854 ERROR] 2021-12-03 10:52:37 path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e
[183854 ERROR] 2021-12-03 10:52:37 path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e
[183854 ERROR] 2021-12-03 10:52:37 path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e
Regards, Alex
Hi, could you paste the content of path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e to here?
Hi,
path/to/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e:
hostname
+ hostname
cd path/to/NextPolish/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1
+ cd path/to/NextPolish/NextPolish.backup0/00.score_chain/01.db_split.sh.work/db_split1
time path/to/NextPolish/NextPolish.backup0/bin/seq_split -d path/to/NextPolish/NextPolish.backup0 -m 315166.6666666667 -n 6 -t 5 -i 1 -s 1891000 -p input.sgspart path/to/NextPolish/NextPolish.backup0/test.fofn
+ time path/to/NextPolish/NextPolish.backup0/bin/seq_split -d path/to/NextPolish/NextPolish.backup0 -m 315166.6666666667 -n 6 -t 5 -i 1 -s 1891000 -p input.sgspart path/to/NextPolish/NextPolish.backup0/test.fofn
time: cannot run path/to/NextPolish/NextPolish.backup0/bin/seq_split: No such file or directory
Command exited with non-zero status 127
0.00user 0.00system 0:00.00elapsed ?%CPU (0avgtext+0avgdata 1020maxresident)k
0inputs+0outputs (0major+25minor)pagefaults 0swaps
As the log says, there is no seq_split execution file, so follow here to reinstall.
BTW, do not forget make command after downloading.
I made the reinstall, but now I receive the log below from the nextPolish.sh.e file from running the test data. However, when I ran a clean installation of it in a server, then it's successful. Looks like is clear that the problem points to my local Python or Anaconda installation, but I would be glad if you have an idea what could be causing this issue. By now it's possible to close this question.
Thanks!
hostname
+ hostname
cd path/to/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome1
+ cd path/to/NextPolish/test_data/01_rundir/00.lgs_polish/04.polish.ref.sh.work/polish_genome1
time /path/to/anaconda3/bin/python path/to/NextPolish/lib/nextpolish2.py -sp -p 1 -g path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta -b path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc -i 0 -l path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list -r ont -o genome.nextpolish.part000.fasta
+ time /path/to/anaconda3/bin/python path/to/NextPolish/lib/nextpolish2.py -sp -p 1 -g path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta -b path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc -i 0 -l path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list -r ont -o genome.nextpolish.part000.fasta
[110589 INFO] 2021-12-07 11:22:42 Corrected step options:
[110589 INFO] 2021-12-07 11:22:42
split: 0
process: 1
auto: True
read_type: 1
block_index: 0
window: 5000000
uppercase: False
alignment_score_ratio: 0.8
alignment_identity_ratio: 0.8
out: genome.nextpolish.part000.fasta
genome: path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta
bam_list: path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/lgs.sort.bam.list
block: path/to/NextPolish/test_data/./01_rundir/00.lgs_polish/input.genome.fasta.blc
[110589 WARNING] 2021-12-07 11:22:42 Adjust -p from 1 to 0, -w from 5000000 to 5000000, logical CPUs:4, available RAM:~6G, use -a to disable automatic adjustment.
Traceback (most recent call last):
File "path/to/NextPolish/lib/nextpolish2.py", line 260, in <module>
main(args)
File "path/to/NextPolish/lib/nextpolish2.py", line 192, in main
pool = Pool(args.process, initializer=start)
File "/path/to/anaconda3/lib/python3.9/multiprocessing/context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "/path/to/anaconda3/lib/python3.9/multiprocessing/pool.py", line 205, in __init__
raise ValueError("Number of processes must be at least 1")
ValueError: Number of processes must be at least 1
Command exited with non-zero status 1
0.08user 0.00system 0:00.09elapsed 100%CPU (0avgtext+0avgdata 16528maxresident)k
0inputs+8outputs (0major+2520minor)pagefaults 0swaps
The RAM is too small
Is there a proper way to increase it?
According to the FAQ of NextPolish it should be possible to increase it from the default 3G for Paralleltask, but I cannot see an effect in changing it in the cluster.cfg file or where precisely use this of the submit parameter. I still receive the same error.
Also how is that the memory isn't enough for only 3G?
The computer node you submitted only have ~6Gb memory, you can not change it by adjusting parameters, you need to change another computer nodes to run it.
EDITED:
Maybe you just forgot to change job_type = local to job_type = sge or others, if you want to submit your job on a computer cluster.
Good to know. At least it's not a problem with my installation or files submitted.
These errors come from a local test. When the tool is used in a cluster it runs perfectly.
I was trying to figure out why there was such different behavior, but if it's due to a hardware restriction then there's nothing to do by now.
Thanks.