Can't use multiple threads via -t
I use the -t 8 option when running ragtag.py correct, but it seems to be only utilizing one CPU core. Is there an error in my command?
my command: ragtag.py correct heterodera_glycines.PRJNA381081.WBPS18.genomic.fa Sk43r400.contig -t 8
Hey, did you solve this? I am having the same issue. Thanks!
Hi! Maybe I can comment on this although I also have a kind of similar issue.
ragtag.py correct -h says that the -t parameter is only passed to minimap2. In the log fragment shared by @ChungLamYu minimap2 was not invoked because the output was generated before, hence no benefit from multithreading.
In my case though, when using ragtag correct with read files provided, the second run of minimap2 used to map the reads seems to ignore the -t parameter too:
+ ragtag.py correct --mm2-params '-x asm20 -t64' -R reads.fastq.gz -T corr -o workdir --gff annotation.gff3 reference.fa draft.fa
Wed Jan 10 18:29:20 2024 --- VERSION: RagTag v2.1.0
Wed Jan 10 18:29:20 2024 --- CMD: ragtag.py correct --mm2-params -x asm20 -t64 -R reads.fastq.gz -T corr -o workdir --gff annotation.gff3 reference.fa draft.fa
Wed Jan 10 18:29:20 2024 --- WARNING: Without '-u' invoked, some component/object AGP pairs might share the same ID. Some external programs/databases don't like this. To ensure valid AGP format, use '-u'.
Wed Jan 10 18:29:20 2024 --- INFO: Mapping the query genome to the draft genome
Wed Jan 10 18:29:20 2024 --- INFO: Running: minimap2 -x asm20 -t64 reference.fa draft.fa > workdir/ragtag.correct.asm.paf 2> workdir/ragtag.correct.asm.paf.log
Wed Jan 10 18:38:17 2024 --- INFO: Finished running : minimap2 -x asm20 -t64 reference.fa draft.fa > workdir/ragtag.correct.asm.paf 2> workdir/ragtag.correct.asm.paf.log
Wed Jan 10 18:38:17 2024 --- INFO: Reading whole genome alignments
Wed Jan 10 18:38:19 2024 --- INFO: Filtering and merging alignments
Wed Jan 10 18:38:22 2024 --- INFO: Validating putative query breakpoints via read alignment
Wed Jan 10 18:38:22 2024 --- INFO: Aligning reads to query sequences
Wed Jan 10 18:38:22 2024 --- INFO: Running: minimap2 -ax asm20 -t 1 draft.fa reads.fastq.gz > workdir/ragtag.correct.reads.sam 2> workdir/ragtag.correct.reads.sam.log
How would I use many threads for the second minimap2 run? And now that I noticed that the second run also respects my instruction for -x asm20 which I don't like — can I maybe set two different sets of parameters for two mappings in read-guided ragtag correct run? Another way to solve this would be allowing to add a custom BAM file as an alternative to making ragtag correct map the reads itself.
Update: multithreading seem to work better if I provide the -t 64 argument outside of the--mm2-params string. First run of minimap2 is said to run with -t unspecified then but I guess it uses many cores because it did the job much faster. The mapping of reads then explicitly has -t 64.
Much better now, but still I don't feel like I can fully control the parameters of two mapping routines so would be cool to improve that!