modules icon indicating copy to clipboard operation
modules copied to clipboard

Add new module vgan/haplocart

Open JoshuaDanielRubin opened this issue 3 years ago • 23 comments

PR checklist

New Module vgan/Haplocart

Closes #2152

  • [ X] This comment contains a description of changes (with reason).
  • [ X] If you've fixed a bug or added code that should be tested, add tests!
  • [ X] If you've added a new tool - have you followed the module conventions in the contribution docs
  • [X ] If necessary, include test data in your PR.
  • [X ] Remove all TODO statements.
  • [X ] Emit the versions.yml file.
  • [ X] Follow the naming conventions.
  • [ X] Follow the parameters requirements.
  • [X ] Follow the input/output options guidelines.
  • [X ] Add a resource label
  • [ X] Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • [ ] PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • [ X] PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • [ ] PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware

JoshuaDanielRubin avatar Jan 26 '23 10:01 JoshuaDanielRubin

@jfy133 Requesting review

JoshuaDanielRubin avatar Jan 26 '23 12:01 JoshuaDanielRubin

Did you create this module using the nf-core tooling? A lot of the template code is missing (especially the code that implements the CI tests here)? :)

nvnieuwk avatar Jan 26 '23 14:01 nvnieuwk

Ok thank you both for the help and suggestions! I will keep working on this.

JoshuaDanielRubin avatar Jan 31 '23 11:01 JoshuaDanielRubin

@JoshuaDanielRubin just a reminder ab out this ;) we are almost ready in the eager3 development for haplocart to be integrated!

jfy133 avatar Jun 04 '23 05:06 jfy133

Hi @jfy133, sorry for the delay here, to be honest I got stuck with the tests failing. And I am a bit behind with PhD work unfortunately at cannot really devote much time to this, so perhaps it would be best to go on without me :) I presume we can always integrate HaploCart at the next eager release?

JoshuaDanielRubin avatar Jun 06 '23 09:06 JoshuaDanielRubin

No worries at all! Yes we can bump to the next major release, however just shout if you want me to fix the tests for this PR (to make it less annoying;))

jfy133 avatar Jun 06 '23 15:06 jfy133

Hi @JoshuaDanielRubin! We are working on integrating haplocart in eager3 and I will be finishing up the module

aidaanva avatar Jun 30 '23 09:06 aidaanva

Hi @JoshuaDanielRubin, I think the module is all ready but test are failing and since I am not familiar with Haplocart I will need your input. When I do the test for single run using the file rCRS_simulated_test.fq.gz, I get the following error:

╦ ╦┌─┐┌─┐┬  ┌─┐╔═╗┌─┐┬─┐┌┬┐
╠═╣├─┤├─┘│  │ │║  ├─┤├┬┘ │
╩ ╩┴ ┴┴  ┴─┘└─┘╚═╝┴ ┴┴└─ ┴


Predicting sample: rCRS_simulated_test.fq.gz
Using 2 threads
Processing sample 1 of 1
Mapping reads...
Reading GAM file XKz8KOQ
Done reading GAM file /tmp/XKz8KOQ
Found 0 reads.
terminate called after throwing an instance of 'std::runtime_error'
  what():  [HaploCart] Error, no reads mapped
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Stack trace path: /tmp/vg_crash_NU38XH/stacktrace.txt
Please include the stack trace file in your bug report!

The command that run seems correct to me:

#!/bin/bash -ue
vgan haplocart \
     \
    -t 2 \
    -fq1 rCRS_simulated_test.fq.gz \
     \
    -o test.txt \
    --hc-files hcfiles \
    -pf test.posterior.txt

cat <<-END_VERSIONS > versions.yml
"test_vgan_haplocart_single_end:VGAN_HAPLOCART":
    vgan: $(vgan version 2>&1 | sed -e "s/vgan version //g;s/ (Mela)//g")
END_VERSIONS

Do you have any suggestions on how to fix it?

aidaanva avatar Jul 07 '23 10:07 aidaanva

Hi @aidaanva,

(Sorry for the late reply, I was on vacation.)

Thank you for working on this! I really appreciate the help. It looks like the reads are not being mapped to the reference graph. Are they reads from the mitochondria?

JoshuaDanielRubin avatar Jul 11 '23 07:07 JoshuaDanielRubin

Hi @JoshuaDanielRubin,

I tried with some libraries that have some reads to the mitochondria, however not too many.

I also tried with the dataset that you included, "rCRS_simulated_test.fq.gz", but this also failed. Are there reads to the mitochondria in that file?

I can check for other files to do the testing that we have in house, how many reads should I be aiming to have in the mitochondria?

Thank you for your time!

aidaanva avatar Jul 14 '23 08:07 aidaanva

@aidaanva

Yes, the rCRS is mitochondrial. So this is strange that not a single read is mapping.

May I ask what the contents of the stacktrace are in the file

/tmp/vg_crash_NU38XH/stacktrace.txt

JoshuaDanielRubin avatar Jul 16 '23 16:07 JoshuaDanielRubin

Also, just to be sure, the --hcfiles argument is pointing to the directory with the haplocart files?

JoshuaDanielRubin avatar Jul 18 '23 07:07 JoshuaDanielRubin

@JoshuaDanielRubin the --hcfiles points to a directory that contains:

ls hcfiles 
children.txt  graph.gbwt  graph.giraffe.gbz  graph.snarls  graph_paths	k31_w11.min	 parents.txt		   path_supports
graph.dist    graph.gg	  graph.og	     graph.xg	   k17_w18.min	mappability.tsv  parsed_pangenome_mapping

I downloaded them based on the documentation in vgan.

I can't find the file /tmp/vg_crash_NU38XH/stacktrace.txt in my tmp directory. Do you know how to change the path to where the program stores that file? then I could send you the output.

aidaanva avatar Jul 21 '23 12:07 aidaanva

@aidaanva

ok, this is strange :) I am thinking that the stacktrace will not be very informative in any case. To isolate the issue I can think of two things

  1. Try an earlier conda version of Haplocart
  2. Try a different input file with human mtDNA reads

JoshuaDanielRubin avatar Jul 26 '23 15:07 JoshuaDanielRubin

I've migrated to nf-test and upgraded to v3.0.0 of the tool. This seems to not have the hcfiles input, so just testing the fastq files directly. Worked on conda in gitpod, but crashed in docker.

SPPearce avatar May 07 '24 05:05 SPPearce

@JoshuaDanielRubin , @aidaanva , any progress on resolving this?

SPPearce avatar Mar 10 '25 15:03 SPPearce