new module: dysgu=v1.6.2
Hi, I am trying to learn how to add this new module "dysgu" for analysing structural variant in short or long reads in nf-core/sarek.
### Tasks
- [ ] Creating conda environment for dysgu.nf
- [ ] I am searching for a PR for this module
- [ ] Working on meta.yml and main.nf files
https://github.com/kcleal/dysgu
We should probably divide this module into multiple submodules.
dysgu run # Run using default arguments, wraps fetch and call commands
dysgu fetch # Separate SV reads from input bam file
dysgu call # SV calling
dysgu merge # Merge calls from multiple samples
dysgu filter # Filter SVs, find somatic SVs (version >= 1.5.0)
dysgu test # Run basic tests
Each would be called dysgu/run, dysgu/fetch and so on
For these modifications I need to add more inputs in main.nf file, meta.yml and update it? and for submodules is it necessary to add more .nf files to define the process?
You can check other modules which have submodules. It basically means one main.nf per subcommand. So one for dysgu run, one for dysgu call and so on. Those will be located within subfolders in the dysgu folder. Check out for example strelka or cnvkit.
We can first only add dysgu/run as module and the other ones can be done at some other time. We just need to restructure the folder in the PR.
At the moment, adding the dysgu call module is more important because I need to run some analysis with the patient data I have, making it the first priority. While this data is processing, I can work on the remaining submodules. I will create a new folder for dysgu call, transfer all the necessary files there, and conduct another test run.
Apologies for the late response; I was leaving from work.
Thanks for clarifying all the doubts:))
I am a bit confused now: The module you were working on was clearly using the comman dysgu run - we usually name the submodules after the command if that is available. So dysgu/call needs to use dysgu call as command.
Hi @famosab as the developer of dysgu explained
[ For paired-end data the run command is recommended which wraps fetch and call:]
dysgu run is used when we are working with paired-end reads and it works with both bam and cram files, while we are dealing with long-reads we used dysgu call It should work with dysgu run
Ok then we need two submodules, one called dysgu/run and dysgu/call or we make that dependent on the input that is given to the module. I am unsure what would be best practise here. Maybe you can ask in the slack. Or @maxulysse or @FriederikeHanssen can help as they are sarek developers.