eager icon indicating copy to clipboard operation
eager copied to clipboard

DSL2: metagenomics

Open ilight1542 opened this issue 2 years ago • 12 comments

TODOS:

  • [ ] Ensure files passed to metagenomics profiling are identifiable with outputted names (eg should be able to distinguish mapped, vs unmapped to ref (if user selects 'all': do we want to merge these and just use the raw sequencing reads --> this i think is the expected behavior but not always working?)
  • [x] Finish testing for maltextract @merszym
  • [x] Finish testing of various input parameters to ensure proper behavior for outputted files (ensure no overwriting of files is occurring with ext.prefix (!!)
  • [x] Test malt parallel execution (multiple independent submissions)
  • [x] Test warning and parameter combo checks implemented in eager.nf
  • [x] Add any necessary warnings or errors for parameter combos for metagenomics to eager.nf
  • [x] Double check documentation is correct, and easy to understand
  • [x] Check how single-strand vs double strand mixed input into malt currently behaves -- currently does not flag singlestranded if one sample is singlestranded (this info may be wiped from metamap at mapping step?). Need to check for single sample(!)
  • [x] Update that malt keeps Single stranded library prep info for later use in maltextract

PR checklist

  • [ ] This comment contains a description of changes (with reason).
  • [ ] If you've fixed a bug or added code that should be tested, add tests!
    • [ ] If you've added a new tool - add to the software_versions process and a regex to scrape_software_versions.py
    • [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/eager/tree/master/.github/CONTRIBUTING.md)
    • [ ] If necessary, also make a PR on the nf-core/eager branch on the nf-core/test-datasets repository.
  • [x] Make sure your code lints (nf-core lint .).
  • [x] Ensure the test suite passes (nextflow run . -profile test,docker).
  • [x] Usage Documentation in docs/usage.md is updated.
  • [x] Output Documentation in docs/output.md is updated.
  • [ ] CHANGELOG.md is updated.
  • [x] README.md is updated (including new tool citations and authors/contributors).

ilight1542 avatar Aug 07 '23 08:08 ilight1542

This PR is against the master branch :x:

  • Do not close this PR
  • Click Edit and change the base to dev
  • This CI test will remain failed until you push a new commit

Hi @ilight1542,

It looks like this pull-request is has been made against the nf-core/eager master branch. The master branch on nf-core repositories should always contain code from the latest release. Because of this, PRs to master are only allowed if they come from the nf-core/eager dev branch.

You do not need to close this PR, you can change the target branch to dev by clicking the "Edit" button at the top of this page. Note that even after this, the test will continue to show as failing until you push a new commit.

Thanks again for your contribution!

github-actions[bot] avatar Aug 07 '23 08:08 github-actions[bot]

@ilight1542 maltextract+AMPS works now, however, there are many optional parameters, so I'll do the comprehensive testing on friday.

merszym avatar Aug 09 '23 17:08 merszym

All tests, except Metaphal have passed today (see file attached).

tests.md

ToDo for the next testing: [] optional parameters [] check the expected output [] update the manual_tests.md file

I'm positive that we finish the metagenomics section in the next weeks :)

merszym avatar Oct 06 '23 10:10 merszym

should consider also implementing this enhancement for bam filtering https://github.com/nf-core/eager/issues/945

ilight1542 avatar Oct 20 '23 11:10 ilight1542

nf-core lint overall result: Passed :white_check_mark: :warning:

Posted for pipeline commit cfbba4d

+| ✅ 367 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗  22 tests had warnings |!

:heavy_exclamation_mark: Test warnings:

  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in main.nf: Remove this line if you don't need a FASTA file
  • pipeline_todos - TODO string in nextflow.config: Specify your pipeline's command line flags
  • pipeline_todos - TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core
  • pipeline_todos - TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
  • pipeline_todos - TODO string in ci.yml: You can customise CI pipeline run tests as required
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test_humanbam.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test_humanbam.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test_nothing.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test_nothing.config: Give any required params for the test so that command line flags are not needed

:grey_question: Tests ignored:

  • nextflow_config - Config default ignored: params.contamination_estimation_angsd_hapmap

:white_check_mark: Tests passed:

  • files_exist - File found: .gitattributes
  • files_exist - File found: .gitignore
  • files_exist - File found: .nf-core.yml
  • files_exist - File found: .editorconfig
  • files_exist - File found: .prettierignore
  • files_exist - File found: .prettierrc.yml
  • files_exist - File found: CHANGELOG.md
  • files_exist - File found: CITATIONS.md
  • files_exist - File found: CODE_OF_CONDUCT.md
  • files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
  • files_exist - File found: nextflow_schema.json
  • files_exist - File found: nextflow.config
  • files_exist - File found: README.md
  • files_exist - File found: .github/.dockstore.yml
  • files_exist - File found: .github/CONTRIBUTING.md
  • files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
  • files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
  • files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
  • files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
  • files_exist - File found: .github/workflows/branch.yml
  • files_exist - File found: .github/workflows/ci.yml
  • files_exist - File found: .github/workflows/linting_comment.yml
  • files_exist - File found: .github/workflows/linting.yml
  • files_exist - File found: assets/email_template.html
  • files_exist - File found: assets/email_template.txt
  • files_exist - File found: assets/sendmail_template.txt
  • files_exist - File found: assets/nf-core-eager_logo_light.png
  • files_exist - File found: conf/modules.config
  • files_exist - File found: conf/test.config
  • files_exist - File found: conf/test_full.config
  • files_exist - File found: docs/images/nf-core-eager_logo_light.png
  • files_exist - File found: docs/images/nf-core-eager_logo_dark.png
  • files_exist - File found: docs/output.md
  • files_exist - File found: docs/README.md
  • files_exist - File found: docs/README.md
  • files_exist - File found: docs/usage.md
  • files_exist - File found: main.nf
  • files_exist - File found: assets/multiqc_config.yml
  • files_exist - File found: conf/base.config
  • files_exist - File found: conf/igenomes.config
  • files_exist - File found: .github/workflows/awstest.yml
  • files_exist - File found: .github/workflows/awsfulltest.yml
  • files_exist - File found: modules.json
  • files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
  • files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
  • files_exist - File not found check: .github/workflows/push_dockerhub.yml
  • files_exist - File not found check: .markdownlint.yml
  • files_exist - File not found check: .nf-core.yaml
  • files_exist - File not found check: .yamllint.yml
  • files_exist - File not found check: bin/markdown_to_html.r
  • files_exist - File not found check: conf/aws.config
  • files_exist - File not found check: docs/images/nf-core-eager_logo.png
  • files_exist - File not found check: lib/Checks.groovy
  • files_exist - File not found check: lib/Completion.groovy
  • files_exist - File not found check: lib/NfcoreTemplate.groovy
  • files_exist - File not found check: lib/Utils.groovy
  • files_exist - File not found check: lib/Workflow.groovy
  • files_exist - File not found check: lib/WorkflowMain.groovy
  • files_exist - File not found check: lib/WorkflowEager.groovy
  • files_exist - File not found check: parameters.settings.json
  • files_exist - File not found check: pipeline_template.yml
  • files_exist - File not found check: Singularity
  • files_exist - File not found check: lib/nfcore_external_java_deps.jar
  • files_exist - File not found check: .travis.yml
  • nextflow_config - Config variable found: manifest.name
  • nextflow_config - Config variable found: manifest.nextflowVersion
  • nextflow_config - Config variable found: manifest.description
  • nextflow_config - Config variable found: manifest.version
  • nextflow_config - Config variable found: manifest.homePage
  • nextflow_config - Config variable found: timeline.enabled
  • nextflow_config - Config variable found: trace.enabled
  • nextflow_config - Config variable found: report.enabled
  • nextflow_config - Config variable found: dag.enabled
  • nextflow_config - Config variable found: process.cpus
  • nextflow_config - Config variable found: process.memory
  • nextflow_config - Config variable found: process.time
  • nextflow_config - Config variable found: params.outdir
  • nextflow_config - Config variable found: params.input
  • nextflow_config - Config variable found: params.validationShowHiddenParams
  • nextflow_config - Config variable found: params.validationSchemaIgnoreParams
  • nextflow_config - Config variable found: manifest.mainScript
  • nextflow_config - Config variable found: timeline.file
  • nextflow_config - Config variable found: trace.file
  • nextflow_config - Config variable found: report.file
  • nextflow_config - Config variable found: dag.file
  • nextflow_config - Config variable (correctly) not found: params.nf_required_version
  • nextflow_config - Config variable (correctly) not found: params.container
  • nextflow_config - Config variable (correctly) not found: params.singleEnd
  • nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
  • nextflow_config - Config variable (correctly) not found: params.name
  • nextflow_config - Config variable (correctly) not found: params.enable_conda
  • nextflow_config - Config timeline.enabled had correct value: true
  • nextflow_config - Config report.enabled had correct value: true
  • nextflow_config - Config trace.enabled had correct value: true
  • nextflow_config - Config dag.enabled had correct value: true
  • nextflow_config - Config manifest.name began with nf-core/
  • nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
  • nextflow_config - Config dag.file ended with .html
  • nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
  • nextflow_config - Config manifest.version ends in dev: 3.0.0dev
  • nextflow_config - Config params.custom_config_version is set to master
  • nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
  • nextflow_config - Lines for loading custom profiles found
  • nextflow_config - nextflow.config contains configuration profile test
  • nextflow_config - Config default value correct: params.igenomes_base= s3://ngi-igenomes/igenomes/
  • nextflow_config - Config default value correct: params.fasta_circularmapper_elongationfactor= 500
  • nextflow_config - Config default value correct: params.custom_config_version= master
  • nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
  • nextflow_config - Config default value correct: params.max_cpus= 16
  • nextflow_config - Config default value correct: params.max_memory= 128.GB
  • nextflow_config - Config default value correct: params.max_time= 240.h
  • nextflow_config - Config default value correct: params.publish_dir_mode= copy
  • nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
  • nextflow_config - Config default value correct: params.validate_params= true
  • nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
  • nextflow_config - Config default value correct: params.sequencing_qc_tool= fastqc
  • nextflow_config - Config default value correct: params.preprocessing_tool= fastp
  • nextflow_config - Config default value correct: params.preprocessing_minlength= 25
  • nextflow_config - Config default value correct: params.preprocessing_trim5p= 0
  • nextflow_config - Config default value correct: params.preprocessing_trim3p= 0
  • nextflow_config - Config default value correct: params.preprocessing_fastp_complexityfilter_threshold= 10
  • nextflow_config - Config default value correct: params.preprocessing_adapterremoval_trimbasequalitymin= 20
  • nextflow_config - Config default value correct: params.preprocessing_adapterremoval_adapteroverlap= 1
  • nextflow_config - Config default value correct: params.preprocessing_adapterremoval_qualitymax= 41
  • nextflow_config - Config default value correct: params.fastq_shard_size= 1000000
  • nextflow_config - Config default value correct: params.mapping_tool= bwaaln
  • nextflow_config - Config default value correct: params.mapping_bwaaln_n= 0.01
  • nextflow_config - Config default value correct: params.mapping_bwaaln_k= 2
  • nextflow_config - Config default value correct: params.mapping_bwaaln_l= 1024
  • nextflow_config - Config default value correct: params.mapping_bwaaln_o= 2
  • nextflow_config - Config default value correct: params.mapping_bwamem_k= 19
  • nextflow_config - Config default value correct: params.mapping_bwamem_r= 1.5
  • nextflow_config - Config default value correct: params.mapping_bowtie2_alignmode= local
  • nextflow_config - Config default value correct: params.mapping_bowtie2_sensitivity= sensitive
  • nextflow_config - Config default value correct: params.mapping_bowtie2_n= 0
  • nextflow_config - Config default value correct: params.mapping_bowtie2_l= 20
  • nextflow_config - Config default value correct: params.mapping_bowtie2_trim5= 0
  • nextflow_config - Config default value correct: params.mapping_bowtie2_trim3= 0
  • nextflow_config - Config default value correct: params.mapping_bowtie2_maxins= 500
  • nextflow_config - Config default value correct: params.bamfiltering_minreadlength= 0
  • nextflow_config - Config default value correct: params.bamfiltering_mappingquality= 0
  • nextflow_config - Config default value correct: params.bamfilter_genomicbamfilterflag= 4
  • nextflow_config - Config default value correct: params.metagenomics_input= unmapped
  • nextflow_config - Config default value correct: params.metagenomics_complexity_tool= bbduk
  • nextflow_config - Config default value correct: params.metagenomics_complexity_entropy= 0.3
  • nextflow_config - Config default value correct: params.metagenomics_prinseq_mode= entropy
  • nextflow_config - Config default value correct: params.metagenomics_prinseq_dustscore= 0.5
  • nextflow_config - Config default value correct: params.metagenomics_krakenuniq_ramchunksize= 16G
  • nextflow_config - Config default value correct: params.metagenomics_malt_mode= BlastN
  • nextflow_config - Config default value correct: params.metagenomics_malt_alignmentmode= SemiGlobal
  • nextflow_config - Config default value correct: params.metagenomics_malt_minpercentidentity= 85
  • nextflow_config - Config default value correct: params.metagenomics_malt_toppercent= 1
  • nextflow_config - Config default value correct: params.metagenomics_malt_minsupportmode= percent
  • nextflow_config - Config default value correct: params.metagenomics_malt_minsupportpercent= 0.01
  • nextflow_config - Config default value correct: params.metagenomics_malt_minsupportreads= 1
  • nextflow_config - Config default value correct: params.metagenomics_malt_maxqueries= 100
  • nextflow_config - Config default value correct: params.metagenomics_malt_memorymode= load
  • nextflow_config - Config default value correct: params.metagenomics_malt_group_size= 0
  • nextflow_config - Config default value correct: params.metagenomics_maltextract_filter= def_anc
  • nextflow_config - Config default value correct: params.metagenomics_maltextract_toppercent= 0.01
  • nextflow_config - Config default value correct: params.metagenomics_maltextract_minpercentidentity= 85.0
  • nextflow_config - Config default value correct: params.deduplication_tool= markduplicates
  • nextflow_config - Config default value correct: params.damage_manipulation_rescale_seqlength= 12
  • nextflow_config - Config default value correct: params.damage_manipulation_rescale_length_5p= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_rescale_length_3p= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_pmdtools_threshold= 3
  • nextflow_config - Config default value correct: params.damage_manipulation_bamutils_trim_double_stranded_none_udg_left= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_bamutils_trim_double_stranded_none_udg_right= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_bamutils_trim_double_stranded_half_udg_left= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_bamutils_trim_double_stranded_half_udg_right= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_bamutils_trim_single_stranded_none_udg_left= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_bamutils_trim_single_stranded_none_udg_right= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_bamutils_trim_single_stranded_half_udg_left= 0
  • nextflow_config - Config default value correct: params.damage_manipulation_bamutils_trim_single_stranded_half_udg_right= 0
  • nextflow_config - Config default value correct: params.genotyping_reference_ploidy= 2
  • nextflow_config - Config default value correct: params.genotyping_pileupcaller_min_base_quality= 30
  • nextflow_config - Config default value correct: params.genotyping_pileupcaller_min_map_quality= 30
  • nextflow_config - Config default value correct: params.genotyping_pileupcaller_method= randomHaploid
  • nextflow_config - Config default value correct: params.genotyping_pileupcaller_transitions_mode= AllSites
  • nextflow_config - Config default value correct: params.genotyping_gatk_call_conf= 30
  • nextflow_config - Config default value correct: params.genotyping_gatk_ug_downsample= 250
  • nextflow_config - Config default value correct: params.genotyping_gatk_ug_out_mode= EMIT_VARIANTS_ONLY
  • nextflow_config - Config default value correct: params.genotyping_gatk_ug_genotype_mode= SNP
  • nextflow_config - Config default value correct: params.genotyping_gatk_ug_defaultbasequalities= -1
  • nextflow_config - Config default value correct: params.genotyping_gatk_hc_out_mode= EMIT_VARIANTS_ONLY
  • nextflow_config - Config default value correct: params.genotyping_gatk_hc_emitrefconf= GVCF
  • nextflow_config - Config default value correct: params.genotyping_freebayes_min_alternate_count= 1
  • nextflow_config - Config default value correct: params.genotyping_freebayes_skip_coverage= 0
  • nextflow_config - Config default value correct: params.genotyping_angsd_glmodel= samtools
  • nextflow_config - Config default value correct: params.genotyping_angsd_glformat= binary
  • nextflow_config - Config default value correct: params.mitochondrion_header= MT
  • nextflow_config - Config default value correct: params.mapstats_preseq_mode= c_curve
  • nextflow_config - Config default value correct: params.mapstats_preseq_stepsize= 1000
  • nextflow_config - Config default value correct: params.mapstats_preseq_terms= 100
  • nextflow_config - Config default value correct: params.mapstats_preseq_maxextrap= 10000000000
  • nextflow_config - Config default value correct: params.mapstats_preseq_bootstrap= 100
  • nextflow_config - Config default value correct: params.mapstats_preseq_cval= 0.95
  • nextflow_config - Config default value correct: params.damagecalculation_tool= damageprofiler
  • nextflow_config - Config default value correct: params.damagecalculation_yaxis= 0.3
  • nextflow_config - Config default value correct: params.damagecalculation_xaxis= 25
  • nextflow_config - Config default value correct: params.damagecalculation_damageprofiler_length= 100
  • nextflow_config - Config default value correct: params.damagecalculation_mapdamage_downsample= 0
  • nextflow_config - Config default value correct: params.host_removal_mode= remove
  • nextflow_config - Config default value correct: params.contamination_estimation_angsd_chrom_name= X
  • nextflow_config - Config default value correct: params.contamination_estimation_angsd_range_from= 5000000
  • nextflow_config - Config default value correct: params.contamination_estimation_angsd_range_to= 154900000
  • nextflow_config - Config default value correct: params.contamination_estimation_angsd_mapq= 30
  • nextflow_config - Config default value correct: params.contamination_estimation_angsd_minq= 30
  • files_unchanged - .gitattributes matches the template
  • files_unchanged - .prettierrc.yml matches the template
  • files_unchanged - CODE_OF_CONDUCT.md matches the template
  • files_unchanged - LICENSE matches the template
  • files_unchanged - .github/.dockstore.yml matches the template
  • files_unchanged - .github/CONTRIBUTING.md matches the template
  • files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
  • files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
  • files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
  • files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
  • files_unchanged - .github/workflows/branch.yml matches the template
  • files_unchanged - .github/workflows/linting_comment.yml matches the template
  • files_unchanged - .github/workflows/linting.yml matches the template
  • files_unchanged - assets/email_template.html matches the template
  • files_unchanged - assets/email_template.txt matches the template
  • files_unchanged - assets/sendmail_template.txt matches the template
  • files_unchanged - assets/nf-core-eager_logo_light.png matches the template
  • files_unchanged - docs/images/nf-core-eager_logo_light.png matches the template
  • files_unchanged - docs/images/nf-core-eager_logo_dark.png matches the template
  • files_unchanged - docs/README.md matches the template
  • files_unchanged - .gitignore matches the template
  • files_unchanged - .prettierignore matches the template
  • actions_ci - '.github/workflows/ci.yml' is triggered on expected events
  • actions_ci - '.github/workflows/ci.yml' checks minimum NF version
  • actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
  • actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
  • actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
  • readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
  • pipeline_name_conventions - Name adheres to nf-core convention
  • template_strings - Did not find any Jinja template strings (343 files)
  • schema_lint - Schema lint passed
  • schema_lint - Schema title + description lint passed
  • schema_lint - Input mimetype lint passed: 'text/csv'
  • schema_params - Schema matched params returned from nextflow config
  • system_exit - No System.exit calls found
  • actions_schema_validation - Workflow validation passed: awstest.yml
  • actions_schema_validation - Workflow validation passed: branch.yml
  • actions_schema_validation - Workflow validation passed: fix-linting.yml
  • actions_schema_validation - Workflow validation passed: linting.yml
  • actions_schema_validation - Workflow validation passed: clean-up.yml
  • actions_schema_validation - Workflow validation passed: ci.yml
  • actions_schema_validation - Workflow validation passed: linting_comment.yml
  • actions_schema_validation - Workflow validation passed: awsfulltest.yml
  • actions_schema_validation - Workflow validation passed: download_pipeline.yml
  • actions_schema_validation - Workflow validation passed: release-announcements.yml
  • merge_markers - No merge markers found in pipeline files
  • modules_json - Only installed modules found in modules.json
  • multiqc_config - assets/multiqc_config.yml found and not ignored.
  • multiqc_config - assets/multiqc_config.yml contains report_section_order
  • multiqc_config - assets/multiqc_config.yml contains export_plots
  • multiqc_config - assets/multiqc_config.yml contains report_comment
  • multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
  • multiqc_config - assets/multiqc_config.yml contains a matching 'report_comment'.
  • multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
  • modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
  • base_config - conf/base.config found and not ignored.
  • modules_config - conf/modules.config found and not ignored.
  • modules_config - SAMTOOLS_CONVERT_BAM_INPUT found in conf/modules.config and Nextflow scripts.
  • modules_config - CAT_FASTQ_CONVERTED_BAM found in conf/modules.config and Nextflow scripts.
  • modules_config - FASTQC found in conf/modules.config and Nextflow scripts.
  • modules_config - FASTQC_PROCESSED found in conf/modules.config and Nextflow scripts.
  • modules_config - MULTIQC found in conf/modules.config and Nextflow scripts.
  • modules_config - FALCO found in conf/modules.config and Nextflow scripts.
  • modules_config - FALCO_PROCESSED found in conf/modules.config and Nextflow scripts.
  • modules_config - FASTP_SINGLE found in conf/modules.config and Nextflow scripts.
  • modules_config - FASTP_PAIRED found in conf/modules.config and Nextflow scripts.
  • modules_config - ADAPTERREMOVAL_SINGLE found in conf/modules.config and Nextflow scripts.
  • modules_config - ADAPTERREMOVAL_PAIRED found in conf/modules.config and Nextflow scripts.
  • modules_config - CAT_FASTQ_ADAPTERREMOVAL found in conf/modules.config and Nextflow scripts.
  • modules_config - GUNZIP_FASTA found in conf/modules.config and Nextflow scripts.
  • modules_config - GUNZIP_PMDFASTA found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FAIDX found in conf/modules.config and Nextflow scripts.
  • modules_config - PICARD_CREATESEQUENCEDICTIONARY found in conf/modules.config and Nextflow scripts.
  • modules_config - BOWTIE2_BUILD found in conf/modules.config and Nextflow scripts.
  • modules_config - BWA_INDEX found in conf/modules.config and Nextflow scripts.
  • modules_config - GUNZIP_ELONGATED_FASTA found in conf/modules.config and Nextflow scripts.
  • modules_config - CIRCULARMAPPER_CIRCULARGENERATOR found in conf/modules.config and Nextflow scripts.
  • modules_config - BWA_INDEX_CIRCULARISED found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FLAGSTATS_BAM_INPUT found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_INDEX_BAM_INPUT found in conf/modules.config and Nextflow scripts.
  • modules_config - CAT_FASTQ_UNMAPPED found in conf/modules.config and Nextflow scripts.
  • modules_config - FILTER_BAM_FRAGMENT_LENGTH found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FASTQ_UNMAPPED found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_VIEW_BAM_FILTERING found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_LENGTH_FILTER_INDEX found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FASTQ_MAPPED found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FLAGSTAT_FILTERED found in conf/modules.config and Nextflow scripts.
  • modules_config - SEQKIT_SPLIT2 found in conf/modules.config and Nextflow scripts.
  • modules_config - BWA_ALN found in conf/modules.config and Nextflow scripts.
  • modules_config - BWA_SAMSE found in conf/modules.config and Nextflow scripts.
  • modules_config - ENDORSPY found in conf/modules.config and Nextflow scripts.
  • modules_config - BWA_MEM found in conf/modules.config and Nextflow scripts.
  • modules_config - BOWTIE2_ALIGN found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_INDEX_MEM found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_MERGE_LANES found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_SORT_MERGED_LANES found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_INDEX_MERGED_LANES found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FLAGSTAT_MERGED_LANES found in conf/modules.config and Nextflow scripts.
  • modules_config - CIRCULARMAPPER_REALIGNSAMFILE found in conf/modules.config and Nextflow scripts.
  • modules_config - PICARD_MARKDUPLICATES found in conf/modules.config and Nextflow scripts.
  • modules_config - DEDUP found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_MERGE_DEDUPPED found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_SORT_DEDUPPED found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_INDEX_DEDUPPED found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FLAGSTAT_DEDUPPED found in conf/modules.config and Nextflow scripts.
  • modules_config - HOST_REMOVAL found in conf/modules.config and Nextflow scripts.
  • modules_config - PRESEQ_CCURVE found in conf/modules.config and Nextflow scripts.
  • modules_config - PRESEQ_LCEXTRAP found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_VIEW_GENOME found in conf/modules.config and Nextflow scripts.
  • modules_config - BEDTOOLS_COVERAGE_DEPTH found in conf/modules.config and Nextflow scripts.
  • modules_config - BEDTOOLS_COVERAGE_BREADTH found in conf/modules.config and Nextflow scripts.
  • modules_config - BEDTOOLS_MASKFASTA found in conf/modules.config and Nextflow scripts.
  • modules_config - MAPDAMAGE2 found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_INDEX_DAMAGE_RESCALED found in conf/modules.config and Nextflow scripts.
  • modules_config - PMDTOOLS_FILTER found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_INDEX_DAMAGE_FILTERED found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FLAGSTAT_DAMAGE_FILTERED found in conf/modules.config and Nextflow scripts.
  • modules_config - BAMUTIL_TRIMBAM found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_INDEX_DAMAGE_TRIMMED found in conf/modules.config and Nextflow scripts.
  • modules_config - ANGSD_DOCOUNTS found in conf/modules.config and Nextflow scripts.
  • modules_config - ANGSD_CONTAMINATION found in conf/modules.config and Nextflow scripts.
  • modules_config - PRINT_CONTAMINATION_ANGSD found in conf/modules.config and Nextflow scripts.
  • modules_config - MTNUCRATIO found in conf/modules.config and Nextflow scripts.
  • modules_config - PRINSEQPLUSPLUS found in conf/modules.config and Nextflow scripts.
  • modules_config - BBMAP_BBDUK found in conf/modules.config and Nextflow scripts.
  • modules_config - MALT_RUN found in conf/modules.config and Nextflow scripts.
  • modules_config - CAT_CAT_MALT found in conf/modules.config and Nextflow scripts.
  • modules_config - KRAKEN2_KRAKEN2 found in conf/modules.config and Nextflow scripts.
  • modules_config - KRAKENUNIQ_PRELOADEDKRAKENUNIQ found in conf/modules.config and Nextflow scripts.
  • modules_config - METAPHLAN_METAPHLAN found in conf/modules.config and Nextflow scripts.
  • modules_config - MALTEXTRACT found in conf/modules.config and Nextflow scripts.
  • modules_config - MEGAN_RMA2INFO found in conf/modules.config and Nextflow scripts.
  • modules_config - AMPS found in conf/modules.config and Nextflow scripts.
  • modules_config - TAXPASTA_MERGE found in conf/modules.config and Nextflow scripts.
  • modules_config - TAXPASTA_STANDARDISE found in conf/modules.config and Nextflow scripts.
  • modules_config - QUALIMAP_BAMQC_WITHBED found in conf/modules.config and Nextflow scripts.
  • modules_config - DAMAGEPROFILER found in conf/modules.config and Nextflow scripts.
  • modules_config - CALCULATE_MAPDAMAGE2 found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_DEPTH_SEXDETERRMINE found in conf/modules.config and Nextflow scripts.
  • modules_config - SEXDETERRMINE found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_MERGE_LIBRARIES found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_SORT_MERGED_LIBRARIES found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_INDEX_MERGED_LIBRARIES found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_FLAGSTAT_MERGED_LIBRARIES found in conf/modules.config and Nextflow scripts.
  • modules_config - SAMTOOLS_MPILEUP_PILEUPCALLER found in conf/modules.config and Nextflow scripts.
  • modules_config - SEQUENCETOOLS_PILEUPCALLER found in conf/modules.config and Nextflow scripts.
  • modules_config - COLLECT_GENOTYPES found in conf/modules.config and Nextflow scripts.
  • modules_config - EIGENSTRATDATABASETOOLS_EIGENSTRATSNPCOVERAGE found in conf/modules.config and Nextflow scripts.
  • modules_config - GATK_REALIGNERTARGETCREATOR found in conf/modules.config and Nextflow scripts.
  • modules_config - GATK_INDELREALIGNER found in conf/modules.config and Nextflow scripts.
  • modules_config - GATK_UNIFIEDGENOTYPER found in conf/modules.config and Nextflow scripts.
  • modules_config - BCFTOOLS_INDEX_UG found in conf/modules.config and Nextflow scripts.
  • modules_config - GATK4_HAPLOTYPECALLER found in conf/modules.config and Nextflow scripts.
  • modules_config - FREEBAYES found in conf/modules.config and Nextflow scripts.
  • modules_config - BCFTOOLS_INDEX_FREEBAYES found in conf/modules.config and Nextflow scripts.
  • modules_config - BCFTOOLS_STATS_GENOTYPING found in conf/modules.config and Nextflow scripts.
  • modules_config - ANGSD_GL found in conf/modules.config and Nextflow scripts.
  • nfcore_yml - Repository type in .nf-core.yml is valid: pipeline
  • nfcore_yml - nf-core version in .nf-core.yml is set to the latest version: 2.14.1

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-09-02 09:03:08

github-actions[bot] avatar Nov 03 '23 09:11 github-actions[bot]

RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module.

Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt

Possibility for maintaining strandedness info for downstream maltextract:

(within metagenomics_profiling.nf reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }

ilight1542 avatar Feb 23 '24 11:02 ilight1542

RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module.

Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt

Possibility for maintaining strandedness info for downstream maltextract:

(within metagenomics_profiling.nf reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }

The only downstream process that relies on strandedness information is maltextract, so we should branch as late as possible (After MALT) and concat the channels directly afterwards.

Problem: The maltextract-module doesnt take a meta map in the input channels... Solution: Update the module

merszym avatar Mar 19 '24 08:03 merszym

RE: keeping strandedness. Since the only meta in malt-run is the meta with the list of read files, keeping info on which samples have single-stranded library prep must be done in multiple malt runs (unless we want to slightly rewrite the malt-run module. Unless we can keep the meta info and then somehow remerge it with the various rma6 files channel that we get from MALT.out.rma6, I think we need to split the rma6 files by strandedness first and then send them into malt Possibility for maintaining strandedness info for downstream maltextract: (within metagenomics_profiling.nf reads .branch { doublestranded: it[0].strandedness == 'double' singlestranded: it[0].strandedness == 'single' }.set { strandedness_ch }

The only downstream process that relies on strandedness information is maltextract, so we should branch as late as possible (After MALT) and concat the channels directly afterwards.

Problem: The maltextract-module doesnt take a meta map in the input channels... Solution: Update the module

  • [x] Update the maltextract module (in Review) (https://github.com/nf-core/modules/pull/5244)
  • [x] Update the module in eager
  • [x] Keep strandedness-information in the workflow
  • [x] Do the branching for malt_group_size == 0
  • [x] Do the branching for malt_group_size > 0

And finally...

  • [x] Taxpasta-merge works only for >1 sample, add taxpasta_standardize as option

merszym avatar Mar 19 '24 12:03 merszym

I would bundle all documentation-related comments into a separate issue, so that we can merge the (working) branch into dev and then finish on the documentation "on top". So that we can do that without going through all the files again and again and without diverging from the dev branch.

merszym avatar Apr 12 '24 12:04 merszym

Open ToDos from code review (after test profiles)

  • [x] Double check test profiles
  • [x] Channel staging (no channel creation from params in the subworkflows)
  • [x] Check if data could still be paired end in metagenomics workflows
  • [x] Maltextract: saveAs directive to drop 'results' folder
  • [ ] Documentation (after merge)
  • [ ] Check channel manipulation before metagenomics (meta.clone()) (after merge)
  • [ ] Multiple Hosts? (new Issue and new PR!)

If I missed anything, please correct me

merszym avatar Apr 26 '24 09:04 merszym

@jfy133 -- I think it is all set for review once more: a quick update RE: strandedness going into metagenomics screening. The current way that bamfiltering is done, the per-sample outputs (eg mapped R1, R2, singletons, unmapped... ) are always concatenated into a single channel and run independently.

Major revision would be required in the parsing of I/O from bamfiltering into metagenomics to get it working also while maintaining metadata for PE reads. Merlin and I feel this is more appropriate as a separate PR/extension.

ilight1542 avatar Jun 28 '24 14:06 ilight1542

@jfy133 -- I think it is all set for review once more: a quick update RE: strandedness going into metagenomics screening. The current way that bamfiltering is done, the per-sample outputs (eg mapped R1, R2, singletons, unmapped... ) are always concatenated into a single channel and run independently.

Major revision would be required in the parsing of I/O from bamfiltering into metagenomics to get it working also while maintaining metadata for PE reads. Merlin and I feel this is more appropriate as a separate PR/extension.

(not strandedness as in double/single stranded libraries, but in sequencing mode (paired end, single read))

Currently all channels coming into the metagenomics have the single_end=true paramter in the meta.

merszym avatar Jun 28 '24 14:06 merszym

@jfy133 @ilight1542 All remaining issues are fixed/documented/updated :)

merszym avatar Aug 16 '24 11:08 merszym

@nf-core-bot fix linting

jfy133 avatar Sep 02 '24 08:09 jfy133