AttributeError issue
Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.6.1/docs/FAQ.md: Yes
Describe the issue:
Setup
- Operating system: Ubuntu 20.04.6 LTS
- DeepVariant version: 1.6
- Installation method (Docker, built from source, etc.): singularity image built form Docker Hub
- Type of data: bacteria whole genome
Steps to reproduce:
- Command:
smakemake pipeline
rule run_deepvariant:
output:
vcf = "../results/deepVariant/{dataset}/{sample}/vcf/{sample}.deepVariant.vcf.gz",
gvcf = "../results/deepVariant/{dataset}/{sample}/vcf/{sample}.deepVariant.g.vcf.gz"
input:
reference_fasta = "/project/databases/bacteroides_genome/reference_genomic.fna",
reads = rules.sam2bam.output.sorted_bam
params:
inter_dir = "../../results/deepVariant/{dataset}/{sample}/intermediate",
log_dir = "../../results/deepVariant/{dataset}/{sample}/log",
work_dir = "/project/",
deepvariant = "/project/software/deepVariant.sif"
shell:
"""
module load singularity/3.7.0
singularity exec -B {params.work_dir} {params.deepvariant} /opt/deepvariant/bin/run_deepvariant
--model_type=WGS
--ref={input.reference_fasta}
--reads={input.reads}
--output_vcf={output.vcf}
--output_gvcf={output.vcf}
--make_examples_extra_args --channels=insert_size
--intermediate_results_dir {params.inter_dir}
--num_shards=6
--logging_dir={params.log_dir} """ - Error trace: ***** Running the command:***** time /opt/deepvariant/bin/vcf_stats_report --input_vcf "../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz" --outfile_base "../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant"
I0626 19:01:30.369722 139699125458752 genomics_reader.py:222] Reading ../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz with NativeVcfReader
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_xq721o6r/runfiles/com_google_deepvariant/deepvariant/vcf_stats_report.py", line 103, in
@hangy1 ,
- You can see from the log:
Reading ../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz
seems like sample_name is not set correctly? Unless you replaced them? Can you confirm if you have set the values correctly by printing them before running DeepVariant?
- Please use absolute paths rather than relative paths when you are setting paths.
so instead of using:
gvcf = "../results/deepVariant/{dataset}/{sample}/vcf/{sample}.deepVariant.g.vcf.gz"
Use:
gvcf = "/path/to/vcf/{sample}.deepVariant.g.vcf.gz"
Hope this helps. You can also run the quickstart to see if you can simply copy-paste and run the command fully and then adapt it to the command you are planning to run.
Sorry about the the confusion of <sample_name>, I edited out the actual name. The name was appeared correctly and It was being generated at the right path too (files were deleted by snakemake due to incomplete workflow).
It seems like both VCF and gVCF were generated successfully from the log but if failed to run vcf_stats_report.py:
***** Running the command:***** time /opt/deepvariant/bin/postprocess_variants --ref "/project/pi_robertmills_umass_edu/databases/bacteroides_genome/<ref_genome>" --infile "../../results/deepVariant/KO_PV/<sample_name>/intermediate/call_variants_output.tfrecord.gz" --outfile "../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz" --cpus "6" --gvcf_outfile "../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz" --nonvariant_site_tfrecord_path "../../results/deepVariant/KO_PV/<sample_name>/intermediate/[email protected]"
I0626 19:01:26.925776 139684790912832 postprocess_variants.py:1211] Using sample name from call_variants output. Sample name: default 2024-06-26 19:01:26.928061: I deepvariant/postprocess_variants.cc:94] Read from: ../../results/deepVariant/KO_PV/<sample_name>/intermediate/call_variants_output-00000-of-00001.tfrecord.gz 2024-06-26 19:01:26.930065: I deepvariant/postprocess_variants.cc:109] Total #entries in single_site_calls = 407 I0626 19:01:26.930917 139684790912832 postprocess_variants.py:1313] CVO sorting took 6.503661473592123e-05 minutes I0626 19:01:26.931080 139684790912832 postprocess_variants.py:1316] Transforming call_variants_output to variants. I0626 19:01:26.931126 139684790912832 postprocess_variants.py:1318] Using 6 CPUs for parallelization of variant transformation. I0626 19:01:26.954008 139684790912832 postprocess_variants.py:1211] Using sample name from call_variants output. Sample name: default I0626 19:01:26.991115 139684790912832 postprocess_variants.py:1386] Processing variants (and writing to temporary file) took 0.00046567519505818686 minutes I0626 19:01:27.391298 139684790912832 postprocess_variants.py:1407] Finished writing VCF and gVCF in 0.006664212544759115 minutes.
real 0m4.417s user 0m2.938s sys 0m0.743s
Also under my target path ../results/deepVariant/KO_PV/<sample_name>/vcf, I do get the file <sample_name>.deepVariant.vcf.gz.tbi
Ok, that means it worked for you. Do you have any further questions?
only <sample_name>.deepVariant.vcf.gz.tbi (besides example and call_variant files) was generated, no vcf and gvcf at the output dir. I also changed the path to absolute dir and it didn't help
@hangy1 , can you please run the quickstart by simply copy-pasting the commands in your system? That way we could pinpoint the issue in a controlled case.
I am closing this due to inactivity. Please reopen if you need further help.
Hello,
I have the exact same issue. In my case, I have my VCFs, gVCFs and the index files created but it fails at creating the "vcf_stats_repot". I have given the absolute paths as well.
When I ran the test data, I had the same issue. Tried runnig the VCF run report separate as documented here: https://github.com/google/deepvariant/blob/r1.6.1/docs/deepvariant-vcf-stats-report.md and that also didn't work.
Any assistance would be greatly appreciated.
Thank you so much.
Hello @ayeshbond!
Could you please add a log statement with the exact error that you're seeing?
Additionally, you can always disable the creation of the vcf_stats_repot in case it's blocking you from running DeepVariant.
Hello @lucasbrambrink,
Thank you very much for the response. I am attaching the log file herewith.
THe command I used was:
singularity run -B /usr/lib/locale/:/usr/lib/locale/ deepvariant_1.6.1.sif /opt/deepvariant/bin/run_deepvariant --model_type=WGS --ref="{rest_path}/tools/DeepVariant/ref_genomes/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta" --reads="{rest_path}/MiniMap_SAM_BAM/11741-KA-0004.sorted.bam" --output_vcf="{rest_path}/deepvar_calls/11741-KA-0004_output.vcf.gz" --output_gvcf="{rest_path}/deepvar_calls/11741-KA-0004_output.g.vcf.gz" --intermediate_results_dir "{rest_path}/deepvar_calls/intermediate_results_dir/0004" --num_shards=32 &> deepvar_0004.log
Also, could you let me know how I can disable it the vcf_stat_report?` I tried to look for it but to no luck. It doesn't necessarily affect the variant calling through. Just gives an error/failure due to this last step.
Thank you very much once again, and please let me know if I can get you any more information.