deepvariant icon indicating copy to clipboard operation
deepvariant copied to clipboard

AttributeError issue

Open hangy1 opened this issue 1 year ago • 5 comments

Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.6.1/docs/FAQ.md: Yes

Describe the issue:

Setup

  • Operating system: Ubuntu 20.04.6 LTS
  • DeepVariant version: 1.6
  • Installation method (Docker, built from source, etc.): singularity image built form Docker Hub
  • Type of data: bacteria whole genome

Steps to reproduce:

  • Command: smakemake pipeline rule run_deepvariant: output: vcf = "../results/deepVariant/{dataset}/{sample}/vcf/{sample}.deepVariant.vcf.gz", gvcf = "../results/deepVariant/{dataset}/{sample}/vcf/{sample}.deepVariant.g.vcf.gz" input: reference_fasta = "/project/databases/bacteroides_genome/reference_genomic.fna", reads = rules.sam2bam.output.sorted_bam params: inter_dir = "../../results/deepVariant/{dataset}/{sample}/intermediate", log_dir = "../../results/deepVariant/{dataset}/{sample}/log", work_dir = "/project/", deepvariant = "/project/software/deepVariant.sif" shell: """ module load singularity/3.7.0 singularity exec -B {params.work_dir} {params.deepvariant} /opt/deepvariant/bin/run_deepvariant
    --model_type=WGS
    --ref={input.reference_fasta}
    --reads={input.reads}
    --output_vcf={output.vcf}
    --output_gvcf={output.vcf}
    --make_examples_extra_args --channels=insert_size
    --intermediate_results_dir {params.inter_dir}
    --num_shards=6
    --logging_dir={params.log_dir} """
  • Error trace: ***** Running the command:***** time /opt/deepvariant/bin/vcf_stats_report --input_vcf "../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz" --outfile_base "../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant"

I0626 19:01:30.369722 139699125458752 genomics_reader.py:222] Reading ../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz with NativeVcfReader Traceback (most recent call last): File "/tmp/Bazel.runfiles_xq721o6r/runfiles/com_google_deepvariant/deepvariant/vcf_stats_report.py", line 103, in tf.compat.v1.app.run() File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/platform/app.py", line 36, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/tmp/Bazel.runfiles_xq721o6r/runfiles/absl_py/absl/app.py", line 312, in run _run_main(main, args) File "/tmp/Bazel.runfiles_xq721o6r/runfiles/absl_py/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/tmp/Bazel.runfiles_xq721o6r/runfiles/com_google_deepvariant/deepvariant/vcf_stats_report.py", line 93, in main vcf_stats.create_vcf_report( File "/tmp/Bazel.runfiles_xq721o6r/runfiles/com_google_deepvariant/deepvariant/vcf_stats.py", line 392, in create_vcf_report vcf_stats_vis.create_visual_report( File "/tmp/Bazel.runfiles_xq721o6r/runfiles/com_google_deepvariant/deepvariant/vcf_stats_vis.py", line 543, in create_visual_report _save_html(basename, all_charts) File "/tmp/Bazel.runfiles_xq721o6r/runfiles/com_google_deepvariant/deepvariant/vcf_stats_vis.py", line 532, in _save_html html_string = _altair_chart_to_html( File "/tmp/Bazel.runfiles_xq721o6r/runfiles/com_google_deepvariant/deepvariant/vcf_stats_vis.py", line 513, in _altair_chart_to_html altair_chart.save( File "/usr/local/lib/python3.8/dist-packages/altair/vegalite/v4/api.py", line 476, in save result = save(**kwds) File "/usr/local/lib/python3.8/dist-packages/altair/utils/save.py", line 79, in save spec = chart.to_dict() File "/usr/local/lib/python3.8/dist-packages/altair/vegalite/v4/api.py", line 373, in to_dict dct = super(TopLevelMixin, copy).to_dict(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 325, in to_dict result = _todict( File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 60, in _todict return { File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 61, in k: _todict(v, validate, context) File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 58, in _todict return [_todict(v, validate, context) for v in obj] File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 58, in return [_todict(v, validate, context) for v in obj] File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 56, in _todict return obj.to_dict(validate=validate, context=context) File "/usr/local/lib/python3.8/dist-packages/altair/vegalite/v4/api.py", line 373, in to_dict dct = super(TopLevelMixin, copy).to_dict(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 325, in to_dict result = _todict( File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 60, in _todict return { File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 61, in k: _todict(v, validate, context) File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 58, in _todict return [_todict(v, validate, context) for v in obj] File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 58, in return [_todict(v, validate, context) for v in obj] File "/usr/local/lib/python3.8/dist-packages/altair/utils/schemapi.py", line 56, in _todict return obj.to_dict(validate=validate, context=context) File "/usr/local/lib/python3.8/dist-packages/altair/vegalite/v4/api.py", line 84, in _prepare_data data = _pipe(data, data_transformers.get()) File "/usr/local/lib/python3.8/dist-packages/toolz/functoolz.py", line 628, in pipe data = func(data) File "/usr/local/lib/python3.8/dist-packages/toolz/functoolz.py", line 304, in call return self._partial(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/altair/vegalite/data.py", line 19, in default_data_transformer return curried.pipe(data, limit_rows(max_rows=max_rows), to_values) File "/usr/local/lib/python3.8/dist-packages/toolz/functoolz.py", line 628, in pipe data = func(data) File "/usr/local/lib/python3.8/dist-packages/toolz/functoolz.py", line 304, in call return self._partial(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/altair/utils/data.py", line 149, in to_values data = sanitize_dataframe(data) File "/usr/local/lib/python3.8/dist-packages/altair/utils/core.py", line 283, in sanitize_dataframe for col_name, dtype in df.dtypes.iteritems(): File "/home/hangyin_umass_edu/.local/lib/python3.8/site-packages/pandas/core/generic.py", line 5989, in getattr return object.getattribute(self, name) AttributeError: 'Series' object has no attribute 'iteritems'

hangy1 avatar Jun 26 '24 19:06 hangy1

@hangy1 ,

  1. You can see from the log:
Reading ../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz

seems like sample_name is not set correctly? Unless you replaced them? Can you confirm if you have set the values correctly by printing them before running DeepVariant?

  1. Please use absolute paths rather than relative paths when you are setting paths.

so instead of using:

gvcf = "../results/deepVariant/{dataset}/{sample}/vcf/{sample}.deepVariant.g.vcf.gz"

Use:

gvcf = "/path/to/vcf/{sample}.deepVariant.g.vcf.gz"

Hope this helps. You can also run the quickstart to see if you can simply copy-paste and run the command fully and then adapt it to the command you are planning to run.

kishwarshafin avatar Jun 27 '24 17:06 kishwarshafin

Sorry about the the confusion of <sample_name>, I edited out the actual name. The name was appeared correctly and It was being generated at the right path too (files were deleted by snakemake due to incomplete workflow).

It seems like both VCF and gVCF were generated successfully from the log but if failed to run vcf_stats_report.py:

***** Running the command:***** time /opt/deepvariant/bin/postprocess_variants --ref "/project/pi_robertmills_umass_edu/databases/bacteroides_genome/<ref_genome>" --infile "../../results/deepVariant/KO_PV/<sample_name>/intermediate/call_variants_output.tfrecord.gz" --outfile "../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz" --cpus "6" --gvcf_outfile "../results/deepVariant/KO_PV/<sample_name>/vcf/<sample_name>.deepVariant.vcf.gz" --nonvariant_site_tfrecord_path "../../results/deepVariant/KO_PV/<sample_name>/intermediate/[email protected]"

I0626 19:01:26.925776 139684790912832 postprocess_variants.py:1211] Using sample name from call_variants output. Sample name: default 2024-06-26 19:01:26.928061: I deepvariant/postprocess_variants.cc:94] Read from: ../../results/deepVariant/KO_PV/<sample_name>/intermediate/call_variants_output-00000-of-00001.tfrecord.gz 2024-06-26 19:01:26.930065: I deepvariant/postprocess_variants.cc:109] Total #entries in single_site_calls = 407 I0626 19:01:26.930917 139684790912832 postprocess_variants.py:1313] CVO sorting took 6.503661473592123e-05 minutes I0626 19:01:26.931080 139684790912832 postprocess_variants.py:1316] Transforming call_variants_output to variants. I0626 19:01:26.931126 139684790912832 postprocess_variants.py:1318] Using 6 CPUs for parallelization of variant transformation. I0626 19:01:26.954008 139684790912832 postprocess_variants.py:1211] Using sample name from call_variants output. Sample name: default I0626 19:01:26.991115 139684790912832 postprocess_variants.py:1386] Processing variants (and writing to temporary file) took 0.00046567519505818686 minutes I0626 19:01:27.391298 139684790912832 postprocess_variants.py:1407] Finished writing VCF and gVCF in 0.006664212544759115 minutes.

real 0m4.417s user 0m2.938s sys 0m0.743s

hangy1 avatar Jun 27 '24 20:06 hangy1

Also under my target path ../results/deepVariant/KO_PV/<sample_name>/vcf, I do get the file <sample_name>.deepVariant.vcf.gz.tbi

hangy1 avatar Jun 28 '24 14:06 hangy1

Ok, that means it worked for you. Do you have any further questions?

kishwarshafin avatar Jun 28 '24 15:06 kishwarshafin

only <sample_name>.deepVariant.vcf.gz.tbi (besides example and call_variant files) was generated, no vcf and gvcf at the output dir. I also changed the path to absolute dir and it didn't help

hangy1 avatar Jun 28 '24 20:06 hangy1

@hangy1 , can you please run the quickstart by simply copy-pasting the commands in your system? That way we could pinpoint the issue in a controlled case.

kishwarshafin avatar Jul 03 '24 19:07 kishwarshafin

I am closing this due to inactivity. Please reopen if you need further help.

kishwarshafin avatar Jul 11 '24 16:07 kishwarshafin

Hello,

I have the exact same issue. In my case, I have my VCFs, gVCFs and the index files created but it fails at creating the "vcf_stats_repot". I have given the absolute paths as well.

When I ran the test data, I had the same issue. Tried runnig the VCF run report separate as documented here: https://github.com/google/deepvariant/blob/r1.6.1/docs/deepvariant-vcf-stats-report.md and that also didn't work.

Any assistance would be greatly appreciated.

Thank you so much.

ayeshbond avatar Aug 21 '24 17:08 ayeshbond

Hello @ayeshbond!

Could you please add a log statement with the exact error that you're seeing?

Additionally, you can always disable the creation of the vcf_stats_repot in case it's blocking you from running DeepVariant.

lucasbrambrink avatar Aug 21 '24 20:08 lucasbrambrink

Hello @lucasbrambrink,

Thank you very much for the response. I am attaching the log file herewith.

THe command I used was:

singularity run -B /usr/lib/locale/:/usr/lib/locale/ deepvariant_1.6.1.sif /opt/deepvariant/bin/run_deepvariant --model_type=WGS --ref="{rest_path}/tools/DeepVariant/ref_genomes/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta" --reads="{rest_path}/MiniMap_SAM_BAM/11741-KA-0004.sorted.bam" --output_vcf="{rest_path}/deepvar_calls/11741-KA-0004_output.vcf.gz" --output_gvcf="{rest_path}/deepvar_calls/11741-KA-0004_output.g.vcf.gz" --intermediate_results_dir "{rest_path}/deepvar_calls/intermediate_results_dir/0004" --num_shards=32 &> deepvar_0004.log

deepvar_0004.log

Also, could you let me know how I can disable it the vcf_stat_report?` I tried to look for it but to no luck. It doesn't necessarily affect the variant calling through. Just gives an error/failure due to this last step.

Thank you very much once again, and please let me know if I can get you any more information.

ayeshbond avatar Aug 23 '24 15:08 ayeshbond