ValueError: Cannot convert float NaN to integer
Hi, I am trying to run QC step and it shows error.
Cell In[45], line 14 11 from pycisTopic.qc import * 12 path_to_regions = {'D59':os.path.join(work_dir, 'scATAC/consensus_peak_calling/consensus_regions.bed')} ---> 14 metadata_bc, profile_data_dict = compute_qc_stats( 15 fragments_dict = fragments_dict, 16 tss_annotation = annot, 17 stats=['barcode_rank_plot', 'duplicate_rate', 'insert_size_distribution', 'profile_tss', 'frip'], 18 label_list = None, 19 path_to_regions = path_to_regions, 20 n_cpu = 5, 21 valid_bc = None, 22 n_frag = 100, 23 n_bc = None, 24 tss_flank_window = 1000, 25 tss_window = 10, 26 tss_minimum_signal_window = 100, 27 tss_rolling_window = 10, 28 remove_duplicates = True, 29 _temp_dir = os.path.join(tmp_dir + 'ray_spill')) 31 if not os.path.exists(os.path.join(work_dir, 'scATAC/quality_control')): 32 os.makedirs(os.path.join(work_dir, 'scATAC/quality_control'))
File ~/.local/lib/python3.8/site-packages/pycisTopic/qc.py:1033, in compute_qc_stats(fragments_dict, tss_annotation, stats, label_list, path_to_regions, n_cpu, partition, valid_bc, n_frag, n_bc, tss_flank_window, tss_window, tss_minimum_signal_window, tss_rolling_window, min_norm, check_for_duplicates, remove_duplicates, use_polars, **kwargs) ... 543 dtype=dtype, 544 copy=copy, 545 )
ValueError: Cannot convert float NaN to integer
Any help regarding this?
Thanks
I am facing the same issue.
Please let me know if anyone able to solve it
@Ajeet1699 and @Citugulia40
This issue seems the same as this one: https://github.com/aertslab/pycisTopic/issues/81.
Could you run the code that I posted as a comment on that issue and report back?
https://github.com/aertslab/pycisTopic/issues/81#issuecomment-1641916325
Best,
Seppe
I have the same error, just with 'profile_tss', I have no problem with the other metrics. I tried to run it @SeppeDeWinter , here is the output:
annot
Chromosome Start Strand Gene Transcript_type 90 chrHG1342_HG2282_PATCH 12923 -1 PRAMEF11 protein_coding 92 chrHG1342_HG2282_PATCH 30238 -1 HNRNPCL1 protein_coding 93 chrHG1342_HG2282_PATCH 38599 1 PRAMEF2 protein_coding 94 chrHG1342_HG2282_PATCH 67714 -1 PRAMEF4 protein_coding 95 chrHG1342_HG2282_PATCH 79783 -1 PRAMEF10 protein_coding ... ... ... ... ... ... 249304 chr1 15617458 1 DDI2 protein_coding 249310 chr1 15659713 1 RSC1A1 protein_coding 249312 chr1 15684320 1 PLEKHM2 protein_coding 249313 chr1 15684390 1 PLEKHM2 protein_coding 249315 chr1 15684556 1 PLEKHM2 protein_coding
set(annot["Strand"])
{-1, 1}
from pycisTopic.utils import read_fragments_from_file
fragments=read_fragments_from_file(fragments_dict["wt1"])
fragments
Chromosome Start Strand Gene Transcript_type 90 chrHG1342_HG2282_PATCH 12923 -1 PRAMEF11 protein_coding 92 chrHG1342_HG2282_PATCH 30238 -1 HNRNPCL1 protein_coding 93 chrHG1342_HG2282_PATCH 38599 1 PRAMEF2 protein_coding 94 chrHG1342_HG2282_PATCH 67714 -1 PRAMEF4 protein_coding 95 chrHG1342_HG2282_PATCH 79783 -1 PRAMEF10 protein_coding ... ... ... ... ... ... 249304 chr1 15617458 1 DDI2 protein_coding 249310 chr1 15659713 1 RSC1A1 protein_coding 249312 chr1 15684320 1 PLEKHM2 protein_coding 249313 chr1 15684390 1 PLEKHM2 protein_coding 249315 chr1 15684556 1 PLEKHM2 protein_coding
annotation = annot
flank_window = 1000
tss_space_annotation = annotation[["Chromosome", "Start", "Strand"]]
tss_space_annotation["End"] = tss_space_annotation["Start"] + flank_window
tss_space_annotation["Start"] = tss_space_annotation["Start"] - flank_window
tss_space_annotation = tss_space_annotation[ ["Chromosome", "Start", "End", "Strand"]]
tss_space_annotation = pr.PyRanges(tss_space_annotation)
overlap_with_TSS = fragments.join(tss_space_annotation, nb_cpu=1).df
overlap_with_TSS
Chromosome Start End Name Score Start_b End_b Strand
0 chr1 922601 922941 GAGGTCCAGGCGCTTC-1 1 922923 924923 NaN 1 chr1 922650 922999 GCGGTGTTCGTAGCGC-1 1 922923 924923 NaN 2 chr1 922655 923037 AAATGAGGTGGGTAGT-1 1 922923 924923 NaN 3 chr1 922682 922934 AGACAAATCGTGATAC-1 1 922923 924923 NaN 4 chr1 922728 923031 CTTGAAGAGACGCCAA-1 1 922923 924923 NaN ... ... ... ... ... ... ... ... ... 62558513 chrY 20756019 20756181 AGCCTGGTCACACGTA-1 1 20755108 20757108 NaN 62558514 chrY 20756029 20756092 GCATGATCAATGATGA-1 1 20755108 20757108 NaN 62558515 chrY 20756212 20756381 ACAAGCTCAGGTGGTA-1 1 20755108 20757108 NaN 62558516 chrY 20756323 20756488 TTATGTCAGTCACGCC-1 1 20755108 20757108 NaN 62558517 chrY 20756460 20756493 AATGGCTCATAGTCCA-1 1 20755108 20757108 NaN
set(overlap_with_TSS["Strand"])
{nan}
This issue might be solved now, can you check this out? https://github.com/aertslab/pycisTopic/issues/81#issuecomment-1643460879
Best,
Seppe
This issue might be solved now, can you check this out? aertslab/pycisTopic#81 (comment)
Best,
Seppe
Yes, I followed this and it is fixed for me now. Thank you!