pycisTopic icon indicating copy to clipboard operation
pycisTopic copied to clipboard

Bug report [BUG] Trying to merge (cis_object_list) but gives an error that it is not defined

Open yojetsharma opened this issue 1 year ago • 1 comments

Describe the bug I am running pycisTopic and everything ran smoothly until i reached the step of creating a merged cis_obj_list:

import warnings
warnings.simplefilter(action='ignore')
import pandas
import pycisTopic
pycisTopic.__version__
'2.0a0'
path_to_regions = os.path.join(out_dir, "consensus_peak_calling/consensus_regions.bed")
path_to_blacklist = "/home/praghu/yojetsharma/softwares/pycisTopic/blacklist/hg38-blacklist.v2.bed"
pycistopic_qc_output_dir = "qc"

from pycisTopic.cistopic_class import create_cistopic_object_from_fragments
import polars as pl

cistopic_obj_list = []
for sample_id in fragments_dict:
    sample_metrics = pl.read_parquet(
        os.path.join(pycistopic_qc_output_dir, f'{sample_id}.fragments_stats_per_cb.parquet')
    ).to_pandas().set_index("CB").loc[ sample_id_to_barcodes_passing_filters[sample_id] ]
    cistopic_obj = create_cistopic_object_from_fragments(
        path_to_fragments = fragments_dict[sample_id],
        path_to_regions = path_to_regions,
        path_to_blacklist = path_to_blacklist,
        metrics = sample_metrics,
        valid_bc = sample_id_to_barcodes_passing_filters[sample_id],
        n_cpu = 1,
        project = sample_id,
        split_pattern = '-'
    )
    cistopic_obj_list.append(cistopic_obj)
2024-09-15 12:15:30,495 cisTopic     INFO     Reading data for d149
2024-09-15 12:18:25,421 cisTopic     INFO     metrics provided!
2024-09-15 12:18:39,466 cisTopic     INFO     Counting fragments in regions
2024-09-15 12:20:54,984 cisTopic     INFO     Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 6280216839 cells in the resulting pandas object.
  .unstack(level="Name", fill_value=0)
2024-09-15 12:23:19,172 cisTopic     INFO     Converting fragment matrix to sparse matrix
2024-09-15 12:24:09,916 cisTopic     INFO     Removing blacklisted regions
2024-09-15 12:24:11,687 cisTopic     INFO     Creating CistopicObject
2024-09-15 12:24:15,938 cisTopic     INFO     Done!
2024-09-15 12:24:17,884 cisTopic     INFO     Reading data for ls002
2024-09-15 12:27:16,855 cisTopic     INFO     metrics provided!
2024-09-15 12:27:30,732 cisTopic     INFO     Counting fragments in regions
2024-09-15 12:29:53,260 cisTopic     INFO     Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 11394507046 cells in the resulting pandas object.
  .unstack(level="Name", fill_value=0)
2024-09-15 12:33:34,320 cisTopic     INFO     Converting fragment matrix to sparse matrix
2024-09-15 12:35:22,478 cisTopic     INFO     Removing blacklisted regions
2024-09-15 12:35:24,316 cisTopic     INFO     Creating CistopicObject
2024-09-15 12:35:29,299 cisTopic     INFO     Done!
2024-09-15 12:35:31,288 cisTopic     INFO     Reading data for ls003
2024-09-15 12:39:03,988 cisTopic     INFO     metrics provided!
2024-09-15 12:39:20,066 cisTopic     INFO     Counting fragments in regions
2024-09-15 12:42:09,488 cisTopic     INFO     Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 7762093599 cells in the resulting pandas object.
  .unstack(level="Name", fill_value=0)
2024-09-15 12:49:48,158 cisTopic     INFO     Converting fragment matrix to sparse matrix
2024-09-15 12:50:58,834 cisTopic     INFO     Removing blacklisted regions
2024-09-15 12:51:02,252 cisTopic     INFO     Creating CistopicObject
2024-09-15 12:51:11,716 cisTopic     INFO     Done!
cistopic_obj = merge(cistopic_obj_list)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[27], line 1
----> 1 cistopic_obj = merge(cistopic_obj_list)

NameError: name 'merge' is not defined

To Reproduce Commands relevant to reproduce the error.

Error output Paste the entire output of the command, including log information prior to the error.

Expected behavior I would expect it the merge() function to run as is described in the notebook:

cistopic_obj = merge(cistopic_obj_list)
2022-08-09 09:58:30,928 cisTopic     INFO     cisTopic object 1 merged
2022-08-09 09:58:41,004 cisTopic     INFO     cisTopic object 2 merged
2022-08-09 09:58:53,013 cisTopic     INFO     cisTopic object 3 merged
2022-08-09 09:59:09,175 cisTopic     INFO     cisTopic object 4 merged
In [8]:	

Screenshots If applicable, add screenshots to help explain your problem or show the format of the input data for the command/s.

Version (please complete the following information):

  • Python.3.11

Additional context Add any other context about the problem here.

yojetsharma avatar Sep 15 '24 07:09 yojetsharma

Describe the bug I am running pycisTopic and everything ran smoothly until i reached the step of creating a merged cis_obj_list:

import warnings
warnings.simplefilter(action='ignore')
import pandas
import pycisTopic
pycisTopic.__version__
'2.0a0'
path_to_regions = os.path.join(out_dir, "consensus_peak_calling/consensus_regions.bed")
path_to_blacklist = "/home/praghu/yojetsharma/softwares/pycisTopic/blacklist/hg38-blacklist.v2.bed"
pycistopic_qc_output_dir = "qc"

from pycisTopic.cistopic_class import create_cistopic_object_from_fragments
import polars as pl

cistopic_obj_list = []
for sample_id in fragments_dict:
    sample_metrics = pl.read_parquet(
        os.path.join(pycistopic_qc_output_dir, f'{sample_id}.fragments_stats_per_cb.parquet')
    ).to_pandas().set_index("CB").loc[ sample_id_to_barcodes_passing_filters[sample_id] ]
    cistopic_obj = create_cistopic_object_from_fragments(
        path_to_fragments = fragments_dict[sample_id],
        path_to_regions = path_to_regions,
        path_to_blacklist = path_to_blacklist,
        metrics = sample_metrics,
        valid_bc = sample_id_to_barcodes_passing_filters[sample_id],
        n_cpu = 1,
        project = sample_id,
        split_pattern = '-'
    )
    cistopic_obj_list.append(cistopic_obj)
2024-09-15 12:15:30,495 cisTopic     INFO     Reading data for d149
2024-09-15 12:18:25,421 cisTopic     INFO     metrics provided!
2024-09-15 12:18:39,466 cisTopic     INFO     Counting fragments in regions
2024-09-15 12:20:54,984 cisTopic     INFO     Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 6280216839 cells in the resulting pandas object.
  .unstack(level="Name", fill_value=0)
2024-09-15 12:23:19,172 cisTopic     INFO     Converting fragment matrix to sparse matrix
2024-09-15 12:24:09,916 cisTopic     INFO     Removing blacklisted regions
2024-09-15 12:24:11,687 cisTopic     INFO     Creating CistopicObject
2024-09-15 12:24:15,938 cisTopic     INFO     Done!
2024-09-15 12:24:17,884 cisTopic     INFO     Reading data for ls002
2024-09-15 12:27:16,855 cisTopic     INFO     metrics provided!
2024-09-15 12:27:30,732 cisTopic     INFO     Counting fragments in regions
2024-09-15 12:29:53,260 cisTopic     INFO     Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 11394507046 cells in the resulting pandas object.
  .unstack(level="Name", fill_value=0)
2024-09-15 12:33:34,320 cisTopic     INFO     Converting fragment matrix to sparse matrix
2024-09-15 12:35:22,478 cisTopic     INFO     Removing blacklisted regions
2024-09-15 12:35:24,316 cisTopic     INFO     Creating CistopicObject
2024-09-15 12:35:29,299 cisTopic     INFO     Done!
2024-09-15 12:35:31,288 cisTopic     INFO     Reading data for ls003
2024-09-15 12:39:03,988 cisTopic     INFO     metrics provided!
2024-09-15 12:39:20,066 cisTopic     INFO     Counting fragments in regions
2024-09-15 12:42:09,488 cisTopic     INFO     Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 7762093599 cells in the resulting pandas object.
  .unstack(level="Name", fill_value=0)
2024-09-15 12:49:48,158 cisTopic     INFO     Converting fragment matrix to sparse matrix
2024-09-15 12:50:58,834 cisTopic     INFO     Removing blacklisted regions
2024-09-15 12:51:02,252 cisTopic     INFO     Creating CistopicObject
2024-09-15 12:51:11,716 cisTopic     INFO     Done!
cistopic_obj = merge(cistopic_obj_list)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[27], line 1
----> 1 cistopic_obj = merge(cistopic_obj_list)

NameError: name 'merge' is not defined

To Reproduce Commands relevant to reproduce the error.

Error output Paste the entire output of the command, including log information prior to the error.

Expected behavior I would expect it the merge() function to run as is described in the notebook:

cistopic_obj = merge(cistopic_obj_list)
2022-08-09 09:58:30,928 cisTopic     INFO     cisTopic object 1 merged
2022-08-09 09:58:41,004 cisTopic     INFO     cisTopic object 2 merged
2022-08-09 09:58:53,013 cisTopic     INFO     cisTopic object 3 merged
2022-08-09 09:59:09,175 cisTopic     INFO     cisTopic object 4 merged
In [8]:	

Screenshots If applicable, add screenshots to help explain your problem or show the format of the input data for the command/s.

Version (please complete the following information):

  • Python.3.11

Additional context Add any other context about the problem here.

Okay, I was able to solve this by first calling the CistopicObject and then merge. dir(pycisTopic) gave me:

['DistributionNotFound',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '__warningregistry__',
 'cistopic_class',
 'contextlib',
 'fragments',
 'genomic_ranges',
 'get_distribution',
 'plotting',
 'qc',
 'topic_binarization',
 'tss_profile',
 'utils']
 import pycisTopic.cistopic_class as cistopic_class

Figured merge would be in cistopic_class so listed that:

# List all attributes and methods in the module
print(dir(cistopic_class))

# Check for the class directly
print(hasattr(cistopic_class, 'CistopicObject'))
['CistopicObject', 'Self', 'TYPE_CHECKING', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__warningregistry__', 'annotations', 'cl', 'collapse_duplicates', 'create_cistopic_object', 'create_cistopic_object_chunk', 'create_cistopic_object_from_fragments', 'create_cistopic_object_from_matrix_file', 'dtype', 'get_position_index', 'logging', 'merge', 'non_zero_rows', 'np', 'pd', 'pr', 'prepare_tag_cells', 'read_fragments_to_pyranges', 'region_names_to_coordinates', 'sp', 'sparse', 'subset_list', 'sys']
True

Found merge here and so imported that:

**from pycisTopic.cistopic_class import CistopicObject, merge**

cistopic_obj = merge(cistopic_obj_list)
2024-09-15 13:31:30,536 cisTopic     INFO     cisTopic object 1 merged
2024-09-15 13:31:49,202 cisTopic     INFO     cisTopic object 2 merged
print(cistopic_obj)
CistopicObject from project cisTopic_merge with n_cells × n_regions = 60499 × 420461

But still don't understand why it wouldn't import the function?

yojetsharma avatar Sep 15 '24 08:09 yojetsharma

Hi @yojetsharma

Indeed this is not documented well.

You should run it like this:


cistopic_obj = cistopic_obj_list[0].merge(cistopic_obj_list[1:])

All the best,

Seppe

SeppeDeWinter avatar Oct 09 '24 09:10 SeppeDeWinter

Thanks for this update! But I have gone ahead with the analyses using the aforementioned workaround. And so far the pipeline seems to be running smoothly.

yojetsharma avatar Oct 09 '24 10:10 yojetsharma

Hey there, unfortunately this didn't work well for me. I got a list of good-to-go cistopic_obj, but they failed to be merged:

> print(cistopic_obj_list[0])
> print(cistopic_obj_list[1])
> print(cistopic_obj_list[2])
...

CistopicObject from project CEMBA200305_7J with n_cells × n_regions = 5776 × 518512
CistopicObject from project CEMBA200305_9L with n_cells × n_regions = 5924 × 526569
CistopicObject from project CEMBA200312_6H with n_cells × n_regions = 5955 × 520224

> cistopic_obj1 = cistopic_obj_list[0].merge(cistopic_obj_list[1:])
> print(cistopic_obj1)
2024-10-24 09:37:27,834 cisTopic     INFO     cisTopic object 1 merged
2024-10-24 09:37:32,531 cisTopic     INFO     cisTopic object 2 merged
2024-10-24 09:37:37,945 cisTopic     INFO     cisTopic object 3 merged
2024-10-24 09:37:43,948 cisTopic     INFO     cisTopic object 4 merged
2024-10-24 09:37:50,544 cisTopic     INFO     cisTopic object 5 merged
2024-10-24 09:37:58,288 cisTopic     INFO     cisTopic object 6 merged
2024-10-24 09:38:06,515 cisTopic     INFO     cisTopic object 7 merged
None

No warning, no error. I just can't find out the reason. Thanks for any comments.

PhrenoVermouth avatar Oct 24 '24 16:10 PhrenoVermouth

Hey there, unfortunately this didn't work well for me. I got a list of good-to-go cistopic_obj, but they failed to be merged:

> print(cistopic_obj_list[0])
> print(cistopic_obj_list[1])
> print(cistopic_obj_list[2])
...

CistopicObject from project CEMBA200305_7J with n_cells × n_regions = 5776 × 518512
CistopicObject from project CEMBA200305_9L with n_cells × n_regions = 5924 × 526569
CistopicObject from project CEMBA200312_6H with n_cells × n_regions = 5955 × 520224

> cistopic_obj1 = cistopic_obj_list[0].merge(cistopic_obj_list[1:])
> print(cistopic_obj1)
2024-10-24 09:37:27,834 cisTopic     INFO     cisTopic object 1 merged
2024-10-24 09:37:32,531 cisTopic     INFO     cisTopic object 2 merged
2024-10-24 09:37:37,945 cisTopic     INFO     cisTopic object 3 merged
2024-10-24 09:37:43,948 cisTopic     INFO     cisTopic object 4 merged
2024-10-24 09:37:50,544 cisTopic     INFO     cisTopic object 5 merged
2024-10-24 09:37:58,288 cisTopic     INFO     cisTopic object 6 merged
2024-10-24 09:38:06,515 cisTopic     INFO     cisTopic object 7 merged
None

No warning, no error. I just can't find out the reason. Thanks for any comments.

Does the following work:

from pycisTopic.cistopic_class import CistopicObject, merge
cistopic_obj = merge(cistopic_obj_list)

yojetsharma avatar Oct 24 '24 18:10 yojetsharma

Hey there, unfortunately this didn't work well for me. I got a list of good-to-go cistopic_obj, but they failed to be merged:

> print(cistopic_obj_list[0])
> print(cistopic_obj_list[1])
> print(cistopic_obj_list[2])
...

CistopicObject from project CEMBA200305_7J with n_cells × n_regions = 5776 × 518512
CistopicObject from project CEMBA200305_9L with n_cells × n_regions = 5924 × 526569
CistopicObject from project CEMBA200312_6H with n_cells × n_regions = 5955 × 520224

> cistopic_obj1 = cistopic_obj_list[0].merge(cistopic_obj_list[1:])
> print(cistopic_obj1)
2024-10-24 09:37:27,834 cisTopic     INFO     cisTopic object 1 merged
2024-10-24 09:37:32,531 cisTopic     INFO     cisTopic object 2 merged
2024-10-24 09:37:37,945 cisTopic     INFO     cisTopic object 3 merged
2024-10-24 09:37:43,948 cisTopic     INFO     cisTopic object 4 merged
2024-10-24 09:37:50,544 cisTopic     INFO     cisTopic object 5 merged
2024-10-24 09:37:58,288 cisTopic     INFO     cisTopic object 6 merged
2024-10-24 09:38:06,515 cisTopic     INFO     cisTopic object 7 merged
None

No warning, no error. I just can't find out the reason. Thanks for any comments.

Does the following work:

from pycisTopic.cistopic_class import CistopicObject, merge
cistopic_obj = merge(cistopic_obj_list)

Thanks for your swift comment! I'll check it out afterwards, as I'm going through the downstream part. Intriguingly, cistopic_obj_list[0].merge(cistopic_obj_list[1:]) command merge all of the list to cistopic_obj_list[0], so I just renamed it to cistopic_obj. LOL

PhrenoVermouth avatar Oct 24 '24 22:10 PhrenoVermouth