Bug report [BUG] Trying to merge (cis_object_list) but gives an error that it is not defined
Describe the bug I am running pycisTopic and everything ran smoothly until i reached the step of creating a merged cis_obj_list:
import warnings
warnings.simplefilter(action='ignore')
import pandas
import pycisTopic
pycisTopic.__version__
'2.0a0'
path_to_regions = os.path.join(out_dir, "consensus_peak_calling/consensus_regions.bed")
path_to_blacklist = "/home/praghu/yojetsharma/softwares/pycisTopic/blacklist/hg38-blacklist.v2.bed"
pycistopic_qc_output_dir = "qc"
from pycisTopic.cistopic_class import create_cistopic_object_from_fragments
import polars as pl
cistopic_obj_list = []
for sample_id in fragments_dict:
sample_metrics = pl.read_parquet(
os.path.join(pycistopic_qc_output_dir, f'{sample_id}.fragments_stats_per_cb.parquet')
).to_pandas().set_index("CB").loc[ sample_id_to_barcodes_passing_filters[sample_id] ]
cistopic_obj = create_cistopic_object_from_fragments(
path_to_fragments = fragments_dict[sample_id],
path_to_regions = path_to_regions,
path_to_blacklist = path_to_blacklist,
metrics = sample_metrics,
valid_bc = sample_id_to_barcodes_passing_filters[sample_id],
n_cpu = 1,
project = sample_id,
split_pattern = '-'
)
cistopic_obj_list.append(cistopic_obj)
2024-09-15 12:15:30,495 cisTopic INFO Reading data for d149
2024-09-15 12:18:25,421 cisTopic INFO metrics provided!
2024-09-15 12:18:39,466 cisTopic INFO Counting fragments in regions
2024-09-15 12:20:54,984 cisTopic INFO Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 6280216839 cells in the resulting pandas object.
.unstack(level="Name", fill_value=0)
2024-09-15 12:23:19,172 cisTopic INFO Converting fragment matrix to sparse matrix
2024-09-15 12:24:09,916 cisTopic INFO Removing blacklisted regions
2024-09-15 12:24:11,687 cisTopic INFO Creating CistopicObject
2024-09-15 12:24:15,938 cisTopic INFO Done!
2024-09-15 12:24:17,884 cisTopic INFO Reading data for ls002
2024-09-15 12:27:16,855 cisTopic INFO metrics provided!
2024-09-15 12:27:30,732 cisTopic INFO Counting fragments in regions
2024-09-15 12:29:53,260 cisTopic INFO Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 11394507046 cells in the resulting pandas object.
.unstack(level="Name", fill_value=0)
2024-09-15 12:33:34,320 cisTopic INFO Converting fragment matrix to sparse matrix
2024-09-15 12:35:22,478 cisTopic INFO Removing blacklisted regions
2024-09-15 12:35:24,316 cisTopic INFO Creating CistopicObject
2024-09-15 12:35:29,299 cisTopic INFO Done!
2024-09-15 12:35:31,288 cisTopic INFO Reading data for ls003
2024-09-15 12:39:03,988 cisTopic INFO metrics provided!
2024-09-15 12:39:20,066 cisTopic INFO Counting fragments in regions
2024-09-15 12:42:09,488 cisTopic INFO Creating fragment matrix
/ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 7762093599 cells in the resulting pandas object.
.unstack(level="Name", fill_value=0)
2024-09-15 12:49:48,158 cisTopic INFO Converting fragment matrix to sparse matrix
2024-09-15 12:50:58,834 cisTopic INFO Removing blacklisted regions
2024-09-15 12:51:02,252 cisTopic INFO Creating CistopicObject
2024-09-15 12:51:11,716 cisTopic INFO Done!
cistopic_obj = merge(cistopic_obj_list)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[27], line 1
----> 1 cistopic_obj = merge(cistopic_obj_list)
NameError: name 'merge' is not defined
To Reproduce Commands relevant to reproduce the error.
Error output Paste the entire output of the command, including log information prior to the error.
Expected behavior I would expect it the merge() function to run as is described in the notebook:
cistopic_obj = merge(cistopic_obj_list)
2022-08-09 09:58:30,928 cisTopic INFO cisTopic object 1 merged
2022-08-09 09:58:41,004 cisTopic INFO cisTopic object 2 merged
2022-08-09 09:58:53,013 cisTopic INFO cisTopic object 3 merged
2022-08-09 09:59:09,175 cisTopic INFO cisTopic object 4 merged
In [8]:
Screenshots If applicable, add screenshots to help explain your problem or show the format of the input data for the command/s.
Version (please complete the following information):
- Python.3.11
Additional context Add any other context about the problem here.
Describe the bug I am running pycisTopic and everything ran smoothly until i reached the step of creating a merged cis_obj_list:
import warnings warnings.simplefilter(action='ignore') import pandas import pycisTopic pycisTopic.__version__ '2.0a0' path_to_regions = os.path.join(out_dir, "consensus_peak_calling/consensus_regions.bed") path_to_blacklist = "/home/praghu/yojetsharma/softwares/pycisTopic/blacklist/hg38-blacklist.v2.bed" pycistopic_qc_output_dir = "qc" from pycisTopic.cistopic_class import create_cistopic_object_from_fragments import polars as pl cistopic_obj_list = [] for sample_id in fragments_dict: sample_metrics = pl.read_parquet( os.path.join(pycistopic_qc_output_dir, f'{sample_id}.fragments_stats_per_cb.parquet') ).to_pandas().set_index("CB").loc[ sample_id_to_barcodes_passing_filters[sample_id] ] cistopic_obj = create_cistopic_object_from_fragments( path_to_fragments = fragments_dict[sample_id], path_to_regions = path_to_regions, path_to_blacklist = path_to_blacklist, metrics = sample_metrics, valid_bc = sample_id_to_barcodes_passing_filters[sample_id], n_cpu = 1, project = sample_id, split_pattern = '-' ) cistopic_obj_list.append(cistopic_obj) 2024-09-15 12:15:30,495 cisTopic INFO Reading data for d149 2024-09-15 12:18:25,421 cisTopic INFO metrics provided! 2024-09-15 12:18:39,466 cisTopic INFO Counting fragments in regions 2024-09-15 12:20:54,984 cisTopic INFO Creating fragment matrix /ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 6280216839 cells in the resulting pandas object. .unstack(level="Name", fill_value=0) 2024-09-15 12:23:19,172 cisTopic INFO Converting fragment matrix to sparse matrix 2024-09-15 12:24:09,916 cisTopic INFO Removing blacklisted regions 2024-09-15 12:24:11,687 cisTopic INFO Creating CistopicObject 2024-09-15 12:24:15,938 cisTopic INFO Done! 2024-09-15 12:24:17,884 cisTopic INFO Reading data for ls002 2024-09-15 12:27:16,855 cisTopic INFO metrics provided! 2024-09-15 12:27:30,732 cisTopic INFO Counting fragments in regions 2024-09-15 12:29:53,260 cisTopic INFO Creating fragment matrix /ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 11394507046 cells in the resulting pandas object. .unstack(level="Name", fill_value=0) 2024-09-15 12:33:34,320 cisTopic INFO Converting fragment matrix to sparse matrix 2024-09-15 12:35:22,478 cisTopic INFO Removing blacklisted regions 2024-09-15 12:35:24,316 cisTopic INFO Creating CistopicObject 2024-09-15 12:35:29,299 cisTopic INFO Done! 2024-09-15 12:35:31,288 cisTopic INFO Reading data for ls003 2024-09-15 12:39:03,988 cisTopic INFO metrics provided! 2024-09-15 12:39:20,066 cisTopic INFO Counting fragments in regions 2024-09-15 12:42:09,488 cisTopic INFO Creating fragment matrix /ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/softwares/pycisTopic/src/pycisTopic/cistopic_class.py:886: PerformanceWarning: The following operation may generate 7762093599 cells in the resulting pandas object. .unstack(level="Name", fill_value=0) 2024-09-15 12:49:48,158 cisTopic INFO Converting fragment matrix to sparse matrix 2024-09-15 12:50:58,834 cisTopic INFO Removing blacklisted regions 2024-09-15 12:51:02,252 cisTopic INFO Creating CistopicObject 2024-09-15 12:51:11,716 cisTopic INFO Done! cistopic_obj = merge(cistopic_obj_list) --------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[27], line 1 ----> 1 cistopic_obj = merge(cistopic_obj_list) NameError: name 'merge' is not definedTo Reproduce Commands relevant to reproduce the error.
Error output Paste the entire output of the command, including log information prior to the error.
Expected behavior I would expect it the merge() function to run as is described in the notebook:
cistopic_obj = merge(cistopic_obj_list) 2022-08-09 09:58:30,928 cisTopic INFO cisTopic object 1 merged 2022-08-09 09:58:41,004 cisTopic INFO cisTopic object 2 merged 2022-08-09 09:58:53,013 cisTopic INFO cisTopic object 3 merged 2022-08-09 09:59:09,175 cisTopic INFO cisTopic object 4 merged In [8]:Screenshots If applicable, add screenshots to help explain your problem or show the format of the input data for the command/s.
Version (please complete the following information):
- Python.3.11
Additional context Add any other context about the problem here.
Okay, I was able to solve this by first calling the CistopicObject and then merge. dir(pycisTopic) gave me:
['DistributionNotFound',
'__builtins__',
'__cached__',
'__doc__',
'__file__',
'__loader__',
'__name__',
'__package__',
'__path__',
'__spec__',
'__version__',
'__warningregistry__',
'cistopic_class',
'contextlib',
'fragments',
'genomic_ranges',
'get_distribution',
'plotting',
'qc',
'topic_binarization',
'tss_profile',
'utils']
import pycisTopic.cistopic_class as cistopic_class
Figured merge would be in cistopic_class so listed that:
# List all attributes and methods in the module
print(dir(cistopic_class))
# Check for the class directly
print(hasattr(cistopic_class, 'CistopicObject'))
['CistopicObject', 'Self', 'TYPE_CHECKING', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__warningregistry__', 'annotations', 'cl', 'collapse_duplicates', 'create_cistopic_object', 'create_cistopic_object_chunk', 'create_cistopic_object_from_fragments', 'create_cistopic_object_from_matrix_file', 'dtype', 'get_position_index', 'logging', 'merge', 'non_zero_rows', 'np', 'pd', 'pr', 'prepare_tag_cells', 'read_fragments_to_pyranges', 'region_names_to_coordinates', 'sp', 'sparse', 'subset_list', 'sys']
True
Found merge here and so imported that:
**from pycisTopic.cistopic_class import CistopicObject, merge**
cistopic_obj = merge(cistopic_obj_list)
2024-09-15 13:31:30,536 cisTopic INFO cisTopic object 1 merged
2024-09-15 13:31:49,202 cisTopic INFO cisTopic object 2 merged
print(cistopic_obj)
CistopicObject from project cisTopic_merge with n_cells × n_regions = 60499 × 420461
But still don't understand why it wouldn't import the function?
Hi @yojetsharma
Indeed this is not documented well.
You should run it like this:
cistopic_obj = cistopic_obj_list[0].merge(cistopic_obj_list[1:])
All the best,
Seppe
Thanks for this update! But I have gone ahead with the analyses using the aforementioned workaround. And so far the pipeline seems to be running smoothly.
Hey there, unfortunately this didn't work well for me. I got a list of good-to-go cistopic_obj, but they failed to be merged:
> print(cistopic_obj_list[0])
> print(cistopic_obj_list[1])
> print(cistopic_obj_list[2])
...
CistopicObject from project CEMBA200305_7J with n_cells × n_regions = 5776 × 518512
CistopicObject from project CEMBA200305_9L with n_cells × n_regions = 5924 × 526569
CistopicObject from project CEMBA200312_6H with n_cells × n_regions = 5955 × 520224
> cistopic_obj1 = cistopic_obj_list[0].merge(cistopic_obj_list[1:])
> print(cistopic_obj1)
2024-10-24 09:37:27,834 cisTopic INFO cisTopic object 1 merged
2024-10-24 09:37:32,531 cisTopic INFO cisTopic object 2 merged
2024-10-24 09:37:37,945 cisTopic INFO cisTopic object 3 merged
2024-10-24 09:37:43,948 cisTopic INFO cisTopic object 4 merged
2024-10-24 09:37:50,544 cisTopic INFO cisTopic object 5 merged
2024-10-24 09:37:58,288 cisTopic INFO cisTopic object 6 merged
2024-10-24 09:38:06,515 cisTopic INFO cisTopic object 7 merged
None
No warning, no error. I just can't find out the reason. Thanks for any comments.
Hey there, unfortunately this didn't work well for me. I got a list of good-to-go cistopic_obj, but they failed to be merged:
> print(cistopic_obj_list[0]) > print(cistopic_obj_list[1]) > print(cistopic_obj_list[2]) ... CistopicObject from project CEMBA200305_7J with n_cells × n_regions = 5776 × 518512 CistopicObject from project CEMBA200305_9L with n_cells × n_regions = 5924 × 526569 CistopicObject from project CEMBA200312_6H with n_cells × n_regions = 5955 × 520224 > cistopic_obj1 = cistopic_obj_list[0].merge(cistopic_obj_list[1:]) > print(cistopic_obj1) 2024-10-24 09:37:27,834 cisTopic INFO cisTopic object 1 merged 2024-10-24 09:37:32,531 cisTopic INFO cisTopic object 2 merged 2024-10-24 09:37:37,945 cisTopic INFO cisTopic object 3 merged 2024-10-24 09:37:43,948 cisTopic INFO cisTopic object 4 merged 2024-10-24 09:37:50,544 cisTopic INFO cisTopic object 5 merged 2024-10-24 09:37:58,288 cisTopic INFO cisTopic object 6 merged 2024-10-24 09:38:06,515 cisTopic INFO cisTopic object 7 merged NoneNo warning, no error. I just can't find out the reason. Thanks for any comments.
Does the following work:
from pycisTopic.cistopic_class import CistopicObject, merge
cistopic_obj = merge(cistopic_obj_list)
Hey there, unfortunately this didn't work well for me. I got a list of good-to-go cistopic_obj, but they failed to be merged:
> print(cistopic_obj_list[0]) > print(cistopic_obj_list[1]) > print(cistopic_obj_list[2]) ... CistopicObject from project CEMBA200305_7J with n_cells × n_regions = 5776 × 518512 CistopicObject from project CEMBA200305_9L with n_cells × n_regions = 5924 × 526569 CistopicObject from project CEMBA200312_6H with n_cells × n_regions = 5955 × 520224 > cistopic_obj1 = cistopic_obj_list[0].merge(cistopic_obj_list[1:]) > print(cistopic_obj1) 2024-10-24 09:37:27,834 cisTopic INFO cisTopic object 1 merged 2024-10-24 09:37:32,531 cisTopic INFO cisTopic object 2 merged 2024-10-24 09:37:37,945 cisTopic INFO cisTopic object 3 merged 2024-10-24 09:37:43,948 cisTopic INFO cisTopic object 4 merged 2024-10-24 09:37:50,544 cisTopic INFO cisTopic object 5 merged 2024-10-24 09:37:58,288 cisTopic INFO cisTopic object 6 merged 2024-10-24 09:38:06,515 cisTopic INFO cisTopic object 7 merged NoneNo warning, no error. I just can't find out the reason. Thanks for any comments.
Does the following work:
from pycisTopic.cistopic_class import CistopicObject, merge cistopic_obj = merge(cistopic_obj_list)
Thanks for your swift comment! I'll check it out afterwards, as I'm going through the downstream part. Intriguingly, cistopic_obj_list[0].merge(cistopic_obj_list[1:]) command merge all of the list to cistopic_obj_list[0], so I just renamed it to cistopic_obj. LOL