how to handle empty tiles
Hi,
the images from my spatial transcriptomics data don't fill the entire image so we I split up the image into smaller tiles I get an error that there are no cells (see error message below). Is there a way to instead of teminating the for loop to just skip empty tiles?
ValueError Traceback (most recent call last)
File
File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/python/ClusterMap/ClusterMap/clustermap.py:87, in ClusterMap.preprocess(self, dapi_grid_interval, LOF, contamination, pct_filter) 86 def preprocess(self,dapi_grid_interval=5, LOF=False, contamination=0.1, pct_filter=0.1): ---> 87 preprocessing_data(self.spots, dapi_grid_interval, self.dapi_binary, LOF,contamination, self.xy_radius,pct_filter)
File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/python/ClusterMap/ClusterMap/preprocessing.py:136, in preprocessing_data(spots, dapi_grid_interval, dapi_binary, LOF, contamination, xy_radius, pct_filter) 134 #compute neighbors within radius for local density 135 knn = NearestNeighbors(radius=xy_radius) --> 136 knn.fit(all_points) 137 spots_array = np.array(spots.loc[:, ['spot_location_2', 'spot_location_1']]) 138 neigh_dist, neigh_array = knn.radius_neighbors(spots_array)
File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/neighbors/_unsupervised.py:166, in NearestNeighbors.fit(self, X, y) 149 def fit(self, X, y=None): 150 """Fit the nearest neighbors estimator from the training dataset. 151 152 Parameters (...) 164 The fitted nearest neighbors estimator. 165 """ --> 166 return self._fit(X)
File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/neighbors/_base.py:435, in NeighborsBase._fit(self, X, y) 433 else: 434 if not isinstance(X, (KDTree, BallTree, NeighborsBase)): --> 435 X = self._validate_data(X, accept_sparse="csr") 437 self._check_algorithm_metric() 438 if self.metric_params is None:
File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/base.py:561, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params) 559 raise ValueError("Validation should be done on X, y or both.") 560 elif not no_val_X and no_val_y: --> 561 X = check_array(X, **check_params) 562 out = X 563 elif no_val_X and not no_val_y:
File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/utils/validation.py:797, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator) 795 n_samples = _num_samples(array) 796 if n_samples < ensure_min_samples: --> 797 raise ValueError( 798 "Found array with %d sample(s) (shape=%s) while a" 799 " minimum of %d is required%s." 800 % (n_samples, array.shape, ensure_min_samples, context) 801 ) 803 if ensure_min_features > 0 and array.ndim == 2: 804 n_features = array.shape[1]
ValueError: Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required.
I recently resolved this error by adding the below code to let the for loop skip processing the empty tiles.
for tile_num in range(out.shape[0]):
if tile_num == 6 or tile_num == 7 or tile_num == 15 :
continue
print(f'tile: {tile_num}')
...
where tile 6,7 and 15 are empty. Hopefully it helps.
Yes that helped. Thanks
Hi,
Sorry for the late reply. Thank @rocketeer1998 for helping! I also skipped tiles with less than 20 spots.
if spots_tile.shape[0] < 20: continue