cellDancer icon indicating copy to clipboard operation
cellDancer copied to clipboard

error of cd.pseudo_time()

Open datou99 opened this issue 2 years ago • 2 comments

I run cd.pseudo_time() same as that in the Case study 1:

import random dt = 0.05 t_total = {dt:int(10/dt)} n_repeats = 10 cellDancer_df_update = cd.pseudo_time(cellDancer_df=cellDancer_df, grid=(30,30), dt=dt, t_total=t_total[dt], n_repeats=n_repeats, speed_up=(100,100), n_paths = 3, plot_long_trajs=True, psrng_seeds_diffusion=[i for i in range(n_repeats)], n_jobs=8)

And the error report is :

Pseudo random number generator seeds are set to: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] Generating Trajectories: 100%|██████████| 30840/30840 [01:32<00:00, 334.26it/s]

There are 3 clusters. [0 1 2] Generating Trajectories: 100%|██████████| 7020/7020 [00:17<00:00, 410.37it/s]

IndexError Traceback (most recent call last) /tmp/ipykernel_458358/741018467.py in 15 plot_long_trajs=True, 16 psrng_seeds_diffusion=[i for i in range(n_repeats)], ---> 17 n_jobs=1)

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/celldancer/pseudo_time.py in pseudo_time(cellDancer_df, grid, dt, t_total, n_repeats, psrng_seeds_diffusion, n_jobs, speed_up, n_paths, plot_long_trajs, save, output_path) 1291 eps=v_eps, 1292 n_jobs=n_jobs, -> 1293 psrng_seeds_diffusion=psrng_seeds_diffusion) 1294 1295 print("--- %s seconds ---" % (time.time() - start_time))

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/celldancer/pseudo_time.py in compute_cell_time(cellDancer_df, embedding, cell_embedding, path_clusters, cell_fate, vel_mesh, cell_grid_idx, grid_mass, sampling_ixs, n_grids, dt, t_total, eps, n_repeats, n_jobs, psrng_seeds_diffusion) 1041 cell_fate_dict, 1042 cell_embedding, -> 1043 tau = 0.05) 1044 1045 #print("\n\nAll inter cluster cell time has been resolved.\n\n\n")

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/celldancer/pseudo_time.py in cell_time_assignment_intercluster(unresolved_cell_time, cell_fate_dict, cell_embedding, tau) 514 clusterIDs = sorted(np.unique(list(cell_fate_dict.values()))) 515 --> 516 cutoff = overlap_crit_intracluster(cell_embedding, cell_fate_dict, tau) 517 #print("Cutoff is ", cutoff) 518

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/celldancer/pseudo_time.py in overlap_crit_intracluster(cell_embedding, cell_fate_dict, quant) 751 # drop the self distances 752 temp3 = temp2[~np.eye(temp2.shape[0], dtype=bool)] --> 753 cutoff.append((np.quantile(temp3, quant))) 754 return max(cutoff) 755

<array_function internals> in quantile(*args, **kwargs)

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/numpy/lib/function_base.py in quantile(a, q, axis, out, overwrite_input, interpolation, keepdims) 3929 raise ValueError("Quantiles must be in the range [0, 1]") 3930 return _quantile_unchecked( -> 3931 a, q, axis, out, overwrite_input, interpolation, keepdims) 3932 3933

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/numpy/lib/function_base.py in _quantile_unchecked(a, q, axis, out, overwrite_input, interpolation, keepdims) 3937 r, k = _ureduce(a, func=_quantile_ureduce_func, q=q, axis=axis, out=out, 3938 overwrite_input=overwrite_input, -> 3939 interpolation=interpolation) 3940 if keepdims: 3941 return r.reshape(q.shape + k)

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/numpy/lib/function_base.py in _ureduce(a, func, **kwargs) 3513 keepdim = (1,) * a.ndim 3514 -> 3515 r = func(a, **kwargs) 3516 return r, keepdim 3517

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/numpy/lib/function_base.py in _quantile_ureduce_func(failed resolving arguments) 4048 indices_below.ravel(), indices_above.ravel(), [-1] 4049 )), axis=0) -> 4050 n = np.isnan(ap[-1]) 4051 else: 4052 # cannot contain nan

IndexError: index -1 is out of bounds for axis 0 with size 0

**Does anyone know the reason for this error report? Any help will be appreciated.

Best, Mia**

datou99 avatar Jan 15 '24 04:01 datou99

Hi, I also am encountering the same issue with a subset of my data (I am able to get the rest of it to run normally). Did you ever solve this issue? Thanks, Jacqui

thompjac24 avatar Feb 28 '24 15:02 thompjac24

Hi Mia and Jacqui,

Thanks for your interest in cellDancer!

This error usually occurs when multiple lineages exist, and there is insufficient overlap between them in the embedding space used for pseudotime calculation.

The goal in cd.pseudotime is to provide a uniform pseudotime for all the cells in the given dataset. We hypothesize that nearby cells should have about the same pseudotime. Therefore, achieving a uniform pseudotime can only be successful when there are enough overlaps between all cells in the embedding space.

If you don't expect a uniform time for all the lineages, you might want to split your cells in groups, and run cd.pseudotime for each group.

If you expect a uniform time for all the lineages, there are a few ideas to ensure overlap between them.

  • Use a different embedding space (only support 2D at the moment)
  • Decrease the size of the grid. The grid size here is grid=(30, 30), which means your embedding space is split into a mesh grid of size 30 x 30. If you reduce it to (20, 20), you may find all the occupied grids are connected at that resolution. By doing this, of course, your overall precision of pseudotime estimates decreases.

Let me know how it works out.

Good luck! Pengzhi

biopzhang avatar Mar 15 '24 21:03 biopzhang