anndata icon indicating copy to clipboard operation
anndata copied to clipboard

Assigning to X after initializing as NoneType

Open tomouellette opened this issue 2 years ago • 2 comments

Please describe your wishes and possible alternatives to achieve the desired result.

This issue is likely related to #464.

So the current functionality allows for a NoneType assignment to X on initialization, e.g.,

import anndata
adata = anndata.AnnData(X=None)

But it seems there's no ability to re-assign a matrix to X, or any multi-dimensional attribute like obsm, once the AnnData object is initialized.

adata.X = np.ones((3,1))
ValueError: Data matrix has wrong shape (3, 1), need to be (0, 0).

Is no reassignment to multi-dimensional parts of the AnnData object the expected functionality or am I missing a specific method to allow for assigning a new matrix to an AnnData object initialized with X = None? This would be useful for me because I am trying to store some multi-dimensional variables in AnnData.obsm prior to adding any matrix.

tomouellette avatar Sep 08 '23 15:09 tomouellette

Thanks for the feature request!

Interesting proposal.

So, what's happening here is: We set the size of each dimension based on the first elements assigned there, and we always have an obs, var. If there's nothing else to guide the size of these dataframes, they have 0 rows.

But I think what you're asking for is basically to say that those dimensions are "uninitialized" instead of having length 0. So adding elements after initialization expands them. I think @LuckyMD suggested something similar once.

This may be a little complicated to do. But I do see how AnnData() is a bit of a useless object at the moment.

Could you tell us a bit more about why you want this? Why don't you have any values to assign when you're creating the object?

ivirshup avatar Sep 08 '23 21:09 ivirshup

Could you tell us a bit more about why you want this?

So to keep it general, I am working on some tools that analyze some multi-dimensional data D. Basically,

  1. The data D may or may not be associated with sequencing data --- which means a subset of users might not require the main data matrix X whereas others who have sequencing data will require the data matrix X.
  2. Additionally, some users might be loading data that is upstream from the multi-dimensional data D type and would therefore want to initialize the AnnData object with a custom attribute before creating data D. In this case, no values are assigned to any of obsm, obs, var, varm, etc. except for uns.

Why don't you have any values to assign when you're creating the object?

Point 2 addresses this question. AnnData initialized with custom attributes plus important meta information stored in uns.

Current hack

Right now I can get around this issue by re-assigning all the existing attributes and keys in uns to a new AnnData object. This just seems kind of hacky though since one would expect that an AnnData object instantiated with X=None should have no shape yet. And since the shape attribute has no getter I can't really hack it.

tomouellette avatar Sep 11 '23 03:09 tomouellette