`spatialdata_io.xenium` results in `NA` feature names in the `transcripts` Spatial Element
After reading in Xenium data with spatialdata_io.xenium for individual samples, some feature names (of the 5006 predesigned genes) have been put to NA in the transcripts Spatial Element. For example in 1 sample the feature names Fga and Kif2b are set to NA, while in another sample Cd3e is set to NA. All the rows of the transcript data are still there, but with NA feature names.
This is not the case :
- within the
Tablesspatial element with the cell-summarized matrix. Here, all 5006 features are available. - when manually reading in the transcript parquet file with
read_parquetfromdask.dataframe.
In fact after testing, the table object within the _get_points function below still contains all the feature names. The NA's are only introduced after PointsModel.parse() is called.
def _get_points(path: Path, specs: dict[str, Any]) -> Table:
table = read_parquet(path / XeniumKeys.TRANSCRIPTS_FILE)
table["feature_name"] = table["feature_name"].apply(
lambda x: x.decode("utf-8") if isinstance(x, bytes) else str(x), meta=("feature_name", "object")
)
transform = Scale([1.0 / specs["pixel_size"], 1.0 / specs["pixel_size"]], axes=("x", "y"))
points = PointsModel.parse(
table,
coordinates={"x": XeniumKeys.TRANSCRIPTS_X, "y": XeniumKeys.TRANSCRIPTS_Y, "z": XeniumKeys.TRANSCRIPTS_Z},
feature_key=XeniumKeys.FEATURE_NAME,
instance_key=XeniumKeys.CELL_ID,
transformations={"global": transform},
sort=True,
)
return points
Some addition Xenium specs info:
- Version: 5.1.0
- Panel: predesigned mAtlas_v1
- instrument_sw_version: 3.1.0.0
- analysis_sw_version: xenium-3.1.0.4
Would there be an easy workaround to make sure all feature names are still included?
Thanks in advance!