`spatialdata_io.xenium` results in `NA` feature names in the `transcripts` Spatial Element

Open ewouddt opened this issue 8 months ago • 0 comments

After reading in Xenium data with spatialdata_io.xenium for individual samples, some feature names (of the 5006 predesigned genes) have been put to NA in the transcripts Spatial Element. For example in 1 sample the feature names Fga and Kif2b are set to NA, while in another sample Cd3e is set to NA. All the rows of the transcript data are still there, but with NA feature names.

This is not the case :

within the Tables spatial element with the cell-summarized matrix. Here, all 5006 features are available.
when manually reading in the transcript parquet file with read_parquet from dask.dataframe.

In fact after testing, the table object within the _get_points function below still contains all the feature names. The NA's are only introduced after PointsModel.parse() is called.

def _get_points(path: Path, specs: dict[str, Any]) -> Table:
    table = read_parquet(path / XeniumKeys.TRANSCRIPTS_FILE)
    table["feature_name"] = table["feature_name"].apply(
        lambda x: x.decode("utf-8") if isinstance(x, bytes) else str(x), meta=("feature_name", "object")
    )

    transform = Scale([1.0 / specs["pixel_size"], 1.0 / specs["pixel_size"]], axes=("x", "y"))
    points = PointsModel.parse(
        table,
        coordinates={"x": XeniumKeys.TRANSCRIPTS_X, "y": XeniumKeys.TRANSCRIPTS_Y, "z": XeniumKeys.TRANSCRIPTS_Z},
        feature_key=XeniumKeys.FEATURE_NAME,
        instance_key=XeniumKeys.CELL_ID,
        transformations={"global": transform},
        sort=True,
    )
    return points

Some addition Xenium specs info:

Version: 5.1.0
Panel: predesigned mAtlas_v1
instrument_sw_version: 3.1.0.0
analysis_sw_version: xenium-3.1.0.4

Would there be an easy workaround to make sure all feature names are still included?

Thanks in advance!

May 21 '25 18:05 ewouddt