Possibility of adding `in_filtered` to read10xVisium(data = "raw")
Hi,
We noticed an issue between reading the raw and filtered outputs from SpaceRanger that I'm not sure whether read10xVisium() could address or not.
I incorrectly thought that the only difference between raw and filtered was that filtered was the subset for raw when in_tissue is TRUE. That way, if you read in the raw data, you can always get the filtered data too by using the in_tissue variable.
However, from https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/output/matrices, raw also includes other background spots that could be in_tissue = TRUE. For example spots from small holes in the tissue not close to the edge.
One solution would be to read in the filtered barcodes when reading in the raw ones, then add a column like in_filtered, specifying if the barcode is in the filtered version or not.
filtered_feature_bc_matrix
├── barcodes.tsv.gz
Doing so though would mean that users would have to have both the filtered and the raw barcode files. I know that most people don't share both sets of barcodes files, and likely don't even keep both sets of them. We do keep both ourselves, but maybe we are in the minority. That's why I'm not sure whether this issue can be addressed by SpatialExperiment::read10xVisium() or not.
Also maybe this belongs in DropletUtils given https://github.com/drighelli/SpatialExperiment/blob/bb81804decd9cdbe93e436588ab8c8792b5a3b8d/R/read10xVisium.R#L167?
Best, Leo
With info reported by @Nick-Eagles and @prashanthi-ravichandran
Hi Leo, @lcolladotor
This sounds like it could be handled by a helper function.
I imagine that most have access to or read raw or filtered data and not both.
I take care of this with the processing = c("filtered", "raw") argument in VisiumIO::TENxVisiumList. The helper function could attempt to look at the barcodes in the other 'processing' type and add those annotations to some part of the object (I'm not sure where yet).
Best,
Marcel
@lcolladotor
I made a first attempt of this in the in_filtered branch here https://github.com/waldronlab/VisiumIO/tree/in_filtered
It currently just returns a logical vector of raw %in% filtered
update: It currently provides a data.frame with barcodes and a logical vector denoting whether those barcodes are in the other dataset e.g., cbind.data.frame(raw_barcodes, in_filtered = 'raw %in% filtered')
Thanks Marcel!
Hi Leo! @lcolladotor
Have you taken a look at the in_filtered branch?
Does the output work for you?
I am considering moving it to the devel branch soon. Let me know if it satisfies the use case.
Thank you!
Best, Marcel