scirpy icon indicating copy to clipboard operation
scirpy copied to clipboard

Option to use TCRblosum for distance calculation

Open riederd opened this issue 11 months ago • 4 comments

TCRblosum, a substitution matrix specifically derived from TCR sequences, is likely to provide a more accurate measure of TCR sequence similarity, however it is not yet used in scirpy.

I propose adding support for TCRblosum as an alternative substitution matrix for the distance calculation. Ideally, this could be implemented as a parameter within the ir_dist function, allowing users to choose also TCRblosum.

Publication:

riederd avatar Feb 10 '25 10:02 riederd

It's not very well documented in scirpy, but it should be possible to define a substitution matrix in a file as supported by parasail: https://github.com/jeffdaily/parasail-python?tab=readme-ov-file#substitution-matrices

and then use

ir.pp.ir_dist(adata, metric="alignment", cutoff=10, subst_mat="tcrblosum.txt")

If that works, it would be quite easy to include it natively in scirpy.

grst avatar Feb 10 '25 16:02 grst

Missed that, will try and report back Thanks

riederd avatar Feb 10 '25 16:02 riederd

Did it work out?

grst avatar Aug 04 '25 18:08 grst

To be honest, I did not test it yet, it is still on my list

riederd avatar Aug 05 '25 07:08 riederd