Option to use TCRblosum for distance calculation
TCRblosum, a substitution matrix specifically derived from TCR sequences, is likely to provide a more accurate measure of TCR sequence similarity, however it is not yet used in scirpy.
I propose adding support for TCRblosum as an alternative substitution matrix for the distance calculation. Ideally, this could be implemented as a parameter within the ir_dist function, allowing users to choose also TCRblosum.
Publication:
- TCRblosum is described in Postovskaya et al. (2025)
It's not very well documented in scirpy, but it should be possible to define a substitution matrix in a file as supported by parasail: https://github.com/jeffdaily/parasail-python?tab=readme-ov-file#substitution-matrices
and then use
ir.pp.ir_dist(adata, metric="alignment", cutoff=10, subst_mat="tcrblosum.txt")
If that works, it would be quite easy to include it natively in scirpy.
Missed that, will try and report back Thanks
Did it work out?
To be honest, I did not test it yet, it is still on my list