handled the case for T5 tokenizers

Open hazemessamm opened this issue 4 months ago • 1 comments

Hi, thanks for the great work!

I was working with some evaluation models that use ProtT5 and Ankh PLMs, and these models do not have a token; they only have token, so I handled their case in order to work properly with EvoProtGrad.

Decoded example for both ProtT5 and Ankh: MQMLKMGLV</s>

Sep 18 '25 18:09 hazemessamm

Hi, thank you for this pull request!

Having this logic (for instances with and tokens) in _compute_gradients might be problematic: https://github.com/NREL/EvoProtGrad/pull/16/files#r2442682916

I'm thinking that we could move this logic out of _compute_gradients and into the expert's code, that way we can let each expert handle removal of its special tokens (, , etc.). This could be done in the Protein LM expert's __call__ function.

I think we would need to also add custom expert code for T5/Ankh though!

Oct 18 '25 23:10 pemami4911