Benjamin Anderson

Results 25 comments of Benjamin Anderson

I might have messed this up github-wise because it's showing all the changes you made to my previous speculative decoding example. LMK if this is a problem and I can...

Yep, happy to address all the points you've raised! For renaming the modules so that keys match, how would you suggest handling cases where the Transformers BERT model has more...

Hey, just made some updates responding to the comments here. I changed the setup of BERT so that fewer keys have to be swapped (still some, but fewer.) :) Also...

Hey @awni when do you think you'll get around to this one?

Lower precision results were decent/usable I thought! But it's not super important to support given how small these models are; the overhead of quantizing/dequantizing probably isn't worth it in most...

See if it looks better on your end with normalize=False. It doesn't affect metrics based on cosine similarity, but does seem to make a difference for e.g. BankingClassification. If there...

Yup, that's what I was getting too. Not sure what explains the remaining discrepancy.

Yeah, my thought is that it makes sentence embeddings usable in MLX, which requires the pooling, handling batches, truncation, etc. and also makes it easy to load the models. This...

Well, while this is stuck I'm working on a separate repo for embeddings in MLX.