Nathan comments

Results 8 comments of


                                            Nathan

TF Bridged Provider request : `pulumi-vercel`

Is there a way to contribute to this?

Modernbert reranker - Encoder layer producing different results after attention

## Transformers [Added logs to look into the outputs after attention and after mlp_output in transformers](https://github.com/huggingface/transformers/blob/main/src/transformers/models/modernbert/modular_modernbert.py#L739-L754) Using the example above, I get the following: ``` Encoder before attention, 0 -...

Modernbert reranker - Encoder layer producing different results after attention

I'm going to look into why there are differences here. I might not understand why but will step through both libraries. 1. We know they're operating on similar `hidden_states` before...

Modernbert reranker - Encoder layer producing different results after attention

# W_o projection output during forward pass I'm noticing a larger difference here 🤔 ## Transformers ``` tensor([[[ 0.1955, 0.5709, 0.2585, ..., -0.2987, 0.2474, 0.3109], [ 0.0070, 0.2721, -0.4248, ...,...

Modernbert reranker - Encoder layer producing different results after attention

# Separate example Going to look at a different example because it looks like with more than 3 texts it changes things quite a bit. ## Example Using a different...

Modernbert reranker - Encoder layer producing different results after attention

@Narsil is there any appetite for looking into this? Otherwise I can try to dig in further!

Modernbert reranker - Encoder layer producing different results after attention

Interestingly, running this on the CPU yields this: ``` [{'index': 1, 'score': 0.99749833}, {'index': 3, 'score': 0.9912548}, {'index': 0, 'score': 0.010130412}, {'index': 2, 'score': 0.0005193049}] ``` It's worth investigating the...

Modernbert reranker - Encoder layer producing different results after attention

Hey @kozistr! Thanks for the reply the above! > which TEI uses approximated gelu while fusing layers on the backside (I might be wrong). It does look like it's fusing:...