esm icon indicating copy to clipboard operation
esm copied to clipboard

Why is the model's accuracy so low when using residue embeddings from pre-trained model?

Open Gift-OYS opened this issue 1 year ago • 1 comments

I’m using a pre-trained model esm3-sm-open-v1 to extract residue embeddings via link. However, the precision of my model is unexpectedly low. For reference, here’s a small snippet of the residue embeddings (shape: [num_tokens, embedding_dim]):

tensor([[ 175.0000,  102.5000,  -99.0000,  ..., -106.5000,  -35.5000,
           86.0000],
        [ 205.0000,  103.5000, -139.0000,  ..., -328.0000, -224.0000,
          134.0000],
        [ 130.0000,   49.2500,  -26.2500,  ..., -202.0000, -161.0000,
          134.0000],
        ...,
        [  65.0000,  -75.0000,   54.5000,  ...,  -62.0000,  -26.2500,
         -102.5000],
        [ 173.0000,  -60.0000,  205.0000,  ...,   43.0000,  -89.0000,
         -115.5000],
        [  -6.0000,  170.0000,  113.0000,  ...,  -44.0000, -115.0000,
           43.7500]])

Gift-OYS avatar Dec 05 '24 13:12 Gift-OYS

I’m using a pre-trained model esm3-sm-open-v1 to extract residue embeddings via link. However, the precision of my model is unexpectedly low. For reference, here’s a small snippet of the residue embeddings (shape: [num_tokens, embedding_dim]):

tensor([[ 175.0000,  102.5000,  -99.0000,  ..., -106.5000,  -35.5000,
           86.0000],
        [ 205.0000,  103.5000, -139.0000,  ..., -328.0000, -224.0000,
          134.0000],
        [ 130.0000,   49.2500,  -26.2500,  ..., -202.0000, -161.0000,
          134.0000],
        ...,
        [  65.0000,  -75.0000,   54.5000,  ...,  -62.0000,  -26.2500,
         -102.5000],
        [ 173.0000,  -60.0000,  205.0000,  ...,   43.0000,  -89.0000,
         -115.5000],
        [  -6.0000,  170.0000,  113.0000,  ...,  -44.0000, -115.0000,
           43.7500]])

I also get the same embeddings as this. And I also worry about why the values of elements of the embedding are larger than those embeddings generated by the various other modes like ProtT5, ESMC, so on.

thnhan avatar Feb 16 '25 04:02 thnhan