PEER_Benchmark icon indicating copy to clipboard operation
PEER_Benchmark copied to clipboard

EnzymeCommissions always NaN using ESM fixed

Open jasperhyp opened this issue 3 years ago • 8 comments

Hi, I used your EnzymeCommissions to benchmark ESM and it looks like the loss is still NaN at epoch 20. Shall I change the yaml (copied from GearNet)?

jasperhyp avatar Dec 02 '22 14:12 jasperhyp

The same thing happened for GO (BP, CC, MF).

jasperhyp avatar Dec 02 '22 22:12 jasperhyp

Hi, I have checked the ESM on EC config in the GearNet repo. Everything looks fine to me. The most important points are lr_ratio: 0.1 and max_length=550 for reproducing the results. I am not sure whether some specific setting (like batch size, gpus) will lead to the training failure

ChrisAllenMing avatar Dec 03 '22 01:12 ChrisAllenMing

Interesting... I am using batch_size: 16 and one GPU. But that shouldn't affect losses. Otherwise, everything is the same.

jasperhyp avatar Dec 03 '22 17:12 jasperhyp

Yeah, it's weird. I'm afraid I cannot tell why it is the case. BTW, for ESM on EC, I think batch_size: 16 may easily exceed the maximum GPU memory. I am curious about the max_length of truncation you use to let it runnable.

ChrisAllenMing avatar Dec 04 '22 01:12 ChrisAllenMing

Hmmm. I'll check the GO dataset to see if there's something weird I'm doing. I used a 32GB GPU. I kept the max len line in the yaml file, which is 550.

jasperhyp avatar Dec 06 '22 20:12 jasperhyp

Oh, I just find you fix the encoder, so you can have the batch size of 16. The truncation length and other configurations also look fine to me

ChrisAllenMing avatar Dec 08 '22 03:12 ChrisAllenMing

It's kinda weird but if I used

truncate_transform = transforms.TruncateProtein(max_length=550, random=False)
protein_view_transform = transforms.ProteinView(view="residue")
transform = transforms.Compose([truncate_transform, protein_view_transform])
dataset = datasets.EnzymeCommission(data_folder+"/protein-datasets", transform=transform)

it worked. I am guessing it's a dataset issue but I can't figure out why.

jasperhyp avatar Dec 18 '22 23:12 jasperhyp

Hi! Have you tried this config in GearNet repo? This config should work well for EC. If not, you can raise an issue in the GearNet repo.

Oxer11 avatar Dec 22 '22 06:12 Oxer11