PEER_Benchmark EnzymeCommissions always NaN using ESM fixed

Hi, I used your EnzymeCommissions to benchmark ESM and it looks like the loss is still NaN at epoch 20. Shall I change the yaml (copied from GearNet)?

Dec 02 '22 14:12 jasperhyp

The same thing happened for GO (BP, CC, MF).

Dec 02 '22 22:12 jasperhyp

Hi, I have checked the ESM on EC config in the GearNet repo. Everything looks fine to me. The most important points are lr_ratio: 0.1 and max_length=550 for reproducing the results. I am not sure whether some specific setting (like batch size, gpus) will lead to the training failure

Dec 03 '22 01:12 ChrisAllenMing

Interesting... I am using batch_size: 16 and one GPU. But that shouldn't affect losses. Otherwise, everything is the same.

Dec 03 '22 17:12 jasperhyp

Yeah, it's weird. I'm afraid I cannot tell why it is the case. BTW, for ESM on EC, I think batch_size: 16 may easily exceed the maximum GPU memory. I am curious about the max_length of truncation you use to let it runnable.

Dec 04 '22 01:12 ChrisAllenMing

Hmmm. I'll check the GO dataset to see if there's something weird I'm doing. I used a 32GB GPU. I kept the max len line in the yaml file, which is 550.

Dec 06 '22 20:12 jasperhyp

Oh, I just find you fix the encoder, so you can have the batch size of 16. The truncation length and other configurations also look fine to me

Dec 08 '22 03:12 ChrisAllenMing

It's kinda weird but if I used

truncate_transform = transforms.TruncateProtein(max_length=550, random=False)
protein_view_transform = transforms.ProteinView(view="residue")
transform = transforms.Compose([truncate_transform, protein_view_transform])
dataset = datasets.EnzymeCommission(data_folder+"/protein-datasets", transform=transform)

it worked. I am guessing it's a dataset issue but I can't figure out why.

Dec 18 '22 23:12 jasperhyp

Hi! Have you tried this config in GearNet repo? This config should work well for EC. If not, you can raise an issue in the GearNet repo.

Dec 22 '22 06:12 Oxer11