DiffLinker icon indicating copy to clipboard operation
DiffLinker copied to clipboard

Larger segments as input

Open ivandon15 opened this issue 1 year ago • 1 comments

Hi DiffLinker Team,

Thank you for your great effort on this. And I was wondering if the model is possible for sampling linkers for larger molecules (like connecting 2 10-mer peptides). I tried using existing model (I know it's not appropriate, I just want to check if the model can run without error) , the model raised NanError during sample_p_zs_given_zt_only_linker. Is it because the inputs contains too many atoms?

Thank you for you patient and help!

ivandon15 avatar Jul 11 '24 05:07 ivandon15

I might have found the solution. The DDPM class calls the GCL class, which uses the unsorted_segment_sum function to process node features. Within this function, there is a normalization_factor. The default value for normalization_factor is set to 100, likely because the original model was applied to small molecules (is that right?). However, when I use peptides as input, the large number of atoms causes an explosion in the second step of diffusion. When I tried setting the normalization_factor to a larger int the model no longer produced NaN errors. Right?

ivandon15 avatar Jul 11 '24 09:07 ivandon15