alphafold icon indicating copy to clipboard operation
alphafold copied to clipboard

How to accelerate protein prediction

Open dahaigui opened this issue 3 years ago • 3 comments

We have built a mutant library for cucumber and have completed resequencing some of the individual plants, finding a number of proteins with non-synonymous mutations. We have translated these proteins and hope to predict the protein structure using AlphaFold2. These proteins have minimal differences from the original protein sequences, is it possible to reduce the MSA time to speed up the protein structure prediction?

dahaigui avatar Nov 16 '22 10:11 dahaigui

This is addressed in the "Inferencing many proteins" section of the README. The short answer is not exactly, but you can reduce the runtime for similar proteins by keeping the network a fixed size, and a bulk inference script could be built on the RunModel.predict method.

tcoates5 avatar Nov 16 '22 15:11 tcoates5

This is good to know. Could you please elaborate with an example please? Or point me to a resource where I can find more about this?

smturzo avatar Nov 17 '22 02:11 smturzo

@smturzo

This is good to know. Could you please elaborate with an example please? Or point me to a resource where I can find more about this?

The reference protein seq is >CsaV3_7G025510 MIGRLRMNHCVPDFEMADDFSLPTFSSLTRPRKSSLPDDDVMELLWQNGQVVTHSQNQRSFRKSPPSKFDVSIPQEQAATREIRPSTQLEEHHELFMQEDEMASWLNYPLVEDHNFCSDLLFPAITAPLCANPQPDIRPSATATLTLTPRPPIPPCRRPEVQTSVQFSRNKATVESEPSNSKVMVRESTVVDSCDTPSVGPESRASEMARRKLVEVVNGGGVRYEIARGSDGVRGASVGGDGIGEKEMMTCEMTVTSSPGGSSASAEPACPKLAVDDRKRKGRALDDTECQSEDVEYESADPKKQLRGSTSTKRSRAAEVHNLSERRRRDRINEKMKALQELIPRCNKTDKASMLDEAIEYLKTLQLQVQMMSMGCGMMPMMFPGVQQYLPPPMGMGMGMGMEMGMNRPMMQFHNLLAGSNLPMQAGATAAAHLGPRFPLPPFAMPPVPGNDPSRAQAMNNQPDPMANSVGTQNTTPPSVLGFPDSYQQFLSSTQMQFHMTQALQNQHPVQLNTSRPCTSRGPENRDNHQSG

One of the proteins in our mutant library differs from the reference genome by only one amino acid, with the following sequence >Csa_Mutant_125_7G025510 MIGRLRMNHCVPDFEMADDFSLPTFSSLTRPRKSSLPDDDVMELLWQNGQVVTHSQNQRSFRKSPPSKFDVSIPQEQAATREIRPSTQLEEHHELFMQEDEMASWLNYPLVEDHNFCSDLLFPAITAPLCANPQPDIRPSATATLTLTPRPPIPPCRRPEVQTSVQFSRNKATVESEPSNSKVMVRESTVVDSCDTPSVGPESRASEMARRKLVEVVNGGGVRYEIARGSDGVRGASVGGDGIGEKEMMTCEMTVTSSPGGSSASAEPACPKLAVDDRKRKGRALDDTECQSEDVEYESADPKKQLRGSTSTKRSRAAEVHNLSERRRRDRINEKMKALQELIPRCNKTDKASMLDEAIEYLKTLQLQVQMMSMGCGMMPMMFPGVQQYLPPPMGMGMGMGMEMGMNRPMMQFHNLLAGSNLKMQAGATAAAHLGPRFPLPPFAMPPVPGNDPSRAQAMNNQPDPMANSVGTQNTTPPSVLGFPDSYQQFLSSTQMQFHMTQALQNQHPVQLNTSRPCTSRGPENRDNHQSG

dahaigui avatar Nov 17 '22 08:11 dahaigui