ColabFold icon indicating copy to clipboard operation
ColabFold copied to clipboard

Index error/ List Index out of range

Open dsairam789 opened this issue 4 years ago • 8 comments

I was predicting the structure of a complex using the CoLab (AlphaFold2) by uploading unpaired alignments of the monomers. I encountered the error (List index out of range) at the Prediction step and am unsure of how to resolve it. I tried resetting the run using Factory run, and Run all but the error persists, so could you help me out in this regard.

P.S. After perusing the list of issues flagged by users on this platform, I noticed that my request is similar to this user (https://github.com/sokrypton/ColabFold/issues/128).

I am at your disposal of any further information.

dsairam789 avatar Jan 10 '22 09:01 dsairam789

Other details about the issue

Notebook Name: AlphaFold2.ipynb

Input Query : MDADKIVFKVNNQVVSLKPEIIVDQHEYKYPAIKDLKKPCITLGKAPDLNKAYKSVLSGM SAAKLDPDDVCSYLAAAMQFFEGTCPEDWTSYGIVIARKGDKITPGSLVEIKRTDVEGNW ALTGGMELTRDPTVPEHASLVGLLLSLYRLSKISGQNTGNYKTNIADRIEQIFETAPFVK IVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLV SFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPY SSNAVGHVFNLIHFVGCYMGQVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFF RDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETRSPEAVYTRIMMNGGRLKRSH IRRYVSVSSNHQARPNSFAEFLNKTYSSDS:MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDD GKSPNPGE

Parameters Checked AMBER : Yes Save to Google_Drive: Yes MSA Mode : Custom Model_Type : Alphafold2-ptm Pair Mode: Unpaired+Paired MSA Mode: Custom Screenshot_2022-01-13 Google Colaboratory Screenshot_2022-01-13 Google Colaboratory(1) MSA_A3M_Files.zip

dsairam789 avatar Jan 13 '22 15:01 dsairam789

You need to express your complex in a single a3m file. My answer to the following issue explains how the MSAs (a3ms) should be formatted: https://github.com/sokrypton/ColabFold/issues/76

martin-steinegger avatar Jan 13 '22 16:01 martin-steinegger

Thanks for your reply and the link to the earlier thread. I did exactly as you suggested (on #76) but I get a similar issue. Screenshot_2022-01-13 Google Colaboratory(2) Screenshot_2022-01-13 Google Colaboratory(3) ezyzip.zip

dsairam789 avatar Jan 13 '22 16:01 dsairam789

I made a small example that runs. You need to make sure to separate the header (#) entry by tabs

#423,68	1,1
>P06747	P06025
YKYPAIKDLKKPCITLGKAPDLNKAYKSVLSCMSAAKLDPDDVCSYLAAAMQFFEGTCPEDWTSYGIVIARKGDKITPGSLVEIKRTDVEGNWALTGGMELTRDPTVPEHASLVGLLLSLYRLSKISGQSTGNYKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLVSFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMGQVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETRSPEAVYTRIIMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSSDSMSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDDGKSPNPGE
>P06747
YKYPAIKDLKKPCITLGKAPDLNKAYKSVLSCMSAAKLDPDDVCSYLAAAMQFFEGTCPEDWTSYGIVIARKGDKITPGSLVEIKRTDVEGNWALTGGMELTRDPTVPEHASLVGLLLSLYRLSKISGQSTGNYKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLVSFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMGQVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETRSPEAVYTRIIMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSSDS--------------------------------------------------------------------
>UniRef100_A0A0 Nucleocapsid protein (Fragment) n=1 Tax=Rabies lyssavirus TaxID=11292 RepID=A0A096XWU8_9RHAB
----AIKDLKKPSITLGKAPDLNKAYKSVLSGMNAAKLDPDDVCSYLAAAMQFFEGTCPEDWTSYGILIARKGDKITPDSLVEIKRTDVEGNWALTGGMELTRDPTVSEHASLVGLLLSLYRLSKISGQNTGNYKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLVSFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMGQVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVHSDDEDYFSGETRSPEAVYTR--------------------------------------------------------------------------------------------------------------
>P06025
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDDGKSPNPGE
>UniRef100_A0A0 Phosphoprotein n=1 Tax=Rabies lyssavirus TaxID=11292 RepID=A0A023GV82_9RHAB
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMRRLQLDDGKPSNL--

sml.a3m.zip

martin-steinegger avatar Jan 13 '22 17:01 martin-steinegger

Hello guys, I found this post when looking for the answer to this "list out of range" error. In my case, I want to predict a protein structure, just with the standard parameters, but this error still appears every time I try. The sequence is:

"MENPMERELSCTICMELYNEPLLLPCAHTFCRKCLEDLIAKSNGFAATATSTCHGDDGGDCRPEADEPCDVPTDSKDSDGIVINCPTCRREVILRSETGLNGLVRNFLLDSIVSRYKQQETSIVRQTCEICDDKPAITKCVQCQVTYCDLCRNMCHPNKGVFNTHQLVPLSHDGPVQRTYGIQPLQCFVHNGEGLKLYCDDCRIPICFICERFGEHKGHHVREAHSAFKTIKEQMSDNVATMIGLQSDIDDFLIALSNGRDSIQEDAASLKEQVSNSCQRLHATINQKEQEMHEVIAKDMTNKLYEIQQKMTMCQDKQRRLSGLVQFAQVVLENNNETSIFLTSAASLDDRLTSELSNSPNLQLNEAFLTFDHIIIDVKSAIRGIKKMQARKITEPKVPIITKTEIMNRNGVMLSVNHEPCDVKWYDVAYQPINGQWMTMRWEVMADQSGASLEDHIHLGNLDYDCEYYFCVCMTNVAGTSPWSQTVAVKTKTVATEFILNEETAHPALLLSEDGKTVKRREDYIHKKMKVSEKIQLGMRFVRNVHCILGDVILSDDAHYWEVEASQSGATSYAIGVATADCHRNQQLGTNKSSWCLEIMGITARGYHNDRCTRIKHNLECNSTRRFGILLDYQRRALEFFYKEKLLLSYAVNAKVTDLCPAFDLTNSSAKLSIITGLKIPEFLNTCHVHIQQKMGDSLERELSCAVCMELYTDPLLLPCAHSFCRSCLPDVLKRNSNQKSGHSRLVCPSCRFTVELDKRGIDGLPRNFLLDNIIERYKEEKSTDGRPVKVKGVACDVCADSGGAKASKTCIQCGVSYCDRCLRTYHPSKGVFSKHKLVKATRNPKRKDVYCPEHDDELIKMYCVQCKTPVCYLCDRFGGHKGHQVAELKTSYKLMKETLSSNLAQLVSKMANVNEFIITLERKGESIQTNAAVMHQRISEEFAVIKAMLEQRERVMHTKISEETARKLLLLKQQNMACQDKLHNTAGLIQYTREVLKEEVPAALLLTGASLDDRLNCAIDSCPQLQPNTADDFSHVILNLEYEKQIIQKMDLLTIKAPEKPRMGGHIEVRNTVHLSIKHAPCIVDSYDMGVCKSGGLWDYFKIECGKGDERETDEYRLVKEDLAFDSEYFFKARVRNKAGASGWSNVFPARTGPQAMTFRLDPETAHEDLVIISAGRSVIYQPRPRGFWVMQEEGKASTGRFHGRALSVLADVVLATGVHYWEVTTQVAEKQHGESLSHYTDRDASVYRGDIAIGLAKQNCNRDLCLGSDGSSWALRLPSNGGNWYVAHKNKQHVISAASAIDSCPRSQFPAGLHVGILVDFTHHKLRFYDCNRKILLYSCDQISKEKRLCPALEISDSSYQLNLRTGTGIPDYANSK" image

At the beginning, everything is fine, but at the modelling level, it keeps running forever until RAM is complete and this error pops up. I don't know exactly what to do and I tried on two different computers and continues to happen. Other errors do appear (I don't have screenshot), but is pretty much the same.

If anyone knows how to solve it I would be really happy.

Scrupy07 avatar Jun 09 '22 00:06 Scrupy07

This is the error I usually encounter when modelling this protein image

Does anyone know how to fix it?

Scrupy07 avatar Jun 09 '22 15:06 Scrupy07

This is an error that you'll see if the previous step failed and didn't return any models. (So none can be displayed).

In this case, that looks like a large protein. On Google-colab. The upper limit is length ~1400.

sokrypton avatar Jun 09 '22 15:06 sokrypton

Thanks for replying. This protein is 1379, so I think it's near to the limit. Probably I could divide the protein into three fragments which have regions in common and somehow know how it would be seen. What do you think?

Scrupy07 avatar Jun 09 '22 15:06 Scrupy07