ColabFold icon indicating copy to clipboard operation
ColabFold copied to clipboard

pdb100_230517_seq does not exist issue

Open rakeshr10 opened this issue 1 year ago • 3 comments

Hi,

I setup colabfold databases using your setup database script. I ran colabfold-search with —use-templates 1 and provided thet template path with —db2 pdb100_230517_dir/pdb100_230517. I get this error pdb100_230517_dir/pdb100_230517_seq does not exist.

Can you let me know why pdb100_230517_seq is not present in the database directory even though I used your setup database script. I set MMSEQS_NO_INDEX to 1.

Also I am a bit confused with the two pdb databases. I used here pdb100_230517 created from pdb100_230517.fasta.gz for colabfold-search step but where should I use pdb100_foldseek_230517.

Regards Rakesh

rakeshr10 avatar Feb 29 '24 18:02 rakeshr10

I put put up this issue last month and it was fixed for me. Try updating your colabfold version? #560

They even more recently fixed running colab_search to produce templates with more than one sequence. The recent fix recommending using the .csv file input mode for correct output. #567

Also as an FYI, the input to --db2 should be just the database name 'pdb100_230517' or in previous versions the .m8 file #571

NickWoodall avatar Feb 29 '24 20:02 NickWoodall

Thanks @NickWoodall. Your suggestions worked.

I have another question how do I pass multiple msa files and multiple m8 files together in a folder as input for colabfold_batch

Regards Rakesh

rakeshr10 avatar Mar 02 '24 04:03 rakeshr10

As far as I know, you need a separate colabfold_batch command for each m8 file with its corresponding a3m file. The m8 files basically make a custom template database which is shared for the whole run. So to keep the .m8 files separate, I think you need to input them one at time (programmatically generating a new command line for each).

NickWoodall avatar Mar 04 '24 14:03 NickWoodall