Error "complex_name"
When trying to run diffdock to predict 24 protein-ligand complexes, I get the following error related to "complex_name":
(diffdock) rjrich@rjr-sd1:~/DiffDock$ python -m inference --config default_inference_args.yaml --protein_ligand_csv DjCES/DjCES_dd.csv --out_dir DjCES/results /home/rjrich/anaconda3/envs/diffdock/lib/python3.9/site-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module. warnings.warn( Traceback (most recent call last): File "/home/rjrich/anaconda3/envs/diffdock/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc return self._engine.get_loc(casted_key) File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'complex_name'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/rjrich/anaconda3/envs/diffdock/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/rjrich/anaconda3/envs/diffdock/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/rjrich/DiffDock/inference.py", line 318, in
Finally, I was able to track down the source of the "complex_name" error. I have been using LibreOffice Calc for Linux to create my CSV files. When I opened my CSV file in a text editor, I discovered that all the underscores ("") had been converted to "+AF8-". I then did a search-and-replace to convert all instances of "+AF9-" to "" and then saved the text file as CSV using the text editor. After doing this, I was able to run DiffDock-L to dock a set of 24 different ligands into a protein receptor.
However, the output consisted of 234 SDF files with names such as "rank1_confidence-1.04.sdf". What would I need to do to have DiffDock-L generate output files with names corresponding to the ligand names, e.g., "ligand01.sdf" or to the protein and ligand names, e.g., "protein_01_ligand01.sdf"?