The "skip" issue encountered after running the inference
python -m inference --config default_inference_args.yaml --protein_path ./rec_1.pdb --ligand ./ZINC01535869.mol2 --out_dir ./outache/ --inference_steps 20 --samples_per_complex 40 --batch_size 10 --actual_steps 18 --no_final_step_noise
The following error occurred and there was no output. Does anyone know how to solve it?thanks
Skipping complex_0 because of the error: 'X' HAPPENING | The confidence dataset did not contain ['complex_0']. We are skipping this complex. 1it [00:00, 1.13it/s] Failed for 0 complexes Skipped 1 complexes
I had the same issue I directly ran the inference file, but somehow it worked when I used app/main.py
thanks,i try it
I had similar issue using DiffDock v1.1.1.
However it works using the data examples provided in https://github.com/gcorso/DiffDock/tree/main/data/1a0q:
python -m inference --config diffdock_default_inference_args.yaml --protein_path 1a0q_protein_processed.pdb --ligand_description 1a0q_ligand.sdf --out_dir test1-1a0q
But it fails with my data:
python -m inference --config diffdock_default_inference_args.yaml --protein_path AF-P00519-F1-model_v4.pdb --ligand_description DB00619.sdf --out_dir AF-P00519-F1-model_v4/DB00619
This raises the following error:
Processing 1 of 1 batches (1 sequences)
0it [00:00, ?it/s]/app/diffdock/datasets/parse_chi.py:91: RuntimeWarning: invalid value encountered in cast
Y = indices.astype(int)
[2024-Apr-27 13:01:30 CEST] WARNING -The test dataset did not contain complex_0 for DB00619.sdf and AF-P00519-F1-model_v4.pdb. We are skipping this complex.
1it [00:00, 1.04it/s]
[2024-Apr-27 13:01:30 CEST] WARNING -
Failed for 0 / 1 complexes.
Skipped 1 / 1 complexes.
Skipping complex_0 because of the error:
Sizes of tensors must match except in dimension 1. Expected size 1130 but got size 1022 for tensor number 1 in the list.
These two files can be used to test after removing the .txt extension
Is there anything wrong in my input data? How can I solve this issue?
Further test:
It fails with AF-P00519-F1-model_v4.pdb and 1a0q_ligand.sdf:
python -m inference --config diffdock_default_inference_args.yaml --protein_path AF-P00519-F1-model_v4.pdb --ligand_description 1a0q_ligand.sdf --out_dir AF-P00519-F1-model_v4/DB00619
It raises this error:
Generating ESM language model embeddings
Processing 1 of 1 batches (1 sequences)
0it [00:00, ?it/s]/app/diffdock/datasets/parse_chi.py:91: RuntimeWarning: invalid value encountered in cast
Y = indices.astype(int)
[2024-Apr-27 13:08:53 CEST] WARNING -The test dataset did not contain complex_0 for 1a0q_ligand.sdf and AF-P00519-F1-model_v4.pdb. We are skipping this complex.
1it [00:00, 1.10it/s]
[2024-Apr-27 13:08:53 CEST] WARNING -
Failed for 0 / 1 complexes.
Skipped 1 / 1 complexes.
Skipping complex_0 because of the error:
Sizes of tensors must match except in dimension 1. Expected size 1130 but got size 1022 for tensor number 1 in the list.
But it works with 1a0q_protein_processed.pdb and DB00619.sdf:
python -m inference --config diffdock_default_inference_args.yaml --protein_path 1a0q_protein_processed.pdb --ligand_description DB00619.sdf --out_dir 1a0q_protein/DB00619
This means the error comes from the PDB file.
Note that I also tried to keep in the PDB files only the lines starting with ATOM:
AF-P00519-F1-model_v4-truncated.pdb.txt
But there is still the same error.
The PDB file comes fom the AlphaFold website
- https://alphafold.ebi.ac.uk/entry/P00519
- https://alphafold.ebi.ac.uk/files/AF-P00519-F1-model_v4.pdb
Last test that was successul with this PDB directly directly produced by AlphaFold I ran myself
python -m inference --config diffdock_default_inference_args.yaml --protein_path ranked_0.pdb --ligand_description DB00619.sdf --out_dir ranked_0/DB00619
This is possibly because of the protein length restriction (1022) on DiffDock inferences. Refer to this issue #199
Thank you @prathithbhargav for this information, I was not aware of this limitation.