psearch icon indicating copy to clipboard operation
psearch copied to clipboard

Help understanding the output txt

Open syedzayyan opened this issue 2 years ago • 6 comments

Hello!

So I ran the model on 130ish compounds to test and then tested 5 compounds. I have an output that is like this:

1 3 2 57

I don't really understand what these numbers really mean. So is it that it selected 1 and 2 from the 5 compounds? Then what do the 3 and 57 mean? The number of conformation that fit?

syedzayyan avatar Feb 23 '23 11:02 syedzayyan

Hello!

The first number is stereoisomer id, the second is conformation id. Knowing this information you can find 3D representations of compounds in your SDF file.

Thanks for the feedback on the difficulties in understanding the results. We are already preparing a new, more comfortable version of this tool.

If you have any more questions, I will gladly answer them.

meddwl avatar Feb 23 '23 19:02 meddwl

Just to be clear, those are the compounds from the screen test in those conformations that got selected as an active compound candidate right?

So when doing docking to confirm binding, those selected conformations would be the conformation to select?

Apologies for the silly question, I am a beginner in this world.

syedzayyan avatar Feb 23 '23 19:02 syedzayyan

It is assumed that these are conformations in which those compounds should bind to the target, but it is not 100% necessary. Doing docking you may use this conformation as a starting one or a random conformation. Nevertheless, the starting conformations will be changed. In some cases output docking conformation will correspond to a conformation retrieved by a pharmacophore model, but again there is no guarantee.

DrrDom avatar Mar 01 '23 13:03 DrrDom

Thank you so much for the replies. Helped me understand a ton.

syedzayyan avatar Mar 02 '23 10:03 syedzayyan

There are many output files in the "screen_results". Initially, I was under the impression that they were all the same since I tried with a small database and had only two duplicate files. But with a large dataset, it has many files, with SMILES id and conformer ID. Why so many files and what do these mean?

syedzayyan avatar Mar 22 '23 18:03 syedzayyan

Also, the paper mentions that centroids of clusters are selected for training, and the rest are for testing. But in this search version is there any test set or are all ligands in a cluster selected?

syedzayyan avatar Mar 22 '23 18:03 syedzayyan