psearch icon indicating copy to clipboard operation
psearch copied to clipboard

prepare_dataset issue

Open julianaamorim opened this issue 2 years ago • 4 comments

Hello, I have just installed psearch and all the env dependencies in conda. I downloaded the acetylcholinestarase (AChE) dataset to do a test and in the phase of preparing the dataset I came across the following error:

**Traceback (most recent call last): File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs)Traceback (most recent call last): File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/site-packages/psearch/prepare_dataset.py", line 61, in common create_db.main_params(dbout_fname=filenames[4], File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/site-packages/psearch/scripts/create_db.py", line 156, in main_params for i, res in enumerate(p.imap_unordered(map_process_mol, File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/pool.py", line 451, in return (item for chunk in result for item in chunk) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/pool.py", line 873, in next raise value OSError: File error: Invalid input file /home/juliana/Downloads/Ache/compounds/inactive_conf.sdf Process Process-1: Traceback (most recent call last): File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/site-packages/psearch/prepare_dataset.py", line 61, in common create_db.main_params(dbout_fname=filenames[4], File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/site-packages/psearch/scripts/create_db.py", line 156, in main_params for i, res in enumerate(p.imap_unordered(map_process_mol, File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/pool.py", line 451, in return (item for chunk in result for item in chunk) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/pool.py", line 873, in next raise value OSError: File error: Invalid input file /home/juliana/Downloads/Ache/compounds/active_conf.sdf File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/site-packages/psearch/prepare_dataset.py", line 61, in common create_db.main_params(dbout_fname=filenames[4], File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/site-packages/psearch/scripts/create_db.py", line 156, in main_params for i, res in enumerate(p.imap_unordered(map_process_mol, File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/pool.py", line 451, in return (item for chunk in result for item in chunk) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/pool.py", line 873, in next raise value OSError: File error: Invalid input file /home/juliana/Downloads/Ache/compounds/inactive_conf.sdf Process Process-1: Traceback (most recent call last): File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(*self._args, self._kwargs) File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/site-packages/psearch/prepare_dataset.py", line 61, in common create_db.main_params(dbout_fname=filenames[4], File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/site-packages/psearch/scripts/create_db.py", line 156, in main_params for i, res in enumerate(p.imap_unordered(map_process_mol, File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/pool.py", line 451, in return (item for chunk in result for item in chunk) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/juliana/anaconda3/envs/psearch/lib/python3.12/multiprocessing/pool.py", line 873, in next raise value OSError: File error: Invalid input file /home/juliana/Downloads/Ache/compounds/active_conf.sdf

No file in the "compounds" folder was generated... Any suggestions on how to move forward?

Thanks in advance...

Juliana

julianaamorim avatar Jan 17 '24 17:01 julianaamorim

From which branch did you install psearch? You have to use gen_pharms branch, it is the most recent. Unfortunately we still did not fix all remaining bugs and merge it to the master. Another aspect, I never used psearch with python 3.12, maximum it was 3.9. However, it should be a problem.

DrrDom avatar Jan 18 '24 08:01 DrrDom

Running psearch on my current working dataset got my trainset list stuck... I tried some checking in select_training_set_rdkit.py to understand the problem, but to no avail...

>> psearch -p my_models_2/created_pharmacophores/ -i beta_2_short.smi -d dbs/beta.dat -c 4 Size of df before generating fingerprints: (101, 3) Size of df after generating fingerprints: (101, 4) Size of df_mols before concatenation: (101, 4) Size of df_mols after concatenation: (101, 5) 100 molecules screened 00:00:01 external_statistics.txt: (0.009s)

Any light?

julianaamorim avatar Jan 27 '24 23:01 julianaamorim

Does it return any pharmacophore model? If not, it may happen that it cannot create training sets. I got something similar in the past and it would be reasonable to implement another modeling mode, where all input ligands will be taken as a training set without selection like now. It may also be some bug. If you can share your data set, I may look on it when I'll have time, but I do not promise that will do this quickly.

DrrDom avatar Jan 28 '24 17:01 DrrDom

Increasing the number of active compounds solved the problem...

julianaamorim avatar Mar 14 '24 00:03 julianaamorim