add pdb matching function

Open LilySnow opened this issue 3 years ago • 1 comments

pdb matching is a very useful step for pdb analysis. It would be nice if we could add this to pdb2sql.

Expected performance:

INPUT:

a reference pdb file with multiple chains
a set of pdb files for the same protein complex but with different numbering and chain IDs

OUTPUT:

chain ID mapping
pdb files renumbered based on the reference pdb. Chain IDs are also changed based on the reference pdb

Ideally, we hope to separate pdb_matching into two functions (steps):

Step 1. pdb_match_chn_batch.py: match chain IDs of pdb files to ref.pdb. Output _newChnID.pdb files. Note: This step can be skipped if model.pdb files have already matched chain IDs. This step is also error-prone when multiple chains are highly similar to each other. Therefore, a human visual check is necessary.

Step 2. pdb_renum_batch.py: align and renumber pdb files to ref.pdb. Output _renum.pdb files.

There are two existing solutions:

https://github.com/LilySnow/PDB-matching (python + cpp)
https://github.com/DeepRank/haddock-tools/commit/ed9beee4437a58ecf9dbc7961b38a63cb5b9e282 (python, by the haddock group)

Maybe we could use these solutions as the basis?

Sep 27 '22 09:09 LilySnow

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Oct 28 '22 04:10 github-actions[bot]