add pdb matching function
pdb matching is a very useful step for pdb analysis. It would be nice if we could add this to pdb2sql.
Expected performance:
INPUT:
- a reference pdb file with multiple chains
- a set of pdb files for the same protein complex but with different numbering and chain IDs
OUTPUT:
- chain ID mapping
- pdb files renumbered based on the reference pdb. Chain IDs are also changed based on the reference pdb
Ideally, we hope to separate pdb_matching into two functions (steps):
Step 1. pdb_match_chn_batch.py: match chain IDs of pdb files to ref.pdb. Output _newChnID.pdb files. Note: This step can be skipped if model.pdb files have already matched chain IDs. This step is also error-prone when multiple chains are highly similar to each other. Therefore, a human visual check is necessary.
Step 2. pdb_renum_batch.py: align and renumber pdb files to ref.pdb. Output _renum.pdb files.
There are two existing solutions:
- https://github.com/LilySnow/PDB-matching (python + cpp)
- https://github.com/DeepRank/haddock-tools/commit/ed9beee4437a58ecf9dbc7961b38a63cb5b9e282 (python, by the haddock group)
Maybe we could use these solutions as the basis?
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.