sidechainnet icon indicating copy to clipboard operation
sidechainnet copied to clipboard

Interpretation of sidechainnet IDs in train split

Open sidnarayanan opened this issue 3 years ago • 1 comments

Hi, thank you for providing such a great and well-documented resource to the field!

I have a hopefully-simple question about some of the sidechainnet IDs I'm seeing in the CASP12 dump. While most IDs have the form <pdb_id>_<chain_num>_<chain_id>, there are a number that look like 2OJ6_d2oj6c1. How do I interpret these? I spot-checked a few on RCSB, and they all contained a single protein (either alone, or as a multimer).

I'm asking because I'd like to align the SCN IDs with UniProt IDs, and UniProt and provides a mapping to <pdb_id>:<chain_id>.

sidnarayanan avatar Aug 09 '22 16:08 sidnarayanan

Hello, thanks for your interest! These are ASTRAL IDs, please see #21 for a more complete description.

jonathanking avatar Aug 10 '22 15:08 jonathanking