scripts icon indicating copy to clipboard operation
scripts copied to clipboard

Bad performance when there is no RMSD information

Open Dadiao-shuai opened this issue 2 years ago • 2 comments

Here, I try to contrast two experiments. normal training VS training without RMSD. I thought as long as the label and affinity label is given, the training wouldn't be different a lot. However, the RMSD-free training resulted in a bizarre performance: image

I used the same args and gninatypes files to train model from crossdock_default2018.caffemodel using default2018.model(modified). The rmsd columns in RMSD-free types are removed, and it's like:

0 3.906 pdb2019_refi_train_gninatypes/4u6w/4u6w_rec.gninatypes redock_default2018_pdbbind_v2019_docked_gninatypes/4u6w_docked_7.gninatypes
1 5.47 pdb2019_refi_train_gninatypes/1gi1/1gi1_rec.gninatypes pdb2019_refi_train_gninatypes/1gi1/1gi1_ligand.gninatypes

And this is the model data layer, I comment the top rmsd_true; In test I set has_rmsd false; In train I set balanced true, stratify_receptor false, has_rmsd false:

layer {
  name: "data"
  type: "MolGridData"
  top: "data"
  top: "label"
  top: "affinity"
  # top: "rmsd_true"
  include {
    phase: TEST
  }
  molgrid_data_param {
        source: "TESTFILE"
        batch_size: 50
        dimension: 23.5
        resolution: 0.500000
        shuffle: false
        ligmap: "completelig"
        recmap: "completerec"
        balanced: false
        has_affinity: true
        has_rmsd: false
        root_folder: "DATA_ROOT"
    }
  }
  
layer {
  name: "data"
  type: "MolGridData"
  top: "data"
  top: "label"
  top: "affinity"
  # top: "rmsd_true"
  include {
    phase: TRAIN
  }
  molgrid_data_param {
        source: "TRAINFILE"
        batch_size:  50
        dimension: 23.5
        resolution: 0.500000
        shuffle: true
        balanced: true
        jitter: 0.000000
        ligmap: "completelig"
        recmap: "completerec"        
        stratify_receptor: false
        stratify_affinity_min: 0
        stratify_affinity_max: 0
        stratify_affinity_step: 1.000000
        has_affinity: true
        has_rmsd: false
        random_rotation: true
        random_translate: 6
        root_folder: "DATA_ROOT"       
    }
}

And the rmsd layer is also deleted.

layer {
  name: "rmsd"
  type: "AffinityLoss"
  bottom: "affinity_output"
  bottom: "affinity"
  top: "rmsd"
...

Dadiao-shuai avatar Dec 25 '23 01:12 Dadiao-shuai

I have thought about this question as well: How the gnina model only predict and train affinity without CNNscore binary_label.

Maybe the gnina/script/affinity could solve this task? But not sure if this is recommended.

JonasLi-19 avatar Dec 26 '23 16:12 JonasLi-19

It really looks like your prediction is going through a sigmoid (values range from 0 to 1), which an affinity prediction should do.

dkoes avatar Jan 10 '24 16:01 dkoes