scripts Bad performance when there is no RMSD information

Here, I try to contrast two experiments. normal training VS training without RMSD. I thought as long as the label and affinity label is given, the training wouldn't be different a lot. However, the RMSD-free training resulted in a bizarre performance:

I used the same args and gninatypes files to train model from crossdock_default2018.caffemodel using default2018.model(modified). The rmsd columns in RMSD-free types are removed, and it's like:

0 3.906 pdb2019_refi_train_gninatypes/4u6w/4u6w_rec.gninatypes redock_default2018_pdbbind_v2019_docked_gninatypes/4u6w_docked_7.gninatypes
1 5.47 pdb2019_refi_train_gninatypes/1gi1/1gi1_rec.gninatypes pdb2019_refi_train_gninatypes/1gi1/1gi1_ligand.gninatypes

And this is the model data layer, I comment the top rmsd_true; In test I set has_rmsd false; In train I set balanced true, stratify_receptor false, has_rmsd false:

layer {
  name: "data"
  type: "MolGridData"
  top: "data"
  top: "label"
  top: "affinity"
  # top: "rmsd_true"
  include {
    phase: TEST
  }
  molgrid_data_param {
        source: "TESTFILE"
        batch_size: 50
        dimension: 23.5
        resolution: 0.500000
        shuffle: false
        ligmap: "completelig"
        recmap: "completerec"
        balanced: false
        has_affinity: true
        has_rmsd: false
        root_folder: "DATA_ROOT"
    }
  }
  
layer {
  name: "data"
  type: "MolGridData"
  top: "data"
  top: "label"
  top: "affinity"
  # top: "rmsd_true"
  include {
    phase: TRAIN
  }
  molgrid_data_param {
        source: "TRAINFILE"
        batch_size:  50
        dimension: 23.5
        resolution: 0.500000
        shuffle: true
        balanced: true
        jitter: 0.000000
        ligmap: "completelig"
        recmap: "completerec"        
        stratify_receptor: false
        stratify_affinity_min: 0
        stratify_affinity_max: 0
        stratify_affinity_step: 1.000000
        has_affinity: true
        has_rmsd: false
        random_rotation: true
        random_translate: 6
        root_folder: "DATA_ROOT"       
    }
}

And the rmsd layer is also deleted.

layer {
  name: "rmsd"
  type: "AffinityLoss"
  bottom: "affinity_output"
  bottom: "affinity"
  top: "rmsd"
...

Dec 25 '23 01:12 Dadiao-shuai

I have thought about this question as well: How the gnina model only predict and train affinity without CNNscore binary_label.

Maybe the gnina/script/affinity could solve this task? But not sure if this is recommended.

Dec 26 '23 16:12 JonasLi-19

It really looks like your prediction is going through a sigmoid (values range from 0 to 1), which an affinity prediction should do.

Jan 10 '24 16:01 dkoes