Aug 4 2022 Feedback.
Contents
 
Help Videos
Reference Guide
Getting Started
Protein Structure
Molecular Graphics
Slides & ActiveICM
Cheminformatics
Learn and Predict
MolScreen
 Load Models and Run
 Results Table
 Model Types
  kcc model
  dfz
  dfa
  dpc
 Custom Model Panel
 Make Classification Model
 Make APF/SAR Model
 Predict Metabolic Oxidation
3D Ligand Editor
Tables and Plots
Local Databases
KNIME
Tutorials
 
Index
PrevICM User's Guide
9.3 Model Types
Next

[ kcc model | dfz | dfa | dpc ]

9.3.1 Kernel Chemical Classification/Regression (kcc) models


Hybrid Fingerprints

Dataset

IC50, EC50, Ki, Kd data was downloaded from ChEMBL18, they were combined and converted to pKd value. Compounds with pKd > 5. (i.e. 10 micro Molar) were classified as positives. 90% of ChEMBL18 compounds were assigned to training set. The remaining 10% compounds were assigned to external test set. To the training set, a subset of compounds with pKd>7. against any targets from ChEMBL18 was added as decoy. To the external test set, the approved drugs were added as decoy

Training

A Kernel Chemical Classification/Regression (kcc) model was trained using the training set compounds. The performance of the model was evaluated using the external test set

Predicted Value

The kcc model returns two scores:

The kcc score is the classification score: The positives have a median kcc score of 1. The decoys have a median kcc score of 0. The kcc score is renamed to MolClass in the pairwise table.

The kca score is the regression score of pKd value A kca score of 6. indicates microMolar activities. The kca score is renamed to MolpKd in the pairwise table.

In the pairwise table, a pPvalue is calculated for both the kcc and kca score. The maximum pPvalue was taken as the final pPvalue for that model. The pPvalue indicates -Log of probability that the compound belongs to the random decoy. A pPvalue of 1 indicates the compound is comparable to the top 90 percentile decoy. A pPvalue of 2 indicates the compound is comparable to the top 99 percentile decoy.

9.3.2 Docking to Ligand Field Z-Score (dfz) models


APF

Dataset

If there is available data from ChEMBL18: IC50, EC50, Ki, Kd data was downloaded from ChEMBL18, they were combined and converted to pKd value. Compounds with pKd > 5. (i.e. 10 micro Molar) were classified as positives. 80% of ChEMBL18 compounds were assigned to training set. The remaining 20% compounds were assigned to external test set. To the training set, a subset of compounds with pKd>7. against any targets from ChEMBL18 was added as decoy. To the external test set, the approved drugs were added as decoy

If ChEMBL data is not available, approved drugs were used as decoy to calculate Z-Score.

Training

The dfz model was training in the following way:

1. For any target, all of its associated mammalian pocketome entries were used. 2. If ChEMBL data is not available, the pocketome entry with the highest number of co-crystallized ligand was used. 3. If ChEMBL training set is available, it was docked to each pocketome entries using APF method. 4. The best combination of template clusters were selected to maximize differentiation between actives and decoys. 5. Z-Score was calculated using approved drug's mean and standard deviation of APF score.

Predicted Value

The dfz model returns one score:

The dfz score is the Z-Score: A score of 1 means the compound is 1 standard deviation above the mean score of approved drugs decoy. The dfz score is renamed to MolZScore in the pairwise table.

In the pairwise table, a pPvalue is calculated for both the Z-Score. The pPvalue indicates -Log of probability that the compound belongs to the random decoy. A pPvalue of 1 indicates the compound is comparable to the top 90 percentile decoy. A pPvalue of 2 indicates the compound is comparable to the top 99 percentile decoy.

9.3.3 Docking to Ligand Field Classification/Regression (dfa) models


Hybrid 4D/2D

Dataset

IC50, EC50, Ki, Kd data was downloaded from ChEMBL18, they were combined and converted to pKd value. Compounds with pKd > 5. (i.e. 10 micro Molar) were classified as positives. 90% of ChEMBL18 compounds were assigned to training set. The remaining 10% compounds were assigned to external test set. To the training set, a subset of compounds with pKd>7. against any targets from ChEMBL18 was added as decoy. To the external test set, the approved drugs were added as decoy

Training

The dfa model is trained in the following steps:

1. Either: a. Training set was docked to the 4D maps of all the pocketome entries associated with the target. b. If pocketome entry is not available. The training set compounds were aligned in 3D. 2. Combining the training set compounds with pocketome co-crystallized ligands (if available), cluster in APF. 3. Subset of ligands were selected from each cluster as APF template. 4. All training set compounds were docked using APF method to all clusters. 5. The best combination of clusters were selected to maximize recognition of actives from decoys. 6. For each selected cluster, a pKd regression model was trained using the 3D poses of the ligands above a certain APF score cutoff.

Predicted Value

Any compound will be predicted using the dfa models in the following way: The compound will be docked using APF method to each of the template clusters. The compound will be assigned to the cluster that gives the highest normalized APF score. The 3D regression model of that cluster will then be used to predict the pKd value of that compound if the APF score is within the score cutoff.

The dfa model returns two scores:

The dfc score is the classification score: The positives have a median dfc score of 1. The decoys have a median dfc score of 0. The dfc score is renamed to MolClass in the pairwise table.

The dfa score is the regression score of pKd value A dfa score of 6. indicates microMolar activities. The dfa score is renamed to MolpKd in the pairwise table.

In the pairwise table, a pPvalue is calculated for both the dfc and dfa score. The maximum pPvalue was taken as the final pPvalue for that model. The pPvalue indicates -Log of probability that the compound belongs to the random decoy. A pPvalue of 1 indicates the compound is comparable to the top 90 percentile decoy. A pPvalue of 2 indicates the compound is comparable to the top 99 percentile decoy.

9.3.4 Docking to Protein Pocket Classification/Regression (dpc) models


Hybrid Docking

Dataset

IC50, EC50, Ki, Kd data was downloaded from ChEMBL18, they were combined and converted to pKd value. Compounds with pKd > 5. (i.e. 10 micro Molar) were classified as positives. 90% of ChEMBL18 compounds were assigned to training set. The remaining 10% compounds were assigned to external test set. To the training set, a subset of compounds with pKd>7. against any targets from ChEMBL18 was added as decoy. To the external test set, the approved drugs were added as decoy

Training

The dpc model is trained in the following steps:

1. The pocketome entry with the most co-crystallized ligands associated with the target is selected 2. The residues around the pocket were clustered, selected representative PDBs were retained 3. All training set compounds were docked to the 4D maps of the pocket in the presences of the co-crystallized ligands in the form of APF template 4. A score cutoff to maximize sensitivity and accuracy was selected 5. All compounds within that score cutoff were used to train a pKd prediction model using their 3D poses.

Predicted Value

Any compound will be predicted using the dpc models in the following way: The compound will be docked to the 4D maps of the pocket in the presence of APF template The 3D regression model of the pocket will then be used to predict the pKd value of that compound if the docking score is within the score cutoff.

The dpc model returns two scores:

The dpc score is the classification score: The positives have a median dpc score of 1. The decoys have a median dpc score of 0. The dpc score is renamed to MolClass in the pairwise table.

The dpa score is the regression score of pKd value A dpa score of 6. indicates microMolar activities. The dpa score is renamed to MolpKd in the pairwise table.

In the pairwise table, a pPvalue is calculated for both the dpc and dpa score. The maximum pPvalue was taken as the final pPvalue for that model. The pPvalue indicates -Log of probability that the compound belongs to the random decoy. A pPvalue of 1 indicates the compound is comparable to the top 90 percentile decoy. A pPvalue of 2 indicates the compound is comparable to the top 99 percentile decoy.


Prev
Results Table
Home
Up
Next
Custom Model Panel

Copyright© 1989-2020, Molsoft,LLC - All Rights Reserved.
This document contains proprietary and confidential information of Molsoft, LLC.
The content of this document may not be disclosed to third parties, copied or duplicated in any form,
in whole or in part, without the prior written permission from Molsoft, LLC.