Aug 4 2022 Feedback. |
[ kcc model | dfz | dfa | dpc ]
Hybrid FingerprintsDatasetIC50, EC50, Ki, Kd data was downloaded from ChEMBL18, they were combined and converted to pKd value. Compounds with pKd > 5. (i.e. 10 micro Molar) were classified as positives. 90% of ChEMBL18 compounds were assigned to training set. The remaining 10% compounds were assigned to external test set. To the training set, a subset of compounds with pKd>7. against any targets from ChEMBL18 was added as decoy. To the external test set, the approved drugs were added as decoyTrainingA Kernel Chemical Classification/Regression (kcc) model was trained using the training set compounds. The performance of the model was evaluated using the external test setPredicted ValueThe kcc model returns two scores: The kcc score is the classification score: The positives have a median kcc score of 1. The decoys have a median kcc score of 0. The kcc score is renamed to MolClass in the pairwise table. The kca score is the regression score of pKd value A kca score of 6. indicates microMolar activities. The kca score is renamed to MolpKd in the pairwise table. In the pairwise table, a pPvalue is calculated for both the kcc and kca score. The maximum pPvalue was taken as the final pPvalue for that model. The pPvalue indicates -Log of probability that the compound belongs to the random decoy. A pPvalue of 1 indicates the compound is comparable to the top 90 percentile decoy. A pPvalue of 2 indicates the compound is comparable to the top 99 percentile decoy.
APFDatasetIf there is available data from ChEMBL18: IC50, EC50, Ki, Kd data was downloaded from ChEMBL18, they were combined and converted to pKd value. Compounds with pKd > 5. (i.e. 10 micro Molar) were classified as positives. 80% of ChEMBL18 compounds were assigned to training set. The remaining 20% compounds were assigned to external test set. To the training set, a subset of compounds with pKd>7. against any targets from ChEMBL18 was added as decoy. To the external test set, the approved drugs were added as decoy If ChEMBL data is not available, approved drugs were used as decoy to calculate Z-Score.
TrainingThe dfz model was training in the following way:
1. For any target, all of its associated mammalian pocketome entries were used. 2. If ChEMBL data is not available, the pocketome entry with the highest number of co-crystallized ligand was used. 3. If ChEMBL training set is available, it was docked to each pocketome entries using APF method. 4. The best combination of template clusters were selected to maximize differentiation between actives and decoys. 5. Z-Score was calculated using approved drug's mean and standard deviation of APF score.
Predicted ValueThe dfz model returns one score: The dfz score is the Z-Score: A score of 1 means the compound is 1 standard deviation above the mean score of approved drugs decoy. The dfz score is renamed to MolZScore in the pairwise table. In the pairwise table, a pPvalue is calculated for both the Z-Score. The pPvalue indicates -Log of probability that the compound belongs to the random decoy. A pPvalue of 1 indicates the compound is comparable to the top 90 percentile decoy. A pPvalue of 2 indicates the compound is comparable to the top 99 percentile decoy.
Hybrid 4D/2DDatasetIC50, EC50, Ki, Kd data was downloaded from ChEMBL18, they were combined and converted to pKd value. Compounds with pKd > 5. (i.e. 10 micro Molar) were classified as positives. 90% of ChEMBL18 compounds were assigned to training set. The remaining 10% compounds were assigned to external test set. To the training set, a subset of compounds with pKd>7. against any targets from ChEMBL18 was added as decoy. To the external test set, the approved drugs were added as decoy
TrainingThe dfa model is trained in the following steps: 1. Either: a. Training set was docked to the 4D maps of all the pocketome entries associated with the target. b. If pocketome entry is not available. The training set compounds were aligned in 3D. 2. Combining the training set compounds with pocketome co-crystallized ligands (if available), cluster in APF. 3. Subset of ligands were selected from each cluster as APF template. 4. All training set compounds were docked using APF method to all clusters. 5. The best combination of clusters were selected to maximize recognition of actives from decoys. 6. For each selected cluster, a pKd regression model was trained using the 3D poses of the ligands above a certain APF score cutoff.
Predicted ValueAny compound will be predicted using the dfa models in the following way: The compound will be docked using APF method to each of the template clusters. The compound will be assigned to the cluster that gives the highest normalized APF score. The 3D regression model of that cluster will then be used to predict the pKd value of that compound if the APF score is within the score cutoff. The dfa model returns two scores: The dfc score is the classification score: The positives have a median dfc score of 1. The decoys have a median dfc score of 0. The dfc score is renamed to MolClass in the pairwise table. The dfa score is the regression score of pKd value A dfa score of 6. indicates microMolar activities. The dfa score is renamed to MolpKd in the pairwise table. In the pairwise table, a pPvalue is calculated for both the dfc and dfa score. The maximum pPvalue was taken as the final pPvalue for that model. The pPvalue indicates -Log of probability that the compound belongs to the random decoy. A pPvalue of 1 indicates the compound is comparable to the top 90 percentile decoy. A pPvalue of 2 indicates the compound is comparable to the top 99 percentile decoy.
Hybrid DockingDatasetIC50, EC50, Ki, Kd data was downloaded from ChEMBL18, they were combined and converted to pKd value. Compounds with pKd > 5. (i.e. 10 micro Molar) were classified as positives. 90% of ChEMBL18 compounds were assigned to training set. The remaining 10% compounds were assigned to external test set. To the training set, a subset of compounds with pKd>7. against any targets from ChEMBL18 was added as decoy. To the external test set, the approved drugs were added as decoy
TrainingThe dpc model is trained in the following steps: 1. The pocketome entry with the most co-crystallized ligands associated with the target is selected 2. The residues around the pocket were clustered, selected representative PDBs were retained 3. All training set compounds were docked to the 4D maps of the pocket in the presences of the co-crystallized ligands in the form of APF template 4. A score cutoff to maximize sensitivity and accuracy was selected 5. All compounds within that score cutoff were used to train a pKd prediction model using their 3D poses.
Predicted ValueAny compound will be predicted using the dpc models in the following way: The compound will be docked to the 4D maps of the pocket in the presence of APF template The 3D regression model of the pocket will then be used to predict the pKd value of that compound if the docking score is within the score cutoff. The dpc model returns two scores: The dpc score is the classification score: The positives have a median dpc score of 1. The decoys have a median dpc score of 0. The dpc score is renamed to MolClass in the pairwise table. The dpa score is the regression score of pKd value A dpa score of 6. indicates microMolar activities. The dpa score is renamed to MolpKd in the pairwise table. In the pairwise table, a pPvalue is calculated for both the dpc and dpa score. The maximum pPvalue was taken as the final pPvalue for that model. The pPvalue indicates -Log of probability that the compound belongs to the random decoy. A pPvalue of 1 indicates the compound is comparable to the top 90 percentile decoy. A pPvalue of 2 indicates the compound is comparable to the top 99 percentile decoy.
|
Copyright© 1989-2020, Molsoft,LLC - All Rights Reserved. |
This document contains proprietary and confidential information of
Molsoft, LLC. The content of this document may not be disclosed to third parties, copied or duplicated in any form, in whole or in part, without the prior written permission from Molsoft, LLC. |