MolScreen Contains a Panel of >2500 High Quality 2D and 3D Models.
MolScreen is a set of high quality 2D fingerprint and 3D pharmacophore models for a broad range of pharmacology and toxicology targets.
The models can be used for lead discovery or counter screening. The models use MolSoft's 2D QSAR/Fingerprint and 3D Atomic Property Fields ( Totrov 2008) methods. There are currently approximately 2500 models for 1200 targets.
The models can be screened directly using MolSoft's ICM-Pro + VLS software. Alternatively we can screen a set of chemicals for you via our contract research services. Please contact us for more information about how to use MolScreen.
MolScreen Applications
MolScreen can be used for:
- Target Identification - Search chemicals against a set of protein targets.
- Lead Identification - Identify chemicals that can bind against a protein target.
- Profiling - Multiple protein targets versus multiple chemicals.
- Drug-Repurposing - Use MolScreen to search for new protein targets for available drugs
Available Models
You can download and view the available models using the links below (updated (3/8/2019). Each model has a name which starts with the 3 letter abbreviation of the model type as described below followed by the gene name.
Model Types
There are two categories of models:
1. ADMET ( mcp)
- CACO2, hERG, HALFLIFE, LD50, CYP, Tox21, etc...
- Properties like, Regression/Classification
2. Different types of Activity Models
- Approximately 2500 models against 1200 targets.
- Machine Learning ( kcc), Ligand Field - 3D Atomic Properties Field ( dfz), 4D Docking/3D-QSAR ( dpc), 3D APF/3D-QSAR ( dfa)
About the Models
Machine Learning Models - Hybrid 2D QSAR/Fingerprint Models kcc(+kca)
|
kcc(+kca): Kernel regression Chemical fingerprint Classification/Activity prediction
- Currently: 999 mammalian models
- Training set: ChEMBL Ki, IC50, EC50
- Report kcc(Classification) score and kca(pKd regression) score
- Median training set: 370 ligands
- Median external test set AUC: 96%
- Median external test set Q2: 0.5
- Extremely fast (thousands of cpds in min)
Training:
- Cluster Actives by fingerprint
- Add 40k ChEMBL actives decoy
- Kernel function to each cluster -> probability score (kcc/MolClass Score)
- Partial Least Square Regression for each cluster + Kernel Regression (kca/MolpKd Score)
- MolScore: combine MolpKd and MolSimilarity to known binders
|
Ligand Field Docking Models - 3D Atomic Property Field Models (dfz)
|
dfz: Docking to ligand Field Z-score prediction model
- Built using Atomic Property Fields
- Currently: 504 mammalian models
- Pocketome ligands/custom alignment as APF template
- ChEMBL cpds for validation
- Median AUC: 92%, 139 cpds vs decoy
- Fast-ish (single template cluster ~5 sec per cpd)
|
Pocket Docking 3D QSAR Models
|
dpc: Docking to Pocket Classification/Activity
- Currently: 343 mammalian models w/ AUC> 80%
- Training set: ChEMBL Ki, IC50, EC50, Drugbank assignment
- Median size: 307 ligands
- Median external Q2: 0.53
- Median external AUC: 95%
Training:
- Pocketome -> Clustering of pocket residues
- 4D Docking w/ co-crystallized ligand as APF template
- Docking Score -> Probability score (dpc/MolClass score)
- 3D QSAR training of Activity-> (dpa/MolpKd)
- MolScore: combine MolpKd and MolSimilarity to known binders
|
Hybrid 4D/2D - Hybrid Models (dfa)
|
dfa: Docking to ligand Field Activity prediction
- Currently: 612 mammalian models w/ AUC > 80%
- Training set: ChEMBL Ki, IC50, EC50, Drugbank assignment
- Median size: 270 ligands
- Median external Q2: 0.65
- Median external AUC: 96%
Training:
- Also from Pocketome -> 4D Docking + Ligand APF template
- Cpd align to ligand template -> cluster by 3D poses
- APF Score -> Probability Score (dfc/MolClass score)
- 3D-QSAR training for each cpd cluster (dfa/MolpKd score)
- MolScore: combine MolpKd and MolSimilarity to known binders
|
ADMET Models
mcp: Miscellaneous Chemical Property Models
- Currently 38 models, mostly from PubChem data
- All validated by external test set (20% of data set aside)
- Regression Models, Mean external test set Q2: 0.7 - CACO2, PAMPA permeability, LD50 (mg/kg), Half-life (hr)
- Classification Models, Median external test set AUC: 84% - hERG, PGPinhibitor, PGPsubstrate, PAINS - Cytochrome P450 1A2, 2C19, 2C9, 2D6, 3A4 - 25 Tox21 Classifier, including Estrogen Agonist/Antagonist, Genotoxicity, Aromatase, etc