[ Related Chains | Calculate RMSD | Contact Areas | Closed Cavities | Surface Area | Distances | Planar Angle | Dihedral Angle | Ramachandran Plot | Ramachandran Export | Protein Health | Local Flexibility | Protein-Protein Interface Prediction | Identify Ligand Pockets ]
In this chapter we describe the tools available for analyzing protein structure. These tools include calculating RMSD, identifying closed cavities, calculating contact and surface area, measuring anlgles and distances, and generating Ramachandran plots.
Chapter Contents:
4.5.1 Find Related Chains |
This option allows you to search the currently loaded PDB files or ICM objects and identify chains which are similar and/or related.
You can do this by:
- Select the objects or pdb files you want to compare.
- Tools/Analysis/Find Related Chains
- Click OK to confirm the selection you made
- A table as shown below will be displayed.
name1 = Name of query structure molecule
name2 = Name of hit
len1 = length of query
len2 = length of hit
seqid = Sequence identity percentage
consensus = Consensus sequence
NOTE: This option is for protein structures only not for chemical compounds. You can use the command line options RMSD (http://molsoft.com/man/icm-functions.html#Rmsd) and Srmsd (http://molsoft.com/man/icm-functions.html#Srmsd) for chemicals.
|
To calculate RMSD between two protein structure:
- Read into ICM the two structures ( File/Open or PDB Search or Read in Chemical) you wish to compare.
- Select one of the two molecules you wish to compare, you can do this by double clicking on the name of the structure in the ICM Workspace. Convert this selection to an Orange Selection.
- Select the second molecule, and then you should have one orange and one green selection in the graphics display.
- Select whether you wish the atoms to be superimiposed onto one another or kept in place. The kept in place option would be ideal for comparing docked structures.
- Choose whether you wish to make the superposition by alignment or exactly matching the atom names.
- Select which atom types you wish to superimpose.
The RMSD value will be displayed in the terminal window.
- Read in a protein structure ( File/Open or PDB Search).
- Select the region you wish to analyse.
- Tools/Analysis/Contact Areas
- The xstick display in the region will be scaled according to the atom/residue contact area. For example, residues making large contacts with a ligand will be displayed in thicker xstick representation (and colored yellow) than those making less significant contacts.
- A table as shown below will be displayed.The table lists the contact area, exposed area, percentage of contact area compared to exposed, the nearest atom of a residue and the distance.
4.5.4 Identify Closed Cavities |
This tool will identify cavities within a molecule which are completely closed,. If you are looking for buried and open pockets then use icmPocketFinder.
- Read in a protein structure ( File/Open or PDB Search).
- Convert the protein structure to an ICM object.
- Tools/Analysis/Closed Cavities
- Use the drop down arrow to locate the molecule you are interested in.
- Enter the minimum volume of the cavities you wish to identify.
- Click OK
- The closed cavities will be displayed in the meshes section of the ICM Workspace and a table of the cavities will be displayed. Double click on a row in the table to jump to a particular closed cavity and select
the residues surrounding it.
What is the difference between Closed Cavity and ICMPocketFinder? Closed pocket (cavityFinder) is purely geometrical/topological - it is a part of molecular surface that is completely disconnected from the exterior surface. It means that a probe sphere of 1.4A radius (representing a water molecule) can not pass in or out of the cavity (considering protein as completely rigid of course).
icmPocketFinder identifies pockets that are likely to contain ligands (not specifically open or closed pockets). Pockets are defined based on physical interaction rather than geometric criterion. Blobs of 'pocket density' generated by icmPocketFinder represent continuous regions of space where there is significant favorable van der Waals interaction with the receptor.
|
This option calculates solvent accessible area of each selection in multiple objects and stores it in a table. If a molecule is specified in a multi-molecular object, the surface area of an isolated molecule is calculated and other molecules are ignored. The area is reported in square Anstroms and the probe radius is assumed to be the value set in the variable waterRadius.
Output: the macro creates table AREA . The empty comment field is added for user's future use. If the table exists, new rows are appended.
To calculate a surface area:
- Read in a protein structure ( File/Open or PDB Search).
- Select the region you wish to analyse.
- Tools/Analysis/Surface Area
- A table will be displayed listing the residues in the selection along with the corresponding total surface area.
There are two approaches to calculating and displaying distances between atoms. You can either use the options in the Labels tab or use Tools/Analysis/Distance.
To display all to all distances:
- Select the atoms between which you would like to find the distance. (See selection toolbar)
- Tools/Analysis/Distance
- Select all to all
To display intermolecular distances
- Select the atoms between which you would like to find the distance. (See selection toolbar)
- Tools/Analysis/Distance
- Select intermolecular
To display the distances between the same atoms in two objects.
- Select the atoms between which you would like to find the distance. (See selection toolbar)
- Tools/Analysis/Distance
- Select same atoms in two objects
NOTE: Distances can be displayed and undisplayed in the 3D labels section of the ICM Worskapce. You can change the color of a distance label by right clicking on it in the ICM Workspace. You can alse export the distance to a table.
|
If you wish to find the planar angle between three atoms:
- Select Tools/Analysis/PlanarAngle
- Right click on the each of the three atoms which you wish to use, and
select their name. The spaces next to First atom, Second atom, and Third
atom should now contain the name of your atoms.
- Click Apply to display the angle measure in the terminal window.
In order to find the angle dihedral angle between two sets of atoms:
- Select Tools/Analysis/Dihedral Angles.
- Right click on each of the four atoms which you wish to use, and
select the name of the atoms. The spaces next to Atom 1, Atom 2, Atom 3,
and Atom 4 should now contain the names of your atoms.
- To find the correct angle, select your atoms according to the
following diagram:
- Click Apply to display your dihedral angle measure in the terminal
window.
4.5.9 Ramachandran Plot Interactive |
To make an interactive ramachandran plot:
- Read in a protein structure ( File/Open or PDB Search).
- Select the structure you wish to build the plot for. You can do this by double clicking on the name of the structure in the ICM Workspace (a selection is highlighted blue in the
ICM Workspace and green crosses in the graphical display) or you can use the right-click button and drag it over the whole structure in the graphical display.
- Tools/Analysis/Ramachandran Plot Interactive
- The interactive ramachandran plot will be displayed in table called RAMA.
- You can view the Omega, Phi/Psi (Gly) or Phi/Psi angles by clicking on the tabs at the top of the plot. Each point is linked to the data in the table RAMA and also to the graphical display. Clicking on a point in the plot will highlight the corresponding angles in the table and also center on this region in the 3D display.
4.5.10 Export Ramachandran Plot |
- Read in a protein structure ( File/Open or PDB Search).
- Select the structure you wish to build the plot for. You can do this by double clicking on the name of the structure in the ICM Workspace (a selection is highlighted blue in the
ICM Workspace and green crosses in the graphical display) or you can use the right-click button anddrag it over the whole structure in the graphical display.
- Tools/Analysis/Ramachandran Plot Export
A postscript viewer needs to be downloaded onto your machine in order to view the plot. This can be downloaded from
http://www.cs.wisc.edu/~ghost/. Once this software is downloaded you need to tell ICM where it is located by typing
the pathname into File/Preferences.
NOTE: You can always export the plot as an image directly in ICM without exporting. You can do this by right clicking on the plot and select save as image. Another approach could be to export the RAMA table to Excel and use the plotting tools there. You can do this by right clicking on the table name tab and selecting "Export to Excel" or save as ".csv".
|
Theory
The protein health option calculates the energy strain of a structure in ICM. It is generally a good idea to investigate
the energy strain of any protein structure before undertaking such processes as docking. It is also essential to use this tool
after making a model (see Molecular Modeling) to identify strained regions within your model and then some optimization
procedure can be undertaken to rectify the problems.
The protein health option calculates the relative energy of each residue for a selection and colors the selected residues by strain. This macro uses statistics obtained in the following paper Maiorov, V.N. and Abagyan, R.A. (1998) Energy strain in three-dimensional protein structures Folding and Design, 3 , 259-269.
To use the Protein Health option:
- Read in a protein structure ( File/Open or PDB Search).
- Convert your PDB structure into an ICM object.
- Make a selection of the residues you wish to analyze.
- Tools/3D Predict/Protein Health and a window as shown below will be displayed.
- The scale of the coloring can be changed by altering the value within the trimEnergy data entry box.
- Click OK and the structure will be colored according to energy strain (red - high) and a table of residue energy will be displayed
in a table.
- A table and plot of Normalized energies for each amino acid in the selection will be displayed. The table is ranked and colored by residues with poor normalized energies. Click on a row in the table or plot to center in on the residue.
This option systematically samples rotamers for each residue side-chain in the input selection and uses resulting conformational ensembles to evaluate energy-weighted RMSDs for every side-chain atom. These are stored in the 'field' values on atoms and can be used for example to color the structure by side-chain flexibility. Conformational entropy for each residue side-chain is also calculated and stored in a table. If l_entropyBfactor flag is on, the atom rmsds are normalized within the residue to reflect its total conformational entropy. If l_bfactor flag is set, the bfactors are reset to the same values that are placed in the atom 'field', and occupancy is set to be inversely proportional to it ( O=1/(1+2*rmsd) )
- Read pdb file (File/Open or PDB Search Tab).
- Convert to an ICM Object.
- Tools/3D Predict/Local Flexibility
4.5.13 Protein-Protein Interface Prediction |
The ICM Optimal Docking Area method is a useful way of prediciting likely protein-protein interaction interfaces. If you do not have mutational data or other experimental data which indicates the likely protein-protein docking site this method will be useful. This procedure can save you time during the docking procedure by focusing your docking only on areas on the receptor and ligand most likely to interact.
Theory
ODA (Optimal Docking Areas) is a new method to predict protein-protein interaction sites on protein surfaces. It identifies optimal surface patches with the lowest docking desolvation energy values as calculated by atomic solvation parameters (ASP) derived from octanol/water transfer experiments and adjusted for protein-protein docking. The predictor has been benchmarked on 66 non-homologous unbound structures, and the identified interactions points (top 10 ODA hot-spots) are correctly located in 70% of the cases (80% if we disregard NMR structures). For a description of the method see Fernandez-Recio et al Proteins (2005) 127: 9632.
To display the optimal docking area.
- Convert the PDB file to an ICM object.
- Tools/3D Predict/Protein Interface by ODA
- If you select the Residue Table option the average ODA score for each residue will be displayed in a table. The lower the number the higher the chance the residue will be involved in protein-protein interactions. Regions colored red represent low ODA score and blue represents a high score.
4.5.14 Identify Ligand Binding Pockets |
Theory
The ICM Pocket Finder method (1-2) uses only the protein structure for the prediction of cavities and clefts. No prior knowledge of the substrate is required. The position and size of the ligand-binding pocket are determined based on a transformation of the Lennard-Jones potential by convolution with a Gaussian kernel of a certain size, a grid map of a binding potential and construction of equipotential surfaces along the maps. The pockets are displayed graphically as a surface and the dimensions of each pocket are presented in an interactive table and plot.
Factors that can influence ligand binding to a pocket include the pocket volume and area, buriedness, hydrophobicity, and how compact the pocket is. All these properties are calculated using ICMPocketFinder and tabulated. Scientists at Merck used MolSoft.s ICMPocketFinder algorithm to define a way for quantifying "drugability" of a protein target (3). The metric they used is called Drug-Like-Density (DLID), and this score is also provided in the results table.
A good example of the use of the icmPocketFinder method is in the database called Pocketome. The Pocketome (www.pocketome.org) is an encyclopedia of conformational ensembles of all druggable binding sites that can be identified experimentally from co-crystal structures in the Protein Data Bank (4).
1. An, J., Totrov, M. & Abagyan, R. Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol. Cell. Proteomics 4, 752 (2005).
2. Abagyan, R. & Kufareva, I. The flexible pocketome engine for structural chemogenomics. Methods Mol. Biol. Clifton NJ 575, 249.279 (2009).
3. Sheridan, R. P., Maiorov, V. N., Holloway, M. K., Cornell, W. D. & Gao, Y.-D. Drug-like density: a method of quantifying the .bindability. of a protein target based on a very large set of pockets and drug-like ligands from the Protein Data Bank. J. Chem. Inf. Model. 50, 2029.2040 (2010).
4. Kufareva, I., Ilatovskiy, A. V. & Abagyan, R. Pocketome: an encyclopedia of small-molecule binding sites in 4D. Nucleic Acids Res. 40, D535.540 (2012).
To predict pockets:
|
- Read in a protein structure ( File/Open or PDB Search).
- Convert the protein structure to an ICM object.
- Tools/3D Predict/icmPocketFinder
- Enter a tolerance level (4.6 is the default value and we recommended you to use this). The lower the tolerance value the more pockets predicted and the
higher the tolerance the less pockets predicted.
- Check the box create sequence sites if you wish the site to be labeled.
- Check the box display results to see the predicted pockets as grobs in the display panel.
- Check the box keep compounds if you wish the compounds (ligands) in the receptor to be included in the prediction. If you
dont check this box the pockets will be calculated based on the receptor without ligands.
- Click OK to run icmPocketFinder and the results will be displayed in a table.
|
|
Results
- Click on a row of the results table to select the residues surrounding the pocket.
- The pocket view can be toggled on or off in the meshes section of the ICM Workspace. Right click on the name of the mesh in the ICM workspace to change properties such as color.
- A fully interactive plot (Area vs Volume) is also provided, pockets that fall within the blue shaded region have a "drug-like" volume and size.
|
About the icmPocketFinder Results Table
Factors that can influence ligand binding to a pocket include the pocket volume and area, buriedness, hydrophobicity, and how compact the pocket is. These values are reported in the results table:
- Volume of the pocket in Å.
- Area of the pocket in Å.
- Hyrdophobicity - represents the percentage of the pocket surface s in contact with hydrophobic protein residues (values can range from 0-1)
- Buriedness - The buriedness parameter is calculated as follows: One measures the solvent accessible surface area of the pocket (probe radius, 1.4) in isolation. Then one measures the solvent accessible surface area of the pocket covered by its shell. The ratio of the second number to the first is the fraction buried. The lowest possible value is 0.5; i.e. the pocket is completely open and the surface flat. The highest is 1.0, i.e., completely buried.
- DLID - Merck's Drug-like density score (see Sheridan et al JCIM 2010). >0.5 is considered "druggable".
- Radius gives an indication of how sprherical a pocket is. ( 3/4 Volume / Pi )^1/3 , i.e. radius of an ideal spherical cavity of the same volume as a pocket blob.
- Nonsphericity give an indication of how spherical the pocket is. Area / (area of ideal spherical cavity) . It is 1.0 if the cavity is spehrical
- Conservation Average conservation (%identity) of residues in contact with the pocket. Calculation is performed if multiple alignment is present and linked to the protein chain being analyzed.
- RelCons Above conservation, relative to the average over the entire chain.
- Type - returns the ICM selection for the residues forming the pocket.
|