ICM Manual v.3.9
by Ruben Abagyan,Eugene Raush and Max Totrov
Copyright © 2020, Molsoft LLC
Nov 24 2024

Contents
 
Introduction
Reference Guide
 ICM options
 Editing
 Graph.Controls
 Alignment Editor
 Constants
 Subsets
 Molecules
 Selections
 Fingerprints
 Regexp
 Cgi programming with icm
 Xml drugbank example
 Tree cluster
 Arithmetics
 Flow control
 MolObjects
 Energy Terms
 Integers
 Reals
 Logicals
 Strings
 Preferences
 Tables
 Other
 Chemical
 Smiles
 Chemical Functions
 MolLogP
 MolLogS
 MolSynth
 Soap
 Gui programming
 Commands
 Functions
 Icm shell functions
 Macros
 Files
  _macro
  _startup file
  foldbank.db
  Icm.cod
  Icm.bbt
  Icm.bst
  Icm.cnf
  Icm.cnt
  Icm.cn
  Icm.gro
  Icm.htm
  Icm.hbt
  Icm.hdt
  Icm.cfg
  Icm.clr
  Icm.map
  Icm.trj
  Icm.ob
  Pmffile
  Icm.res
  Icm.var
  Icm.rst
  Icm.rs
  Icm.col
  Icm.tab
  Icm.tot
  Icm.vwt
  Icm.pdb
  Icm.seq
  Icm.se
  Icm.ali
  Icm.all
  Icm.cmp
  Icm.prf
  Icm.iar
  Icm.sar
  Icm.mat
  Icm.rar
Command Line User's Guide
References
Glossary
 
Index
PrevICM Language Reference
Files
Next

[ _macro | _startup file | foldbank.db | Icm.cod | Icm.bbt | Icm.bst | Icm.cnf | Icm.cnt | Icm.cn | Icm.gro | Icm.htm | Icm.hbt | Icm.hdt | Icm.cfg | Icm.clr | Icm.map | Icm.trj | Icm.ob | Pmffile | Icm.res | Icm.var | Icm.rst | Icm.rs | Icm.col | Icm.tab | Icm.tot | Icm.vwt | Icm.pdb | Icm.seq | Icm.se | Icm.ali | Icm.all | Icm.cmp | Icm.prf | Icm.iar | Icm.sar | Icm.mat | Icm.rar ]

_macro file. A collection of ICM macros.


This file contains a set of ICM macros. You can use them, modify them, or browse them to develop your own macros. _macro is downloaded by the call _macro command.

_startup. ICM startup file


This ICM script contains a set of commands issued automatically upon invoking ICM. The file will be first searched in the current directory and then in the directory defined by the UNIX environmental variable ICMHOME. Read more about the _startup file in the customization: _startup section.
 
 s_pdbDir     = "/data/pdb/" # set it to the place where PDB lives 
 pdbDirStyle  = "pdb1abc.ent" # style currently distributed by PDB 
 s_helpEngine = "icm"        # reasonable default, HTML-help is 
                             # an alternative 
 
                             # you may have your own PROSITE updated file 
 s_prositeDat = Getenv("ICMHOME")+"/prosite.dat" 
 
                             # xpsview may be more standard 
 s_psViewer   = "/usr/opt/bin/gs -q" 
 
                             # better be accessible only for you 
 s_tempDir = "/usr/tmp/" 
# 
 read libraries              # they will be read from $ICMHOME 
 
 call _aliases               # by default it will be taken 
                             # from the directory defined by 
                             # environmental variable $ICMHOME 
 call _macro                 # by default it will be taken 
                             # from the directory defined by 
                             # environmental variable $ICMHOME 
 
 print "...ICM startup file executed..." 


foldbank.db


Bank of assigned secondary structures (foldbank.db) This text file may be created by _mkSegmentLib script and contains secondary structures for a non-redundant set of protein chains. Description of fields:
  • NA - chain name ('m' usually stands for main or 'NO chain identifier')
  • RZ - resolution. NMR entries get 9.99 (they may be actually worse than that).
  • ER - all-atom RMSD-error upon PDB->ICM conversion. Beware of entries with ER > 0.5!!
  • SE - amino acid sequence as extracted from the structure (not SEQRES)
  • SX - authors' secondary structure assignment, all _____ if not provided (as in 1knt.m).
  • SS - automatically assigned by ICM secondary structure using modified Kabsh and Sander algorithm.
The commented field contains a serial number.
Example two entries:
 
... 
... 
## 355 
NA 4tpi.i 
RZ 2.20 
ER 0.027 
SE RPDFCLEPPYTGPCRARIIRYFYNAKAGLCQTFVYGGCRAKRNNFKSAEDCMRTCGGA 
SX _______________EEEEEEEEEE__EEEEEEEEE__________HHHHHHHHHH__ 
SS ___GGG___________EEEEEEE____EEEEEEE_________B__HHHHHHHH___ 
... 
... 
## 364 
NA 1knt.m 
RZ 1.60 
ER 0.015 
SE TDICKLPKDEGTCRDFILKWYYDPNTKSCARFWYGGCGGNENKFGSQKECEKVCA 
SX _______________________________________________________ 
SS _GGGG______B____EEEEEEE____EEEEEEE__B______B__HHHHHHHH_ 
... 
... 



Atom codes (icm.cod)


This text file contains description of (1) atom types and references to (2) MMFF types, (3) van der Waals types, (type 0 or 9 to ignore) (4) hydrogen bonding types, (type 1 means no H-bonds) and (5) hydration types (0 to ignore). The real numbers are atomic mass (6) and surface (7). The character (8) is used to define atom color. Free-format. Two example lines:
 
#      (1)  (2)  (3)  (4)  (5) (6)     (7)   (8)  *  Comment 
#>     cd   mmff vw   hb   hd     wt    sf   na  comment 
cod    63    6   18    6    6  16.000  40.77 o    * o in r-c-oh (thr,ser) 
cod    71   32   19    6    8  16.000  36.79 o    * o- in carboxylate ion 
cod    92   12   61    1    1  35.453 133.8  Cl   *    (MMFF) 
cod   223   38   13    4    3  14.007  61.16 n    * pyridin nitrogen (MMFF) 


Bond angle bending and improper torsion deformation parameters (icm.bbt)


This text file contains a factor (kcal/mole) and an equilibrium angle in degrees for the bond angle bending deformation energy for different types of angles.
 
#   Type   Factor  OptAngle(a0).   E=Factor*(a-a0)<sup>2</sup> 
# 
bbt    1  160.7000  115.0000  ca-c#-n 
bbt    2  128.2000  120.5000  ca-c#=o 
... 


Bond stretching parameters (icm.bst)


This text file contains a factor (kcal/mole) and an equilibrium bond length in Angstroms for the bond stretching energy for different types of bonds.
 
#     E=Factor*(b-b0)<sup>2</sup> 
#     Type   Factor  BondLength(b0). 
#>    ity    eybs      eqbl     bt   comment 
#    
bst     1  500.0       1.4530  1 cn  n-ca 
bst     2 1150.0       1.3250  1 cn  c#-n 
bst     3  460.0       1.5300  1 cc  ca-c# 
bst     4  430.0       1.5300  1 cc  ca-cb 
... 


Conformational stack ( *.cnf )


This binary file contains descriptions of several conformations of the same molecule. You can not edit this file. The stack is automatically generated and saved in the course of Monte Carlo, or systematic search procedures. Alternatively the stack may be created directly by the store conf command. To read/write a stack use: read stack [s_StackName] write stack [s_StackName]

Distance restraint types ( icm.cnt or *.cnt )


The file describes legal types of drestraints to impose attraction or repulsion between atom pairs (e.g. NOE distance restraints derived from NMR data). This penalty term is called "cn". The system icm.cnt can be edited, however, user files (e.g. mydist.cnt) of the same format can be created and loaded with the read drestraint type command. The file contains:
 
#      type    weight    lower     upper  sharpness 
#                                              4 special types for S-S bonds 
ssSS1         10.0       2.04      2.04     10.0 # Sharp well for S-S dist. 
ssSS2          2.0       2.04      2.04      1.0 # Wide well  for S-S dist. 
ssSC           5.0       3.052     3.052    10.0 # S -Cb distance 
ssCC           3.0       3.855     3.855    10.0 # Cb-Cb distance 
global    1    1.0       0.0       3.0           # a global drestraint 
global    2    1.0       2.0       4.0           # a global drestraint 
local    12    1.0       2.5       2.8       1.0 # a local drestraint 
Both local distance restraints and global ones force two atoms to stay between the upper and lower boundaries, however, the local restraints diminish at large distances (similar to van der Waals interactions), whereas the global restraints grow bi-quadratically as deviation from the target distance range increases. You can have and read several *.cnt files. If the type numbers overlap the previous types are redefined.

From version 3.6-2 self-sufficient harmonic drestraints that do not require a separately defined type were introduced. However, the older drestraints with external types continue to be supported.
See also related commands: read drestraint type, show drestraint type, set drestraint type, make drestraint type.



Distance restraints ( *.cn )


Contains list of atom pairs for which interatomic distances should be restrained according to specified types defined in a separate icm.cnt file. The .cn files are created by the user. Supplied icm.cn file is just an example.
 
#  ml1  re1   at1   ml2 re2   at2  cn_type 
cn crn 1 val  hg22   * 1 val  ca      1 
cn  *  1 val  hg23   * 1 val  cg1     1 
cn  *  2 gln  ca     * 1 val  hg11    1 
cn  *  2 gln  ca     * 1 val  ha      1 
Molecule name and residue number (e.g. 14, 25A, etc.) are normally used to find an atom. An asterisk instead of the molecule name means that only the residue number should be matched. See also: read drestraint, show drestraint, set drestraint, and make drestraint (un)display drestraint

Graphics objects ( *.gro )


This text file contains streams of POINT coordinates (i.e. a triple of floats in one line, preceded by an integer reference number), LINE descriptors (i.e. pair of integer numbers of recently described POINTS in one line) and/or TRIPLES (or TRIANGLES, i.e. triples of integer numbers of POINTS). The order of streams is arbitrary, provided that referenced POINTS are already described. Either LINES or TRIPLES can be omitted. Graphics objects can be read, written, displayed , or made from a 3D map.
 
1 1.00 -1.00 0.00 
2 1.00  1.00 1.00 
3 0.00  2.00 0.50 
 1 2 
 1 3 
 2 3 
1 2 3 
Check content of other .gro files in you icm directory. ICM also understands the Wavefront obj-format ( files *.obj and *.off ) as well as 3DXML from Dassault Systemes, and two Collada formats, namely .DAE and .KMZ. The .KMZ format is a zip file that contains .DAE files and additional texture files. You can create your own dot, wire or solid graphics objects either manually or automatically.

ICM HTML help file ( icm.htm )


contains this manual

Hydrogen bonding types ( icm.hbt )


A and B parameters for the A/r12 - B/r10 potential between HB donors and acceptors. See Nemethy et al. for reference.
Example lines:
 
#     i  j     B        A        E      r0 
#    
hbt   2  4   8244.0   32897.0  0.550   2.190      * n-h...n 
hbt   3  4   8244.0   32897.0  0.550   2.190      * o-h...n 


Hydration parameters ( icm.hdt )


Parameters to calculate solvation energy based on atomic solvent-accessible surfaces (see solvation term). The file contains the current ICM set and several currently inactive sets (e.g. Eisenberg and McLachlan (1986), Wesson and Eisenberg (1992) ) that are commented out.
Example lines:
 
rwater  1.4000  # water radius used to roll around the molecule 
# 
#       1      2          3      4     5       6      7    8
# ty,        eyhd,   ey_apolar, ra,   exvo, ey_membrane
hdt     1    0.0100   0.0151   1.950  21.15 -0.00824  c aliphatic
hdt     2   -0.0090   0.0177   1.800  12.57 -0.02646  c aromatic
hdt     4   -0.2800  -0.0548   1.700  13.63 -0.03390  n+ nz lys+
  1. reference type number
  2. solvation energy density from vacuum-water transfer experiments for a given hydration type
  3. solvation energy density from octanol-water transfer experiments for a given hydration type
  4. radius used to calculate accessible surface
  5. excluded volume
  6. membrane (lipophilic) implicit solvation parameters, see surfaceMethod = 4 and TOOLS.membrane array
  7. atom code name
  8. comment

The solvation parameters can be temporarily changed and the product of the current solvent accessible areas by the parameters can be returned with the Area ( as_ R_newParams energy ) function.



Configuration file (icm.cfg)


This file contains limits and memory requirements for ICM. It will be searched in the current directory ( ./ ) first and, if not found, in the directory defined by the UNIX environmental variable $ICMHOME or $HOME/.icm/config/ directory ( $USERPROFILE/.icm/config/ for Windows), if present.
You may edit the file and change the limits. The MnArrays parameter controls sized of three types arrays: rarray , iarray and sarray .
 
# ICM configuration file. Free format 
# Mn stands for "Maximal Number of" 
# Mx stands for "Maximal Size of" 
BufferSpace 2097152 # ICM will not let you decrease BufferSpace less than 131072 
MnResidueTypes  200 
MnSequences   20000 
MnAlignments   1500 
MnProfiles       40 
MnGrobs         200 
MnArrays        600 
MnTables        140 
MnMaps           40 
MnMacros        400 
XTermFont  *-fixed-medium-*-*-*-24-* # to set font in the terminal window 
Xterm      xwsh                      # default for SGI 
# Xterm    xterm                     # default for Linux and other UN*Xes              


Colors ( icm.clr )


file contains default color and font settings. The default icm.clr file resides in the $ICMHOME directory. The LIBRARY.clr variable defines the default path and name of the icm.clr file.
Keep your own color and graphics controls file in ~/.icm directory. Example of ~/.icm/user_startup.icm file ( $USERPROFILE/user_startup.icm under Windows ):
 
LIBRARY.clr = Getenv("HOME")+"/.icm/icm.clr" 
read color    # load your custom settings 

Modify the file if needed. The following lines are recognized (free format):
 
# CONFIGURABLE GRAPHICS.mode translation table  
#   Use keywords Left Mid Right, Shift Ctrl Alt Dbl, At        
#   TopNN LeftNN RightNN BottomNN, where NN is a percentage of the zone 
#   Modes 0,3,4,5,14,15 require a hit in 'At' = (atom | grob)  
#   otherwise control falls through to next best appropriate action 
#   Some modes have submode switches listed in parentheses ()  
#   Users are encouraged to modify bindings to their needs     
# ---mode--combination------------- # equivalent GRAPHICS.mode preference 
mode   0  Right-At                  # popup (in GUI only) 
mode   1  (Shift)-Left              # Rotation 
mode   2  (Shift)-Mid               # Translation 
mode   3  (Ctrl)-Shift-Right-At     # Label atoms 
mode   4  (Ctrl)-Dbl-Right-At       # Label residues 
mode   5  (Shift)-Ctrl-Left-At      # Change torsion angles 
mode   6  (Shift)-Bottom5-Left      # Rotation of the view 
mode   7  (Shift)-Top5-Left         # Z-axis rotation 
mode   8  Left5-Mid                 # Zoom 
mode   9  Alt-Mid                   # Move rear clipping plane 
mode  10  Ctrl-Mid                  # Move front clipping plane 
mode  11  Ctrl-Alt-Mid              # Slab 
mode  12  (Shift)-Right             # Rectangular selection 
mode  13  (Shift)-Ctrl-Left         # Lasso selection 
mode  14  (Shift)-Ctrl-Alt-Right-At # Connect to molecule 
mode  15  Shift-Ctrl-Dbl-Right-At   # Set alignment cursor 
mode  16  Ctrl-Mid-At               # Drag atoms 
mode  17  Right5-Mid                # Z-translate 

 
#--------------{Colors}------------ 
# -----color------------RRGGBB--A_real_if_not_1 
.... 
color lightgreen        # 80ff80 
color rita              # ff1b00  0.3 
color darkseagreen      # 8fbc8f 
... 
#--------{ Atom/Grob/Font Colors }-------- 
atom c grey  # c is the first character of chemical element. 
background    black 
#             color    font      size bold italic underline 
atomFont      rose       times     12  0   0    0 
varFont       yellow     symbol    12  0   0    0 
residueFont   green      helvetica 18  0   0    0 
grobFont      green      helvetica 18  0   0    0 
stringFont    green      times     24  0   0    0 
auxiliaryFont green      symbol    28  0   0    0 
fixedFont     green      courier   12  0   0    0 
# 
alphaRibbon red 
piRibbon    blue 
threetenRibbon magenta 
betaRibbon  green 
coilRibbon  yellow 
#- 0:127 rainbow colors (address them by number: color 15.5) ------- 
# ---------i-color---- 
rainbow     0  # 0000ff 
rainbow    63  # ffffff 
rainbow   127  # ff0000 
.... 


Electron density map ( *.map )


This binary file contains a complete description of the electron density map, compatible with the format devised by Phil Evans. Maps are stored as a 3-dimensional array preceded by a header which contains all the necessary information about the map. See "The CCP4 Suite" manual for details.

A crystallographic electron density map can be converted to a rectangular equally spaced grid map with the make map potential m_source command.

To create an electron density map from an object in ICM use the make map potential command which by default creates a map called m_atoms . Example:


read pdb "1xbb"           # Syk kinase with gleevec
make map potential a_sti  # density map only for the gleevec 
display m_atoms only


MC simulation trajectory ( *.trj )


This binary file contains a description of a set of geometric parameters (free variables, usually torsions and overall rotation/translation variables), participating in MC simulation, followed by a stream of their values for each conformation accepted during the simulation, together with the energy of each accepted conformation. Movies can be created or appended during MC simulation runs, and then played in any direction with optional smoothing, superimposition (to the initial conformation) and/or centering. These files tend to be large, watch them carefully and do not create them without a need. See also: display trajectory.

ICM-object ( *.ob )


a binary non-editable file describing one or several molecules forming an ICM-object. In addition to information available in a PDB-file it contains a description of atomic charges, tree-like connectivity, detailed atom codes, information about which internal coordinates are constrained, references to energy parameters, secondary structure, etc. The object can be read and written.

Parameters for "mf" mean force term ( icm.pmf )


Example lines:


#  types:  icm  pmf
pmft   50   5
pmft   53   6
pmft   54   6
pmft   55   6
.....
#       midi       mxdi     steps   (interpolation range)
pmfh        0.0     9.0    46
#       type1      type2    energy  (interpolation points)
pmf   1   1   4.253138   4.253138   4.253138   4.253138 ..........

Lines starting with "pmft" key define mapping of general icm atom types (as defined in icm.cod) to a (smaller) set of types of atoms for "mf" energy term. A single line starting with "pmfh" defines the range of distances and the number of interpolation steps for energy energy calculations using cubic spline. Lines starting with "pmf" key define energy function for a specific pair of atom types (first two integers), subsequent reals are function values at different distances, from minimal to maximal as defined in "pmfh" line.

See also:



Residue library ( icm.res or *.res )


The main residue library, describing all "residues" and molecules which can constitute a legal ICM-object. You can create your own entry either manually or using the write library command and add the entry to the icm.res file. You can also keep it in a separate file and append the file to the LIBRARY.res sarray (i.e. LIBRARY.res=LIBRARY.res//"usr" followed by the read library command). A example of an entry for a pro residue:
 
#  resName 1-ch Type AccSurf Eentropy LongName 
nare  pro   P  AMINO    150  0.0  proline 
rem 
rem        _________atom_________   _dihedral_angles   __bond_angles__   _bond_lengths 
rem       /                      \ /                \ /               \ /             \ 
rem    at na    cd lwat   qu    gu na  fe   vuva   ey na  fe  vuva   ey na  fe  vuva ey qfm 
#    Fields: 
#       1 2      3    4      5   6 7    8        9 10 11  12 13      14 15  16    17 18  19 
# 
atre    1 n    216    0 -0.285   0 psi  +  180.000  2 an   . 118.000  1 bn   . 1.340 31 
atre    2 ca   113    1  0.050   0 omgp +  180.000 24 aca  . 121.000  1 bca  . 1.465  1 
atre    3 ha     1    2  0.040   0 fha  .  116.800  0 aha  . 110.200  1 bha  . 1.090  0 
atre    4 cb   112    2 -0.025   0 fcb  . -120.850  0 acb  . 103.700 37 bcb  . 1.530  4 
atre    5 hb1    1    4  0.015   0 fhb1 .  120.200  0 ahb1 . 111.600  1 bhb1 . 1.090  0 
atre    6 hb2    1    4  0.015   0 fhb2 . -120.200  0 ahb2 . 111.600  1 bhb2 . 1.090  0 
atre    7 cg   112    4 -0.050   0 xi1  .   27.400 21 acg  . 103.700  1 bcg  . 1.502  4 
atre    8 hg1    1    7  0.025   0 fhg1 .  120.450  0 ahg1 . 111.200  1 bhg1 . 1.090  0 
atre    9 hg2    1    7  0.025   0 fhg2 . -120.450  0 ahg2 . 111.200  1 bhg2 . 1.090  0 
atre   10 cd   114    7  0.100   0 xi2  .  -35.600 21 acd  . 105.300 39 bcd  . 1.501  4 
atre   11 hd1    5   10  0.010   0 fhd1 . -119.730  0 ahd1 . 111.600  1 bhd1 . 1.090  0 
atre   12 hd2    5   10  0.010   0 xi3  .  -90.760 21 ahd2 . 111.600  1 bhd2 . 1.090  0 
atre   13 c    121    2  0.455   1 phip .  -68.800 19 ac   . 112.300  3 bc   . 1.520  3 
atre   14 o     81   13 -0.385   1 fo   .  180.000  0 ao   . 120.500  1 bo   . 1.230  5 
#    F 20 
lwat   13 
#       F 21 
exbo    1   10 
Eentropy is the entropic contribution to the free energy for a fully accessible residue divided by the solvent accessible surface of this residue (in gly gly X gly gly environment) and multiplied by a factor of 1000.
Fields:
  1. relative atom number
  2. atom name
  3. main atom code (see icm.cod file). It in turn refers to other codes such as hydrogen bonding code, van der Waals code and hydration code.
  4. previous atom in a connectivity graph of the directed ICM-tree.
  5. electric charge
  6. groups of close charges (they should not be separated due to cutoffs distance in interaction lists)
  7. torsion name
  8. fixation status (+ free variable, . fixed)
  9. torsion angle (degrees)
  10. torsion energy type (see icm.tot file).
  11. bond angle name
  12. bond angle fixation status
  13. bond angle value (degrees).
  14. bond angle deformation energy type (see icm.bbt file).
  15. bond length name
  16. bond fixation status
  17. bond length (Angstroms)
  18. bond stretching energy type (see icm.bst file).
  19. (qfm) formal charge (may be +1, -1/3, -1/2 etc.), if any
  20. (lwat) exit atom of the residue
  21. (exbo) additional (non-tree) covalent bonds (atom1 atom2). Normal ICM-tree bonds form regular directed graph without cycles, therefore all the remaining bonds should be declared separately.

The following modified amino acids are also present in icm.res :

  • cme C S,S-(2-hydroxyethyl)thiocysteine
  • csd A 3-sulfinoalanine
  • cso C S-hydroxycysteine
  • kcx K lysine Nz-carboxylic acid
  • hyp P 4-hydroxyproline
  • ptr Y Aphosphotyrosine
  • tys Y AO-sulfo-L-tyrosine
  • llp K lysine-pyridoxal phosphate
  • sep S phosphoserine
  • tpo T phosphothreonine
The following code will generate an uptodate list ore amino acids in the library: t= Table(residue) tt = t.type == 'Amino' & t.desc !~"*simple*" & t.desc !~ "*united*"

Access to 250 modified amino acids.ICM also contains a table called AminoAcids with objects and codes for 250 amino acids. mutateResidue2 res code macro can be used to modify an amino acid to the specified one.

See also: LIBRARY.res , Table (residue)



Object Variables ( *.var )


Text file containing either a subset or a complete set of internal coordinates (variables). Usually created by the write vs_var command. It may also be typed manually. In contrast to an object file ( *.ob) which is a complete description of an object, icm.var may contain any selection of variables. These variables can be read and automatically assigned to a molecule according to molecule name, residue number and variable name.
Examples:
 
rem Va fx Atom  Residue  Mol Obj VaType Symm  Value 
va psi  + n       3  gln  mol dl   2    1   180.00 
va phi  + c       3  gln  mol dl   1    1   -60.00 
va psi  + n       4  met  mol dl   2    1   120.00 
va phi  + c       4  met  mol dl   1    1   180.00 


Multidimensional variable restraint types ( icm.rst or *.rst )


define generic attraction zones in internal coordinate space (usually torsion space) in terms of residue name pattern (* for any residue type), relative residue number, and variable names. After the types are loaded, you may use them to assign specific vrestraints using the set vrestraint command. Three example restraint types (the third one marked with 'rse' will be used as a penalty term, the first two will be used for BPMC steps):
 
------------------------------------ Two-variable vrestraint 
rs  aa      -3.300     0.700     0.542 
va ala* 1 phi    -63.200    22.500 
va    * 2 psi    -38.540    25.500 
------------------------------------ One-variable vrestraint 
rs  vt      -3.511     0.700     0.670 
va val* 1 xi1    174.690    29.225 
-------------------------------------- 
rse fmr     -3.267     0.700     0.525 
va phe* 1 xi1    -66.780    29.700 
va    * 1 xi2     98.680    75.100 
Explanation of fields:
  • aa - rs-name (not longer than 4 characters). Usually the first character is a 1-char residue name, and the second is the zone character (a - alpha, b-beta, g-gamma,d-delta,t-trans,p-plus 60, m-minus 60,n-null)
  • -3.300 - well depth ( should be negative )
  • 0.700 - flat fraction of the well
  • 0.542 - occupancy (probability) of the well with respect to other wells which could be assigned to the same set of variables.
  • ala* - residue name pattern
  • 1 - relative residue number
  • phi - variable name
  • -63.2 - center of the well for 1 phi
  • 22.5 - size of the well (well is -63.2 +- 22.5 )
  • * 2 psi - residue name pattern (* means any), relative residue number (psi formally belongs to C atom of the next residue, that is why the relative number is 2) and variable name
WARNING: remember that the outer borders of the two-dimensional restraints are ellipses, rather than rectangles (the same for the multidimensional ones). To create a sloped surface for all variables involved in the restraint, the well sizes for these variables should be greater than 180.*sqrt(number of angles in the rs) (255.0 for 2 angles).

Multidimensional variable restraints ( *.rs )


This file has similar format to the icm.rst file. The differences are:
  • rse and rs fields indicate whether the zone will be used for energy or probability, respectively.
  • .rs file contains specific residue numbers rather than the relative ones.
  • instead of the residue name pattern the file should contain molecule name or "*".

Example:
 
------------------------------------ Two-variable vrestraint 
rs  aa      -3.300     0.700     0.542 
va 1crn 1 phi    -63.200    22.500 
va 1crn 2 psi    -38.540    25.500 
------------------------------------ One-variable vrestraint 
rse vt      -3.511     0.700     0.670 
va    * 3 xi1    174.690    29.225 
-------------------------------------- 
rs  fmr     -3.267     0.700     0.525 
va 1crn 4 xi1    -66.780    29.700 
va 1crn 4 xi2     98.680    75.100 


A sample *.col file


The file shows an example of a multicolumn file which can be read with the read column command. Arrays r, e s and ent will be created.
 
# Entropies of several amino acids 
#>-r---e------s--------ent--- 
arg 2.13 184.920211 11.5181 
asn 0.81 99.516352 8.1393 
asp 0.61 89.629773 6.8057 
cys 1.14 85.947233 13.263 
gln 2.02 129.68481 15.576 
glu 1.65 119.957333 13.75 
his 0.99 121.124577 8.173 
ile 0.75 132.717054 5.651 


A sample *.tab file


The file shows an example of a .tab file which can be read with the read table. It is similar to the previous file but additionally command. Table t consisting of header string t.titl and arrays t.r, t.e t.s and t.ent will be created.
 
#>s t.titl 
 Entropies of several amino acids 
#>T t 
#>-r---e------s--------ent--- 
arg 2.13 184.920211 11.5181 
asn 0.81 99.516352 8.1393 
asp 0.61 89.629773 6.8057 


Torsion parameters ( icm.tot )


The file contains torsion parameters according to Momany et al., 1975. Parameters for type 21 for pro taken from Venkatachalam et al., (1974), Macromolecules, 7, 212, parameters for types 22-23 for cooh taken from Karplus et al., J.Comp.Chem., (1983),4,187-217, DNA parameters are from Veal and Wilson, 1991. We added extra terms and modified the original Momany et al. parameters (psi and xi3 of Met). The format is free.
 
#                              +----symmetry-----+ 
#        maxEner sign fold exact heavy Pseudo selChar phase 
# 
tot   0    0.00    0    0    1    1     1 -  0.  # fixed dihedrals  
tot   2    0.25    1    1    1    1     1 S  90. # psi 
tot  44    0.50    1    1    1    1     1 S  90. # psi : removing the ECEPP alpha bias 
tot   3   10.00   -1    2    1    1     1 S  0.  # omg 
tot   4    1.35    1    3    1    1     1 H  0.  # xi CH2-CH2  
tot   8    0.90    1    3    3    3     3 M  0.  # nh3 term.group of lys,lysn 
tot  14    0.00    0    2    1    1     2 H  0.  # xi2 of his (+-90) 
tot   5    1.00    1    3    1    1     1 H  0.   # xi3 met  
tot   5    1.17   -1    1    1    1     1 H  0.   # xi3 met additional torsion 
Torsion energy is calculated as: maxEner*(1 + sign*Cos(fold * torsion_angle)) Symmetry is a rotational symmetry in different situations: Exact is the exact symmetry (implies presence of all atoms, including hydrogens), Heavy implies presence of only the heavy atoms (no hydrogens) but uniqueness of different atom types. Pseudo implies that all heavy atoms are equivalent, and hydrogens are ignored. The last character is a short reference name which can be used in vs_var. For example: v_//M specifies all the torsions rotating terminal hydrogen atoms with symmetry higher than 1, v_//H side-chain torsions rotating heavy atoms, etc.

Van der Waals parameters ( icm.vwt )


The file contains ECEPP/2 parameters for peptides ( Momany et al., 1975, Nemethy et al., 1983), parameters for DNA atoms: Veal and Wilson, 1991 and other parameters (unpublished).
Example lines:
 
#    type pzat   n_el  energy   Deq     Rvw   Rvwel electroRadii  
#    
vwt   1   0.42   0.85  0.0370   2.92    1.200 1.200 * h aliphatic 
vwt   2   0.42   0.85  0.0610   2.68    1.200 0.808 * h amide,amine 
vwt   7   1.51   5.20  0.1400   3.74    1.700 1.700 * c carbonyl 
vwt  39   0.00   0.00  0.55     5.911   2.631 2.631 * pseudo atom 
Each line contains:
  1. type reference number (see icm.cod file)
  2. atomic polarizability *1024 (cm cubed);
  3. effective number of electrons
  4. -e(kk), kcal/mol - depth of energy minimum at the optimal inter-atomic distance
  5. r(kk), a - equilibrium (optimal) distance between two atoms of the same type
  6. van der Waals radius (used in graphics,electrostatics,etc)
  7. electrostatic radii, used to calculate geometrical surface boundary in boundary element method and MIMEL calculations
Normally van der Waals parameters A and B are calculated from polarizability and effective number of electrons (fields 2 and 3). However, if these two fields contain zeros, parameters A and B are calculated directly from the energy depth and equilibrium distance (e.g. for type 39).

Protein databank file ( or *.ent )


Protein Data Bank formatted files consist of x,y,z coordinates, occupancies, and B-factors.
Examples:
 
ATOM    1  N   THR     1      17.047  14.099   3.625  1.00 13.79 
ATOM    2  CA  THR     1      16.967  12.784   4.338  1.00 10.80 
ATOM    3  C   THR     1      15.685  12.755   5.133  1.00  9.19 
This file does not provide a complete and unambiguous description of a molecular object. Therefore an object resulting from the read pdb command has a special type and needs conversion in order to become a full-scale ICM-object for which energy calculations are possible.
See also: convert and minimize tether commands .

Sequence ( *.seq *.pir *.gcg *.msf *.gb )


Acceptable formats of sequence files: FASTA and ICM format ( *.seq the simplest and the most natural):
 
> Name1 comment1 comment2 
AGFDSTREMNH-FQW 
> Name2 
RTPIYQWSCCVANMKL  

PIR format: ( *.pir )
 
 >P1;Azur_Pses4 
     Length: 80 
 AECSVDIQGN DQMQFSTNAI TVDKACKTFT VNLSHPGSLP KNVMGHNWVL TTAADMQGVV 
 TDGMAAGLDK NYVKDGDTRV* 
 // 
 >P2;Azur_Pses3 
     Length: 50 
 AECSVDIQGN DQMQFSTNAI TVDKACKTFT VNLSHPGSLP KNVMGHNWVL* 

GCG format ( difficult to generate and impossible to edit because of the CheckSum):
 
Azur_Alcfa  Length: 69  Check: 4484  .. 
 
       1  ACDVSIEGND AMQFNTKSIV VDKTCKEFTI NLKHTGKLPK AAMGHNVVVS 
      51  DGMKAGLNND YVKAGDERV 

MSF format - obsolete multiple sequence format for alignments. Noneditable, contains CheckSums.
GB-Gene Bank format
Entries start with field names followed by a tabulation and the value. NCBI allows one to save in this format.
 
LOCUS     (entry code) 
..........(other fields) ... 
ORIGIN ...(then the sequence)  
        1 tctaaataag ttttacacaa aataagttat .. 




ICM-sequence file ( *.se )


contains molecular names and sequences. A simple example with two peptides:
 
ml a 
se gly ala ser pro tyr his 
se phe trp tyr 
ml b 
se ala ala ser asn 
A more advanced example with numbering, N- and C-termini and D-amino acids:
 
ml sub1 
se 0 nter 1 gly 2 ala 2A Dglu 4 asp cooh 
ml water 
se 18 hoh 
ml field followed by molecule name signals that a new molecule is started. se field indicates sequence lines (free format). Residue names should correspond to entries in the icm.res residue library (or icmff.res library if you intend to use ICMFF forcefield). Residue numbers (if any) may be arbitrary, negative and may contain additional characters (e.g. 15A, 15B, etc.). Terminal modifiers (nter, nh3+, cooh, coo-, conh, etc.) may be explicitly specified. Cystein bridges may be specified as in cys(1) .... cys(1) .... cys(2) ... cys(2) Certain modules (ligand/peptide docking _dockScan and conformer generation _confGen) )further understand modres fields that can follow se field:


modres 6 c1ccccc1
modres 7/og OC

This allows creation of various modified or unnatural aminoacids. In the above example phenyl ring will be placed instead of Cbeta of residue 6 (and anything that follows in the sidechain), and methoxy group will replace Ogamma of residue 7. _confGen and _dockScan further understand special terminal groups nvtr and cvtr as an indication that N- to C- cyclization is to be applied.



ICM-alignment file


ICM-format for sequence alignments. The consensus string contains the following symbols:
symboldescription
spacegap in at least one of the sequences
character this amino acid is conserved in all the sequences
+ positively charged amino acids ( R,K )
- negatively charged amino acids ( D,E )
^ small amino acids: ( A,S,G,S )
% aromatic residues ( F,Y,W )
# hydrophobic amino acid (F,I,L,M,P,V,W)
~ polar amino acid ( C,D,E,G,H,N,Q,S,T,Y )
dotthe rest (no consensus, no gap)
The file looks like this:
 
# comments 
# Consensus:                      .C~.~I.^ND.MQ.~.K~#.V~K~CK~FT#~LKH.GK#.K..MG 
Azur_Alcde   MLAKATLAIVLSAASLPVLAAQCEATIESNDAMQYNLKEMVVDKSCKQFTVHLKHVGKMAKVAMG 
Azur_Alcfa   ---------------------ACDVSIEGNDSMQFNTKSIVVDKTCKEFTINLKHTGKLPKAAMG 
Azur_Alcsp   --------------------AECSVDIAGNDQMQFDKKEITVSKSCKQFTVNLKHPGKLAKNVMG 
 
# Consensus: FCSFPGH#^#MKG.# 
Azur_Alcde   FCSFPGHWAMMKGTLKLSN 
Azur_Alcfa   FCSFPGHWSIMKGTIELGS 
Azur_Alcsp   FCSFPGHFALMKGVL---- 

Residues can be colored by consensus with the color alignment rs_ command.

ICM all-file: a file with multiple icm objects.


a file containing several ICM-shell objects divided by the following separators:
 
#> type1  ICM-shell-object-name1 
.... obj1..... 
.... obj1..... 
.............. 
#> type2  ICM-shell-object-name2 
.... obj2..... 
.... obj2..... 
.............. 
 etc. 
Legal separators: A sample file a.all containing an integer, real, logical, real array, a pdb-file and a table. You can read all this from a file or simply mark the lines and paste them into your ICM-session after the command: read all unix cat followed by Ctrl-D.
 
#>i numberOffset 
  0 
#>r lineWidth 
  1.00 
#>l logo 
  yes 
#>R boxx 
 0. 0. 1. 1. 
#>brk 
ATOM      1  n   leu m   1       2.602 -12.770  -6.750  1.00 20.00 
ATOM      2  ca  leu m   1       2.423 -11.442  -7.311  1.00 20.00 
ATOM      3  cb  leu m   1       0.947 -11.187  -7.625  1.00 20.00 
ATOM      4  cg  leu m   1       0.758 -11.068  -9.138  1.00 20.00 
ATOM      5  cd1 leu m   1       1.487  -9.824  -9.649  1.00 20.00 
ATOM      6  cd2 leu m   1       1.335 -12.309  -9.822  1.00 20.00 
#>s tt.h 
this is a header string of table tt. The arrays follow. 
#>i tt.i 
15 
#>T tt  
#> a b c d 
1 2. bla  13 
3 5. bli  13 


Residue comparison table ( icm.cmp or *.cmp )


A triangular matrix with relative residue exchange frequencies (see actual file). The amino acid character line serves as a ruler. Use your favorite comparison matrix.

Protein profiles ( *.prf )


A profile table contains residue preferences for each residue type in each sequence position. The preferences may be derived from a multiple sequence alignment or from three-dimensional structure.
Examples:
 
Cons A    B    C    D    E    F  etc.   Z  Gap Len .. 
C   35  -32  143  -42  -52  -12  etc. -62  100  100 
P   55   17    6   17   17  -71  etc.  26  100 


Integer array ( *.iar )


File looking like this (free-format):
 
# everything which is not a number will be skipped 
1 2 4 
  9 
numbers may be in a row, or column or be in an arbitrary order. 
 -14  9 


String array ( *.sar )


Actually any text file. Each line will be a separate element of a string array

Matrix ( *.mat )


File looking like this:
 
1 0 0 
0 1 0 
0 1 1 
or like that
 
# my matrix 
0. 1. 1. 2.2 
1. 0. blu  1. 2. # text will be skipped 
# 1. 1. 0. 3.      this line is commented out 
In the latter case the result of read matrix command is a matrix of two rows {0. 1. 1. 2.2 } and {1. 0. 1. 2.} . Lines can be commented out with # sign. All the fields which do not look like numbers are skipped. If you matrix is symmetric, you may specify only the upper left or the lower right triangle like this:
 
1. 
1. 2 
1. 2 -1. 
1. 2  3. 5. 


Numerical data (real arrays) ( *.rar )


File may contain arbitrarily mixed numbers and strings. Strings will be skipped and numbers will form an array. A hash sign # at the beginning of a line comments this line out.
Examples:
 
# 1.2 
1.4 
1.8 
rem 2.2 
This array will lead to {1.4 1.8 2.2} array.


Prev
MergePdb
Home
Up
Next
ICM options

Copyright© 1989-2024, Molsoft,LLC - All Rights Reserved. Copyright© 1989-2024, Molsoft,LLC - All Rights Reserved. This document contains proprietary and confidential information of Molsoft, LLC. The content of this document may not be disclosed to third parties, copied or duplicated in any form, in whole or in part, without the prior written permission from Molsoft, LLC.