Jul 1 2004
|
[ _macro | _startup | _startupCheck | foldbank.db | foldbank.seg | icm.cod | icm.bbt | icm.bst | icm.cnf | icm.cnt | icm.cn | icm.gro | icm.htm | icm.hbt | icm.hdt | icm.cfg | icm.clr | icm.map | icm.mov | icm.ob | icm.res | icm.var | icm.rst | icm.rs | icm.col | icm.tab | icm.tot | icm.vwt | icm.pdb | icm.seq | icm.se | icm.ali | icm.all | icm.cmp | icm.prf | icm.iar | icm.sar | icm.mat | icm.rar ]
This
file
contains a set of ICM macros. You can use them, modify them,
or browse them to develop your own macros. _macro is downloaded by the
call _macro
command.
This ICM script contains a set of commands issued automatically upon
invoking ICM. The file will be searched for in the directory defined
by the UNIX environmental variable ICMHOME.
This location may be different from the $ICMHOME directory. It allows
users to share the ICM executable but have their individual _startup
files.
Important: edit this
file to customize your environment. A template to modify follows.
s_pdbDir = "/data/pdb/" # set it to the place where PDB lives
pdbDirStyle = "pdb1abc.ent" # style currently distributed by PDB
s_helpEngine = "icm" # reasonable default, HTML-help is
# an alternative
# you may have your own PROSITE updated file
s_prositeDat = Getenv("ICMHOME")+"/prosite.dat"
# xpsview may be more standard
s_psViewer = "/usr/opt/bin/gs -q"
# better be accessible only for you
s_tempDir = "/usr/tmp/"
#
read libraries # they will be read from $ICMHOME
call _aliases # by default it will be taken
# from the directory defined by
# environmental variable $ICMHOME
call _macro # by default it will be taken
# from the directory defined by
# environmental variable $ICMHOME
print "...ICM startup file executed..."
This script checks the presence of and access to the directories and files
used by ICM and specified by some ICM-shell string variables.
This script is recommended during customization of the ICM.
Bank of assigned secondary structures (foldbank.db)
This text file may be created by
_mkSegmentLib script and
contains secondary structures for a nonredundant set of protein chains.
Description of fields:
- NA - chain name ('m' usually stands for main or 'NO chain identifier')
- RZ - resolution. NMR entries get 9.99 (they may be actually worse than that).
- ER - all-atom RMSD-error upon PDB->ICM conversion. Beware of entries with ER > 0.5!!
- SE - amino acid sequence as extracted from the structure (not SEQRES)
- SX - authors' secondary structure assignment, all _____ if not provided (as in 1knt.m).
- SS - automatically
assigned by ICM secondary structure using modified
Kabsh and Sander algorithm.
-
The commented field contains a serial number.
Example two entries:
...
...
## 355
NA 4tpi.i
RZ 2.20
ER 0.027
SE RPDFCLEPPYTGPCRARIIRYFYNAKAGLCQTFVYGGCRAKRNNFKSAEDCMRTCGGA
SX _______________EEEEEEEEEE__EEEEEEEEE__________HHHHHHHHHH__
SS ___GGG___________EEEEEEE____EEEEEEE_________B__HHHHHHHH___
...
...
## 364
NA 1knt.m
RZ 1.60
ER 0.015
SE TDICKLPKDEGTCRDFILKWYYDPNTKSCARFWYGGCGGNENKFGSQKECEKVCA
SX _______________________________________________________
SS _GGGG______B____EEEEEEE____EEEEEEE__B______B__HHHHHHHH_
...
...
This text file contains descriptions of segment (or vector) representations
of protein three-dimensional structures.
Example:
sis.m scorpion insectotoxin i5a _E_H_E_E_ 1 1 2 3 6 8 4 4 1 5 3
-1 2.50 1.11 -0.95 3.95 2.97 -3.10 5.41 -3.20 -10.98 13.48 -0.74 -12.05
11.31 4.10 -3.08 13.56 0.33 -0.05 5.08 -8.21 -5.66 4.15 -7.57 -8.58 8.77 -1.09 2.21 6.15 0.21 7.04
Each molecule is represented by a single line containing the following fields:
- sis.m : molecular selection (note 'm' is used as a chain identifier if the pdb-file has
no chain information).
- scorpion insectotoxin i5a : long name (up to the 30th position)
- _E_H_E_E_ : secondary structure of ( '_' coil, 'E' extended, 'H' helix )
- 1 1 2 3 6 8 4 4 1 5 3 -1 : a segment list; the first integer indicates
the format for the segment list, the last -1 is a terminator.
There are two formats, indicated by numbers 1 and 0.
-
1 : concise : ResNumberOffset 1st_SegmentLength 2nd_Segment_Length 3rd_Segment_Length ...
-
0 : full format (e.g. 0 74b 4 78b 8 86b 4 90b 5 -1 ) : 1st_Res_Number/char 1st_Segment_Length
2nd_Res_Number/char 2nd_Segment_Length ...
- 2.50 1.11 -0.95 ... : x,y,z for all reference points.
The full format allows more complex residue numbering which may frequently be found in the pdb-entries
(i.e. 4 5 6 8 9 9a 9b 9c 10 12 ..).
The concise format is used for the regularly numbered molecules.
This text file contains description of
(1) atom types and references to
(2) MMFF types,
(3) van der Waals types, (type 0 or 9 to ignore)
(4) hydrogen bonding types, (type 1 means no H-bonds)
and
(5) hydration types (0 to ignore).
The real numbers are atomic mass (6) and surface (7). The character (8) is used to define atom color.
Free-format. Two example lines:
# (1) (2) (3) (4) (5) (6) (7) (8) * Comment
#> cd mmff vw hb hd wt sf na comment
cod 63 6 18 6 6 16.000 40.77 o * o in r-c-oh (thr,ser)
cod 71 32 19 6 8 16.000 36.79 o * o- in carboxylate ion
cod 92 12 61 1 1 35.453 133.8 Cl * (MMFF)
cod 223 38 13 4 3 14.007 61.16 n * pyridin nitrogen (MMFF)
This text file contains a factor (kcal/mole) and an equilibrium angle
in degrees for the bond angle bending deformation energy for different
types of angles.
# Type Factor OptAngle(a0). E=Factor*(a-a0)<sup>2</sup>
#
bbt 1 160.7000 115.0000 ca-c#-n
bbt 2 128.2000 120.5000 ca-c#=o
...
This text file contains a factor (kcal/mole) and an equilibrium bond
length in Angstroms for the bond stretching energy for different types
of bonds.
# E=Factor*(b-b0)<sup>2</sup>
# Type Factor BondLength(b0).
#> ity eybs eqbl bt comment
#
bst 1 500.0 1.4530 1 cn n-ca
bst 2 1150.0 1.3250 1 cn c#-n
bst 3 460.0 1.5300 1 cc ca-c#
bst 4 430.0 1.5300 1 cc ca-cb
...
This binary file contains descriptions of several conformations of the
same molecule. You can not edit this file. The stack is
automatically generated and saved in the course of Monte Carlo,
or systematic search procedures.
Alternatively the stack may be created directly by the store conf
command. To read/write a stack use: read stack [s_StackName]
write stack [s_StackName]
The file describes legal types of
drestraints
to impose attraction or repulsion between atom pairs
(e.g. NOE distance restraints derived from NMR data).
This penalty term is called "cn".
The system icm.cnt can be edited, however, user files (e.g. mydist.cnt)
of the same format can be created and loaded with the
read drestraint type command.
The file contains:
# type weight lower upper sharpness
# 4 special types for S-S bonds
ssSS1 10.0 2.04 2.04 10.0 # Sharp well for S-S dist.
ssSS2 2.0 2.04 2.04 1.0 # Wide well for S-S dist.
ssSC 5.0 3.052 3.052 10.0 # S -Cb distance
ssCC 3.0 3.855 3.855 10.0 # Cb-Cb distance
global 1 1.0 0.0 3.0 # a global drestraint
global 2 1.0 2.0 4.0 # a global drestraint
local 12 1.0 2.5 2.8 1.0 # a local drestraint
Both local distance restraints and global
ones force two atoms to stay between the upper and lower boundaries,
however, the local restraints diminish at large distances (similar to van der
Waals interactions), whereas the global restraints grow bi-quadratically as
deviation from the target distance range increases.
You can have and read several *.cnt files.
If the type numbers overlap the previous types are redefined.
See also related commands:
read drestraint type,
show drestraint type,
set drestraint type,
make drestraint type.
Contains list of atom pairs for which interatomic distances should be
restrained according to specified types defined in a separate
icm.cnt
file. The .cn files are created by the user. Supplied icm.cn file
is just an example.
# ml1 re1 at1 ml2 re2 at2 cn_type
cn crn 1 val hg22 * 1 val ca 1
cn * 1 val hg23 * 1 val cg1 1
cn * 2 gln ca * 1 val hg11 1
cn * 2 gln ca * 1 val ha 1
Molecule name and residue number (e.g. 14, 25A, etc.) are normally used to
find an atom. An asterisk instead of the molecule name means that only the residue
number should be matched.
See also:
read drestraint,
show drestraint,
set drestraint,
and
make drestraint
(un)display drestraint
This text file contains streams of POINT coordinates (i.e. a triple of
floats in one line, preceded by an integer reference number), LINE
descriptors (i.e. pair of integer numbers of recently described POINTS
in one line) and/or TRIPLES (or TRIANGLES, i.e. triples of integer numbers
of POINTS). The order of streams is arbitrary, provided that referenced
POINTS are already described. Either LINES or TRIPLES can be omitted.
Graphics objects can be
read,
written,
displayed
,
or
made
from a 3D
map.
1 1.00 -1.00 0.00
2 1.00 1.00 1.00
3 0.00 2.00 0.50
1 2
1 3
2 3
1 2 3
Check content of other .gro files in you icm directory.
ICM also understands the Wavefront obj-format ( files *.obj ).
You can create your own dot, wire or solid graphics objects
either manually or automatically.
contains this manual
A and B parameters for the A/r12 - B/r10
potential between HB donors and acceptors. See
Nemethy et al.
for reference.
Example lines:
# i j B A E r0
#
hbt 2 4 8244.0 32897.0 0.550 2.190 * n-h...n
hbt 3 4 8244.0 32897.0 0.550 2.190 * o-h...n
Parameters to calculate solvation energy based on atomic
solvent-accessible surfaces (see
solvation term).
The file contains several sets (e.g.
Eisenberg and McLachlan (1986),
Wesson and Eisenberg (1992) )
although only one of them is not commented out.
Example lines:
rwater 1.4000 # water radius used to roll around the molecule
#
# 1 2 3 4 5
#
hdt 4 -0.0500 1.7000 0.0016 n+
- reference type number
- solvation energy density from vacuum-water transfer experiments for a given hydration type
- solvation energy density from octanol-water transfer experiments for a given hydration type
- radius used to calculate accessible surface
- comment
This file contains limits and memory requirements for ICM.
It will be searched in the current directory ( ./ ) first and, if not found,
in the directory defined by the UNIX environmental variable $ICMHOME
or $HOME/.icm/config/ directory ( $USERPROFILE/.icm/config/ for Windows), if present.
You may edit the file and change the limits. The MnArrays parameter controls sized of three types
arrays: rarray , iarray and sarray .
# ICM configuration file. Free format
# Mn stands for "Maximal Number of"
# Mx stands for "Maximal Size of"
BufferSpace 2097152 # ICM will not let you decrease BufferSpace less than 131072
MnResidueTypes 200
MnSequences 20000
MnAlignments 1500
MnProfiles 40
MnGrobs 200
MnArrays 600
MnTables 140
MnMaps 40
MnMacros 400
XTermFont *-fixed-medium-*-*-*-24-* # to set font in the terminal window
Xterm xwsh # default for SGI
# Xterm xterm # default for Linux and other UN*Xes
file contains default color and font settings. The default icm.clr file
resides in the $ICMHOME directory.
The LIBRARY.clr variable defines the default path and name of the icm.clr
file.
Keep your own color and graphics controls file in ~/.icm directory.
Example of ~/.icm/user_startup.icm file ( $USERPROFILE/user_startup.icm under Windows ):
LIBRARY.clr = Getenv("HOME")+"/.icm/icm.clr"
read color # load your custom settings
Modify the file if needed. The following lines are recognized (free format):
# CONFIGURABLE GRAPHICS.mode translation table
# Use keywords Left Mid Right, Shift Ctrl Alt Dbl, At
# TopNN LeftNN RightNN BottomNN, where NN is a percentage of the zone
# Modes 0,3,4,5,14,15 require a hit in 'At' = (atom | grob)
# otherwise control falls through to next best appropriate action
# Some modes have submode switches listed in parentheses ()
# Users are encouraged to modify bindings to their needs
# ---mode--combination------------- # equivalent GRAPHICS.mode preference
mode 0 Right-At # popup (in GUI only)
mode 1 (Shift)-Left # Rotation
mode 2 (Shift)-Mid # Translation
mode 3 (Ctrl)-Shift-Right-At # Label atoms
mode 4 (Ctrl)-Dbl-Right-At # Label residues
mode 5 (Shift)-Ctrl-Left-At # Change torsion angles
mode 6 (Shift)-Bottom5-Left # Rotation of the view
mode 7 (Shift)-Top5-Left # Z-axis rotation
mode 8 Left5-Mid # Zoom
mode 9 Alt-Mid # Move rear clipping plane
mode 10 Ctrl-Mid # Move front clipping plane
mode 11 Ctrl-Alt-Mid # Slab
mode 12 (Shift)-Right # Rectangular selection
mode 13 (Shift)-Ctrl-Left # Lasso selection
mode 14 (Shift)-Ctrl-Alt-Right-At # Connect to molecule
mode 15 Shift-Ctrl-Dbl-Right-At # Set alignment cursor
mode 16 Ctrl-Mid-At # Drag atoms
mode 17 Right5-Mid # Z-translate
#--------------{Colors}------------
# -----color------------RRGGBB--A_real_if_not_1
....
color lightgreen # 80ff80
color rita # ff1b00 0.3
color darkseagreen # 8fbc8f
...
#--------{ Atom/Grob/Font Colors }--------
atom c grey # c is the first character of chemical element.
background black
# color font size bold italic underline
atomFont rose times 12 0 0 0
varFont yellow symbol 12 0 0 0
residueFont green helvetica 18 0 0 0
grobFont green helvetica 18 0 0 0
stringFont green times 24 0 0 0
auxiliaryFont green symbol 28 0 0 0
fixedFont green courier 12 0 0 0
#
alphaRibbon red
piRibbon blue
threetenRibbon magenta
betaRibbon green
coilRibbon yellow
#- 0:127 rainbow colors (address them by number: color 15.5) -------
# ---------i-color----
rainbow 0 # 0000ff
rainbow 63 # ffffff
rainbow 127 # ff0000
....
This binary file contains a complete description of the electron density
map, compatible with the format devised by Phil Evans. Maps are stored
as a 3-dimensional array preceded by a header which contains all the
necessary information about the map. See "The CCP4 Suite" manual for
details.
This binary file contains a description of a set of geometric parameters
(free variables, usually torsions and overall rotation/translation
variables), participating in MC simulation, followed by a stream of
their values for each conformation accepted during the simulation,
together with the energy of each accepted conformation. Movies can
be created or appended during MC simulation runs, and then played in
any direction with optional smoothing, superimposition (to the initial
conformation) and/or centering. These files tend to be large, watch
them carefully and do not create them without a need. See also:
display movie.
a binary noneditable file describing one or several molecules forming
an ICM-object. In addition to information available in a PDB-file it
contains a description of atomic charges, tree-like connectivity,
detailed atom codes, information about which internal coordinates are
constrained, references to energy parameters, secondary structure, etc.
The object can be
read
and
written.
The main residue library, describing all "residues" and molecules which can constitute a legal ICM-object.
You can create your own entry either manually or using the write library command
and add the entry to the icm.res file.
You can also keep it in a separate file and append the file to the LIBRARY.res sarray
(i.e. LIBRARY.res=LIBRARY.res//"usr" followed by the read library
command).
A example of an entry for a pro residue:
# resName 1-ch Type AccSurf Eentropy LongName
nare pro P AMINO 150 0.0 proline
rem
rem _________atom_________ _dihedral_angles __bond_angles__ _bond_lengths
rem / \ / \ / \ / \
rem at na cd lwat qu gu na fe vuva ey na fe vuva ey na fe vuva ey qfm
# Fields:
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
#
atre 1 n 216 0 -0.285 0 psi + 180.000 2 an . 118.000 1 bn . 1.340 31
atre 2 ca 113 1 0.050 0 omgp + 180.000 24 aca . 121.000 1 bca . 1.465 1
atre 3 ha 1 2 0.040 0 fha . 116.800 0 aha . 110.200 1 bha . 1.090 0
atre 4 cb 112 2 -0.025 0 fcb . -120.850 0 acb . 103.700 37 bcb . 1.530 4
atre 5 hb1 1 4 0.015 0 fhb1 . 120.200 0 ahb1 . 111.600 1 bhb1 . 1.090 0
atre 6 hb2 1 4 0.015 0 fhb2 . -120.200 0 ahb2 . 111.600 1 bhb2 . 1.090 0
atre 7 cg 112 4 -0.050 0 xi1 . 27.400 21 acg . 103.700 1 bcg . 1.502 4
atre 8 hg1 1 7 0.025 0 fhg1 . 120.450 0 ahg1 . 111.200 1 bhg1 . 1.090 0
atre 9 hg2 1 7 0.025 0 fhg2 . -120.450 0 ahg2 . 111.200 1 bhg2 . 1.090 0
atre 10 cd 114 7 0.100 0 xi2 . -35.600 21 acd . 105.300 39 bcd . 1.501 4
atre 11 hd1 5 10 0.010 0 fhd1 . -119.730 0 ahd1 . 111.600 1 bhd1 . 1.090 0
atre 12 hd2 5 10 0.010 0 xi3 . -90.760 21 ahd2 . 111.600 1 bhd2 . 1.090 0
atre 13 c 121 2 0.455 1 phip . -68.800 19 ac . 112.300 3 bc . 1.520 3
atre 14 o 81 13 -0.385 1 fo . 180.000 0 ao . 120.500 1 bo . 1.230 5
# F 20
lwat 13
# F 21
exbo 1 10
Eentropy is the entropic contribution to the free energy for a fully accessible residue
divided by the solvent accessible surface
of this residue (in gly gly X gly gly environment) and multiplied by a factor of 1000.
Fields:
- relative atom number
- atom name
- main atom code (see
icm.cod
file). It in turn
refers to other codes such as hydrogen bonding code, van der Waals code
and hydration code.
- previous atom in a connectivity graph of the directed ICM-tree.
- electric charge
- groups of close charges (they should not be separated due to cutoffs
distance in interaction lists)
- torsion name
- fixation status (+ free variable, . fixed)
- torsion angle (degrees)
- torsion energy type (see
icm.tot
file).
- bond angle name
- bond angle fixation status
- bond angle value (degrees).
- bond angle deformation energy type (see
icm.bbt file).
- bond length name
- bond fixation status
- bond length (Angstroms)
- bond stretching energy type (see
icm.bst
file).
- (qfm) formal charge (may be +1, -1/3, -1/2 etc.), if any
- (lwat) exit atom of the residue
- (exbo) additional (non-tree) covalent bonds (atom1 atom2).
Normal ICM-tree bonds form regular directed graph without cycles,
therefore all the remaining bonds should be declared separately.
See also: LIBRARY.res
.
Text file containing either a subset or a complete set of internal coordinates
(variables). Usually created by the write vs_var command.
It may also be typed manually.
In contrast to an object file ( *.ob) which is a complete description
of an object, icm.var may contain any selection of variables.
These variables can be read and automatically assigned
to a molecule according to molecule name, residue number and variable name.
Examples:
rem Va fx Atom Residue Mol Obj VaType Symm Value
va psi + n 3 gln mol dl 2 1 180.00
va phi + c 3 gln mol dl 1 1 -60.00
va psi + n 4 met mol dl 2 1 120.00
va phi + c 4 met mol dl 1 1 180.00
define generic attraction zones in internal coordinate space (usually
torsion space) in terms of residue name pattern (* for any
residue type), relative residue number, and variable names. After the
types are loaded, you may use them to assign specific vrestraints using the
set vrestraint
command. Three example restraint types (the third one marked with
'rse' will be used as a penalty term, the first two will be used
for BPMC steps):
------------------------------------ Two-variable vrestraint
rs aa -3.300 0.700 0.542
va ala* 1 phi -63.200 22.500
va * 2 psi -38.540 25.500
------------------------------------ One-variable vrestraint
rs vt -3.511 0.700 0.670
va val* 1 xi1 174.690 29.225
--------------------------------------
rse fmr -3.267 0.700 0.525
va phe* 1 xi1 -66.780 29.700
va * 1 xi2 98.680 75.100
Explanation of fields:
- aa - rs-name (not longer than 4 characters). Usually the
first character is a 1-char residue name, and the second is the zone
character (a - alpha, b-beta, g-gamma,d-delta,t-trans,p-plus 60, m-minus
60,n-null)
- -3.300 - well depth ( should be negative )
- 0.700 - flat fraction of the well
- 0.542 - occupancy (probability) of the well with respect
to other wells which could be assigned to the same set of variables.
- ala* - residue name pattern
- 1 - relative residue number
- phi - variable name
- -63.2 - center of the well for 1 phi
- 22.5 - size of the well (well is -63.2 +- 22.5 )
- * 2 psi - residue name pattern (* means any), relative
residue number (psi formally belongs to C atom of the next residue,
that is why the relative number is 2) and variable name
WARNING: remember that the outer borders of the two-dimensional
restraints are ellipses, rather than rectangles (the same for the multidimensional ones).
To create a sloped surface for all variables involved in the
restraint, the well sizes for these variables should be
greater than 180.*sqrt(number of angles in the rs) (255.0 for 2 angles).
This file has similar format to the
icm.rst file. The differences are:
- rse and rs fields indicate whether the zone will be
used for energy or probability, respectively.
- .rs file contains specific residue numbers
rather than the relative ones.
- instead of the residue name pattern the file should contain molecule name or "*".
Example:
------------------------------------ Two-variable vrestraint
rs aa -3.300 0.700 0.542
va 1crn 1 phi -63.200 22.500
va 1crn 2 psi -38.540 25.500
------------------------------------ One-variable vrestraint
rse vt -3.511 0.700 0.670
va * 3 xi1 174.690 29.225
--------------------------------------
rs fmr -3.267 0.700 0.525
va 1crn 4 xi1 -66.780 29.700
va 1crn 4 xi2 98.680 75.100
The file shows an example of a multicolumn file which can be read with the
read column
command. Arrays r, e s and ent will be created.
# Entropies of several amino acids
#>-r---e------s--------ent---
arg 2.13 184.920211 11.5181
asn 0.81 99.516352 8.1393
asp 0.61 89.629773 6.8057
cys 1.14 85.947233 13.263
gln 2.02 129.68481 15.576
glu 1.65 119.957333 13.75
his 0.99 121.124577 8.173
ile 0.75 132.717054 5.651
The file shows an example of a .tab file which can be read with the
read table. It is similar to the previous
file but additionally
command. Table t consisting of header string t.titl and
arrays t.r, t.e t.s and t.ent will
be created.
#>s t.titl
Entropies of several amino acids
#>T t
#>-r---e------s--------ent---
arg 2.13 184.920211 11.5181
asn 0.81 99.516352 8.1393
asp 0.61 89.629773 6.8057
The file contains torsion parameters according to Momany et al., 1975.
Parameters for type 21 for pro taken from Venkatachalam et al.,
(1974), Macromolecules, 7, 212,
parameters for types 22-23 for cooh taken from Karplus et al., J.Comp.Chem.,
(1983),4,187-217, DNA parameters are from Veal and Wilson, 1991.
We added extra terms and modified the original Momany et al. parameters
(psi and xi3 of Met). The format is free.
# +----symmetry-----+
# maxEner sign fold exact heavy Pseudo selChar phase
#
tot 0 0.00 0 0 1 1 1 - 0. # fixed dihedrals
tot 2 0.25 1 1 1 1 1 S 90. # psi
tot 44 0.50 1 1 1 1 1 S 90. # psi : removing the ECEPP alpha bias
tot 3 10.00 -1 2 1 1 1 S 0. # omg
tot 4 1.35 1 3 1 1 1 H 0. # xi CH2-CH2
tot 8 0.90 1 3 3 3 3 M 0. # nh3 term.group of lys,lysn
tot 14 0.00 0 2 1 1 2 H 0. # xi2 of his (+-90)
tot 5 1.00 1 3 1 1 1 H 0. # xi3 met
tot 5 1.17 -1 1 1 1 1 H 0. # xi3 met additional torsion
Torsion energy is calculated as: maxEner*(1 + sign*Cos(fold * torsion_angle))
Symmetry is a rotational symmetry in different situations:
Exact is the exact symmetry (implies presence of all atoms, including hydrogens),
Heavy implies presence of only the heavy atoms (no hydrogens)
but uniqueness of different atom types.
Pseudo implies that all heavy atoms are equivalent, and hydrogens
are ignored. The last character is a short reference name which can be
used in vs_var. For example: v_//M specifies all the torsions rotating
terminal hydrogen atoms with symmetry higher than 1, v_//H side-chain
torsions rotating heavy atoms, etc.
The file contains ECEPP/2 parameters for peptides (
Momany et al., 1975,
Nemethy et al., 1983),
parameters for DNA atoms:
Veal and Wilson, 1991
and other parameters (unpublished).
Example lines:
# type pzat n_el energy Deq Rvw Rvwel electroRadii
#
vwt 1 0.42 0.85 0.0370 2.92 1.200 1.200 * h aliphatic
vwt 2 0.42 0.85 0.0610 2.68 1.200 0.808 * h amide,amine
vwt 7 1.51 5.20 0.1400 3.74 1.700 1.700 * c carbonyl
vwt 39 0.00 0.00 0.55 5.911 2.631 2.631 * pseudo atom
Each line contains:
- type reference number (see icm.cod file)
- atomic polarizability *1024 (cm cubed);
- effective number of electrons
- -e(kk), kcal/mol - depth of energy minimum at the optimal
interatomic distance
- r(kk), a - equilibrium (optimal) distance between two atoms of the
same type
- van der Waals radius (used in graphics,electrostatics,etc)
- electrostatic radii, used to calculate geometrical surface boundary in boundary element method and MIMEL calculations
Normally van der Waals parameters A and B are calculated
from polarizability and effective number of electrons (fields 2 and 3).
However, if these two fields contain zeros, parameters A and B
are calculated directly from the energy depth and equilibrium distance
(e.g. for type 39).
Protein Data Bank formatted files consist of x,y,z coordinates,
occupancies, and B-factors.
Examples:
ATOM 1 N THR 1 17.047 14.099 3.625 1.00 13.79
ATOM 2 CA THR 1 16.967 12.784 4.338 1.00 10.80
ATOM 3 C THR 1 15.685 12.755 5.133 1.00 9.19
This file does not provide a complete and unambiguous description of
a molecular object. Therefore an object resulting from the
read pdb
command has a special type and needs conversion in order to
become a full-scale ICM-object for which energy calculations are
possible.
See also:
convert
and
minimize tether
commands
.
Acceptable formats of sequence files:
FASTA and ICM format ( *.seq the simplest and the most natural):
> Name1 comment1 comment2
AGFDSTREMNH-FQW
> Name2
RTPIYQWSCCVANMKL
PIR format: ( *.pir )
>P1;Azur_Pses4
Length: 80
AECSVDIQGN DQMQFSTNAI TVDKACKTFT VNLSHPGSLP KNVMGHNWVL TTAADMQGVV
TDGMAAGLDK NYVKDGDTRV*
//
>P2;Azur_Pses3
Length: 50
AECSVDIQGN DQMQFSTNAI TVDKACKTFT VNLSHPGSLP KNVMGHNWVL*
GCG format ( difficult to generate and impossible to edit because of the CheckSum):
Azur_Alcfa Length: 69 Check: 4484 ..
1 ACDVSIEGND AMQFNTKSIV VDKTCKEFTI NLKHTGKLPK AAMGHNVVVS
51 DGMKAGLNND YVKAGDERV
MSF format - obsolete multiple sequence format for alignments. Noneditable, contains CheckSums.
GB-Gene Bank format
Entries start with field names followed by a tabulation and the value.
NCBI allows to save in this format.
LOCUS (entry code)
..........(other fields) ...
ORIGIN ...(then the sequence)
1 tctaaataag ttttacacaa aataagttat ..
contains molecular names and sequences. A simple example with two peptides:
ml a
se gly ala ser pro tyr his
se phe trp tyr
ml b
se ala ala ser asn
A more advanced example with numbering, N- and C-termini and D-amino acids:
ml sub1
se 0 nter 1 gly 2 ala 2A Dglu 4 asp cooh
ml water
se 18 hoh
ml field followed by molecule name signals that a new molecule
is started. se field indicates sequence lines (free format).
Residue names should correspond to entries in the
icm.res
residue library. Residue numbers (if any) may be arbitrary, negative and may
contain additional characters (e.g. 15A, 15B, etc.). Terminal modifiers
(nter, nh3+, cooh, coo-, conh, etc.) may be explicitly specified.
ICM-format for sequence alignments.
The consensus string contains the following symbols:
symbol | description
|
---|
space | gap in at least one of the sequences
| character | this amino acid is conserved in all the sequences
| + | positively charged amino acids ( R,K )
| - | negatively charged amino acids ( D,E )
| ^ | small amino acids: ( A,S,G,S )
| % | aromatic residues ( F,Y,W )
| # | hydrophobic amino acid (F,I,L,M,P,V,W)
| ~ | polar amino acid ( C,D,E,G,H,N,Q,S,T,Y )
| dot | the rest (no consensus, no gap)
|
The file looks like this:
# comments
# Consensus: .C~.~I.^ND.MQ.~.K~#.V~K~CK~FT#~LKH.GK#.K..MG
Azur_Alcde MLAKATLAIVLSAASLPVLAAQCEATIESNDAMQYNLKEMVVDKSCKQFTVHLKHVGKMAKVAMG
Azur_Alcfa ---------------------ACDVSIEGNDSMQFNTKSIVVDKTCKEFTINLKHTGKLPKAAMG
Azur_Alcsp --------------------AECSVDIAGNDQMQFDKKEITVSKSCKQFTVNLKHPGKLAKNVMG
# Consensus: FCSFPGH#^#MKG.#
Azur_Alcde FCSFPGHWAMMKGTLKLSN
Azur_Alcfa FCSFPGHWSIMKGTIELGS
Azur_Alcsp FCSFPGHFALMKGVL----
Residues can be colored by consensus with the
color alignment rs_ command.
a file containing several ICM-shell objects divided by the following separators:
#> type1 ICM-shell-object-name1
.... obj1.....
.... obj1.....
..............
#> type2 ICM-shell-object-name2
.... obj2.....
.... obj2.....
..............
etc.
Legal separators:
A sample file a.all containing an integer, real, logical, real array,
a pdb-file and a table. You can read all this from a file or
simply mark the lines and paste them into your ICM-session after the command:
read all unix cat followed by Ctrl-D.
#>i numberOffset
0
#>r lineWidth
1.00
#>l logo
yes
#>R boxx
0. 0. 1. 1.
#>brk
ATOM 1 n leu m 1 2.602 -12.770 -6.750 1.00 20.00
ATOM 2 ca leu m 1 2.423 -11.442 -7.311 1.00 20.00
ATOM 3 cb leu m 1 0.947 -11.187 -7.625 1.00 20.00
ATOM 4 cg leu m 1 0.758 -11.068 -9.138 1.00 20.00
ATOM 5 cd1 leu m 1 1.487 -9.824 -9.649 1.00 20.00
ATOM 6 cd2 leu m 1 1.335 -12.309 -9.822 1.00 20.00
#>s tt.h
this is a header string of table tt. The arrays follow.
#>i tt.i
15
#>T tt
#> a b c d
1 2. bla 13
3 5. bli 13
A triangular matrix with relative residue exchange frequences (see actual
file). The amino acid character line serves as a ruler. Use your favorite
comparison matrix.
A profile table contains residue preferences for each residue type in
each sequence position. The preferences may be derived from a multiple
sequence alignment or from three-dimensional structure.
Examples:
Cons A B C D E F etc. Z Gap Len ..
C 35 -32 143 -42 -52 -12 etc. -62 100 100
P 55 17 6 17 17 -71 etc. 26 100
File looking like this (free-format):
# everything which is not a number will be skipped
1 2 4
9
numbers may be in a row, or column or be in an arbitrary order.
-14 9
Actually any text file. Each line will be a separate element of a
string array
File looking like this:
1 0 0
0 1 0
0 1 1
or like that
# my matrix
0. 1. 1. 2.2
1. 0. blu 1. 2. # text will be skipped
# 1. 1. 0. 3. this line is commented out
In the latter case the result of
read matrix
command is a matrix of two rows {0. 1. 1. 2.2 } and {1. 0. 1. 2.} . Lines
can be commented out with # sign. All the fields which do not look like
numbers are skipped. If you matrix is symmetric, you may specify only
the upper left or the lower right triangle like this:
1.
1. 2
1. 2 -1.
1. 2 3. 5.
File may contain arbitrarily mixed numbers and strings. Strings will be
skipped and numbers will form an array. A hash sign # at the beginning of
a line comments this line out.
Examples:
# 1.2
1.4
1.8
rem 2.2
This array will lead to {1.4 1.8 2.2} array.
|