Copyright © 2020, Molsoft LLC Jun 5 2024
|
[ Selection Types | Selection levels | Examples | Selection elements | Objects | Molecules | Residues | Atoms | Variables | Functions | Res.ranges ] Let us imagine that we decided to compare two structures deposited in the PDB. We will read both entries in the ICM shell, and define the following levels or organization. Each entry will form an object, each object will contain one or several molecules, protein molecules will naturally contain amino acid residues and residues will consist of atoms. Now, in the superimpose command, we will need to specify, or select, the molecules, residues or atoms which should be superimposed. The ICM shell language has a flexible way of selecting subsets of atoms, amino-acid residues, molecules, objects, as well as torsion angles and other internal geometrical parameters of molecules. Most of the ICM commands and functions dealing with molecules, for example, display, delete, minimize, etc., will operate on an arbitrary selection. What does a selection look like? For example, selection a_2./2:14/c* selects carbon atoms of residues from 2 to 14 of the second object. The general syntax of a selection is the following:prefix _ [ object(s) . ] molecule(s) / residue(s) / atom(s) or variable(s) The object section including the dot (e.g. 1crn. ) may be omitted. In this case the selection will be performed in the current object. There can be as many as five sections separated by _ . / and /, Examples: a_2ins.a,b/lys,arg/ca,cb,n* # atom selection, '*' - any string a_2ins.a,b/2:10/n,ca,c # atom selection v_crn./lys,arg/phi,PSI # variable selection(Note use of PSI torsion in the last example.) Storing selections in named variables. Selections can be assigned to a variable (e.g. x = a_//c* ) and can be combined in an expression by logical and ( & ) or logical or ( | ), e.g. ( a_//n* & a_//ca ).
Three prefix types: a_ v_ and V_ . The Prefix defines one of the three selection types:
A selection can also be assigned to a named variable: Example: aa = a_//ca,c,n # the backbone show aa The object and molecule sections are separated by a period, all other sections are separated by slashes. Inside each section, arguments in a list are separated by comma (,) while ranges are separated by colon ( from:to ).
There are four principal levels of selection: object selection, molecular selection, residue selection and atom or variable selection. The level is defined by the "lowest" section explicitly specified in a selection (e.g. a_1.1/2:4 is a residue level selection, while a_//ca is an atom selection). These selections are referred to as os_ ms_ rs_ as_ or vs_ , respectively. If selection level is not important or the level is the lowest one (atoms or variables), selections are referred to as as_ or vs_. The selection level of the interactive graphics selections is controlled by the GRAPHICS.selectionLevel preference. To change it from the command line, assign this variable to an appropriate level, e.g. GRAPHICS.selectionLevel="atom" . Selection levels can be changed from the GUI interface, by changing the selection level
Examples of different selection levels (note that object and molecule names are arbitrary): a_1,3. a_mod*. a_*. a_"*benz?n*". # object selections a_3.mol1 a_zinc a_$molNum a_*.* # molecule selections a_/3:29,as?,ala a_/* a_*./"VHC?[!W]A" # residue selections a//h?,c* a_//T v_//phi,psi # atom or variable selectionsFor example, a_1,3. is an object selection, and a_/ala is a residue selection. Each section may contain a negation symbol ! in the beginning. It selects all, but the specified. You can only use the negation symbol in the first position of a section and the negation will always apply to the whole section. For example, a_/!ala,gly is right, while a_/ala,!gly is wrong. If object section together with the separating period is skipped, selection addresses the current object rather than all objects.
Matching. Objects, molecules, residues, atoms and variables may be referred to by their names. Objects and molecules can be additionally referred to by their sequential numbers (e.g. a_1.2). To select by a numerical name, use backslash before the name, e.g. a_\123 . Metacharacters, such as * ? [], can also be used for pattern matching (e.g. v_//?vt*). Full syntax. A complete description of selection syntax for each level is as follows:
( a_ obj. or just a_ for the current object ):
Example: read object s_icmhome+"all" show a_ # the current object show a_1,2:3. show a_s1?. show a_"*Th[iy]o*".//!h* #here we select by comment set comment a_ "tag1 tag2 tag3, description" show a_"*tag2*". show a_"*tag2?tag3*". # use ? for space
[ select-by-mol | water-accessibility ]
by name: a_s_name e.g. a_m2 or a_1.m2 in the current ( a_ ), or the first ( a_1 ) object, respectively. ( Note that there is no dot at the end ). If the name starts with a digit or one of the reserved one-letter types (see below), add backslash before the digit, e.g. a_\123 , a_\A . by pattern a_s_namePattern ( a_w* - all water molecules in the current object) by number(s) a_number ( a_2 , a_3.2,4,7 ) - relative number of molecule(s) by range(s) a_num1:num2 ( a_2:5 , a_2:5,10:12 ) - number range by chemical formula (F): a_Fformula1,Fformula2.. the chemical formula must be the same as the one returned by the ICM String( ms_ ) function without hydrogens, e.g. read pdb "1abe" show a_FC505 # selects 2 arabinose molecules String( a_2//!h* ) C5O5 by special symbol for types of molecules: a_specialSymbol[,specialSymbol2..]
Note that if a molecule name coincides with any of the above characters ( i.e. "ACHLMNQRSTUW" ), ICM gives preference to the type selection. To select by molecule name, use backslash (e.g. a_1.\A for chain named "A" ) Examples: nice "1dnk" # one peptide, two dna chains and other mols a_A # the peptide a_N # the two DNA chains a_A,N # the peptide and the DNA chains rename a_1 "A" a_\A # chain NAMED "A" read pdb "2ins" delete a_W read pdb "1e8s" show a_TRACE # shows Ca-trace molecules of two proteins and one RNA Some special cases: a_* # all molecules in the current object a_a # molecule 'a' in the current object a_.a # molecules 'a' in all objects a_*.a # the same as a_.a selecting water molecules from pdb-files by their 'residue-field' number. Water molecules in PDB files are numbered and the numbers are stored in the residue field. For consistency, we convert these numbers into residue numbers. At the same time the names of water molecules are built sequentially like this: w1,w2,w3 . This way one can use both sequential numbering via molecule names and PDB-file numbering via residue numbers. read pdb "1sri" show a_w12:w15 # by molecule name, sequential numbering show a_w*/719:721 # by original pdb number converting any selection to molecules with the Mol function Selection of any level, e.g. atoms, residues, and objects can be converted to molecules with the Mol ( selection ) function. Example: Mol(Sphere(a_zinc a_1,2 8.)) # Sphere returns atoms
[ select-by-alignment | selection-by-site ] With respect to objects and molecules there are the following possibilities:
Residue field specifications (for all molecules in the current object). by name: a_/resName ( e.g. a_/his , or a_/\001 - here we had to start with a backslash because the residue name looked like a number) by residue name pattern: a_/resNamePattern ( e.g. a_/as? - asn or asp). A useful tip for DNA or RNA selections. Quite often bases are modified. To select A,T,G,C,U and their modifications, use a_/??a or a_/??t or a_/??g or a_/??c or a_/??u, respectively. by residue number(s): a_/numChar ( a_/3 or a_/15A ) - PDB residue number may contain additional characters. by residue range(s): a_/numChar1:numChar2 ( a_/4:15,20:25 ) - reference residue number range by amino acid sequence pattern: a_/"seqPattern" ( a_/"G?GTE" ) - selects the fragment with matching amino acid sequence. Example selecting all residues preceding prolines (the first expression selects dipeptides with the second proline, the second one excludes prolines): show a_/"?P" & a_/!pro* by string and integer shell variables use the dollar substitution, e.g. build string "ASDF" i=2; j=3; a_/$i:$j s = "12:13,15:19" a_/$s/c*Notice that value substitution for integer and string shell variables without the leading dollar symbol has been obsoleted.
by special symbols and expressions a_/B - barcode residues, see Pattern( rs ). E.g. a_1.2/BL2LL . The gap lengths is calculated from the residue labels, see also the Q selection.
a_/C resConservationCode - selects residues by consensus letter, see below.
a_/DR - displayed residues in the ribbon representation only
a_/DL - displayed residues with residue labels
residues identical to their homology target residues
a_/Q - barcode residues, see the B selection above. e.g. a_1.2/QL2LL . The gap lengths is calculated from the order of actual residues, the labels are ignored. by secondary structure
a_/S sec_struct_chars - residues with certain
secondary structure
(e.g. a_/SH - only helices; a_/SEH - sheets and helices; a_/S_ - only coil)
a_/U - unknown residues not described in ICM residue library
by functional features a_/F[SiteChars] or a_/F"siteID" or a_/Flocal SITE.labelStyle residue selection by the one-letter site type or the site ID, respectively. Letter F refers to the word feature as in the FT (feature table) field of Swissprot entries. The types along with their one-letter codes are listed in the glossary site entry. The default string, the a_/F selection, is defined by the SITE.defSelect string (you may redefine it), which defines important local features such as binding sites as opposed to domain-type sites such as signal peptides, zinc fingers and other protein domains. The PDB entries do not comply with the standard SWISSPROT site definitions, such as ACT_SITE BINDING etc., and are assigned by the user type F (selection a_/FF ). Example: nice "1as6" show site color ribbon a_/F magenta show a_/FF show a_/F"cu3" # select only site named cu3 show a_/F"MUTAGEN" # sites so defined in Swissprot set site a_1.1 "FT SITE 15 15 My favorite residue" label=2 show a_/F2 # select by site label display style number converting selections to residue level: The Res ( selection ) will convert any selection of higher level or lower level to the residue level. Example a_/SH & a_/pro # a proline in a helix Res(Sphere(a_/pro 2.)) # expand to the neighboring residues
[ as_alter | as_bycode | as_onscreen | as_bygrad | as_byproperty | as_bytether | as_bytetherdest | chiral-atoms ] ( a_//atoms ):by name a_//name ( a_.//ca , ca is a usual name for alpha carbon ) by name pattern a_//namePattern ( a_.//c* for all carbons ) by special symbols and expressions alternative atom positions in X-ray structures a_//A alterCharacter - select alternative positions of the specified type (e.g. read pdb "1cbn" ; show a_//Ab ). See also the set comment "A" as_ command. This selection breaks down if an alternative has the character of one of the elements: Ac,Ag,Al,Am,Ar,As,At,Au . A newer (superior) form of this selection is a_//:char1char2.. , e.g. a_//c*:ab a_//A will select all atoms marked as alternatives (both main and secondary alternatives). This selection, in contrast to the explicit one ( e.g. a_//:c ) will also select the unmarked alternatives that are recognized as residues with the first coordinate less than 0.2A away form the same atom of the previous residue. a_//AS will select only the Secondary alternatives (e.g. color magenta a_//AS . If you deleted a_//Aa atoms then a_//Ab become the main alternative and the other ones will become secondary. If you want to delete the primary, do not forget to clear the alternative flag with set comment as " " . The AS selection will also recognize the residues in the PDB file that are not marked by the alter character (see the a_//A description above). E.g. delete a_//AS # delete secondary alternatives, do not need to clear # delete a_//A & a_//!AS # delete primary alternatives set comment a_//A " " # clear the flag convert
a_//MatomMmffCodeNum[:atomCodeMmffNum2] - by mmff code
e.g. a_1.//M3,M10:15 . The atom types are described in icm.cod file.
Special named selections: as_graph graphically selected atoms: as_graph selection contains graphically selected objects, molecules, residues, or atoms The level of selection depends on the GRAPHICS.selectionLevel preference. The level can be changed from the GUI interface or from command line. strained atoms (atoms with high energy gradient) a_//G - strained atoms (Gradient vector longer than selectMinGrad) You can also use the display gradient command. Example: build string "his trp trp" display randomize v_//phi,psi selectMinGrad = 100. show energy display a_//G ball display gradient hydrophobic atoms a_//H hydrogen bonding donors acceptors (one atom per residue at which the residue label is displayed) a_//HA hbond acceptors including atoms of the following ICM types: (50:90,201,205:207,213,214,216,217,220:223,225,228:230,234:236,239:241,246,255,281:295) a_//HD hbond donors. a_//E donors and acceptors combined (includes non-aliphatic hydrogens and atoms of the following ICM types: (50:90,201,205:207,213,214,216,217,220:223,225,228:230,234:236,239:241,246,255,281:295) a_//I donors and acceptors of the a_//E selection that are buried. This selection requires that the show area command is used beforehand.
residue label atoms (one atom per residue at which the residue label is displayed) a_//L
aromatic atoms a_//R It selects heavy atoms connected by aromatic bonds and hydrogens attached to them. Example: build string "HWYP" display skin color skin a_//R magenta Ring atom selection a_//R[RA3456789] It selects atoms in rings.
build string "HWYP" display xstick a_ color a_//RA & a_//R6 green # 6 member aromatic rings color a_//RA & a_//R5 yellow # 5 member aromatic rings color ! a_//RA & a_//R5 magenta # 5 member non-aromatic rings
See also: V_//FC to select chiral phase angles.
[ Vs_ ]
The position of each atom branch is determined by the positions of the preceding atoms and three parameters: dihedral angle, planar angle and bond length. The dihedral angle for the main branch atom is the torsion angle itself, while for the secondary branch atoms the dihedral angle consists of the torsion angle plus the phase angle. The default fixation is given in the ICM-residue library and can be changed by fix and unfix commands. Individual free variables can be rotated interactively with Ctrl-LeftMB-Atom-Click and drag. A vselection can also be assigned to a named variable: Example: aa = v_//phi,psi # the backbone torsions unfix only aa unfix only v_/10:15/phi,psi V_ : selecting among all internal coordinates Finally, the V_ selection selects both free and fixed variables in molecular objects of ICM-type. You always need this type of selection in the unfix command. It makes no sense to unfix variables which are free already. Here is a list of variable selection specifications: by name: v_//name ( v_//phi ) by name pattern: v_//namePattern ( v_//x* ) use asterisk * for any string, and question mark ? for any character. Example: v_//?vt* selects the 6 "virtual" variables defining rigid body rotation and translation. torsion variables v_//TtorsionCodeNum[:torsionCodeNum2] - select by torsion angle code as described in the icm.tot file, e.g. v_//T11 selects the amide group torsion angle v_//T10:15 selects torsion codes from 10 to 15 angles (planar angle variables) v_//AangleCodeNum[:angleCodeNum2] - select by planar angle code as described in the icm.bbt file. bond length variables v_//BbondCodeNum[:bondCodeNum2] - select by bond length code as described in the icm.bst file. Displayed Variable Labels v_//DL - selects variables with displayed variable labels
Psi torsions not shifted to the next residue vPhi = v_/3/phi vPsi = v_/3/PSI # BUT !!! vPhi = v_//phi* & a_/3 vPsi = v_//PSI & Next( a_/3 ) methyl group torsions v_//M - torsion angles rotating Methyl-type terminal hydrogens (excluding polar hydrogen) polar hydrogen torsions v_//P - torsion angles rotating Polar hydrogens (e.g. hydroxyl group) essential (non-hydrogen) torsions: v_//H - side chain torsion angles rotating "Heavy" atoms standard set of free torsions (excludes rings) v_//S - all "Standard" free torsion angles as defined in the icm.tot file. Note that v_//M, v_//P, and v_//H do not overlap, they are mutually exclusive. v_//S contains v_//M, v_//P, and v_//H as well as other standard torsion angles, it does not include the positional ones, use V_//SV or V_//S,V to add them (eg unfix only V_//SV ); phase angles v_//F - select all phase angles (usually they are fixed, so use V_//F ) V_//FC - select phase angles related to the chiral centers (see set chiral and montecarlo chiral ) all torsion angles v_//T - select all free torsion angles, V_//T for all torsion angles including the fixed ones. v_//TiType - select free torsion angles of specified type V_//T for all torsion angles including the fixed ones. positional variables
v_//V - select all six positional variables, 6-pack for each molecule or its part (see convert rs_ ; use v_//VV to select nine positional variables belonging to the first three atoms (three of them are not really necessary since only six variables are independent).
E.g. v_//V , v_2//V v_//VV
Substituting ICM-shell variables into a selection. You can insert the value of an integer or string ICM-shell variable anywhere inside your selection by using a $ (dollar sign) prefix. (Note, this is a general ICM-shell substitution mechanism). Examples: selstr="!w*/14:19" # a string constant display a_$selstr Logical operations. You can also assign selection to a variable, (i.e.: backbone=a_//ca,c,n ) combine several selections using logical operators (example: show a_/3:6 & backbone ) .
To identify contiguous ranges of residues in residue selection, use the String ( rs_ ) function which will convert your selection into a string expression suitable for entering into a ICM-shell. For example, if we want to find all prolines surrounded by two other helical residues helical proline plus next and prev. residues we might do the following: read pdb "1dkf" rrange = String( a_/"?P?" ) # the result would look like "a_a.b/5:7,30:32" rg = Split(rrange,"/,|") # split into sarray with {"a_a.b","5:7","30:32"} # bar (|) helps with multiple chains okrg={""} k=0 # counter for good residue triplets with HHH and ?P? for i=2,Nof(rg) if Nof(Split(rg[i],":")) != 2 continue # ignore molecular names if Sstructure( a_/$rg[i] ) == "HHH" then # compare with ss-pattern k = k+1 okrg[k] = rg[i] endif endfor # now ok-ranges are stored in okrg string array e.g. {"5:7"} # to use them Sum(okrg,",")
|
Copyright© 1989-2024, Molsoft,LLC - All Rights Reserved. Copyright© 1989-2024, Molsoft,LLC - All Rights Reserved. This document contains proprietary and confidential information of Molsoft, LLC. The content of this document may not be disclosed to third parties, copied or duplicated in any form, in whole or in part, without the prior written permission from Molsoft, LLC. |