PrevICM Language Reference
Hierarchical cluster trees
Next

[ Tree representatives ]

The records, or rows, of any table can be clustered into a hierarchical tree, and one or several trees associated with this table can be stored with it, displayed and edited in the ICM GUI, and deleted.

A tree is created with the make tree command. We can decide 1) the tree type and, 2) the distance function between two table rows, as well as establish a number of arguments. Then a tree object is added to the header of the table and is stored together with the table. The table gets a new column with the tree order, and optionally two new elements: and a column with the branch number at a certain level, (option split) and the distance matrix (option matrix).

The related commands and functions:
make tree create tree object and attach it to the table
Split function to split cluster by threshold or number of clusters
split command to change the position of tree cursor (separator) and recalculate new cluster numbers
Name( table.cluster i_tree [index,label,matrix,sort,split] ) names of important table columns
Max( table.cluster ) the distance of the root node
Distance of the cluster splitting level
Nof( table.cluster tree ) clusters
Centers of clusters

Example:


# create a distance matrix
m=Matrix(5,3)
m[2,1:3]={1. 0. 0.}
m[3,1:3]={1. 1. 0.}
m[4,1:3]={1. 1. 1.}
m[5,1:3]={1. 0.1 0.1}
D = Distance( m )

# create a table and move distance matrix into header
group table t { "a" "b" "c" "d" "e" } "label" {1. 2. 2. 1. 4. } "val"
group table t append header D "dm"
make tree t distance = "dm"       # uses external distance matrix for clustering

# get cluster number with threshold set to the middle
cl = Split( t.cluster, Max( t.cluster )/2 ) 
add column t cl name="cl"

# group by cluster and take rows by smallest value of "val" column
group t.cl t.val "min" all "refmin" name="t1"

Selecting N representatives from clusters


This involves several steps:

Example:


read table mol s_icmhome + "drug_groups.sdf"
make tree drug_groups
I = Index( drug_groups.cluster center 0.4 ) # divide at threshold 0.4


Prev
Xml drugbank example
Home
Up
Next
Arithmetics