R:将聚类树状图与密度图相关联

时间:2011-12-14 17:02:14

标签: r

我在R中有以下情节: enter image description here

我使用以下代码制作情节:

par(mfrow=c(1,2))

rmsd <- read.table(textConnection("
pdb rmsd
1grl_edited.pdb 1.5118
1oel_edited.pdb 1.1758
1ss8_edited.pdb 0.8576
1gr5_edited.pdb 1.8301
1j4z_edited.pdb 0.7892
1kp8.pdb    0.1808
1kpo_edited.pdb 0.7879
1mnf.pdb    1.2371
1xck.pdb    1.6820
2c7e_edited.pdb 5.4446
2cgt_edited.pdb 9.9108
2eu1.pdb    54.1764
2nwc.pdb    1.6026
2yey.pdb    61.4931
"), header=TRUE)

dat <- read.table(textConnection("
pdb      PA      EHSS 
1gr5_edited.pdb 21518.0 29320.0
1grl_edited.pdb 21366.0 28778.0
1j4z_edited.pdb 21713.0 29636.0
1kp8.pdb    21598.0 29423.0
1kpo_edited.pdb 21718.0 29643.0
1mnf.pdb    21287.0 29035.0
1oel_edited.pdb 21377.0 29054.0
1ss8_edited.pdb 21543.0 29459.0
1sx3.pdb    21651.0 29585.0
1xck.pdb    21191.0 28857.0
2c7e_edited.pdb 22930.0 31120.0
2cgt_edited.pdb 22807.0 31058.0
2eu1.pdb    22323.0 30569.0
2nwc.pdb    21338.0 29326.0
2yey.pdb    21032.0 28670.0
"), header=TRUE, row.names=NULL)

d <- dist(rmsd$rmsd, method = "euclidean")
fit <- hclust(d, method="ward")
plot(fit, labels=rmsd$pdb)
groups <- cutree(fit, k=3)

rect.hclust(fit, k=3, border="red")

#for (i in dat[1]){for (z in i){ if (z=="1sx3.pdb"){print (z)}}}

den.PA <- density(dat$PA)
plot(den.PA)
for (i in dat$PA){
    lineat = i
    lineheight <- den.PA$y[which.min(abs(den.PA$x - lineat))]
    lines(c(lineat, lineat), c(0, lineheight), col = "red")
}

左图显示RMSD值的簇,右图显示“PA”的密度图。密度图包含一个额外的值,因为引用包含在图中,引用未包含在RMSD集群中,因为它显然会返回值0. dat中的引用文件是1sx3.pdb

群集图有3个红色框,我怎么能以不同的方式为这些框着色,左框是红色,中间框是绿色,右边框是蓝色。然后我需要用密度图来镜像,这意味着红色框内的值在密度图上有红线,绿框内的值在密度图上有绿线等等。

是否有可能抓住参考结构并在密度图上将其着色为黑色?

2 个答案:

答案 0 :(得分:2)

此代码将执行您想要的操作。你几乎就在那里......只需要进行一些排序和索引。

par(mfrow=c(1,2))

d <- dist(rmsd$rmsd, method = "euclidean")
fit <- hclust(d, method="ward")
plot(fit, labels=rmsd$pdb)
groups <- cutree(fit, k=3)

cols = c('red', 'green', 'blue')

rect.hclust(fit, k=3, border=cols)

#for (i in dat[1]){for (z in i){ if (z=="1sx3.pdb"){print (z)}}}

cols = cols[sort(unique(groups[fit$order]), index=T)$ix]

den.PA <- density(dat$PA)
plot(den.PA)
for (i in 1:length(dat$PA)){
    lineat = dat$PA[i]
    lineheight <- den.PA$y[which.min(abs(den.PA$x - lineat))]
    col = cols[groups[which(rmsd$pdb == as.character(dat[i, 'pdb']))]]
    lines(c(lineat, lineat), c(0, lineheight), col = col)
}

enter image description here

答案 1 :(得分:0)

您可以将颜色矢量传递到边框,例如:

t <- rect.hclust(fit, k=3, border=c("red",'green','blue'))

请注意,我保存了输出,它看起来像这样:

[[1]]
[1] 12 14

[[2]]
 [1]  1  2  3  4  5  6  7  8  9 13

[[3]]
[1] 10 11

然后,稍微改变你的循环

for (i in 1:length(dat$PA)){
    lineat = dat$PA[i]
    lineheight <- den.PA$y[which.min(abs(den.PA$x - lineat))]
    if(i %in% t[[1]]) lines(c(lineat, lineat), c(0, lineheight), col = "red")
    if(i %in% t[[2]]) lines(c(lineat, lineat), c(0, lineheight), col = "green")
    if(i %in% t[[3]]) lines(c(lineat, lineat), c(0, lineheight), col = "blue")
}

虽然最后一点代码不太优雅;我相信有人可以提出更好的解决方案。