绘制具有许多零的非常大的数据

时间:2016-12-17 16:01:00

标签: r heatmap hclust

这是vey大数据的一小部分

df<- structure(list(A = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0.68906, 0, 0, 0, 0, 0, 0, 0, 0, 0.13597, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0), B = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0.40001, 0, 0, 0, 0, 0.69718, 0, 0, 0, 0, 0, 0, 0, 
0, 0.090752, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), C = c(0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0.84068, 0, 0, 0, 0.34713, 0, 0, 0, 0, 0.65201, 
0, 0, 0.25725, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
), D = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.86419, 0, 0, 0, 0.3845, 
0, 0, 0, 0, 0.67091, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0), E = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1.1083, 0.8324, 
0, 0, 0, 0.38499, 0, 0, 0, 0, 0.69064, 0, 0, 0.14596, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), F = c(0, 0, 0, 0, 0, 
0, 0, 0, 0, 1.0954, 0.74426, 0, 0, 0, 0.37715, 0, 0, 0, 0, 0.68884, 
0, 0, 0.20826, 0, 0.38782, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0), G = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1.0985, 0.66651, 0, 0, 
0, 0, 0, 0, 0, 0, 0.68861, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.1812, 
0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("A", "B", "C", "D", "E", 
"F", "G"), class = "data.frame", row.names = c(NA, -39L))

我想要的是当数据中存在大量零时以更强调的方式显示值

我如何绘制它就像这样

eucl_dist=dist(df,method = 'euclidean')
hie_clust=hclust(eucl_dist,method = 'complete')
my_palette <- colorRampPalette(c( "green", "yellow", "red"))(n = 1000)
heatmap.2(mydata, scale = c("none"), Colv=F, Rowv=as.dendrogram(hie_clust), 
          xlab = "X", ylab = "Y", key=TRUE, keysize=1.1, trace="none", 
          density.info=c("none"), margins=c(4, 4), col=my_palette, dendrogram="row")

但正如你所看到的,在这个小例子中,零占主导地位,当它非常大时,就不可能看到任何东西。我也无法改变值的位置

1 个答案:

答案 0 :(得分:0)

你在这里问了很多问题,我会尝试回答我看到的问题。

零占主导地位

Zeros主宰你的数据,但零是什么意思?如果没有深入了解零实际意味着什么,就很难规定一种最好的方法来处理它。

色彩映射表

您选择的彩色色彩图不是描述定量数据的最佳方式。我会建议一个简单的白色到蓝色(或您选择的颜色),以便您的零显示为白色并隐藏非强调的非零数据。示例(仅更改my_palette <- colorRampPalette(c("white", "cornflowerblue"))(n = 1000)):

Example

更改值的位置

我不确定你的意思,但布局是由你定义的树形图固定的。