如何根据包含NA的组值创建2D网格,栅格或热图?

时间:2018-06-13 20:44:39

标签: r heatmap r-raster

以下数据:

df <- data.frame(cbind("Group_ID" = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4), "WBHO" = runif(20, 1.0, 7.0), "SI" = runif(20, 1.0, 7.0), "OORT" = c(2.34, 4.64, NA, 5.32, 3.23, 6.01, 5.43, 4.78, 3.98, 3.80, 4.45, NA, NA, 3.18, 4.87, NA, NA, 5.73, 3.52, 4.89), "LMX" = runif(20, 1.0, 7.0),"RL" = runif(20, 1.0, 7.0),"AL" = c(1.54, NA, 1.08, 6.77, NA, NA, 4.56, NA, 5.34, 4.32, 2.45, 3.86, 6.21, 2.89, 7.32, 6.43, NA, 4.56, 3.89, 6.16),"SL" = runif(20, 1.0, 7.0),"RV" = runif(20, 1.0, 7.0),"PT" = runif(20, 1.0, 7.0),"SD" = runif(20, 1.0, 7.0), "HT" = runif(20, 1.0, 7.0), "RTL" = c(2.45, NA, 6.04, 2.88, 3.49, 2.30, NA, 5.32, 2.39, NA, 3.62, 3.22, 4.87, 2.91, 5.41, NA, NA, 4.78, 6.20, NA), "INB" = runif(20, 1.0, 7.0), "ETB" = runif(20, 1.0, 7.0)))

现在,我想创建一个栅格,2D网格或热图,使用均值(x轴显示组和y轴)可以很好地概览每个组的所有变量(“Group_ID”)所有变量),为值1到3赋予特定字段绿色,为3到5给出黄色,为5到7给出绿色。我有以下代码来创建一个df,它将变量组合在一列中并具有值和集团属于另外两个:

library(dplyr)
library(tidyr)

df %>%
gather(key = "variable", value = "value", - Group_ID) -> df_new

但这不起作用,因为包含了NA。但是,我希望将这些行保留为NA。有没有办法在同一步骤中做到这一点?

然后,我想创建一个栅格,我已经获得了以下代码,我不完全确定如何在这种情况下应用:

library(raster)

r <- raster(ncol=nrow(df_new), nrow=15, xmn=0, xmx=4, ymn=0, ymx=15)
values(r) <- as.vector(as.matrix(df$WBHO, df$SI, df$OORT, df$LMX, df$RL, df$AL, df$SL, df$RV, df$PT, df$SD, df$HT, df$RTL,
                             df$INB, df$ETB)
plot(r, axes=F, box=F, asp=NA)
axis(1, at=seq(), 0:9)
axis(2, at=seq(), c("", colnames(df_new)), las=1)

感谢您的帮助!

1 个答案:

答案 0 :(得分:1)

我们可以使用dplyrtidyr来计算平均值。之后,我们可以使用cut函数对值进行分类。然后,我们可以使用geom_tile中的ggplot2来绘制热图。指定xvariableyGroup_ID(转换为系数),fillvalue2。不需要raster包。

目前尚不清楚为什么你想要两组(1-3,5-7),都是绿色的。我的示例将红色分配给5-7组,但您可以根据需要轻松进行更改。

library(dplyr)
library(tidyr)

df_new <- df %>%
  gather(key = "variable", value = "value", - Group_ID) %>%
  group_by(Group_ID, variable) %>%
  summarise(value = mean(value, na.rm = TRUE)) %>%
  mutate(value2 = cut(value, breaks = c(1, 3, 5, 7), labels = c("Low", "Medium", "High"))) %>%
  ungroup()

library(ggplot2)

ggplot(df_new, aes(x = variable, y = factor(Group_ID), fill = value2)) +
  geom_tile() +
  scale_fill_manual(values = c("Low" = "Green", "Medium" = "Yellow", "High" = "Red")) + 
  labs(
    y = "Group_ID"
  )

enter image description here