几天来我一直在创建热图,但无法使网格线的最终格式生效。请参见下面的代码和附图。我想做的是使用geom_tile()使网格线沿热图的图块对齐,以便每个图块以盒装方式填充网格的内部。我可以使用geom_raster()对齐网格线,但y轴标签在图块的顶部或底部打勾,但我需要在中心打勾(请参见红色突出显示),而且我也无法包装geom_raster磁贴周围有一条白线边框,因此颜色块在我的原始数据集中显得有些混乱。感谢格式化代码的帮助。非常感谢!
#The data set in long format
y<- c("A","A","A","A","B","B","B","B","B","C","C","C","D","D","D")
x<- c("2020-03-01","2020-03-15","2020-03-18","2020-03-18","2020-03-01","2020-03-01","2020-03-01","2020-03-01","2020-03-05","2020-03-06","2020-03-05","2020-03-05","2020-03-20","2020-03-20","2020-03-21")
v<-data.frame(y,x)
#approach 1 using geom_tile but gridline does not align with borders of the tiles
v%>%
count(y,x,drop=FALSE)%>%
arrange(n)%>%
ggplot(aes(x=x,y=fct_reorder(y,n,sum)))+
geom_tile(aes(fill=n),color="white", size=0.25)
我曾尝试从another post运行类似的代码,但无法使其正常运行。我认为因为我的x变量是y变量的计数变量,所以无法格式化为在geom_rect()中指定xmin和xmax的因子变量
#approach 2 using geom_raster but y-axis label can't tick at the center of tiles and there's no border around the tile to differentiate between tiles.
v%>%
count(y,x,drop=FALSE)%>%
arrange(n)%>%
ggplot()+
geom_raster(aes(x=x,y=fct_reorder(y,n,sum),fill=n),hjust=0,vjust=0)
答案 0 :(得分:3)
我认为保留刻度线并依次将网格线保持在适当位置是有意义的。为了仍然实现所需的功能,建议您将数据扩展为包括所有可能的组合,并将na.value
设置为中性填充颜色:
# all possible combinations
all <- v %>% expand(y, x)
# join with all, n will be NA for obs. in all that are not present in v
v = v %>% group_by_at(vars(y, x)) %>%
summarize(n = n()) %>% right_join(all)
ggplot(data = v,
aes(x=x, y=fct_reorder(y,n, function(x) sum(x, na.rm = T))))+ # note that you must account for the NA values now
geom_tile(aes(fill=n), color="white",
size=0.25) +
scale_fill_continuous(na.value = 'grey90') +
scale_x_discrete(expand = c(0,0)) +
scale_y_discrete(expand = c(0,0))
答案 1 :(得分:2)
这有点hack。我的方法将分类变量转换为数字,从而向绘图中添加与网格对齐的较小网格线。要摆脱主要的网格线,我只需使用theme()
。缺点:中断和标签必须手动设置。
library(ggplot2)
library(dplyr)
library(forcats)
v1 <- v %>%
count(y,x,drop=FALSE)%>%
arrange(n) %>%
mutate(y = fct_reorder(y, n, sum),
y1 = as.integer(y),
x = factor(x),
x1 = as.integer(x))
labels_y <- levels(v1$y)
breaks_y <- seq_along(labels_y)
labels_x <- levels(v1$x)
breaks_x <- seq_along(labels_x)
ggplot(v1, aes(x=x1, y=y1))+
geom_tile(aes(fill=n), color="white", size=0.25) +
scale_y_continuous(breaks = breaks_y, labels = labels_y) +
scale_x_continuous(breaks = breaks_x, labels = labels_x) +
theme(panel.grid.major = element_blank())
由reprex package(v0.3.0)于2020-05-23创建
编辑:检查了较长的变量名
y<- c("John Doe","John Doe","John Doe","John Doe","Mary Jane","Mary Jane","Mary Jane","Mary Jane","Mary Jane","C","C","C","D","D","D")
x<- c("2020-03-01","2020-03-15","2020-03-18","2020-03-18","2020-03-01","2020-03-01","2020-03-01","2020-03-01","2020-03-05","2020-03-06","2020-03-05","2020-03-05","2020-03-20","2020-03-20","2020-03-21")
v<-data.frame(y,x)
由reprex package(v0.3.0)于2020-05-23创建