我有一个数据框(df),其中包含三个分类变量,分别称为site,purchase和happycustomer。
我想使用gglot2的geom_tile功能来创建客户体验的热图。我喜欢X轴上的网站,在y轴上购买,以及happycustomer作为填充。我希望热图能够显示按网站和购买分组的快乐客户的百分比(即happycustomer的值为y的那些客户)。
我的问题是,目前情节既有快乐又有不快乐的顾客。
非常感谢任何帮助。
起点(df):
df <- data.frame(site=c("GA","NY","BO","NY","BO","NY","BO","NY","BO","GA","NY","GA","NY","NY","NY"),purchase=c("a1","a2","a1","a1","a3","a1","a1","a3","a1","a2","a1","a2","a1","a2","a1"),happycustomer=c("n","y","n","y","y","y","n","y","n","y","y","y","n","y","n"))
当前代码:
library(ggplot2)
library(dplyr)
df %>%
group_by(site, purchase,happycustomer) %>%
summarize(bin = sum(happycustomer==happycustomer)) %>%
group_by(site,happycustomer) %>%
mutate(bin_per = (bin/sum(bin)*100)) %>%
ggplot(aes(site,purchase)) + geom_tile(aes(fill = bin_per),colour = "white") + geom_text(aes(label = round(bin_per, 1))) +
scale_fill_gradient(low = "blue", high = "red")
答案 0 :(得分:0)
以下是具有两个数据框的解决方案。
happyDF <- df %>%
filter(happycustomer == "y") %>%
group_by(site, purchase) %>%
summarise( n = n() )
totalDF <- df %>%
group_by(site, purchase) %>%
summarise( n = n() )
ggplot代码:
merge(happyDF, totalDF, by=c("site", "purchase") ) %>%
mutate(prop = 100 * (n.x / n.y) ) %>%
ggplot(., aes(site, purchase)) +
geom_tile(aes(fill = prop),colour = "white") +
geom_text(aes(label = round(prop, 1))) +
scale_fill_gradient(low = "blue", high = "red")