我的数据框如下所示:
plant distance
one 0
one 1
one 2
one 3
one 4
one 5
one 6
one 7
one 8
one 9
one 9.9
two 0
two 1
two 2
two 3
two 4
two 5
two 6
two 7
two 8
two 9
two 9.5
我希望按间隔(例如,间隔= 3)将每个级别的距离分成组,并计算每个组的百分比。最后,绘制每组的每个级别的百分比,如下所示:
我的代码:
library(ggplot2)
library(dplyr)
dat <- data %>%
mutate(group = factor(cut(distance, seq(0, max(distance), 3), F))) %>%
group_by(plant, group) %>%
summarise(percentage = n()) %>%
mutate(percentage = percentage / sum(percentage))
p <- ggplot(dat, aes(x = plant, y = percentage, fill = group)) +
geom_bar(stat = "identity", position = "stack")+
scale_y_continuous(labels=percent)
p
但我的情节如下所示:group 4
遗失了。
我发现dat
错了,group 4
是NA
。
可能的原因是group 4
的长度小于interval=3
,所以我的问题是如何修复它?提前谢谢!
答案 0 :(得分:0)
我已经解决了这个问题。原因是cut(distance, seq(0, max(distance), 3), F)
没有包含最大值和最小值。
这是我的解决方案:
dat <- my_data %>%
mutate(group = factor(cut(distance, seq(from = min(distance), by = 3, length.out = n()/ 3 + 1), include.lowest = TRUE))) %>%
count(plant, group) %>%
group_by(plant) %>%
mutate(percentage = n / sum(n))