我有一个数据框“calls1”,我想知道如何创建一个新变量“PercCallsMo”,它是来自“CallsHandled”变量的总调用百分比,每个调用队列“QUEUE”代表给定月份“MON1_12。”我的示例数据文件如下:
structure(list(MON1_12 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), QUEUE = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("APPLICATION_STATUS", "BENEFITS", "BILLING"
), class = "factor"), CallsHandled = c(9L, 3L, 10L, 27L, 64L,
17L, 10L, 58L, 8L, 29L, 32L, 12L, 2L, 6L, 1L, 3L, 2L, 2L, 2L,
2L)), .Names = c("MON1_12", "QUEUE", "CallsHandled"), class = "data.frame", row.names = c(NA,
-20L))
我期待的结果会在每个月“MON1_12”的连续行上显示每个“QUEUE”所代表的“PercCallsMo”,并且应该如下所示:
structure(list(MON1_12 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), QUEUE = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("APPLICATION_STATUS", "BENEFITS", "BILLING"
), class = "factor"), CallsHandled = c(9L, 3L, 10L, 27L, 64L,
17L, 10L, 58L, 8L, 29L, 32L, 12L, 2L, 6L, 1L, 3L, 2L, 2L, 2L,
2L), PercCallsMo = c(0.362962963, 0.362962963, 0.362962963, 0.362962963,
0.554878049, 0.554878049, 0.554878049, 0.488888889, 0.488888889,
0.37195122, 0.37195122, 0.148148148, 0.148148148, 0.148148148,
0.073170732, 0.073170732, 0.073170732, 0.073170732, 0.073170732,
0.073170732)), .Names = c("MON1_12", "QUEUE", "CallsHandled",
"PercCallsMo"), class = "data.frame", row.names = c(NA, -20L))
答案 0 :(得分:2)
你可以这样做:
library(dplyr)
calls1 = calls1 %>%
group_by(MON1_12) %>%
mutate(month_total = sum(CallsHandled)) %>%
group_by(MON1_12, QUEUE) %>%
mutate(PercCallsMo = sum(CallsHandled)/month_total) %>%
select(-month_total)
答案 1 :(得分:0)
使用基础R
range unbounded preceding