扩展@G给出的this answer。 Grothendieck,我怎样才能将多个分组变量传递给函数内的dplyr?
我们说我有这些数据:
# Data
set.seed(1)
dfx <- data.frame(nLive = sample(x = 10, size = 40, replace = TRUE),
nDead = sample(x = 3, size = 40, replace = TRUE),
areaA = c(rep("A", 20), rep("B", 20)),
areaB = rep( c( rep("yes", 10), rep("no", 10)), 2),
year = rep(c(2000,2002,2004,2006,2008),4)
)
我想按年份分组,最多可能有2个其他变量。
-G。 Grothendieck的例子非常适合指定1个索引:
UnFun <- function(dat, index) {
dat %>%
group_by(year) %>%
regroup(list(index)) %>%
summarise(n = n() )
}
> UnFun(dfx, "areaA")
Source: local data frame [2 x 2]
areaA n
1 A 20
2 B 20
> UnFun(dfx, "areaB")
Source: local data frame [2 x 2]
areaB n
1 no 20
2 yes 20
但是当我尝试按两者(或单独一年)分组时,我会收到错误或错误答案:
> UnFun(dfx, list("areaA", "areaB"))
Error: cannot convert to symbol (SYMSXP)
> UnFun(dfx, c("areaA", "areaB"))
Source: local data frame [2 x 2]
areaA n
1 A 20
2 B 20
UnFun(dfx, NULL)
Error: cannot convert to symbol (SYMSXP)
有关如何正确指定0,1或2组选项的提示?
谢谢,R社区!
答案 0 :(得分:0)
这确实有效:
UnFun <- function(dat, index) {
dat %>%
group_by_(.dots = c(quote(year), index)) %>%
tally
}
UnFun(dfx, c("areaA", "areaB"))