按函数内部的变量进行分组和汇总

时间:2017-08-01 19:42:17

标签: r plyr

如何按变量分组并使用ddply进行汇总?

例如:

library(plyr)

sample <- function(x, g){
  print(g)
  print(x[[g]])
  res = ddply(x, ~x[[g]], summarise, value = mean(value))
  return(res)
}

x = data.frame(type = c('a', 'a', 'a', 'b'), 
               age = c(20, 21, 21, 10), 
               value = c(100, 120, 121, 150))
sample(x = x, g = 'age')

将失败说:

 Error in (function(x, i, exact) if (is.matrix(i)) as.matrix(x)[[i]] else .subset2(x,  : 
  object 'g' not found 

即使该功能打印:

[1] "age"
[1] 20 21 21 10

为什么R在打印时会找到g,而在group_by时却找不到?

编辑: 我希望输出为:

  x[["age"]] value
1         10 150.0
2         20 100.0
3         21 120.5

3 个答案:

答案 0 :(得分:0)

是针对由&#39; =&#39;设置的环境尝试以这种方式调用你的函数

sample(x = x, g <- 'age') 

或者你可以简单地使用

# g insted of ~x[[g]]
res = ddply(x, g, summarise, value = mean(value))

答案 1 :(得分:0)

以下是使用dplyr包的解决方案。 为了正确评估group_by函数,我需要使用将被弃用的group_by_

library(dplyr)

x = data.frame(type = c('a', 'a', 'a', 'b'), 
               age = c(20, 21, 21, 10), 
               value = c(100, 120, 121, 150))

sample <- function(x, g){
  print(g)
  print(x[[g]])
  res<- group_by_(x, g) %>% summarise( mean(value))
  #res = ddply(x, ~x[[g]], summarise, value = mean(value))
  return(res)
}

sample(x = x, g = 'age') 

答案 2 :(得分:0)

我会使用最新dplyr版本附带的tidyeval:

sample <- function(x, g){
var <- dplyr::enquo(g)
res = x %>% group_by(!!var) %>% summarise(age_mean = mean(value))
return(res)
}

x = data.frame(type = c('a', 'a', 'a', 'b'), 
           age = c(20, 21, 21, 10), 
           value = c(100, 120, 121, 150))
sample(x, age)

# A tibble: 3 x 2
     age age_mean
  <dbl>    <dbl>
1    10    150.0
2    20    100.0
3    21    120.5