for循环按组计算均值(也忽略NA)

时间:2019-12-02 13:20:55

标签: r for-loop group-by na

我想创建一个for loop

  1. 创建a1,a2,... a10作为组均值的变量
  2. 根据组变量groupid计算变量b1,b2,b3 .... b10的平均值
  3. 在计算均值时忽略NA,我使用na.rm=TRUE
df <- within(df, {a1 = ave(as.numeric(as.character(b1)), groupid, FUN=function(x) mean(x, na.rm=TRUE))})  
df <- within(df, {a2 = ave(as.numeric(as.character(b2)), groupid, FUN=function(x) mean(x, na.rm=TRUE))})
.
.
.
df <- within(df, {a10 = ave(as.numeric(as.character(b10)), groupid, FUN=function(x) mean(x, na.rm=TRUE))})

如何将这10条愚蠢的行改写为优雅的for loop

2 个答案:

答案 0 :(得分:2)

由于groupid是相同的,我们可以使用mutate_at来执行,以模式mean作为列名来获取所有列的b\\d+,并创建新列后缀“ a”

library(dplyr)
df %>%
   group_by(groupid) %>%
   mutate_at(vars(matches('^b\\d+$')), list(a = ~ mean(., na.rm = TRUE)))    

答案 1 :(得分:2)

也许可以尝试以下

df <- sapply(1:10, function(k) eval(parse(text = sprintf("within(df, {a%d = ave(as.numeric(as.character(b%d)), groupid, FUN=function(x) mean(x, na.rm=TRUE))})",k,k))))