Question

我有包含不同季节气候数据的样本数据集：

df <- data.frame(season=rep(1:5,2),year=rep(1:2,each=5),
      temp=c(2,4,3,5,2,4,1,5,4,3),ppt=c(4,3,1,5,6,2,1,2,2,2),
      samples=c(22,25,24,31,31,29,28,31,30,32))

我可以简单地确定每年每个季节的气候变量的平均值：

aggregate(df[,c('temp','ppt')], by = list(df$season,df$year), function(x) mean(x,na.rm=T))

但是，我想确定每个季节组合的加权平均值，使用变量samples作为我的权重。

基本上我想用mean替换aggregate()中的weighted.mean功能。这需要在我的函数中添加第二个参数，需要使用x进行更改。

    function(x,w) weighted.mean(x,w,na.rm=T))

尽管如此，我不确定如何让weighted.mean()的权重参数（'w'）随聚合数据的每个子集而变化。

我可以在aggregate函数中完成所有操作吗？

任何建议都会很棒！

Answer 1

从summarise_each尝试dplyr。它允许先前使用group_by分组并将应用程序分配到多个列：

library(dplyr)
df %>% group_by(season, year) %>%
        summarise_each(funs(weighted.mean(., samples,na.rm=T)), temp,ppt)
# Source: local data frame [10 x 5]
# Groups: season, year [10]
# 
#    season  year  temp   ppt samples
#    (int) (int) (dbl) (dbl)   (dbl)
# 1       1     1     2     4      22
# 2       2     1     4     3      25
# 3       3     1     3     1      24
# 4       4     1     5     5      31
# 5       5     1     2     6      31
# 6       1     2     4     2      29
# 7       2     2     1     1      28
# 8       3     2     5     2      31
# 9       4     2     4     2      30
# 10      5     2     3     2      32

R中的聚合函数具有多个函数参数

1 个答案: