我尝试使用函数weighted.mean
聚合数据框并继续出错。我的数据如下:
dat <- data.frame(date, nWords, v1, v2, v3, v4 ...)
我尝试过类似的事情:
aggregate(dat, by = list(dat$date), weighted.mean, w = dat$nWords)
但得到了
Error in weighted.mean.default(X[[1L]], ...) :
'x' and 'w' must have the same length
还有另一个线程使用plyr回答这个问题但是对于一个变量,我想以这种方式聚合所有变量。
答案 0 :(得分:1)
您可以使用data.table:
来完成 library(data.table)
#set up your data
dat <- data.frame(date = c("2012-01-01","2012-01-01","2012-01-01","2013-01-01",
"2013-01-01","2013-01-01","2014-01-01","2014-01-01","2014-01-01"),
nwords = 1:9, v1 = rnorm(9), v2 = rnorm(9), v3 = rnorm(9))
#make it into a data.table
dat = data.table(dat, key = "date")
# grab the column names we want, generalized for V1:Vwhatever
c = colnames(dat)[-c(1,2)]
#get the weighted mean by date for each column
for(n in c){
dat[,
n := weighted.mean(get(n), nwords),
with = FALSE,
by = date]
}
#keep only the unique dates and weighted means
wms = unique(dat[,nwords:=NULL])
答案 1 :(得分:0)
尝试使用by
:
# your numeric data
x <- 111:120
# the weights
ww <- 10:1
mat <- cbind(x, ww)
# the group variable (in your case is 'date')
y <- c(rep("A", 7), rep("B", 3))
by(data=mat, y, weighted.mean)
如果您想在数据框中显示结果,我建议使用plyr
包:
plyr::ddply(data.frame(mat), "y", weighted.mean)