我的数据如下:
Week_ID County State date ZCTA T_mean_F Precipitation holiday Units
523 Carroll Iowa 01/01/2010 51401 5.669194 0 1 0
523 Carroll Iowa 01/01/2010 51430 5.757368 0 1 0
523 Carroll Iowa 01/01/2010 51436 5.355239 0 1 0
523 Carroll Iowa 01/01/2010 51440 6.055060 0 1 0
523 Carroll Iowa 01/01/2010 51443 5.806877 0 1 0
523 Carroll Iowa 01/01/2010 51444 5.995150 0 1 0
523 Carroll Iowa 01/01/2010 51451 5.003030 0 1 0
523 Carroll Iowa 01/01/2010 51455 6.342612 0 1 0
523 Carroll Iowa 01/01/2010 51459 5.500786 0 1 0
523 Carroll Iowa 01/01/2010 51463 6.303967 0 1 0
这只是前10行。整个数据集有许多不同的Week_ID和ZCTA。
我想做的是取“T_mean_F”的平均值。 ZCTA& amp;“降水”和“单位”的总和。 Week_ID,最好是一次通话。最终结果看起来像这样(只是一个例子,而不是实际输出):
Week_ID ZCTA T_mean_avg Prep_avg Units
523 51401 5.669194 2 10
524 51401 5.757368 3 12
525 51401 5.355239 7 14
这就是我的尝试:
Rollup = Wthr_UMW_dwu[,.(T_mean_avg = mean(T_mean_F),Prep_avg = mean(Precipitaton), Units=sum(Units)), by=.(ZCTA,Week_ID)]
和
Rollup_1<- aggregate(cbind(T_mean_F,Precipitation,Units) ~ ZCTA + Week_ID, data=Wthr_UMW_dwu, FUN = function(x) c(mn=mean(x), MN=mean(x), n = sum(x)))
我对这个主题的前一个问题进行了模拟,这两个问题都产生了错误。
任何人都知道顺利/优雅地解决这个问题吗?
谢谢, -Keith
答案 0 :(得分:1)
library(data.table)
setDT(x)
x[, .(
avg.T_mean_F = mean(T_mean_F),
avg.P = mean(Precipitation),
s.Units = sum(Units)
), by = .(ZCTA, Week_ID)]