参考问题"Calculating average of based on condition",我需要根据average
列
E
列的F
以下是我的数据框df
部分,但我的实际数据是65K值。
E F
3.130658445 -1
4.175605237 -1
4.949554963 0
4.653496112 0
4.382672845 0
3.870951272 0
3.905365677 0
3.795199341 0
3.374740696 0
3.104690415 0
2.801178871 0
2.487881321 0
2.449349554 0
2.405409636 0
2.090901539 0
1.632416356 0
1.700583696 0
1.846504012 0
1.949797831 0
1.963114449 0
2.033100326 0
2.014312751 0
1.997178247 0
2.143775497 0
根据上述帖子中提供的解决方案,下面是我的脚本。
setDT(df)[, Avg := c(rep(mean(head(d$fE, 5)), 5), rep(0, .N-5)),
cumsum(c(TRUE, diff(abs(F)!=1)==1))]
但是在执行时我收到以下错误。
rep(0,.N - 5)出错:无效'次'参数
答案 0 :(得分:1)
使用聚合:
agg <- aggregate(df$E,by=list(df$F), FUN=mean)
您使用了数据表示例,但您在qu中说过数据框 数据表:
# this will retain all rows and return mean as a new column (per group_
df[, Mean:=mean(E), by=list(F)]
# this will return means per group only
df[, mean(E),by=.(F)]
答案 1 :(得分:0)
试试这个:dt<-data.table(df)
dt[,Avg:=mean(E),by="F"]
dt <- unique(dt,by="F")
这是结果:
`E F Avg
1: 3.130658 -1 3.653132
2: 4.949555 0 2.797826
仅执行此操作:dt<-data.table(df)
dt[,Avg:=mean(E),by="F"]
你得到:E F Avg
1: 3.130658 -1 3.653132
2: 4.175605 -1 3.653132
3: 4.949555 0 2.797826
4: 4.653496 0 2.797826
5: 4.382673 0 2.797826
6: 3.870951 0 2.797826
7: 3.905366 0 2.797826
8: 3.795199 0 2.797826
9: 3.374741 0 2.797826
10: 3.104690 0 2.797826
11: 2.801179 0 2.797826
12: 2.487881 0 2.797826
13: 2.449350 0 2.797826
14: 2.405410 0 2.797826
15: 2.090902 0 2.797826
16: 1.632416 0 2.797826
17: 1.700584 0 2.797826
18: 1.846504 0 2.797826
19: 1.949798 0 2.797826
20: 1.963114 0 2.797826
21: 2.033100 0 2.797826
22: 2.014313 0 2.797826
23: 1.997178 0 2.797826
24: 2.143775 0 2.797826