我的数据框看起来像这样:
df = data.table(type=rep(x=LETTERS[1:2], each=4),year=list(2009,2010,2013,2016,2003,2005,2009,2015), outcome = list(1,2,1,4,3,1,5,3))
type year outcome
1: A 2009 1
2: A 2010 2
3: A 2013 1
4: A 2016 4
5: B 2003 3
6: B 2005 1
7: B 2009 5
8: B 2015 3
我想要做的是,对于每一行,计算按类型分组的结果的先前均值
我对“之前”的含义是,对于r
的行type = A
,我想计算j
的所有行type=A
的平均值j.year < r.year
}}
在这种情况下,它会给出:
type year outcome previousMean
1: A 2009 1 0
2: A 2010 2 1
3: A 2013 1 1.5
4: A 2016 4 1.33
5: B 2003 3 0
6: B 2005 1 3
7: B 2009 5 2
8: B 2015 3 3
感谢。
答案 0 :(得分:0)
对于每个&#39;类型,我们可以循环遍历行的顺序,对&#39;结果进行子集化。根据序列,获取mean
,unlist
,与0连接并指定(:=
)以创建前一个均值&#39;
df[, previousMean := c(0,unlist(lapply(1:(.N-1),
function(i) mean(outcome[1:i])))), by = type]
或其他选项cummean
来自dplyr
library(dplyr)
df[, previousMean := c(0,cummean(outcome)[-.N]), by = type]
df = data.table(type=rep(x=LETTERS[1:2], each=4),
year=c(2009,2010,2013,2016,2003,2005,2009,2015),
outcome = c(1,2,1,4,3,1,5,3))