有条件的计数,但没有NA

时间:2017-04-18 15:39:31

标签: r count data.table

我有一个类似的数据框

library(data.table)
mydata<-
data.table(comname=c("hon","hon","hon","acer","acer","acer","acer","acer","acer"),
oversea=c(1,0,1,1,0,1,1,1,0),
year=c(1991,1992,1993,1981,1982,1983,1983,1984,1985),
hopecount=c(0,0,1,0,0,1,1,2,2))

   comname oversea year hopecount
1:     hon       1 1991         0
2:     hon       0 1992         0
3:     hon       1 1993         1
4:    acer       1 1981         0
5:    acer       0 1982         0
6:    acer       1 1983         1
7:    acer       1 1983         1
8:    acer       1 1984         2
9:    acer       0 1985         2

我计算在海外== 1的条件下出现的comname,从零开始:

mydata[oversea==1, mycount := match(year, unique(year))-1, comname];mydata

我希望得到mycount = hopecount,但是当海外== 0时,mycount将是NA

有没有办法让海外== 0&#34;没有计数&#34;并填写&#34;之前的计数时间&#34;而不是&#34; NA&#34;。 就像hopecount的形式一样

thx很多^^&#34;

1 个答案:

答案 0 :(得分:0)

我们可以使用na.locf中的zoo将NA元素替换为之前的非NA相邻元素

library(data.table)
library(zoo)
mydata[oversea==1, hopecount2 := match(year, unique(year))-1, comname
          ][, hopecount2 := na.locf(hopecount2), comname]
identical(mydata$hopecount, mydata$hopecount2)
#[1] TRUE