连续一年没有观察就删除观察结果

时间:2017-11-30 08:59:25

标签: r

我想删除没有连续一年的观察结果。 这也是最后的观察结果。

dput(head(df)) 
structure(list(ID = c(13302, 13302, 14401, 14401, 14401, 14401 ), 
               Jaar = c(2012, 2015, 2012, 2013, 2015, 2016))

在此示例中,ID 13302的两个观察结果都将被删除,ID 14401 2013 2016的{​​{1}}观察结果将被删除。

任何人都可以在R中协助代码吗?提前谢谢!

3 个答案:

答案 0 :(得分:3)

稍短一些:

data.table::setDT(df)[,ind:=c(diff(Jaar),NA),by="ID"][ind %in% 1,]

输出

      ID Jaar ind
1: 14401 2012   1
2: 14401 2015   1

答案 1 :(得分:2)

试试这段代码,

library(data.table)
df <- data.table(ID = c(13302, 13302, 14401, 14401, 14401, 14401), Jaar = c(2012, 2015, 2012, 2013, 2015, 2016))
df[, diff := shift(Jaar, type = 'lead'), by = 'ID'][, diff := diff - Jaar]
df[, id := rleid(diff)]
df[diff == 1][, head(.SD, n = 1), by = 'id'][, .(ID, Jaar)]

输出

     ID Jaar
1: 14401 2012
2: 14401 2015

答案 2 :(得分:2)

这是另一种data.table方法:

library(data.table)
setDT(df)
df[df[, Jaar+1 == shift(Jaar, type = "lead"), by = ID]$V1]
#      ID Jaar
#1: 14401 2012
#2: 14401 2015