我想删除没有连续一年的观察结果。 这也是最后的观察结果。
dput(head(df))
structure(list(ID = c(13302, 13302, 14401, 14401, 14401, 14401 ),
Jaar = c(2012, 2015, 2012, 2013, 2015, 2016))
在此示例中,ID 13302
的两个观察结果都将被删除,ID 14401
2013
2016
的{{1}}观察结果将被删除。
任何人都可以在R中协助代码吗?提前谢谢!
答案 0 :(得分:3)
稍短一些:
data.table::setDT(df)[,ind:=c(diff(Jaar),NA),by="ID"][ind %in% 1,]
ID Jaar ind
1: 14401 2012 1
2: 14401 2015 1
答案 1 :(得分:2)
试试这段代码,
library(data.table)
df <- data.table(ID = c(13302, 13302, 14401, 14401, 14401, 14401), Jaar = c(2012, 2015, 2012, 2013, 2015, 2016))
df[, diff := shift(Jaar, type = 'lead'), by = 'ID'][, diff := diff - Jaar]
df[, id := rleid(diff)]
df[diff == 1][, head(.SD, n = 1), by = 'id'][, .(ID, Jaar)]
输出
ID Jaar
1: 14401 2012
2: 14401 2015
答案 2 :(得分:2)
这是另一种data.table方法:
library(data.table)
setDT(df)
df[df[, Jaar+1 == shift(Jaar, type = "lead"), by = ID]$V1]
# ID Jaar
#1: 14401 2012
#2: 14401 2015