Fitering在基于id R的列中记录指定值

时间:2016-04-08 13:40:10

标签: r

我有以下数据(df)

    Id     event    label
     1    eating    0
     1    walking   0
     1    sleeping  finish
     1    dreaming  stage changed
     1    snoring   0
     2    drinking  0
     2    running   finish
     2    resting   0
     2    relaxing  0

这里针对每个Id(案例),label =“finish”表示案例的完成, 我正在尝试考虑直到label =“finish”的情况并删除该Id的剩余记录。可能看起来像,

    Id     event    label
     1    eating    0
     1    walking   0
     1    sleeping  finish
     2    drinking  0
     2    running   finish

我尝试了以下方式,但它没有帮助。任何建议,将不胜感激。感谢

 df <- data.table(df)
 setDT(df)[label =="finish", by=parent_id]

2 个答案:

答案 0 :(得分:3)

使用data.table我们可以这样做:

library(data.table)
setDT(df)[, .SD[1:which(label == "finish")], by = Id]
#   Id    event  label
#1:  1   eating      0
#2:  1  walking      0
#3:  1 sleeping finish
#4:  2 drinking      0
#5:  2  running finish

答案 1 :(得分:2)

如果每个ID都有“完成”并且所有的ob都按上面的顺序排序,那么使用基数R的答案会更长

start <- which(!duplicated(df$ID))
end <- which(df$label =="finish")
keepObs <- unlist(lapply(unique(df$ID), function(i) start[i]:end[i]))

dfKeepers <- df[keepObs,]