子集根据同一列中的行比较

时间:2016-06-24 19:28:56

标签: r

我想只选择那些没有"没有"在最后一次" RFA"

之后的EVENT列中

输入:

structure(list(Person = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 4L, 5L, 5L, 5L, 5L), Date = c("13/04/13", "14/05/14", 
"14/05/14", "15/02/15", "13/04/13", "14/05/14", "14/08/14", "14/09/14", 
"14/08/15", "15/10/12", "15/10/14", "15/10/12", "04/03/13", "05/03/13", 
"06/03/13", "07/03/13"), EVENT = c("RFA", "RFA", "RFA", "nothing", 
"RFA", "EMR", "nothing", "RFA", "nothing", "nothing", "nothing", 
"EMR", "RFA", "RFA", "RFA", "nothing")), .Names = c("Person", 
"Date", "EVENT"), class = "data.frame", row.names = c(NA, -16L
))

输出:

Person  Date    EVENT
1   13/04/13    RFA
1   14/05/14    RFA
1   14/05/14    RFA
1   15/02/15    nothing
2   13/04/13    RFA
2   14/05/14    EMR
2   14/08/14    nothing
2   14/09/14    RFA
2   14/08/15    nothing
5   04/03/13    RFA
5   05/03/13    RFA
5   06/03/13    RFA
5   07/03/13    nothing

我的尝试:

library(dplyr)
PostAblation<-Therap %>% 
  arrange(Person, as.Date(Therap$Date, '%d/%m/%y')) %>% 
  group_by(Person) %>% 
  filter(last(EVENT == "nothing") & EVENT == "RFA")

但我没有得到我期望的结果

3 个答案:

答案 0 :(得分:3)

我认为你的逻辑有点复杂。但可能是这样的:

df %>% group_by(Person) %>% filter(EVENT[max(which(EVENT == "RFA")) + 1] == "nothing")

Source: local data frame [13 x 3]
Groups: Person [3]

   Person     Date   EVENT
    (int)    (chr)   (chr)
1       1 13/04/13     RFA
2       1 14/05/14     RFA
3       1 14/05/14     RFA
4       1 15/02/15 nothing
5       2 13/04/13     RFA
6       2 14/05/14     EMR
7       2 14/08/14 nothing
8       2 14/09/14     RFA
9       2 14/08/15 nothing
10      5 04/03/13     RFA
11      5 05/03/13     RFA
12      5 06/03/13     RFA
13      5 07/03/13 nothing

如果您的数据已由RFAnothing订购,那么最后Person后面跟Date。修改后的版本将是:

df %>% group_by(Person) %>% filter(max(which(EVENT == "nothing")) > max(which(EVENT == "RFA")) & 
       length(which(EVENT == "RFA")) != 0)

哪个不如第一个逻辑贪婪,只要此人同时拥有nothingRFA并且nothing位于最后RFA之后,它就会成立

答案 1 :(得分:2)

您可以在Person上拆分数据并查看其中的数据,以查找“RFA”条目的最大索引。然后,将1添加到该索引并检查对应于下一个EVENT的条目是否为“无”。如果是这样,你保留它:

splitPerson <- split(d, d$Person)

afterNothing <- lapply(splitPerson, function(ii) max(which(ii$EVENT == "RFA")) + 1)

keepers <- which(mapply(function(x, y) x[["EVENT"]][y] == "nothing", splitPerson, afterNothing))


d[d[["Person"]] %in% keepers, ]
#   Person     Date   EVENT
#1       1 13/04/13     RFA
#2       1 14/05/14     RFA
#3       1 14/05/14     RFA
#4       1 15/02/15 nothing
#5       2 13/04/13     RFA
#6       2 14/05/14     EMR
#7       2 14/08/14 nothing
#8       2 14/09/14     RFA
#9       2 14/08/15 nothing
#13      5 04/03/13     RFA
#14      5 05/03/13     RFA
#15      5 06/03/13     RFA
#16      5 07/03/13 nothing

答案 2 :(得分:1)

另一种选择是使用data.table

library(data.table)
setDT(df)[, if(any(EVENT == "RFA") & all(EVENT[tail(which(EVENT == "RFA"), 
                1)+1]=="nothing")) .SD , Person]
#    Person     Date   EVENT
#1:      1 13/04/13     RFA
#2:      1 14/05/14     RFA
#3:      1 14/05/14     RFA
#4:      1 15/02/15 nothing
#5:      2 13/04/13     RFA
#6:      2 14/05/14     EMR
#7:      2 14/08/14 nothing
#8:      2 14/09/14     RFA
#9:      2 14/08/15 nothing
#10:     5 04/03/13     RFA
#11:     5 05/03/13     RFA
#12:     5 06/03/13     RFA
#13:     5 07/03/13 nothing