删除数据框中的行,这些行都是行中的值之前的行

时间:2017-04-22 15:04:25

标签: r dataframe row subset

考虑df1

df <- data.frame(time = c(1,2,3,4,5,6,7,8,9,10), marker = c(NA,NA,NA,"stop",NA,NA,NA,"start",NA,NA), behaviour = c("Rest","Rest","Rest","Rest","Awake","Awake","Awake","Awake","Awake","Rest"))

   time marker behaviour
1     1   <NA>      Rest
2     2   <NA>      Rest
3     3   <NA>      Rest
4     4   stop      Rest
5     5   <NA>     Awake
6     6   <NA>     Awake
7     7   <NA>     Awake
8     8  start     Awake
9     9   <NA>     Awake
10   10   <NA>      Rest

我希望根据列markers对数据进行子集化,而不是包含元素之间的数据&#34; stop&#34;和&#34;开始以便df看起来像这样:

time marker behaviour
   1   <NA>      Rest
   2   <NA>      Rest
   3   <NA>      Rest
   4   stop      Rest
   8   start     Awake
   9   <NA>     Awake
   10  <NA>      Rest

3 个答案:

答案 0 :(得分:1)

我们可以使用数字索引来对行进行子集化

i1 <- with(df, which(marker %in% c("stop", "start")))
df[-((i1[1]+1):(i1[2]-1)),]

如果有多个`start&#39;,&#39;停止&#39;,那么,我们可以做

grp <- with(df, c(0, head(cumsum(marker == "stop" & !is.na(marker)),-1)))
df[with(df, ave(marker == "start" & !is.na(marker),
             grp, FUN = function(x) !any(x)|cumsum(x)>0)),]
#   time marker behaviour
#1     1   <NA>      Rest
#2     2   <NA>      Rest
#3     3   <NA>      Rest
#4     4   stop      Rest
#8     8  start     Awake
#9     9   <NA>     Awake
#10   10   <NA>      Rest

答案 1 :(得分:1)

df <- data.frame(time = c(1,2,3,4,5,6,7,8,9,10), marker = c("NA","NA","NA","stop","NA","NA","NA","start","NA","NA"), behaviour = c("Rest","Rest","Rest","Rest","Awake","Awake","Awake","Awake","Awake","Rest"))

df1 <- as.integer(row.names(df[df$marker=="stop",]))+1
df2 <- as.integer(row.names(df[df$marker=="start",]))-1
ans <- df[-(df1:df2),]

答案 2 :(得分:1)

cumsum解决方案(我也使用data.table,但您不必),它会推广到多个stop/start值:

library(data.table)
dt <- as.data.table(df)

dt[, drop := list(cumsum(marker=="stop" & !is.na(marker)) - 
                    cumsum(marker=="start" & !is.na(marker)))][drop==0 | marker == "stop"]

   #    time marker behaviour drop
   # 1:    1     NA      Rest    0
   # 2:    2     NA      Rest    0
   # 3:    3     NA      Rest    0
   # 4:    4   stop      Rest    1
   # 5:    8  start     Awake    0
   # 6:    9     NA     Awake    0
   # 7:   10     NA      Rest    0