考虑df1
:
df <- data.frame(time = c(1,2,3,4,5,6,7,8,9,10), marker = c(NA,NA,NA,"stop",NA,NA,NA,"start",NA,NA), behaviour = c("Rest","Rest","Rest","Rest","Awake","Awake","Awake","Awake","Awake","Rest"))
time marker behaviour
1 1 <NA> Rest
2 2 <NA> Rest
3 3 <NA> Rest
4 4 stop Rest
5 5 <NA> Awake
6 6 <NA> Awake
7 7 <NA> Awake
8 8 start Awake
9 9 <NA> Awake
10 10 <NA> Rest
我希望根据列markers
对数据进行子集化,而不是包含元素之间的数据&#34; stop&#34;和&#34;开始以便df
看起来像这样:
time marker behaviour
1 <NA> Rest
2 <NA> Rest
3 <NA> Rest
4 stop Rest
8 start Awake
9 <NA> Awake
10 <NA> Rest
答案 0 :(得分:1)
我们可以使用数字索引来对行进行子集化
i1 <- with(df, which(marker %in% c("stop", "start")))
df[-((i1[1]+1):(i1[2]-1)),]
如果有多个`start&#39;,&#39;停止&#39;,那么,我们可以做
grp <- with(df, c(0, head(cumsum(marker == "stop" & !is.na(marker)),-1)))
df[with(df, ave(marker == "start" & !is.na(marker),
grp, FUN = function(x) !any(x)|cumsum(x)>0)),]
# time marker behaviour
#1 1 <NA> Rest
#2 2 <NA> Rest
#3 3 <NA> Rest
#4 4 stop Rest
#8 8 start Awake
#9 9 <NA> Awake
#10 10 <NA> Rest
答案 1 :(得分:1)
df <- data.frame(time = c(1,2,3,4,5,6,7,8,9,10), marker = c("NA","NA","NA","stop","NA","NA","NA","start","NA","NA"), behaviour = c("Rest","Rest","Rest","Rest","Awake","Awake","Awake","Awake","Awake","Rest"))
df1 <- as.integer(row.names(df[df$marker=="stop",]))+1
df2 <- as.integer(row.names(df[df$marker=="start",]))-1
ans <- df[-(df1:df2),]
答案 2 :(得分:1)
cumsum
解决方案(我也使用data.table
,但您不必),它会推广到多个stop/start
值:
library(data.table)
dt <- as.data.table(df)
dt[, drop := list(cumsum(marker=="stop" & !is.na(marker)) -
cumsum(marker=="start" & !is.na(marker)))][drop==0 | marker == "stop"]
# time marker behaviour drop
# 1: 1 NA Rest 0
# 2: 2 NA Rest 0
# 3: 3 NA Rest 0
# 4: 4 stop Rest 1
# 5: 8 start Awake 0
# 6: 9 NA Awake 0
# 7: 10 NA Rest 0