在R

时间:2017-01-04 00:44:38

标签: r

我有两列时间和事件。有两个事件A和B.一旦事件A发生,我想找到下一个事件B 何时发生。列Time_EventB是所需的输出。

这是数据框:

df <- data.frame(Event = sample(c("A", "B", ""), 20, replace = TRUE), Time = paste("t", seq(1,20)))

![enter image description here

  1. R中用于查找值的下一个实例(在本例中为B)的代码是什么?
  2. 找到B实例后的代码是什么,返回相应时间列的值?
  3. 代码应该是这样的:

    data$Time_EventB <- ifelse(data$Event == "A", <Code for returning time of next instance of B>, "")
    

    在Excel中,这可以使用VLOOKUP完成。

2 个答案:

答案 0 :(得分:1)

这是一个简单的解决方案:

set.seed(1)
df <- data.frame(Event = sample(c("A", "B", ""),size=20, replace=T), time = 1:20)

as <- which(df$Event == "A")
bs <- which(df$Event == "B")
next_b <- sapply(as, function(a) {
    diff <- bs-a
    if(all(diff < 0)) return(NA)
    bs[min(diff[diff > 0]) == diff]
})
df$next_b <- NA
df$next_b[as] <- df$time[next_b]

> df
   Event time next_b
1      A    1      2
2      B    2     NA
3      B    3     NA
4           4     NA
5      A    5      8
6           6     NA
7           7     NA
8      B    8     NA
9      B    9     NA
10     A   10     14
11     A   11     14
12     A   12     14
13         13     NA
14     B   14     NA
15         15     NA
16     B   16     NA
17         17     NA
18         18     NA
19     B   19     NA
20         20     NA

答案 1 :(得分:0)

这是尝试使用data.table包中的“滚动连接”:

library(data.table)
setDT(df)

df[Event=="B", .(time, nextb=time)][df, on="time", roll=-Inf][Event != "A", nextb := NA][]

#    time nextb Event
# 1:    1     2     A
# 2:    2    NA     B
# 3:    3    NA     B
# 4:    4    NA      
# 5:    5     8     A
# 6:    6    NA      
# 7:    7    NA      
# 8:    8    NA     B
# 9:    9    NA     B
#10:   10    14     A
#11:   11    14     A
#12:   12    14     A
#13:   13    NA      
#14:   14    NA     B
#15:   15    NA      
#16:   16    NA     B
#17:   17    NA      
#18:   18    NA      
#19:   19    NA     B
#20:   20    NA   

使用从@thc借来的数据