确定组中是否有最近的事件

时间:2016-02-13 03:04:23

标签: r

我正在尝试计算一个变量,该变量依赖于多个其他列的值,但在其他行中。 以下是示例数据:

set.seed(2)
df1 <- data.frame(Participant=c(rep(1,5),rep(2,7),rep(3,10)), 

                            Action=sample(c(rep("Play",9),rep("Other",13))), 
                            time = c(sort(runif(5,1,100)),sort(runif(7,1,100)),sort(runif(10,1,100))))
df1$Action[2] ="Play" # edited to provide important test case

我想要实现的是一个列,用于测试最后一次“播放”事件是否最多10秒前(时间列)。如果在过去的10年中没有“播放”事件,则无论当前操作如何,StillPlaying的值都应为“n”。以下是我想要的样本:

   Part Action  time        StillPlaying
1   1   Play    15.77544    n
2   1   Play    15.89964    y
3   1   Other   35.37995    n
4   1   Play    49.38855    n
5   1   Other   83.85203    n
6   2   Other   2.031038    n
7   2   Play    14.10483    n
8   2   Other   17.29958    y
9   2   Play    36.3492     n
10  2   Play    81.20902    n
11  2   Other   87.01724    y
12  2   Other   96.30176    n

1 个答案:

答案 0 :(得分:2)

好像你想按参与者分组并用行动标记任何一行&#34;其他&#34;以及最后的#34; Play&#34;在10秒内。您可以使用group_by中的dplyr,使用cummax来确定最后一次&#34;播放&#34}。行动发生了:

library(dplyr)
df1 %>%
  group_by(Participant) %>%
  mutate(StillPlaying=ifelse(time - c(-100, head(cummax(ifelse(Action == "Play", time, -100)), -1)) <= 10, "y", "n"))
#    Participant Action      time StillPlaying
#          (dbl) (fctr)     (dbl)        (chr)
# 1            1   Play 15.775439            n
# 2            1   Play 15.899643            y
# 3            1  Other 35.379953            n
# 4            1   Play 49.388550            n
# 5            1  Other 83.852029            n
# 6            2  Other  2.031038            n
# 7            2   Play 14.104828            n
# 8            2  Other 17.299582            y
# 9            2   Play 36.349196            n
# 10           2   Play 81.209022            n
# ..         ...    ...       ...          ...

如果你想把它保存在基础R中,你可以使用相同的基本命令进行split-apply-combine:

do.call(rbind, lapply(split(df1, df1$Participant), function(x) {
  x$StillPlaying <- ifelse(x$time - c(-100, head(cummax(ifelse(x$Action == "Play", x$time, -100)), -1)) <= 10, "y", "n")
  x
}))
#      Participant Action      time StillPlaying
# 1.1            1   Play 15.775439            n
# 1.2            1   Play 15.899643            y
# 1.3            1  Other 35.379953            n
# 1.4            1   Play 49.388550            n
# 1.5            1  Other 83.852029            n
# 2.6            2  Other  2.031038            n
# 2.7            2   Play 14.104828            n
# 2.8            2  Other 17.299582            y
# 2.9            2   Play 36.349196            n
# 2.10           2   Play 81.209022            n
# 2.11           2  Other 87.017243            y
# 2.12           2  Other 96.301761            n
# ...