计算分组纵向数据中下一个不同(成功)事件的时间

时间:2016-06-03 00:22:33

标签: r dplyr

我有一个人们预约出勤的数据集。当他们错过约会我想计算他们参加的天数,或者如果他们从未做过,则返回NA。

在提出这个问题的过程中,我提出了一个解决方案,计算事件之间的天数,然后计算这些事件的反向累积总和(参见here),按患者分组和改变出勤状态(见here)。我发布这个以防万一它可以帮助别人,或者有人发现错误或者可以提出更好的方法。

library(dplyr)

df <- data.frame(
  id = rep(c("A","B"), each = 5),
  event = c(FALSE, FALSE, TRUE, TRUE, FALSE,
             FALSE, TRUE, FALSE, TRUE, TRUE),
  date   = as.Date(c("2016-01-02","2016-02-10","2016-02-12","2016-07-05","2016-12-28",
                          "2016-01-16","2016-02-11","2016-02-15","2016-04-20","2016-10-23")))

df %>%
  # Sort data (if not already)
  arrange(id, date) %>%
  group_by(id) %>%
  mutate(
    # Calculate days before next appointment
    days_next_event = lead(date) - date,
    # Identify change in attend status
    event_chng_n = cumsum(event != lag(event, default = 1))) %>%
  group_by(id, event_chng_n) %>%
  mutate(
    # Calculate days before next change in event ('cumsum' not defined for "difftime" objects)
    days_next_chng = rev(cumsum(rev(as.numeric(
      ifelse(is.na(days_next_event), 0, days_next_event)
      )))),
    # Calculate days before next success
    days_next_success = ifelse(event, 0, rev(cumsum(rev(
      as.numeric(days_next_event)
    )))))


Source: local data frame [10 x 7]
Groups: id, event_chng_n [7]

       id event       date days_next_event event_chng_n days_next_chng days_next_success
   (fctr) (lgl)     (date)          (dfft)        (int)          (dbl)             (dbl)
1       A FALSE 2016-01-02         39 days            1             41                41
2       A FALSE 2016-02-10          2 days            1              2                 2
3       A  TRUE 2016-02-12        144 days            2            320                 0
4       A  TRUE 2016-07-05        176 days            2            176                 0
5       A FALSE 2016-12-28         NA days            3              0                NA
6       B FALSE 2016-01-16         26 days            1             26                26
7       B  TRUE 2016-02-11          4 days            2              4                 0
8       B FALSE 2016-02-15         65 days            3             65                65
9       B  TRUE 2016-04-20        186 days            4            186                 0
10      B  TRUE 2016-10-23         NA days            4              0                 0

0 个答案:

没有答案