我有一些过程的开始,结束和过程的持续时间。
process_start process_end hourly_process_duration
2019-01-01 00:00:00 2019-01-01 12:00:00 12
2019-01-01 12:00:00 2019-01-01 13:00:00 1
NA NA 11
NA NA 15
2019-01-02 15:00:00 2019-01-02 18:00:00 3
我一直有hourly_process_duration
。流程是连续的-当一个流程结束时,下一个流程开始。
我需要正确替换NA。像示例中一样:
process_start process_end hourly_process_duration
2019-01-01 00:00:00 2019-01-01 12:00:00 12
2019-01-01 12:00:00 2019-01-01 13:00:00 1
2019-01-01 13:00:00 2019-01-02 00:00:00 11
2019-01-02 00:00:00 2019-01-02 15:00:00 15
2019-01-02 15:00:00 2019-01-02 18:00:00 3
答案 0 :(得分:1)
这是填补缺少的日期时间的一种选择
library(dplyr)
library(lubridate)
df1 %>%
mutate(process_start = coalesce(process_start, lag(process_end)),
process_end = coalesce(process_end, lead(process_start))) %>%
mutate_at(vars(process_start, process_end), ymd_hms) %>%
mutate_at(vars(process_start, process_end),
list(~ replace(., is.na(.), floor_date(.[which(is.na(.))+1], "day"))))
# process_start process_end hourly_process_duration
#1 2019-01-01 00:00:00 2019-01-01 12:00:00 12
#2 2019-01-01 12:00:00 2019-01-01 13:00:00 1
#3 2019-01-01 13:00:00 2019-01-02 00:00:00 11
#4 2019-01-02 00:00:00 2019-01-02 15:00:00 15
#5 2019-01-02 15:00:00 2019-01-02 18:00:00 3
df1 <- structure(list(process_start = c("2019-01-01 00:00:00",
"2019-01-01 12:00:00",
NA, NA, "2019-01-02 15:00:00"), process_end = c("2019-01-01 12:00:00",
"2019-01-01 13:00:00", NA, NA, "2019-01-02 18:00:00"),
hourly_process_duration = c(12L,
1L, 11L, 15L, 3L)), class = "data.frame", row.names = c(NA, -5L
))