我想根据两个变量之间的时间范围添加一个新变量。我希望白天8:01-20:00 =晚20:01-8:00 =晚之间的任何时间都混在一起。
我尝试手动添加变量,但是试图了解这一点可以更轻松地完成
。#Current database
id<-c("m1","m1","m1","m2","m2","m2","m3","m4","m4")
x<-c("2020-01-03 10:00:00","2020-01-03 16:00:00","2020-01-03 19:20:00","2020-01-05 10:00:00","2020-01-05 15:20:00","2020-01-05 20:50:00","2020-01-06 06:30:00","2020-01-08 06:30:00","2020-01-08 07:50:00")
start<-strptime(x,"%Y-%m-%d %H:%M:%S")
y<-c("2020-01-03 16:00:00","2020-01-03 19:20:00","2020-01-03 20:50:00","2020-01-05 15:20:00","2020-01-05 20:50:00","2020-01-05 22:00:00","2020-01-06 07:40:00","2020-01-08 07:50:00","2020-01-08 08:55:00")
end<-strptime(y,"%Y-%m-%d %H:%M:%S")
mydata<-data.frame(id,start,end)
#output
day.night<-c("day","day","mixed","day","mixed","night","night","night","mixed")
newdata<-cbind(mydata,day.night)
编辑:很抱歉,我忘记添加日期。
答案 0 :(得分:1)
使用dplyr
的一种方法是将start.time
和end.time
转换为POSIXct
对象,然后以各种间隔比较值并使用case_when
应用标签。
library(dplyr)
data %>%
mutate(start.time1 = as.POSIXct(start.time, format = "%H:%M"),
end.time1 = as.POSIXct(end.time, format = "%H:%M"),
day.night = case_when(
start.time1 > as.POSIXct('08:01:00', format = "%T") &
end.time1 < as.POSIXct('20:00:00', format = "%T") ~"day",
start.time1 > as.POSIXct('20:01:00', format = "%T") |
start.time1 < as.POSIXct('08:00:00', format = "%T") &
end.time1 < as.POSIXct('08:00:00', format = "%T") ~ "night",
TRUE ~ "mixed")) %>%
select(names(data), day.night)
# id start.time end.time day.night
#1 m1 10:00 16:00 day
#2 m1 16:00 19:20 day
#3 m1 19:20 20:50 mixed
#4 m2 10:00 15:20 day
#5 m2 15:20 20:50 mixed
#6 m2 20:50 22:00 night
#7 m3 06:30 07:40 night
#8 m4 06:30 07:50 night
#9 m4 07:50 08:55 mixed
编辑
如果我们还有日期,一种方法是将start
和end
中的日期部分替换为今天的日期,以进行比较。
df1 <- mydata %>%
mutate(start1 = as.POSIXct(sub("\\d+-\\d+-\\d+", Sys.Date(), start)),
end1 = as.POSIXct(sub("\\d+-\\d+-\\d+", Sys.Date(), end)),
day.night = case_when(
start1 > as.POSIXct('08:01:00', format = "%T") &
end1 < as.POSIXct('20:00:00', format = "%T") ~"day",
start1 > as.POSIXct('20:01:00', format = "%T") |
start1 < as.POSIXct('08:00:00', format = "%T") &
end1 < as.POSIXct('08:00:00', format = "%T") ~ "night",
TRUE ~ "mixed"))
df1 %>% select(names(mydata), day.night)
# id start end day.night
#1 m1 2020-01-03 10:00:00 2020-01-03 16:00:00 day
#2 m1 2020-01-03 16:00:00 2020-01-03 19:20:00 day
#3 m1 2020-01-03 19:20:00 2020-01-03 20:50:00 mixed
#4 m2 2020-01-05 10:00:00 2020-01-05 15:20:00 day
#5 m2 2020-01-05 15:20:00 2020-01-05 20:50:00 mixed
#6 m2 2020-01-05 20:50:00 2020-01-05 22:00:00 night
#7 m3 2020-01-06 06:30:00 2020-01-06 07:40:00 night
#8 m4 2020-01-08 06:30:00 2020-01-08 07:50:00 night
#9 m4 2020-01-08 07:50:00 2020-01-08 08:55:00 mixed