如何根据两个变量之间的时间范围添加新的夜间/白天变量

时间:2019-07-16 14:03:26

标签: r dataframe datetime

我想根据两个变量之间的时间范围添加一个新变量。我希望白天8:01-20:00 =晚20:01-8:00 =晚之间的任何时间都混在一起。

我尝试手动添加变量,但是试图了解这一点可以更轻松地完成

#Current database
id<-c("m1","m1","m1","m2","m2","m2","m3","m4","m4")
x<-c("2020-01-03 10:00:00","2020-01-03 16:00:00","2020-01-03 19:20:00","2020-01-05 10:00:00","2020-01-05 15:20:00","2020-01-05 20:50:00","2020-01-06 06:30:00","2020-01-08 06:30:00","2020-01-08 07:50:00")
start<-strptime(x,"%Y-%m-%d %H:%M:%S")
y<-c("2020-01-03 16:00:00","2020-01-03 19:20:00","2020-01-03 20:50:00","2020-01-05 15:20:00","2020-01-05 20:50:00","2020-01-05 22:00:00","2020-01-06 07:40:00","2020-01-08 07:50:00","2020-01-08 08:55:00")
end<-strptime(y,"%Y-%m-%d %H:%M:%S")
mydata<-data.frame(id,start,end)

#output
day.night<-c("day","day","mixed","day","mixed","night","night","night","mixed")
newdata<-cbind(mydata,day.night)

编辑:很抱歉,我忘记添加日期。

1 个答案:

答案 0 :(得分:1)

使用dplyr的一种方法是将start.timeend.time转换为POSIXct对象,然后以各种间隔比较值并使用case_when应用标签。

library(dplyr)

data %>%
   mutate(start.time1 = as.POSIXct(start.time, format = "%H:%M"), 
          end.time1 =  as.POSIXct(end.time, format = "%H:%M"), 
          day.night =  case_when(
          start.time1 > as.POSIXct('08:01:00', format = "%T") &
          end.time1 < as.POSIXct('20:00:00', format = "%T") ~"day",
          start.time1 > as.POSIXct('20:01:00', format = "%T") |
          start.time1 < as.POSIXct('08:00:00', format = "%T") & 
          end.time1 < as.POSIXct('08:00:00', format = "%T") ~ "night",
          TRUE ~ "mixed")) %>%
   select(names(data), day.night)

#  id start.time end.time day.night
#1 m1      10:00    16:00       day
#2 m1      16:00    19:20       day
#3 m1      19:20    20:50     mixed
#4 m2      10:00    15:20       day
#5 m2      15:20    20:50     mixed
#6 m2      20:50    22:00     night
#7 m3      06:30    07:40     night
#8 m4      06:30    07:50     night
#9 m4      07:50    08:55     mixed

编辑

如果我们还有日期,一种方法是将startend中的日期部分替换为今天的日期,以进行比较。

df1 <- mydata %>%
         mutate(start1 = as.POSIXct(sub("\\d+-\\d+-\\d+", Sys.Date(), start)), 
                end1 = as.POSIXct(sub("\\d+-\\d+-\\d+", Sys.Date(), end)),
                day.night =  case_when(
                start1 > as.POSIXct('08:01:00', format = "%T") &
                end1 < as.POSIXct('20:00:00', format = "%T") ~"day",
                start1 > as.POSIXct('20:01:00', format = "%T") |
                start1 < as.POSIXct('08:00:00', format = "%T") & 
                end1 < as.POSIXct('08:00:00', format = "%T") ~ "night",
                TRUE ~ "mixed")) 

df1 %>% select(names(mydata), day.night)
#    id               start                 end day.night
#1 m1 2020-01-03 10:00:00 2020-01-03 16:00:00       day
#2 m1 2020-01-03 16:00:00 2020-01-03 19:20:00       day
#3 m1 2020-01-03 19:20:00 2020-01-03 20:50:00     mixed
#4 m2 2020-01-05 10:00:00 2020-01-05 15:20:00       day
#5 m2 2020-01-05 15:20:00 2020-01-05 20:50:00     mixed
#6 m2 2020-01-05 20:50:00 2020-01-05 22:00:00     night
#7 m3 2020-01-06 06:30:00 2020-01-06 07:40:00     night
#8 m4 2020-01-08 06:30:00 2020-01-08 07:50:00     night
#9 m4 2020-01-08 07:50:00 2020-01-08 08:55:00     mixed