我一直在尝试根据时间范围创建一个新列:准时(12:00:00之前),晚(12:00:00至15:00:00之间)和很晚(15之后: 00:00)。我可以根据固定的时间而不是范围来创建列。
数据
time worker Day
1 2020-07-21 15:25:00 Ryan Tue
2 2020-07-21 11:20:00 Tim Tue
3 2020-07-21 11:30:00 John Tue
4 2020-07-21 14:00:00 Adam Tue
所需的输出
time worker Day Arrival
1 2020-07-21 15:25:00 Ryan Tue very late
2 2020-07-21 11:20:00 Tim Tue punctual
3 2020-07-21 11:30:00 John Tue punctual
4 2020-07-21 14:00:00 Adam Tue late
错误代码
df<-df %>% mutate(hour = lubridate::hour(time), minutes = lubridate::minutes(time),Arrival = case_when(hour <- 12 | (hour == 12 & minutes <= 30) ~ 'punctual',
hour <- 15 | (hour == 15 & minutes <= 30) ~ 'late',
TRUE ~ 'very late'))
答案 0 :(得分:1)
您可以使用case_when
并分别指定每个条件:
library(dplyr)
library(lubridate)
df %>%
mutate(hour = hour(time),
Arrival = case_when(hour < 12 ~ 'punctual',
hour < 15 ~ 'late',
TRUE ~ 'very late'))
或通过指定中断在基数R中使用cut
。
df$Arrival <- cut(as.integer(format(df$time, '%H')), c(0, 11, 14, 23),
c('punctual', 'late', 'very late'))
df
# time worker Day Arrival
#1 2020-07-21 15:25:00 Ryan Tue very late
#2 2020-07-21 11:20:00 Tim Tue punctual
#3 2020-07-21 11:30:00 John Tue punctual
#4 2020-07-21 14:00:00 Adam Tue late
要将其扩展到最高精度,我们可以像这样使用case_when
:
df %>%
mutate(hour = hour(time),
minutes = minute(time),
Arrival = case_when(
hour < 12 | (hour == 12 & minutes <= 30) ~ 'punctual',
hour < 15 | (hour == 15 & minutes <= 30) ~ 'late',
TRUE ~ 'very late'))
数据
确保您的数据中的time
列属于POSIXct
类。
df <- structure(list(time = structure(c(1595345100, 1595330400, 1595331000,
1595340000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
worker = c("Ryan", "Tim", "John", "Adam"), Day = c("Tue",
"Tue", "Tue", "Tue")), row.names = c("1", "2", "3", "4"), class = "data.frame")