创建小时范围行

时间:2017-03-18 20:10:43

标签: r time

我目前正在处理来自食品卡车的数据,它由申请人,一周中的一天,开始时间和结束时间组织。

我被要求制作单独的行或列,以描述在开始时间和结束时间(打开,未打开)范围内是否发生了一小时。

有没有办法要求R返回,在一天中的每个小时,哪些小时落在开始时间和结束时间的范围内并标记它打开。然后在不在范围内的每小时询问同样的事情并将其标记为未打开。

我尝试使用for循环,但没有成功。

for(Yes in c("1","2","3","4","5","6","7","8","9","10","11","12","13","14",
             "15","16","17","18","19","20","21","22","23","24"))
{
    print(Yes)
    if(Yes %in% (NSFS$starthour %between% NSFS$endhour))
}



DayOfWeekStr Applicant           starthour  endhour  locationid
Friday       Natan's Catering    12         13       437207
Friday       Linda's Catering    10         15       760539
Wednesday    Mang Hang Catering  12         13       559779
Sunday       Tacos Santana       17         22       453014
Friday       Breaking Bread Inc. 14         18       934995

1 个答案:

答案 0 :(得分:0)

我假设你有一个输入表,其中start_hour和end_hour是整数,例如:

#  applicant    day start_hour end_hour
#1         a monday          9       10
#2         a monday         12       12
#3         a monday         14       16
#4         a monday         17       18

您可以使用seq查找开始和结束之间的所有小时数。以下代码中的想法是生成一个data.table,其中包含营业时间(dt_open_hours)和一个data.tabledt_all_hours),包含所有可能的营业时间(使用申请人和天数)在输入数据中)。通过合并两个data.tables,结果表将包含applicantdayhour的所有可能组合,但状态(Open / Not Open)将仅来自{{1 }}。最后一步是将缺失值(dt_open_hours)转换为“未打开”:

NA

library(data.table) dt <- structure(list(applicant = structure(c(1L, 1L, 1L, 1L), .Label = "a", class = "factor"), day = structure(c(1L, 1L, 1L, 1L), .Label = "monday", class = "factor"), start_hour = c(9L, 12L, 14L, 17L), end_hour = c(10L, 12L, 16L, 18L)), .Names = c("applicant", "day", "start_hour", "end_hour"), class = "data.frame", row.names = c(NA, -4L)) # Convert to data.table setDT(dt) # Assign row_id unique row_ids for seq to work on one row at a time dt[, row_id := seq(1, nrow(dt))] # Convert start and end hour into sequence of hours between start and end dt_open_hours <- dt[, .(state = "Open", hour = as.integer(seq(from = start_hour, to = end_hour, by = 1))), by = .(row_id, applicant, day)] # Remove row_id column dt_open_hours[, row_id := NULL] # Generate data.table with all combinations of applicant, day and hour dt_all_hours <- CJ(applicant = unique(dt_open_hours[, applicant]), day = unique(dt_open_hours[, day]), hour = seq(1, 24)) # Merge out <- dt_open_hours[dt_all_hours, on=.(applicant, day, hour)] out[is.na(state), state := "Not Open"] data.table如下所示:

out

更新:使用以下更新的输入data.frame:

#    applicant    day    state hour
# 1:         a monday Not Open    1
# 2:         a monday Not Open    2
# 3:         a monday Not Open    3
# 4:         a monday Not Open    4
# 5:         a monday Not Open    5
# 6:         a monday Not Open    6
# 7:         a monday Not Open    7
# 8:         a monday Not Open    8
# 9:         a monday     Open    9
#10:         a monday     Open   10
#11:         a monday Not Open   11
#12:         a monday     Open   12
#13:         a monday Not Open   13
#14:         a monday     Open   14
#15:         a monday     Open   15
#16:         a monday     Open   16
#17:         a monday     Open   17
#18:         a monday     Open   18
#19:         a monday Not Open   19
#20:         a monday Not Open   20
#21:         a monday Not Open   21
#22:         a monday Not Open   22
#23:         a monday Not Open   23
#24:         a monday Not Open   24
#    applicant    day    state hour

对代码进行了一些修改(除了使用更新的输入data.frame中提供的列名),主要是为了合并 DayOfWeekStr Applicant starthour endhour locationid 1 Friday Natan's Catering 12 13 437207 2 Friday Linda's Catering 10 15 760539 3 Wednesday Mang Hang Catering 12 13 559779 4 Sunday Tacos Santana 17 22 453014 5 Friday Breaking Bread Inc. 14 18 934995 变量:locationid包含{{1}之间的小时数{}为dt_open_hoursstarthourendhourApplicantDayOfWeekStr的唯一组合} {}} {} locationid 24代表locationiddt_all_hoursApplicant的相同唯一组合。 DayOfWeekStrlocationid的合并在dt_open_hoursdt_all_hoursApplicantDayOfWeekStr上完成(locationid是新的)。

hour