计算每个员工每个入口和出口之间的总时间差

时间:2017-11-06 04:37:26

标签: r

我的数据如下:

data female;
    input cost_female :comma. @@;
    datalines;
871 684 795 838 1,033 917 1,047 723 1,179 707 817 846 975 868 1,323 791 1,157 932 1,089 770
;

data male;
    input cost_male :comma.  @@;
    datalines;
792 765 511 520 618 447 548 720 899 788 927 657 851 702 918 528 884 702 839 878
;

data repair_costs;
    merge female male;
run;

现在,我需要获得每位员工的总时间和总休息时间 (即)工作人员编号:124

  1. 总及时:(最后存在时间 - 首次进入时间) - 总休息时间                    (14:00 - 07:00) - (09:30 - 09:00)

  2. 总休息时间:每个中间退出和进入之间的时间                      (09:30 - 09:00)

  3. 我正在努力解决这个问题。有人可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

如果您将时间格式设置为POSIXct,则可以减去它们(或直接使用difftime来控制单位)。减去它们会返回可以求和的difftime个对象:

library(tidyverse)

df <- structure(list(Staff = c(123L, 123L, 123L, 123L, 123L, 123L, 124L, 124L, 124L, 124L), 
                     Event = c("Entry", "Exit", "Entry", "Exit", "Entry", "Exit", "Entry", "Exit", "Entry", "Exit"), 
                     Time = c("07:00 Hrs", "08:15 Hrs", "08:30 Hrs", "11:15 Hrs", "11:30 Hrs", "15:00 Hrs", "07:00 Hrs", "09:00 Hrs", "09:30 Hrs", "14:00 Hrs")), 
                .Names = c("Staff", "Event", "Time"), class = "data.frame", row.names = c(NA, 10L
))

df2 <- df %>% 
    group_by(Staff) %>% 
    mutate(i = cumsum(Event == 'Entry'),    # add index to allow reshaping
           Time = as.POSIXct(Time, format = '%H:%M')) %>%    # parse to datetime
    spread(Event, Time) %>%    # reshape to wide form
    mutate(work_time = Exit - Entry, 
           break_time = lead(Entry) - Exit)

df2
#> # A tibble: 5 x 6
#> # Groups:   Staff [2]
#>   Staff     i               Entry                Exit  work_time break_time
#>   <int> <int>              <dttm>              <dttm>     <time>     <time>
#> 1   123     1 2017-11-06 07:00:00 2017-11-06 08:15:00 1.25 hours    15 mins
#> 2   123     2 2017-11-06 08:30:00 2017-11-06 11:15:00 2.75 hours    15 mins
#> 3   123     3 2017-11-06 11:30:00 2017-11-06 15:00:00 3.50 hours    NA mins
#> 4   124     1 2017-11-06 07:00:00 2017-11-06 09:00:00 2.00 hours    30 mins
#> 5   124     2 2017-11-06 09:30:00 2017-11-06 14:00:00 4.50 hours    NA mins

# now just aggregate
df2 %>% summarise_at(vars(work_time, break_time), sum, na.rm = TRUE)
#> # A tibble: 2 x 3
#>   Staff work_time break_time
#>   <int>    <time>     <time>
#> 1   123 7.5 hours    30 mins
#> 2   124 6.5 hours    30 mins