如何找到Python中不包括周末和某些假期的两个日期之间的小时数? BusinessHours套餐

时间:2017-02-23 07:15:18

标签: pandas numpy python-3.5

我试图找到一种非常干净的方法来计算两个日期之间的小时数,不包括周末某些假期

我发现BusinessHours(https://pypi.python.org/pypi/BusinessHours/1.01)包可以做到这一点。但是我没有找到任何关于如何使用包(实际上是语法)的指令,特别是如何输入假期。 我找到了包的原始代码(https://github.com/dnel/BusinessHours/blob/master/BusinessHours.py),但仍不太确定。 我猜它可能是这样的:

date1 = pd.to_datetime('2017-01-01 00:00:00')
date2 = pd.to_datetime('2017-01-22 12:00:00')
import BusinessHour
gethours(date1, date2, worktiming=[8, 17], weekends=[6, 7])

不过,我在哪里可以输入假期?如果我想要排除非办公时间,我会将worktiming调整为worktiming=[0,23]吗?

任何人都知道如何使用这个包请告诉我。我很感激。

P / s:我知道numpy中的命令可以获得两个日期(busday_count)之间的工作日数,但是没有命令可以在小时中获得结果。 pandas或numpy中可以完成任务的任何其他命令也受到欢迎。 谢谢

3 个答案:

答案 0 :(得分:0)

此软件包1.2的最新pip安装在第51行有一个错误,“extraday”需要更改为“extradays”。

我也一直在网上搜索一些可行的代码来计算营业时间和工作日。这个软件包有一点点tweeking但是当你启动并运行它时工作得很好。

这就是我在笔记本中的内容:

#import BusinessHours
from BusinessHours import BusinessHours as bh
import numpy as np
import pandas as pd
from pandas import Series, DataFrame

date1 = pd.to_datetime('2017-01-01 00:00:00')
date2 = pd.to_datetime('2017-01-22 12:00:00')
bh(date1, date2, worktiming=[8, 17], weekends=[6, 7]).gethours()

这也在源代码中:

'''
holidayfile - A file consisting of the predetermined office holidays. 
Each date starts in a new line and currently must only be in the format 
dd-mm-yyyy
'''

希望这有帮助

答案 1 :(得分:0)

在PyPi中尝试这个名为business-hour的包

示例代码

from business_duration import businessDuration
import pandas as pd
from datetime import time,datetime
import holidays as pyholidays

startdate = pd.to_datetime('2017-01-01 00:00:00')
enddate = pd.to_datetime('2017-01-22 12:00:00')

holidaylist = pyholidays.Australia()
unit='hour'
#By default Saturday and Sunday are excluded

print(businessDuration(startdate,enddate,holidaylist=holidaylist,unit=unit))
 Output: 335.99611

holidaylist:
{datetime.date(2017, 1, 1): "New Year's Day",
 datetime.date(2017, 1, 2): "New Year's Day (Observed)",
 datetime.date(2017, 1, 26): 'Australia Day',
 datetime.date(2017, 3, 6): 'Canberra Day',
 datetime.date(2017, 4, 14): 'Good Friday',
 datetime.date(2017, 4, 15): 'Easter Saturday',
 datetime.date(2017, 4, 17): 'Easter Monday',
 datetime.date(2017, 4, 25): 'Anzac Day',
 datetime.date(2017, 6, 12): "Queen's Birthday",
 datetime.date(2017, 9, 26): 'Family & Community Day',
 datetime.date(2017, 10, 2): 'Labour Day',
 datetime.date(2017, 12, 25): 'Christmas Day',
 datetime.date(2017, 12, 26): 'Boxing Day'}

答案 2 :(得分:0)

我从源头重新使用了代码,我汇编了这段似乎可行的代码(适用于英国假期),但我希望就如何改进它发表意见。 我知道这不是特别优雅,但可以帮助某人。 顺便说一句,我想找到一种方法将假日库中的日历插入此库中。

无论如何,目前它不需要太多的库,只需要熊猫和日期时间,这可能是一个加分。


import pandas as pd
import datetime
from pandas.tseries.offsets import CDay
from pandas.tseries.holiday import (
    AbstractHolidayCalendar, DateOffset, EasterMonday,
    GoodFriday, Holiday, MO,
    next_monday, next_monday_or_tuesday)

# This function will calculate the number of working minutes by first
# generating a time series of business days. Then it will calculate the 
# precise working minutes for the start and end date, and use the total 
# working hours for each day in-between.

def count_mins(starttime,endtime, bus_day_series, bus_start_time,bus_end_time):
    mins_in_working_day=(bus_end_time-bus_start_time)*60

    # now we are going to take the series of business days (pre-calculated)
    # and sub select the period provided as argument of the function
    # we could do the calculation of that "calendar" in the function itself
    # but to improve performance, we calculate it separately and then we c
    # call the function with that series as argument, provided the dates
    # fall within the calculated range, of course
    days = bus_day_series[starttime.date():endtime.date()] 

    daycount = len(days)
    if len(days)==0:
        return 0
    else:
        first_day_start = days[0].replace(hour=bus_start_time, minute=0)
        first_day_end = days[0].replace(hour=bus_end_time, minute=0)     
        first_period_start = max(first_day_start, starttime)
        first_period_end = min(first_day_end, endtime)     
        if first_period_end<=first_period_start:
            first_day_mins=0
        else:
            first_day_sec=first_period_end - first_period_start
            first_day_mins=first_day_sec.seconds/60
        if daycount == 1:
            return first_day_mins
        else:
            last_period_start = days[-1].replace(hour=bus_start_time, minute=0) 
            #we know the last day will always start in the bus_start_time

            last_day_end = days[-1].replace(hour=bus_end_time, minute=0)       
            last_period_end = min(last_day_end, endtime)       
            if last_period_end<=last_period_start:
                last_day_mins=0
            else:
                last_day_sec=last_period_end - last_period_start
                last_day_mins=last_day_sec.seconds/60            
            middle_days_mins=0
            if daycount>2:
                middle_days_mins=(daycount-2)*mins_in_working_day
            return first_day_mins + last_day_mins + middle_days_mins


# Calculates the date series with all the business days 
# of the period we are interested on
class EnglandAndWalesHolidayCalendar(AbstractHolidayCalendar):
    rules = [
        Holiday('New Years Day', month=1, day=1, observance=next_monday),
        GoodFriday,
        EasterMonday,
        Holiday('Early May bank holiday',
                month=5, day=1, offset=DateOffset(weekday=MO(1))),
        Holiday('Spring bank holiday',
                month=5, day=31, offset=DateOffset(weekday=MO(-1))),
        Holiday('Summer bank holiday',
                month=8, day=31, offset=DateOffset(weekday=MO(-1))),
        Holiday('Christmas Day', month=12, day=25, observance=next_monday),
        Holiday('Boxing Day',
                month=12, day=26, observance=next_monday_or_tuesday)
    ]

# From this point its how we use the function



# Here we hardcode a start/end date to create the list of business days
cal = EnglandAndWalesHolidayCalendar()
dayindex = pd.bdate_range(datetime.date(2019,1,1),datetime.date.today(),freq=CDay(calendar=cal))
day_series = dayindex.to_series()


# Convenience function to simplify how we call the main function
# It will take a pre calculated day_series.
def bus_hr(ts_start, ts_end, day_series ):
    BUS_START=8 
    BUS_END=20
    minutes = count_mins(ts_start, ts_end, day_series, BUS_START, BUS_END)
    return int(round(minutes/60,0))


#A set of checks that the function is working properly
assert bus_hr( pd.Timestamp(2019,9,30,6,1,0) , pd.Timestamp(2019,10,1,9,0,0),day_series) == 13
assert bus_hr( pd.Timestamp(2019,10,3,10,30,0) , pd.Timestamp(2019,10,3,23,30,0),day_series)==10
assert bus_hr( pd.Timestamp(2019,8,25,10,30,0) , pd.Timestamp(2019,8,27,10,0,0),day_series) ==2
assert bus_hr( pd.Timestamp(2019,12,25,8,0,0) , pd.Timestamp(2019,12,25,17,0,0),day_series) ==0
assert bus_hr( pd.Timestamp(2019,12,26,8,0,0) , pd.Timestamp(2019,12,26,17,0,0),day_series) ==0
assert bus_hr( pd.Timestamp(2019,12,27,8,0,0) , pd.Timestamp(2019,12,27,17,0,0),day_series) ==9
assert bus_hr( pd.Timestamp(2019,6,24,5,10,44) , pd.Timestamp(2019,6,24,7,39,17),day_series)==0
assert bus_hr( pd.Timestamp(2019,6,24,5,10,44) , pd.Timestamp(2019,6,24,8,29,17),day_series)==0
assert bus_hr( pd.Timestamp(2019,6,24,5,10,44) , pd.Timestamp(2019,6,24,10,0,0),day_series)==2
assert bus_hr(pd.Timestamp(2019,4,30,21,19,0) , pd.Timestamp(2019,5,1,16,17,56),day_series)==8
assert bus_hr(pd.Timestamp(2019,4,30,21,19,0) , pd.Timestamp(2019,5,1,20,17,56),day_series)==12