在各个月中拆分工作日

时间:2020-08-10 08:59:55

标签: python pandas dataframe

df1

no  From        To          check   
1   27-Jan-20   28-Mar-20   a                                    
2   28-Mar-20   12-Apr-20   a                                 
3   29-May-20   29-May-20   b                             
4   5-Apr-20    12-Apr-20   b                                 

df2

col1    col2
a       9-Apr-20
b       30-Mar-20

df

no  From        To          check   total   Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1   27-Jan-20   28-Mar-20   a       45      5   20  20                                  
2   28-Mar-20   12-Apr-20   a       9           2   7                               
3   29-May-20   29-May-20   b       1                   1                           
4   5-Apr-20    12-Apr-20   b       5               5   

我需要计算两件事

  1. “总计”列基于“自”到“至”之间的工作日,并包括df2中的任何假期。
  2. 在相应月份中拆分“总计”列(一月至十二月列)

对于第1部分: df1中的“总计”列是使用

来计算的
np.busday_count('2020-01-27','2020-03-28')

但是这不是准确的,并且不能在其中包含holiday(df2) 我试图直接使用创建数据框

df['total']=np.busday_count(df1['From'].astype('datetime64[D]')
,df1['To'].astype('datetime64[D]'))

但是它给出了错误。

1 个答案:

答案 0 :(得分:0)

您可以在自定义函数中使用bdate_range

# dict of num to month mapping
months = pd.tseries.frequencies.MONTH_ALIASES

df2['col2'] = pd.to_datetime(df2['col2'], dayfirst=True)

# holiday month
df['holiday'] = df['check'].map(df2.set_index(['col1'])['col2']).dt.month

def count_by_month(s):

      start, end, holiday = s['From'], s['To'], s['holiday']

      valid_dates = pd.bdate_range(start=start, end=end).month
      count = dict(pd.Series(valid_dates).value_counts())

      # subtract holidays
      if holiday in count:
         count[holiday] -= 1
    
      return pd.concat([s, pd.Series({v: count.get(k, 0) for k, v in months.items()})], axis=0)

print(df)

   no         From            To check  total  holiday  JAN  FEB  MAR  APR  \
0   1    27-Jan-20     28-Mar-20     a     45        4    5   20   20    0   
1   2    28-Mar-20     12-Apr-20     a      9        4    0    0    2    7   
2   3    29-May-20     29-May-20     b      1        3    0    0    0    0   
3   4    5-Apr-20      12-Apr-20     b      5        3    0    0    0    5   

   MAY  JUN  JUL  AUG  SEP  OCT  NOV  DEC  
0    0    0    0    0    0    0    0    0  
1    0    0    0    0    0    0    0    0  
2    1    0    0    0    0    0    0    0  
3    0    0    0    0    0    0    0    0  
相关问题