熊猫从CSV中读取日期,并按每月的总工作日分组

时间:2019-10-11 02:42:15

标签: python pandas datetime pandas-groupby

数据。CSV

ID Activity Month   Activity Date

0   04/2019     04-01-2019

1   05/2019     05-13-2019

2   05/2019     05-25-2019

3   06/2019     06-10-2019

4   06/2019     06-19-2019

5   07/2019     07-15-2019

6   07/2019     07-18-2019

7   07/2019     07-29-2019

8   08/2019     06-03-2019

9   08/2019     06-15-2019

10  08/2019     06-20-2019

我的计划

阅读csv:

  

df = pd.read_csv('data.CSV')

转换为日期时间:

  

df ['活动日期'] = pd.to_datetime(df ['活动日期'],dayfirst = True)

按“活动月份”列分组:

  

grouped = df.groupby(['活动月份'])['活动日期'] .count()

     

打印(分组)

Activity Month
04/2019    15532
05/2019    13924
06/2019    12822
07/2019    14067
08/2019    10939
Name: Activity Date, dtype: int64

将日期分组时,执行工作日计算:

  

这部分我不确定该怎么做。已经丢失

我用来计算工作日的代码

import calendar
import datetime

x = datetime.date(2019, 4, 1)
cal = calendar.Calendar()
working_days = len([x for x in cal.itermonthdays2(x.year, x.month) if x[0] !=0 and x[1] < 5])
print ("Total business days for month (" + str(x.month) +  ") is " + str(working_days) + " days")

我想要的输出

Total business days for month (4) is 22 days
Total business days for month (5) is 23 days
Total business days for month (6) is 20 days
Total business days for month (7) is 23 days
Total business days for month (8) is 22 days

1 个答案:

答案 0 :(得分:1)

这里我并不清楚问题的陈述,但是,如果您要计算每个Activity Month的工作日数,可以将计算结果包装在一个方法中,然后将该方法应用于{ {1}}列(Activity Month表达式基本上是指定列的每一行的for循环操作)。

lambda

输出是具有文本输出的系列。

grouped = df.groupby(['Activity Month'])['Activity Date'].count().reset_index()

def get_business_days(x):
    x = datetime.date(int(x.split('/')[1]), int(x.split('/')[0]), 1)
    cal = calendar.Calendar()
    working_days = len([x for x in cal.itermonthdays2(x.year, x.month) if x[0] !=0 and x[1] < 5])
    return ("Total business days for month (" + str(x.month) +  ") is " + str(working_days) + " days")

grouped['Activity Month'].apply(get_business_days)

但是,在每个单元格中存储重复的信息是一个坏主意。最好只返回0 Total business days for month (4) is 22 days 1 Total business days for month (5) is 23 days 2 Total business days for month (6) is 20 days 3 Total business days for month (7) is 23 days 4 Total business days for month (8) is 22 days 而不是将其嵌入字符串中。