如何将熊猫DataFrame时间戳与.ics文件日期进行比较

时间:2018-11-07 08:41:43

标签: python pandas dataframe icalendar

我有一个包含一列Timestamps的数据框:

    Timestamp   
0   2017-11-09 14:55:29 
1   2017-11-09 14:58:29 
2   2017-11-09 15:01:29 

我还有一个包含假期日历的.ics文件,该文件已下载到驱动器中(完整的日历位于:https://raw.githubusercontent.com/PanderMusubi/dutch-holidays/master/DutchHolidays.ics

示例条目如下:

BEGIN:VEVENT
DTSTAMP:20180712T151328Z
SUMMARY:Eerste Paasdag (Easter Sunday)
UID:20180712T151328Z-17127-0077-en@katana
DTSTART;VALUE=DATE:20180401
DTEND;VALUE=DATE:20180402
ATTACH:https://nl.wikipedia.org/wiki/Eerste_Paasdag
CATEGORIES:Public Holiday
TRANSP:TRANSPARENT
END:VEVENT

我想在df.Timestamp旁边创建一个名为“假日”的二进制列,如果时间戳记日期对应于“类别:公共假日”的日期,则显示1。这个问题有点类似,但是我不理解json或walk部分:parse dates with icalendar and compare to python datetime

到目前为止,我已经尝试过了,但是我对此很陌生,所以可能很不对:

import icalendar
calendar = icalendar.Calendar.from_ical('/Users/dpezim/Desktop/Python/DutchHolidays.ics')

for i in df.Timestamp:
    for event in calendar.walk('VEVENT'):
        if event['DTSTART'].dt <= i <= event['DTEND'].dt:
            df = df.assign(Holiday=1)
        else: 
            df = df.assign(Holiday=0)
return df

我收到此错误:

ValueError: Content line could not be parsed into parts: '/Users/dpezim/Desktop/Python/DutchHolidays.ics': /Users/dpezim/Desktop/Python/DutchHolidays.ics

1 个答案:

答案 0 :(得分:2)

此代码从网址读取ics文件,并从其中提取所有事件。代码从数据帧 df 中遍历 TimeLine 中的所有值,并检查事件对象中的事件日期。如果有任何事件日期与时间轴匹配,它将检查事件的类别,并根据该类别设置holidayCheck列表的值。在代码末尾,该列表已分配给相应的数据框列。

请让我知道这是否有帮助。谢谢。

import numpy as np
import pandas as pd
from urllib.request import urlopen
import datetime as dt
import ics

url = 'https://raw.githubusercontent.com/PanderMusubi/dutch-holidays/master/DutchHolidays.ics'


icsFile = c = ics.Calendar(urlopen(url).read().decode('iso-8859-1'))
holidayCheck = []
events = icsFile.events

for _datetime in df.Timestamp:

    dfDate = int(_datetime.strftime('%Y%m%d'))

    check = False
    for event in events:
        eventDate = int(event.begin.strftime('%Y%m%d'))

        if dfDate == eventDate:
            categories = next(iter(event.categories))
            if categories == 'Public Holiday':
                holidayCheck.append(1)
            else:
                holidayCheck.append(0)
            check = True
            break
        else:
            check = False

    if check == False:
        holidayCheck.append(0)

df = df.assign(Holiday = holidayCheck)