拥有以下DataFrames:
网站
OfflineFrom | OfflineTo | ShiftDays | Site | -------------------------------------------------------- 2017-10-02 2017-10-10 6 | ID| 2017-10-13 2017-11-10 6 | ID| 2017-11-15 2017-12-09 6 | ID| 2017-10-03 2017-10-11 6 | IN| 2017-10-03 2017-10-10 6 | IN|
节日
Holiday | SiteID | ------------------------ 2017-10-07 | ID| 2017-10-08 | ID| 2017-09-12 | ID| 2017-10-08 | IN|
想要得到一个逻辑,如果某个网站有假期,并且它位于OfflineFrom和OfflineTo之间,那么应该从ShiftDays中减去一天。
预期结果为:
OfflineFrom | OfflineTo | ShiftDays | Site | -------------------------------------------------------- 2017-10-02 2017-10-10 4 | ID| 2017-10-13 2017-11-10 6 | ID| 2017-11-15 2017-12-09 6 | ID| 2017-10-03 2017-10-11 6 | IN| 2017-10-03 2017-10-10 5 | IN|
感谢获取代码...感谢
用于运行此代码和测试的代码是:
# Evaluate if Holiday by Site is within OfflineFrom and OfflineTo
# Subtract the holiday from ShiftDays if it is so
import numpy as np
import pandas as pd
from datetime import datetime, time
# Prepare site ID series
s1 = pd.Series('ID', index = range(3))
s2 = pd.Series('IN', index = range(2))
site = s1.append(s2, ignore_index=True)
# Prepare OfflineFrom and OfflineTo series with datetime
offf = pd.DataFrame({'year':[2017, 2017, 2017, 2017, 2017],
'month': [10, 10, 10, 10, 10],
'day': [2, 5, 10, 20, 25]})
offt = pd.DataFrame({'year':[2017, 2017, 2017, 2017, 2017],
'month': [10, 10, 10, 10, 10],
'day': [10, 10, 18, 23, 28]})
offf = pd.to_datetime(offf)
offt = pd.to_datetime(offt)
# Make a series with ShiftDays as 6
sd = pd.Series(6, index = range(5))
# Assemble all these to a single dataframe
site = pd.DataFrame({'Site': site, 'OfflineFrom': offf, 'OfflineTo': offt, 'ShiftDays': sd})
holiday = pd.DataFrame({'SiteID': ['ID', 'ID', 'IN'], 'Holiday': [datetime.strptime('07-09-2017','%d-%m-%Y'),
datetime.strptime('12-09-2017','%d-%m-%Y'),
datetime.strptime('08-09-2017','%d-%m-%Y')
]})
test = pd.DataFrame((holiday.Holiday[:, None] >= site.OfflineFrom.values)
& (holiday.Holiday[:, None] <= site.OfflineTo.values))
x = (holiday.Holiday[:, None]);x
y = site.OfflineFrom.values; y
答案 0 :(得分:1)
您可以使用numpy
广播:
site.ShiftDays -= ((holiday.Holiday[:, None] >= site.OfflineFrom.values)
& (holiday.Holiday[:, None] <= site.OfflineTo.values)
& (holiday.SiteID[:, None] == site.Site.values)).sum(axis=0)
虽然我没有测试这个效率......