我有每日数据和一个定义一个月中每个第三个星期五的循环,然后在第三个星期五的20天内将列的值更改为2。但是,标记仅适用于第三个星期五之后的日子。我不明白为什么。 我的数据框"合并了#34;如下:
Date ID Window
01/01/2000 1 0
01/01/2000 1 0
02/01/2000 2 0
02/01/2000 2 0
目前的代码如下:
#Get third friday in a month Friday:
c = calendar.Calendar(firstweekday=calendar.SUNDAY)
year = 2000; month = 3
monthcal = c.monthdatescalendar(year,month)
third_friday = [day for week in monthcal for day in week if \
day.weekday() == calendar.FRIDAY and \
day.month == month][2]
#Loop through dates to change the window column:
for beg in pd.date_range("2000-01-01", "2017-05-01"):
beg= third_friday
merged["window"].loc[beg: beg + pd.to_timedelta(20,"D")] = 2
merged["window"].loc[beg: beg - pd.to_timedelta(20,"D")] = 2
#repeat the same for the next Fridays:
if month==12:
year=year+1
month=0
if year>=2017 and month>=3:
break
month = month +3
monthcal = c.monthdatescalendar(year,month)
third_friday = [day for week in monthcal for day in week if \
day.weekday() == calendar.FRIDAY and \
day.month == month][2]
当我运行此代码时,我没有在第三个星期五之前将窗口列设置为2。只有在第三个星期五之后的20天才变为2.有人知道我做错了什么吗?
答案 0 :(得分:1)
最简单的方法是定义一个方法来计算一个月的第三个星期五,给定一年和一个月。要么将您的方法与calendar
一起使用,要么类似的方法也可以使用
def third_friday_of(year, month):
pd.DatetimeIndex(start = '%i/%i/15' % (year, month, ), end='%i/%i/21' % (year, month, ), freq='d')
return daterange[daterange.weekday == 4][0]
这会返回pandas.Timestamp
,但这是datetime.datetime
的子类,所以不应该在你的程序中造成进一步的问题
我还定义了一个单独的方法来实际更改DataFrame
,并将间隔和窗口作为参数
def process_dataframe(df, begin_year, begin_month, end_year, end_month, interval_months=3, window=20):
end_month = min(end_month + 1, 12)
dates = pd.DatetimeIndex(start = '%i/%i' % ( begin_year, begin_month,), end='%i/%i' % (end_year, end_month), freq='%im' % interval_months)
for d in dates:
third_friday = third_friday_of(d.year, d.month)
# print(d, third_friday)
df.loc[third_friday - pd.Timedelta(window, unit='d') : third_friday 2 pd.Timedelta(window, unit='d'), 'Window'] = 2
它可能不适合你的原因是merged["window"].loc[beg: beg - pd.to_timedelta(20,"D")] = 2
应该是merged["window"].loc[beg - pd.to_timedelta(20,"D"):beg] = 2
merged["window"].loc[beg: beg + pd.to_timedelta(20,"D")] = 2
本身有第二个问题。使用merged["window"]
,您需要一个系列,但无论是获得视图还是副本,它都不是100%明确或确定的。最好是在.loc
中执行此操作,就像在我的代码中一样