标记范围之间的日期

时间:2015-11-16 13:43:44

标签: python pandas

我有以下数据框:

/etc/default/docker

我需要标记exdiv_date在今天之后和expiry_date之前的每一行。输出应为:

     exdiv_date    expiry_date
0    2015-09-18    2015-12-18
1    2015-11-20    2015-12-18
2    NaN           2016-01-20
3    2015-12-26    2016-01-15
4    NaN           2015-11-21

根据示例,某些行没有exdiv_date(即:NaN)。我确保exdiv_date和expiry_date的类型如下:

     exdiv_date    expiry_date   flag
0    2015-09-18    2015-12-18    False
1    2015-11-20    2015-12-18    True
2    NaN           2016-01-20    False
3    2015-12-26    2016-01-15    True
4    NaN           2015-11-21    False

我尝试过这样做:

df['exdiv_date'] = pd.to_datetime(df['exdiv_date'])
df['expiry_date'] = pd.to_datetime(df['expiry_date'])

但我收到错误:mask = (df['exdiv_date'] > dt.date.today) & (df['exdiv_date'] < df['expiry_date']) df.loc[mask, 'flag'] = True

我认为错误是因为NaN,但我不确定如何绕过它。

1 个答案:

答案 0 :(得分:1)

括号有问题 - 使用dt.date.today()

您可以选择使用np.where

import datetime as dt

#  exdiv_date expiry_date
#0 2015-09-18  2015-12-18
#1 2015-11-20  2015-12-18
#2        NaT  2016-01-20
#3 2015-12-26  2016-01-15
#4        NaT  2015-11-21


mask = (df['exdiv_date'] > dt.date.today()) & (df['exdiv_date'] < df['expiry_date'])
df.loc[mask, 'flag'] = True
print df
#  exdiv_date expiry_date  flag
#0 2015-09-18  2015-12-18   NaN
#1 2015-11-20  2015-12-18  True
#2        NaT  2016-01-20   NaN
#3 2015-12-26  2016-01-15  True
#4        NaT  2015-11-21   NaN
#if condition true add value True else add False to column flag
df['flag'] = np.where((df['exdiv_date'] > dt.date.today()) & (df['exdiv_date'] < df['expiry_date']), 'True', 'False')
print df
#  exdiv_date expiry_date   flag
#0 2015-09-18  2015-12-18  False
#1 2015-11-20  2015-12-18   True
#2        NaT  2016-01-20  False
#3 2015-12-26  2016-01-15   True
#4        NaT  2015-11-21  False