Pandas DateTime比较不会产生准确的结果

时间:2015-06-29 12:57:40

标签: python datetime pandas filter dataframe

我有一个数据框,我试图过滤日期列中的值介于StartDateFinishDate之间的值。为实现此目的,我从这些日期的字符串值创建pandas.to_datetime列,然后根据该值进行过滤。

    result['date'] = pd.to_datetime(result.DateCreated)
    result['StartDate'] = pd.to_datetime(result.StartDate)
    result['FinishDate'] = pd.to_datetime(result.FinishDate)
    result = result[(result.date >= result.StartDate) &
                    (result.date <= result.FinishDate)]

使用的数据的部分部分位于下方,左侧的StartDateFinishDates是上面代码后面的值,右侧的值是我包含的初始值如果to_datetime

中存在问题
,date,StartDate,FinishDate,startboundry,finishboundry,DateCreated,StartDate,FinishDate
0,2009-06-08,2009-05-01,2009-06-30,False,True,2009-06-08 00:00:00,2009-05-01,2009-06-30
1,2009-10-08,2009-08-01,2009-12-31,False,True,2009-10-08 00:00:00,2009-08-01,2009-12-31
2,2010-01-28,2010-01-01,2010-04-30,False,True,2010-01-28 00:00:00,2010-01-01,2010-04-30
3,2010-05-27,2010-05-01,2010-06-30,False,True,2010-05-27 00:00:00,2010-05-01,2010-06-30
4,2010-09-22,2010-08-01,2010-12-31,False,True,2010-09-22 00:00:00,2010-08-01,2010-12-31
5,2011-01-13,2011-01-01,2011-04-30,False,True,2011-01-13 00:00:00,2011-01-01,2011-04-30
6,2011-05-26,2011-05-01,2011-06-30,False,True,2011-05-26 00:00:00,2011-05-01,2011-06-30
7,2009-01-20,2009-01-01,2009-04-30,False,True,2009-01-20 00:00:00,2009-01-01,2009-04-30
8,2009-05-11,2009-05-01,2009-06-30,False,True,2009-05-11 00:00:00,2009-05-01,2009-06-30
9,2009-10-05,2009-08-01,2009-12-31,False,True,2009-10-05 00:00:00,2009-08-01,2009-12-31

其中有几个正在将(result.date >= result.StartDate)的初始条件读作False,即使它们显然是正确的。

2009-06-08是在2009-05-01之后,例如在时间和词汇上,如果只是进行字符串比较。

编辑添加一些版本信息: 在确保python pandas等版本相同的过程中收集了版本信息以便分享以防这里:

pandas版本0.16.2 python版本2.7.9 ipython 3.2.0

1 个答案:

答案 0 :(得分:0)

过滤您可以在

之间使用的数据框
df[df.date1.between(date2,date3)]