我可以通过首先将日期时间created
列设置为索引并对数据帧进行切片来选择两个日期之间的pandas数据帧。但现在我想做一个涉及额外日期时间列的新查询' modifieddate'即:
df = df.set_index(['created'])
print (df)
name modifieddate
created
2014-01-01 16:07:07 john 2014-01-01 16:07:07
2014-01-04 16:07:07 harold 2014-01-04 16:07:07
2014-01-04 16:07:07 clara 2014-01-04 18:07:07
2014-01-05 16:07:07 emily 2014-01-06 16:07:07
2014-01-08 16:07:07 smiths 2014-01-08 16:07:07
2014-01-09 20:07:07 clara 2014-01-09 20:07:07
2014-01-10 18:07:07 clara 2014-01-10 18:07:07
2014-01-10 16:07:07 john 2014-01-11 16:07:07
选择created
和modifieddate
相等的行,并且在给定的日期时间2014-01-04 16:07:07
和2014-01-10 16:07:07
之间:
name modifieddate
created
2014-01-04 16:07:07 harold 2014-01-04 16:07:07
2014-01-08 16:07:07 smiths 2014-01-08 16:07:07
2014-01-09 20:07:07 clara 2014-01-09 20:07:07
答案 0 :(得分:2)
您可以between
使用boolean indexing
:
s = '2014-01-04 16:07:07'
e = '2014-01-10 16:07:07'
df = df[(df.index.to_series().between(s,e)) &
(df.modifieddate.between(s,e)) &
(df.index == df.modifieddate)]
print (df)
name modifieddate
created
2014-01-04 16:07:07 harold 2014-01-04 16:07:07
2014-01-08 16:07:07 smiths 2014-01-08 16:07:07
2014-01-09 20:07:07 clara 2014-01-09 20:07:07
答案 1 :(得分:1)
假设您的列“已创建”不是索引。
df2= df.ix[(df.created==df.modifieddate)&(df.created>=datetime.datetime(2014,1,4,
16,7,7))&(df.created <=datetime.datetime(2014,1,10, 16,7,7)]