我在使用between_time从pandas DataFrame中选择数据时遇到问题。当查询的开始日期和结束日期在两天之间时,结果为空。我正在使用pandas 0.17.1(python 2.7)
我有以下数据框:
mydf = pd.DataFrame.from_dict({'azi': {Timestamp('2015-05-12 00:00:14.348000'): 109.801,
Timestamp('2015-05-12 00:00:36.125000'): 109.994,
Timestamp('2015-05-12 00:00:57.599000'): 109.60299999999999,
Timestamp('2015-05-12 00:01:14.576000'): 100.2},
'ele': {Timestamp('2015-05-12 00:00:14.348000'): 180.001,
Timestamp('2015-05-12 00:00:36.125000'): 179.999,
Timestamp('2015-05-12 00:00:57.599000'): 179.999,
Timestamp('2015-05-12 00:01:14.576000'): 180.001}})
结果是:
azi ele
2015-05-12 00:00:14.348 109.801 180.001
2015-05-12 00:00:36.125 109.994 179.999
2015-05-12 00:00:57.599 109.603 179.999
2015-05-12 00:01:14.576 100.200 180.001
以下查询失败:
mydf['azi'].between_time(datetime(2015, 5, 11, 23, 59, 59, 850000), datetime(2015, 5, 12, 0, 1, 59, 850000))
导致:
Series([], Name: azi, dtype: float64)
但是以下查询有效:
mydf2['azi'].between_time(datetime(2015, 5, 11, 0, 0, 0, 0), datetime(2015, 5, 12, 0, 1, 59, 850000))
正确答案:
2015-05-12 00:00:14.348 109.801
2015-05-12 00:00:36.125 109.994
2015-05-12 00:00:57.599 109.603
2015-05-12 00:01:14.576 100.200
Name: azi, dtype: float64
问题:
答案 0 :(得分:0)
您可以从docs找到有关如何使用日期时间索引的大量信息。对于您的情况,您可以尝试loc
:
In [147]: mydf['azi'].loc[datetime(2015, 5, 11, 23, 59, 59, 850000): datetime(2015, 5, 12, 0, 1, 59, 850000)]
Out[147]:
2015-05-12 00:00:14.348 109.801
2015-05-12 00:00:36.125 109.994
2015-05-12 00:00:57.599 109.603
2015-05-12 00:01:14.576 100.200
Name: azi, dtype: float64
这是关于你的子弹。大约1)你可以从@Jeff
看到解释答案 1 :(得分:0)
doc-string说明了一切。
between_time
选择所有时间。
In [67]: mydf.between_time?
Signature: mydf.between_time(start_time, end_time, include_start=True, include_end=True)
Docstring:
Select values between particular times of the day (e.g., 9:00-9:30 AM)
Parameters
----------
start_time : datetime.time or string
end_time : datetime.time or string
include_start : boolean, default True
include_end : boolean, default True
Returns
-------
values_between_time : type of caller
File: ~/pandas/pandas/core/generic.py
Type: instancemethod
In [68]: mydf
Out[68]:
azi ele
2015-05-12 00:00:14.348 109.801 180.001
2015-05-12 00:00:36.125 109.994 179.999
2015-05-12 00:00:57.599 109.603 179.999
2015-05-12 00:01:14.576 100.200 180.001
In [70]: mydf.between_time('00:00:30','00:01:00')
Out[70]:
azi ele
2015-05-12 00:00:36.125 109.994 179.999
2015-05-12 00:00:57.599 109.603 179.999
您可以单独使用partial-string
索引,请参阅here根据日期进行选择(这些可以是字符串或日期时间)。
In [73]: mydf.loc['20150512 00:00:30':'20150512 00:01:00']
Out[73]:
azi ele
2015-05-12 00:00:36.125 109.994 179.999
2015-05-12 00:00:57.599 109.603 179.999
我认为.between_time
实际上应该在非.time
/字符串可转换对象上引发,但是IIRC这样做是为了便于实现。