Question

使用DatetimeIndex处理pandas系列。期望的结果是包含.loc []函数中指定的范围内的所有行的数据帧。

当我尝试以下代码时：

aapl.index = pd.to_datetime(aapl.index)
print(aapl.loc[pd.Timestamp('2010-11-01'):pd.Timestamp('2010-12-30')])

我回来了：

Empty DataFrame
Columns: [Open, High, Low, Close, Volume, ExDividend, SplitRatio, 
AdjOpen, AdjHigh, AdjLow, AdjClose, AdjVolume]
Index: []

只是为了重新迭代，我想要的结果是数据帧的一个子集，包含范围内的所有行（2010-11-01）:( 2010-12-30）。

Answer 1

IIUC：

import pandas_datareader as web
aapl = web.get_data_yahoo('aapl')

aapl.loc['2010-11-01':'2010-12-30']

使用partial string indexing并切片。

Answer 2

您似乎需要将索引转换为datetime，然后使用标准索引/切片表示法。

import pandas as pd, numpy as np

df = pd.DataFrame(list(range(365)))

# these lines are for demonstration purposes only
df['date'] = pd.date_range('2010-1-1', periods=365, freq='D').astype(str)
df = df.set_index('date')

df.index = pd.to_datetime(df.index)

res = df[pd.Timestamp('2010-11-01'):pd.Timestamp('2010-11-10')]

#               0
# date           
# 2010-11-01  304
# 2010-11-02  305
# 2010-11-03  306
# 2010-11-04  307
# 2010-11-05  308
# 2010-11-06  309
# 2010-11-07  310
# 2010-11-08  311
# 2010-11-09  312
# 2010-11-10  313

Answer 3

出于好奇，我尝试将最近的日期作为选择的开始，并将最近的日期作为结束。令我惊讶的是，这有效，但时间序列数据的顺序相反。

在：

aapl.loc[pd.Timestamp('2010-12-30'):pd.Timestamp('2010-11-01')]

所以...呃我意识到我的时间序列数据必须是相反的顺序。现在的问题是，如何将DatetimeIndex df排序为正确的顺序？

所需订单的第n行为最后一行，最早的日期为第一行。

****** ****** EDIT

aapl.index = pd.to_datetime(aapl.index)
aapl =  aapl.sort_index(ascending=True)

aaplrange = aapl.loc[pd.Timestamp('2010-11-01'):pd.Timestamp('2010-12-30')]

作品！

使用.loc选择DatetimeIndex行的范围（Pandas Python 3）

3 个答案: