如何为日期时间在multiindex中设置索引?

时间:2019-02-20 19:44:02

标签: pandas

我有这个df:

                      open     high      low    close    volume
date       symbol                                              
2014-02-20 AAPL    69.9986  70.5252  69.4746  69.7569  76529103
           MSFT    33.5650  33.8331  33.4087  33.7259  27541038
2014-02-21 AAPL    69.9727  70.2061  68.8967  68.9821  69757247
           MSFT    33.8956  34.2619  33.8241  33.9313  38030656
2014-02-24 AAPL    68.7063  69.5954  68.6104  69.2841  72364950
           MSFT    33.6723  33.9269  33.5382  33.6723  32143395

从这里返回

from datetime import datetime
from iexfinance.stocks import get_historical_data
from pandas_datareader import data
import matplotlib.pyplot as plt
import pandas as pd

start =  '2014-01-01'
end = datetime.today().utcnow()
symbol = ['AAPL', 'MSFT']

prices = pd.DataFrame()
datasets_test = []
for d in symbol:
    data_original = data.DataReader(d, 'iex', start, end)
    data_original['symbol'] = d
    prices = pd.concat([prices,data_original],axis=0)
prices = prices.set_index(['symbol'], append=True)
prices.sort_index(inplace=True)

当尝试获取星期几时:

A['day_of_week'] = features.index.get_level_values('date').weekday

我收到错误消息:

  

AttributeError:“索引”对象没有属性“工作日”

我尝试将日期索引更改为日期时间

prices['date'] = pd.to_datetime(prices['date'])

但出现此错误:

  

KeyError:“日期”

有什么主意如何保留2个索引,即日期+符号,但要更改其中之一(日期)tp datetime,以便我可以获取星期几?

1 个答案:

答案 0 :(得分:1)

看起来索引的date级别包含字符串,而不包含日期时间对象。一种解决方案是将所有MultiIndex级别重置为列,将date列转换为日期时间,然后再将MultiIndex设置回去。然后,您可以按照通常的方式使用.weekday之类的熊猫日期时间访问器。

prices = prices.reset_index()
prices['date'] = pd.to_datetime(prices['date'])
prices = prices.set_index(['date', 'symbol'])

prices.index.get_level_values('date').weekday
Int64Index([3, 3, 4, 4, 0, 0, 1, 1, 2, 2,
            ...
            1, 1, 2, 2, 3, 3, 4, 4, 1, 1],
           dtype='int64', name='date', length=2516)