我想使用pandas_datareader获取库存数据。我有数据,但是得到的索引是multiIndex.
_data.columns
MultiIndex(levels=[['High', 'Low', 'Open', 'Close', 'Volume', 'Adj Close'], ['MSFT']],
codes=[[0, 1, 2, 3, 4, 5], [0, 0, 0, 0, 0, 0]],
names=['Attributes', 'Symbols'])
from pandas_datareader import data as pdr
import yfinance
_data = pdr.get_data_yahoo(['MSFT'], start='2019-01-01', end='2019-05-30')
我要获取的格式是单一索引。这样我就可以使用该数据进行绘图
symbol date price
0 MSFT 2000-01-01 39.81
1 MSFT 2000-02-01 36.35
2 MSFT 2000-03-01 43.22
3 MSFT 2000-04-01 28.37
4 MSFT 2000-05-01 25.45
答案 0 :(得分:3)
有多个“价格”列可供选择。我选择了'Adj Close'
。这与ChrisA的评论基本相同。
_data.stack()['Adj Close'].reset_index(name='Price')
Date Symbols Price
0 2019-01-02 MSFT 100.318642
1 2019-01-03 MSFT 96.628120
2 2019-01-04 MSFT 101.122223
3 2019-01-07 MSFT 101.251190
4 2019-01-08 MSFT 101.985329
.. ... ... ...
答案 1 :(得分:1)
我认为您需要reset_index
和melt
from pandas_datareader import data as pdr
import yfinance
_data = pdr.get_data_yahoo(['MSFT'], start='2019-01-01', end='2019-05-30')
print(_data)
Attributes High Low Open Close Volume Adj Close
Symbols MSFT MSFT MSFT MSFT MSFT MSFT
Date
2019-01-02 101.750000 98.940002 99.550003 101.120003 35329300.0 100.318642
2019-01-03 100.190002 97.199997 100.099998 97.400002 42523600.0 96.628120
2019-01-04 102.510002 98.930000 99.720001 101.930000 44060600.0 101.122223
2019-01-07 103.269997 100.980003 101.639999 102.059998 35656100.0 101.251190
2019-01-08 103.970001 101.709999 103.040001 102.800003 31514400.0 101.985329
我们可以reset_index
和melt
通过date
作为我们的id_vars
df = _data.reset_index().melt(id_vars='Date') # You can filter out attributes if you don't need them.
print(df)
Date Attributes Symbols value
0 2019-01-02 High MSFT 101.750000
1 2019-01-03 High MSFT 100.190002
2 2019-01-04 High MSFT 102.510002
3 2019-01-07 High MSFT 103.269997
4 2019-01-08 High MSFT 103.970001
5 2019-01-09 High MSFT 104.879997
6 2019-01-10 High MSFT 103.750000
7 2019-01-11 High MSFT 103.440002
答案 2 :(得分:1)
可能的解决方法是:
from pandas_datareader import data as pdr
data = pdr.get_data_yahoo(['MSFT'], start='2019-01-01', end='2019-05-30')
data.columns = data.columns.levels[0]
data['symbol'] = 'MSFT'
data.head()
Attributes High Low Open Close Volume Adj Close symbol
Date
2019-01-02 101.750000 98.940002 99.550003 101.120003 35329300.0 100.318642 MSFT
2019-01-03 100.190002 97.199997 100.099998 97.400002 42523600.0 96.628120 MSFT
2019-01-04 102.510002 98.930000 99.720001 101.930000 44060600.0 101.122223 MSFT
2019-01-07 103.269997 100.980003 101.639999 102.059998 35656100.0 101.251190 MSFT
2019-01-08 103.970001 101.709999 103.040001 102.800003 31514400.0 101.985329 MSFT
答案 3 :(得分:0)
您需要使用multiindex droplevel https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.MultiIndex.droplevel.html