在没有符号的Pandas上Concat两个DataFrame,只有Dates(来自pd.datareader)

时间:2018-05-22 17:27:55

标签: python pandas pandas-datareader

我使用pandas_datareader导入股市数据:

from pandas_datareader import data, wb
import pandas as pd
import datetime
start = datetime.datetime(2006, 1,1)
end = datetime.datetime(2016, 1,1)
boaml = data.DataReader('BAC', 'morningstar', start, end)
citi = data.DataReader('C', 'morningstar', start, end)

数据看起来整洁,由citi.head()

的结果表示
Close   High    Low     Open    Volume

Symbol  Date    

C   

2006-01-02  485.3   487.1   482.2   483.5   0

2006-01-03  492.9   493.8   481.1   490.0   1536700

2006-01-04  483.8   491.0   483.5   488.6   1852790

2006-01-05  486.2   487.8   484.0   484.4   1015470

2006-01-06  486.2   489.0   482.0   488.8   1358930

现在,当我尝试使用pd.concat()连接它们时,我在右上角和矩阵的左下角得到NaN:

bank_stocks = pd.concat([boaml, citi], axis=1, join='outer')

查看bank_stocks.head()

Close   High    Low     Open    Volume  Close   High    Low     Open    Volume

Symbol  Date    

BAC     2006-01-02  46.15   46.36   45.91   46.02   0.0     NaN     NaN     NaN     NaN     NaN

2006-01-03  47.08   47.18   46.15   46.92   16197900.0  NaN     NaN     NaN     NaN     NaN

2006-01-04  46.58   47.24   46.45   47.00   17427400.0  NaN     NaN     NaN     NaN     NaN

2006-01-05  46.64   46.83   46.32   46.58   14668900.0  NaN     NaN     NaN     NaN     NaN

2006-01-06  46.57   46.91   46.35   46.80   11965700.0  NaN     NaN     NaN     NaN     NaN

bank_stocks.tail()

关闭高低开放量关闭高低开放量

符号日期

C   2015-12-28  NaN     NaN     NaN     NaN     NaN     52.38   52.57   51.96   52.57   8760674.0

2015-12-29  NaN     NaN     NaN     NaN     NaN     52.98   53.22   52.74   52.76   10153634.0

2015-12-30  NaN     NaN     NaN     NaN     NaN     52.30   52.94   52.25   52.84   8763137.0

2015-12-31  NaN     NaN     NaN     NaN     NaN     51.75   52.39   51.75   52.07   11275231.0

2016-01-01  NaN     NaN     NaN     NaN     NaN     51.75   51.75   51.75   51.75   0.0

(如果输出不清楚,请提前道歉,我希望在重现错误时代码可以轻松完成)。

我知道问题依赖于Symbol,但是,我尝试过MultiIndexing并且无效。

任何想法如何获得一个矩阵,该矩阵在同一日期下连接boamlciti的库存数据,而不显示NaN?

1 个答案:

答案 0 :(得分:2)

你的等级0 MultiIndex'符号'造成了这个问题。尝试删除该级别,然后连续

citi.index = citi.index.droplevel()
boaml.index = boaml.index.droplevel()

pd.concat([citi.add_suffix('_citi'), boaml.add_suffix('_boaml')], axis = 1)