连接几个股票价格数据帧

时间:2017-02-07 19:09:18

标签: python pandas dataframe pandas-datareader

我从pahoo获得pandas_datareader的月度价格数据,如下所示:

import pandas_datareader.data as web
fb = web.get_data_yahoo('FB', '06/01/2012', interval='m')
amzn = web.get_data_yahoo('AMZN', '06/01/2012', interval='m')
nflx = web.get_data_yahoo('NFLX', '06/01/2012', interval='m')
goog = web.get_data_yahoo('GOOG', '06/01/2012', interval='m')

然后我将其清理以获得这样的收盘价:

import pandas as pd
amzn = amzn.rename(columns={'Adj Close': 'AMZN'})
amzn = pd.DataFrame(amzn['AMZN'], columns=['AMZN']) 

重复清理所有四个数据帧。完成此操作后,我想将这四个数据框合并在一起。要做到这一点,我正在使用:

data = pd.concat([fb, amzn, nlfx, goog])

然而,这会产生一个数据帧,其中四列中只有三列是NaN。我已经确认日期匹配。为什么会这样?任何见解都表示赞赏。

1 个答案:

答案 0 :(得分:3)

有一种更好的方法 - 使用Pandas.Panel:

In [20]: p = web.get_data_yahoo(['FB','AMZN','NFLX','GOOG'], '06/01/2012', interval='m')

In [21]: p.loc['Adj Close']
Out[21]:
                  AMZN          FB        GOOG        NFLX
Date
2012-06-01  228.350006   31.100000  289.745758    9.784286
2012-07-02  233.300003   21.709999  316.169373    8.121428
2012-08-01  248.270004   18.059999  342.203369    8.531428
2012-09-04  254.320007   21.660000  376.873779    7.777143
2012-10-01  232.889999   21.110001  339.810760   11.320000
2012-11-01  252.050003   28.000000  348.836761   11.672857
2012-12-03  250.869995   26.620001  353.337280   13.227143
2013-01-02  265.500000   30.980000  377.468170   23.605715
2013-02-01  264.269989   27.250000  400.200500   26.868572
2013-03-01  266.489990   25.580000  396.698975   27.040001
2013-04-01  253.809998   27.770000  411.873840   30.867144
2013-05-01  269.200012   24.350000  435.175598   32.321430
2013-06-03  277.690002   24.879999  439.746002   30.155714
2013-07-01  301.220001   36.799999  443.432343   34.925713
2013-08-01  280.980011   41.290001  423.027710   40.558571
2013-09-03  312.640015   50.230000  437.518250   44.172855
2013-10-01  364.029999   50.209999  514.776123   46.068573
2013-11-01  393.619995   47.009998  529.266602   52.257141
2013-12-02  398.790009   54.650002  559.796204   52.595715
2014-01-02  358.690002   62.570000  589.896118   58.475716
2014-02-03  362.100006   68.459999  607.218811   63.661430
2014-03-03  336.369995   60.240002  556.972473   50.290001
2014-04-01  304.130005   59.779999  526.662415   46.005714
2014-05-01  312.549988   63.299999  559.892578   59.689999
2014-06-02  324.779999   67.290001  575.282593   62.942856
...                ...         ...         ...         ...
2015-02-02  380.160004   78.970001  558.402527   67.844284
2015-03-02  372.100006   82.220001  548.002441   59.527142
2015-04-01  421.779999   78.769997  537.340027   79.500000
2015-05-01  429.230011   79.190002  532.109985   89.151428
2015-06-01  434.089996   85.769997  520.510010   93.848572
2015-07-01  536.150024   94.010002  625.609985  114.309998
2015-08-03  512.890015   89.430000  618.250000  115.029999
2015-09-01  511.890015   89.900002  608.419983  103.260002
2015-10-01  625.900024  101.970001  710.809998  108.379997
2015-11-02  664.799988  104.239998  742.599976  123.330002
2015-12-01  675.890015  104.660004  758.880005  114.379997
2016-01-04  587.000000  112.209999  742.950012   91.839996
2016-02-01  552.520020  106.919998  697.770020   93.410004
2016-03-01  593.640015  114.099998  744.950012  102.230003
2016-04-01  659.590027  117.580002  693.010010   90.029999
2016-05-02  722.789978  118.809998  735.719971  102.570000
2016-06-01  715.619995  114.279999  692.099976   91.480003
2016-07-01  758.809998  123.940002  768.789978   91.250000
2016-08-01  769.159973  126.120003  767.049988   97.449997
2016-09-01  837.309998  128.270004  777.289978   98.550003
2016-10-03  789.820007  130.990005  784.539978  124.870003
2016-11-01  750.570007  118.419998  758.039978  117.000000
2016-12-01  749.869995  115.050003  771.820007  123.800003
2017-01-03  823.479980  130.320007  796.789978  140.710007
2017-02-01  807.640015  132.059998  801.340027  140.970001

[57 rows x 4 columns]

面板轴:

In [22]: p.axes
Out[22]:
[Index(['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], dtype='object'),
 DatetimeIndex(['2012-06-01', '2012-07-02', '2012-08-01', '2012-09-04', '2012-10-01', '2012-11-01', '2012-12-03', '2013-01-02', '2013-02-01'
, '2013-03-01', '2013-04-01', '2013-05-01', '2013-06-03',
                '2013-07-01', '2013-08-01', '2013-09-03', '2013-10-01', '2013-11-01', '2013-12-02', '2014-01-02', '2014-02-03', '2014-03-03'
, '2014-04-01', '2014-05-01', '2014-06-02', '2014-07-01',
                '2014-08-01', '2014-09-02', '2014-10-01', '2014-11-03', '2014-12-01', '2015-01-02', '2015-02-02', '2015-03-02', '2015-04-01'
, '2015-05-01', '2015-06-01', '2015-07-01', '2015-08-03',
                '2015-09-01', '2015-10-01', '2015-11-02', '2015-12-01', '2016-01-04', '2016-02-01', '2016-03-01', '2016-04-01', '2016-05-02'
, '2016-06-01', '2016-07-01', '2016-08-01', '2016-09-01',
                '2016-10-03', '2016-11-01', '2016-12-01', '2017-01-03', '2017-02-01'],
               dtype='datetime64[ns]', name='Date', freq=None),
 Index(['AMZN', 'FB', 'GOOG', 'NFLX'], dtype='object')]