将多个数据框转换为多索引数据框

时间:2020-04-29 15:51:12

标签: python pandas

我从Yahoo Finance下载了一堆股票数据。每个数据框如下所示:

          Date  Open  High   Low  Close  Adj Close  Volume
0   2019-03-11  2.73  2.81  2.71   2.75       2.75  243900
1   2019-03-12  2.66  2.78  2.66   2.75       2.75   69200
2   2019-03-13  2.75  2.80  2.71   2.77       2.77   61200
3   2019-03-14  2.77  2.79  2.75   2.75       2.75   48800
4   2019-03-15  2.76  2.79  2.75   2.79       2.79  124400
..         ...   ...   ...   ...    ...        ...     ...
282 2020-04-22  3.61  3.75  3.61   3.71       3.71  312900
283 2020-04-23  3.74  3.77  3.66   3.76       3.76   99800
284 2020-04-24  3.78  3.78  3.63   3.63       3.63   89100
285 2020-04-27  3.70  3.70  3.55   3.64       3.64   60600
286 2020-04-28  3.70  3.74  3.64   3.70       3.70  248300

我需要合并数据,使其看起来像下面的多索引格式,但我很茫然。我尝试了许多pd.concat([list of dfs], zip(cols,symbols), axis=[0,1])连击,但没有碰到任何帮助,我们将不胜感激!

                Adj Close         Close             High              Low               Open               Volume               
                CHNR  GNSS  SGRP  CHNR  GNSS  SGRP  CHNR  GNSS  SGRP  CHNR  GNSS  SGRP  CHNR  GNSS  SGRP   CHNR    GNSS   SGRP
Date                                                                                                                          
2019-04-30      1.85  3.08  0.69  1.85  3.08  0.69  1.94  3.10  0.70  1.74  3.05  0.67  1.74  3.07  0.70  24800   23900  30400
2019-05-01      1.81  3.15  0.65  1.81  3.15  0.65  1.85  3.17  0.69  1.75  3.06  0.62  1.76  3.09  0.67  15500   72800  85900
2019-05-02      1.80  3.12  0.66  1.80  3.12  0.66  1.87  3.16  0.66  1.76  3.10  0.65  1.80  3.16  0.65  12900   28100  97200
2019-05-03      1.85  3.14  0.67  1.85  3.14  0.67  1.89  3.19  0.69  1.74  3.06  0.62  1.74  3.12  0.62  43200   31300  27500
2019-05-06      1.85  3.13  0.66  1.85  3.13  0.66  1.89  3.25  0.69  1.75  3.11  0.65  1.79  3.11  0.67  37000   50200  31500
...              ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...    ...     ...    ...
2020-04-22      0.93  3.71  0.73  0.93  3.71  0.73  1.04  3.75  0.73  0.93  3.61  0.69  0.93  3.61  0.72   2600  312900  14600
2020-04-23      1.01  3.76  0.74  1.01  3.76  0.74  1.01  3.77  0.77  0.94  3.66  0.73  0.94  3.74  0.73   2500   99800  15200
2020-04-24      1.05  3.63  0.76  1.05  3.63  0.76  1.05  3.78  0.77  0.92  3.63  0.74  1.05  3.78  0.74   4400   89100   1300
2020-04-27      1.03  3.64  0.76  1.03  3.64  0.76  1.07  3.70  0.77  0.92  3.55  0.76  1.07  3.70  0.77   6200   60600   3500
2020-04-28      1.00  3.70  0.77  1.00  3.70  0.77  1.07  3.74  0.77  0.96  3.64  0.75  1.07  3.70  0.77  22300  248300  26100

根据Quang Hoang的建议进行编辑:

尝试:

ret =  pd.concat(stock_data.values(), keys=stocks, axis=1)
ret = ret.swaplevel(0, 1, axis=1)

得到以下看起来更接近但仍然有些偏离的输出:

           Date   Open   High    Low  Close Adj Close Volume       Date  Open  High   Low Close Adj Close    Volume       Date  Open  High   Low Close Adj Close Volume
           CHNR   CHNR   CHNR   CHNR   CHNR      CHNR   CHNR       GNSS  GNSS  GNSS  GNSS  GNSS      GNSS      GNSS       SGRP  SGRP  SGRP  SGRP  SGRP      SGRP   SGRP
0    2010-04-29  11.39  11.74  11.39  11.57     11.57   3100 2019-03-11  2.73  2.81  2.71  2.75      2.75  243900.0 2010-04-29  0.79  0.79  0.79  0.79      0.79      0
1    2010-04-30  11.60  11.61  11.50  11.56     11.56   5400 2019-03-12  2.66  2.78  2.66  2.75      2.75   69200.0 2010-04-30  0.79  0.79  0.79  0.79      0.79      0
2    2010-05-03  11.95  11.95  11.22  11.44     11.44  19400 2019-03-13  2.75  2.80  2.71  2.77      2.77   61200.0 2010-05-03  0.79  0.79  0.79  0.79      0.79      0
3    2010-05-04  11.20  11.49  11.20  11.46     11.46  10700 2019-03-14  2.77  2.79  2.75  2.75      2.75   48800.0 2010-05-04  0.79  0.79  0.66  0.79      0.79   9700
4    2010-05-05  11.50  11.60  11.25  11.50     11.50  13400 2019-03-15  2.76  2.79  2.75  2.79      2.79  124400.0 2010-05-05  0.69  0.80  0.67  0.80      0.80   6700
...         ...    ...    ...    ...    ...       ...    ...        ...   ...   ...   ...   ...       ...       ...        ...   ...   ...   ...   ...       ...    ...
2512 2020-04-22   0.93   1.04   0.93   0.93      0.93   2600        NaT   NaN   NaN   NaN   NaN       NaN       NaN 2020-04-22  0.72  0.73  0.69  0.73      0.73  14600
2513 2020-04-23   0.94   1.01   0.94   1.01      1.01   2500        NaT   NaN   NaN   NaN   NaN       NaN       NaN 2020-04-23  0.73  0.77  0.73  0.74      0.74  15200
2514 2020-04-24   1.05   1.05   0.92   1.05      1.05   4400        NaT   NaN   NaN   NaN   NaN       NaN       NaN 2020-04-24  0.74  0.77  0.74  0.76      0.76   1300
2515 2020-04-27   1.07   1.07   0.92   1.03      1.03   6200        NaT   NaN   NaN   NaN   NaN       NaN       NaN 2020-04-27  0.77  0.77  0.76  0.76      0.76   3500
2516 2020-04-28   1.07   1.07   0.96   1.00      1.00  22300        NaT   NaN   NaN   NaN   NaN       NaN       NaN 2020-04-28  0.77  0.77  0.75  0.77      0.77  26100

0 个答案:

没有答案