在Pandas中绑定两个数据帧

时间:2018-03-02 14:50:19

标签: python pandas merge data-manipulation

我有两个数据框:

1)。

                          2017 Hours
name            Month   

a               January   199.25
                February  203.25
                March     220.75
                April     203.50
                May       242.50
                June      261.25
                July      278.50
                August    227.75
                September 160.75
                October   213.50
                November  230.75
                December  159.75


                          2018 Hours 
name            Month   

a               January   199.25
                February  203.25
                March     220.75
                April     203.50
                May       242.50
                June      261.25
                July      278.50
                August    227.75
                September 160.75
                October   213.50
                November  230.75
                December  159.75

我想将两个数据框合并为一个用于绘图。我的目标是绘制一个简单的折线图,其中y轴为小时,x轴为月,2017年有一条线,2018年为另一条

我想要一个看起来像这样的df:

                                       Hours
name            Month     Year

a               January   2017         199.25
                February  2017         203.25
                March     2017         220.75
                April     2017         203.50
                May       2017         242.50
                June      2017         261.25
                July      2017         278.50
                August    2017         227.75
                September 2017         160.75
                October   2017         213.50
                November  2017         230.75
                December  2017         159.7o
                January   2018         199.25
                February  2018         203.25
                March     2018         220.75
                April     2018         203.50
                May       2018         242.50
                June      2018         261.25
                July      2018         278.50
                August    2018         227.75
                September 2018         160.75
                October   2018         213.50
                November  2018         230.75
                December  2018         159.75

非常感谢任何帮助!!

2 个答案:

答案 0 :(得分:2)

我认为首先需要在DataFrame s中设置相同的列名,然后将concat与参数keys一起用于distingush DataFrame和最后reset_index来自MultiIndex的列:

df1.columns = ['Hour']
df2.columns = ['Hour']
df = pd.concat([df1, df2], keys=(2017, 2018)).reset_index().rename(columns={'level_0':'Year'})
print (df)
    Year name      Month    Hour
0   2017    a    January  199.25
1   2017    a   February  203.25
2   2017    a      March  220.75
3   2017    a      April  203.50
4   2017    a        May  242.50
5   2017    a       June  261.25
6   2017    a       July  278.50
7   2017    a     August  227.75
8   2017    a  September  160.75
9   2017    a    October  213.50
10  2017    a   November  230.75
11  2017    a   December  159.75
12  2018    a    January  199.25
13  2018    a   February  203.25
14  2018    a      March  220.75
15  2018    a      April  203.50
16  2018    a        May  242.50
17  2018    a       June  261.25
18  2018    a       July  278.50
19  2018    a     August  227.75
20  2018    a  September  160.75
21  2018    a    October  213.50
22  2018    a   November  230.75
23  2018    a   December  159.75

但是对于情节应该更好:

df = (pd.concat([df1['2017 Hours'], df2['2018 Hours']], keys=(2017, 2018), axis=1)
       .reset_index(level=0, drop=True))
print (df)
             2017    2018
Month                    
January    199.25  199.25
February   203.25  203.25
March      220.75  220.75
April      203.50  203.50
May        242.50  242.50
June       261.25  261.25
July       278.50  278.50
August     227.75  227.75
September  160.75  160.75
October    213.50  213.50
November   230.75  230.75
December   159.75  159.75

答案 1 :(得分:0)

传递给多个索引

df1.columns=df1.columns.str.split(' ',expand=True)

df1.swaplevel(0,1,axis=1).stack()
Out[946]:
                     Hours
name Month
a    January  2017  199.25
     February 2017  203.25
     March    2017  220.75

df2.columns=df2.columns.str.split(' ',expand=True)

然后

使用concat

pd.concat([df1.swaplevel(0,1,axis=1).stack(),df2.swaplevel(0,1,axis=1).stack()])