pandas层次分组由多列组成

时间:2017-01-10 10:59:15

标签: python sorting pandas multiple-columns multi-index

我想按专栏分组'编号3'和'活动'并获得所需的结果,如下所示。请注意,第一列是索引。我想将所需的结果保存到新的数据框中。

     Number1 Event        Number2  Number3
0      20    clouds        30        404
1      22    lightening    32        404
2      23    playing       33        405
3      25    clouds        35        410
4      24    sleeping      34        407
5      26    lightening    36        410
6      21    rain          31        404
7      27    rain          37        410


Derired Result:

Number3     Event          Number1   Number2
   404   0  clouds          20         30
         1  lightening      22         32
         6  rain            21         31
   405   2  playing         23         33
   410   3  clouds          25         35
         6  lightening      26         36
         7  rain            27         37
   407   4  sleeping        24         34

1 个答案:

答案 0 :(得分:0)

需要set_index

df1 = df.set_index(['Number3', 'Event'])
print (df1)
                    Number1  Number2
Number3 Event                       
404     clouds           20       30
        lightening       21       31
        rain             22       32
405     playing          23       33
410     sun              24       34
420     clouds           25       35
        lightening       26       36
        rain             27       37

但如果需要旧版index,请添加参数append=True,然后再添加swaplevel

df1 = df.set_index(['Number3', 'Event'], append=True).swaplevel(0,1)
print (df1)
                      Number1  Number2
Number3   Event                       
404     0 clouds           20       30
        1 lightening       21       31
        2 rain             22       32
405     3 playing          23       33
410     4 sun              24       34
420     5 clouds           25       35
        6 lightening       26       36
        7 rain             27       37

编辑问题编辑:

添加sort_index

df1 = df.set_index(['Number3', 'Event'], append=True)
        .swaplevel(0,1)
        .sort_index(level='Number3')
print (df1)
                      Number1  Number2
Number3   Event                       
404     0 clouds           20       30
        1 lightening       22       32
        6 rain             21       31
405     2 playing          23       33
407     4 sleeping         24       34
410     3 clouds           25       35
        5 lightening       26       36
        7 rain             27       37