透视/重新索引此DataFrame中的MultiIndex列?

时间:2016-04-14 23:57:19

标签: python pandas

假设我DataFrame生成的describe()在列索引中生成MultiIndex

txid   FOOBAR                                                           
           count      mean        std    min   25%   50%   75%    max   
meas 0     233.0  8.064378  76.225789 -127.0 -62.0  14.0  78.0  126.0   
...  

txid   DEADBEEF                                                        
           count     mean        std    min    25%  50%    75%    max  
meas 0     60.0  7.866667  78.921215 -127.0 -55.75  3.0  81.75  126.0  
...

如何将“txid”键向下弹出一个DataFrame行,其中行数为两倍:

           txid       count      mean        std    min   25%   50%   75%    max
meas 0     FOOBAR     233.0  8.064378  76.225789 -127.0 -62.0  14.0  78.0  126.0
...
meas 0     DEADBEEF   60.0  7.866667  78.921215 -127.0 -55.75  3.0  81.75  126.0

更新

@jezrael提供的答案略有改进:

df1 = df.groupby('txid').apply(lambda x: x.describe())
# start improvements
df1.index.rename('idx',level=1,inplace=True)
df1.reset_index(inplace=True)
df1 = df1.pivot(columns='txid',index='idx')
df1 = df1.T
df.index.rename('meas',level=0,inplace=True)

1 个答案:

答案 0 :(得分:1)

您可以将pivot_tableT一起使用。上次使用reset_indexrename_axis

print df
   meas 0      txid  meas 1
0   12123  DEADBEEF       1
1   11123    FOOBAR       2
2   10231  DEADBEEF       3
3   10233    FOOBAR       3

df1 = df.groupby('txid').apply(lambda x: x.describe())
                        .reset_index()
                        .rename(columns={'level_1':'ind'})

print df1
        txid    ind        meas 0    meas 1
0   DEADBEEF  count      2.000000  2.000000
1   DEADBEEF   mean  11177.000000  2.000000
2   DEADBEEF    std   1337.846030  1.414214
3   DEADBEEF    min  10231.000000  1.000000
4   DEADBEEF    25%  10704.000000  1.500000
5   DEADBEEF    50%  11177.000000  2.000000
6   DEADBEEF    75%  11650.000000  2.500000
7   DEADBEEF    max  12123.000000  3.000000
8     FOOBAR  count      2.000000  2.000000
9     FOOBAR   mean  10678.000000  2.500000
10    FOOBAR    std    629.325035  0.707107
11    FOOBAR    min  10233.000000  2.000000
12    FOOBAR    25%  10455.500000  2.250000
13    FOOBAR    50%  10678.000000  2.500000
14    FOOBAR    75%  10900.500000  2.750000
15    FOOBAR    max  11123.000000  3.000000
    
df1 = df1.pivot_table(columns='txid', index='ind').T
                                                  .reset_index(level=1)
                                                  .rename_axis(None, axis=1)
print df1
            txid       25%      50%       75%  count      max     mean  \
meas 0  DEADBEEF  10704.00  11177.0  11650.00    2.0  12123.0  11177.0   
meas 0    FOOBAR  10455.50  10678.0  10900.50    2.0  11123.0  10678.0   
meas 1  DEADBEEF      1.50      2.0      2.50    2.0      3.0      2.0   
meas 1    FOOBAR      2.25      2.5      2.75    2.0      3.0      2.5   

            min          std  
meas 0  10231.0  1337.846030  
meas 0  10233.0   629.325035  
meas 1      1.0     1.414214  
meas 1      2.0     0.707107