在multiIndex数据帧pandas中添加计算列

时间:2017-09-26 15:49:41

标签: python-3.x pandas dataframe

               star_rating          duration         
Date           20170829 20170830 20170829 20170830
genre                                           
Action         1038.1   1038.1  15917.0  16598.0
Adventure       595.0    595.0   9386.0  10113.0
Animation       490.7    490.7   5811.0   5989.0
Biography       596.9    596.9   9661.0  10002.0
Comedy         1211.7   1211.7  16616.0  16786.0

In[86]: df2.columns
Out[86]: 
MultiIndex(levels=[['star_rating', 'duration'], [20170829, 20170830]],
           labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
           names=[None, 'Date'])

大家好,我有上面的表df2,我想插入一个差异列,它将是20170830 - 20170829的简单减法。

            star_rating                     duration        
Date        20170829    20170830    Diff    20170829    20170830    Diff
genre                       
Action      1038.1      1038.1      0       15917       16598       681
Adventure   595         595         0       9386        10113       727
Animation   490.7       490.7       0       5811        5989        178
Biography   596.9       596.9       0       9661        10002       341
Comedy      1211.7      1211.7      0       16616       16786       170

如果日期处于最顶端,我就可以轻松使用df2['diff'] = df2[20170830] - df2[20170829]

我是multiIndex的新手,所以感谢任何人有任何想法让我开始。提前谢谢。

1 个答案:

答案 0 :(得分:0)

让我们试试:

df1 = df.groupby(level=0,axis=1).diff().dropna(1)

df1.columns = df1.columns.set_levels(['diff','diff'],level=1)

df.columns = df.columns.set_levels(df.columns.get_level_values(1).astype(str),level=1)

df_out = pd.concat([df,df1],axis=1).sort_index(1)

输出:

          duration                 star_rating              
Date      20170829 20170830   diff    20170829 20170830 diff
genre                                                       
Action     15917.0  16598.0  681.0      1038.1   1038.1  0.0
Adventure   9386.0  10113.0  727.0       595.0    595.0  0.0
Animation   5811.0   5989.0  178.0       490.7    490.7  0.0
Biography   9661.0  10002.0  341.0       596.9    596.9  0.0
Comedy     16616.0  16786.0  170.0      1211.7   1211.7  0.0