star_rating duration
Date 20170829 20170830 20170829 20170830
genre
Action 1038.1 1038.1 15917.0 16598.0
Adventure 595.0 595.0 9386.0 10113.0
Animation 490.7 490.7 5811.0 5989.0
Biography 596.9 596.9 9661.0 10002.0
Comedy 1211.7 1211.7 16616.0 16786.0
In[86]: df2.columns
Out[86]:
MultiIndex(levels=[['star_rating', 'duration'], [20170829, 20170830]],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=[None, 'Date'])
大家好,我有上面的表df2,我想插入一个差异列,它将是20170830 - 20170829的简单减法。
star_rating duration
Date 20170829 20170830 Diff 20170829 20170830 Diff
genre
Action 1038.1 1038.1 0 15917 16598 681
Adventure 595 595 0 9386 10113 727
Animation 490.7 490.7 0 5811 5989 178
Biography 596.9 596.9 0 9661 10002 341
Comedy 1211.7 1211.7 0 16616 16786 170
如果日期处于最顶端,我就可以轻松使用df2['diff'] = df2[20170830] - df2[20170829]
。
我是multiIndex的新手,所以感谢任何人有任何想法让我开始。提前谢谢。
答案 0 :(得分:0)
让我们试试:
df1 = df.groupby(level=0,axis=1).diff().dropna(1)
df1.columns = df1.columns.set_levels(['diff','diff'],level=1)
df.columns = df.columns.set_levels(df.columns.get_level_values(1).astype(str),level=1)
df_out = pd.concat([df,df1],axis=1).sort_index(1)
输出:
duration star_rating
Date 20170829 20170830 diff 20170829 20170830 diff
genre
Action 15917.0 16598.0 681.0 1038.1 1038.1 0.0
Adventure 9386.0 10113.0 727.0 595.0 595.0 0.0
Animation 5811.0 5989.0 178.0 490.7 490.7 0.0
Biography 9661.0 10002.0 341.0 596.9 596.9 0.0
Comedy 16616.0 16786.0 170.0 1211.7 1211.7 0.0