我需要一些关于python pandas的指导,因为它对于前端开发来说是一个未知领域。我现在熟悉数据帧概念。我希望通过比较其他两个数据帧来找到创建新数据帧的方法。为此,我应该在熊猫中寻找什么?
例如,将df1视为
Date col1 col2 col3 id
2017-04-14 2482 1 0 a2
2017-04-15 2483 1 0 a3
和df2为
Date col1 col2 col3 id
2017-04-15 2483 10 20 a3
2017-04-14 2482 11 0 a2
所以我想要实现的是创建一个新的数据框,其中包含与
不同的值的详细信息 Date df1_value df2_valuue diff_col_name val_diff id
2017-04-14 1 11 col2 -10 a2
2017-04-15 1 11 col2 -9 a3
2017-04-15 0 20 col3 20 a3
所以我能够基于id,df1.merge(df2, on='id', how='left')
加入两个dfs,但下一步应该是什么。如何比较差异并创建最终的df?
答案 0 :(得分:0)
<强>设置强>
df1 = pd.DataFrame({'Date': {0: '2017-04-14', 1: '2017-04-15'},
'col1': {0: 2482, 1: 2483},
'col2': {0: 1, 1: 1},
'col3': {0: 0, 1: 0},
'id': {0: 'a2', 1: 'a3'}})
df2 = pd.DataFrame({'Date': {0: '2017-04-15', 1: '2017-04-14'},
'col1': {0: 2483, 1: 2482},
'col2': {0: 10, 1: 11},
'col3': {0: 20, 1: 0},
'id': {0: 'a3', 1: 'a2'}})
<强>解决方案强>
#melt the dfs to long df from wide df and merge them together.
dfm = pd.merge(pd.melt(df1,id_vars=['Date','id']),
pd.melt(df2,id_vars=['Date','id']),
how='outer',on=['Date','id','variable'])
#rename columns
dfm.columns=['Date','id','diff_col_name','df1_value','df2_value']
#compare values
dfm['val_diff'] = dfm.df1_value-dfm.df2_value
#reorder columns
dfm = dfm[['Date','df1_value','df2_value','diff_col_name','val_diff','id']]
#filter unequal values
dfm=dfm[dfm.val_diff!=0]
Out[2001]:
Date df1_value df2_value diff_col_name val_diff id
2 2017-04-14 1 11 col2 -10 a2
3 2017-04-15 1 10 col2 -9 a3
5 2017-04-15 0 20 col3 -20 a3