我有两个数据帧df1
和df2
。
第一个数据框包含人物姓名:
df1 NAME
0 Paul
1 Jack
2 Anna
3 Tom
4 Eva
,并附上每个人接收和支付的金额信息的名字。有些人不在df1
中,例如Zack
。有些人无法出现在列表中,例如Tom
df2 Receiver Payer Amount
0 Paul Jack 300
1 Anna Paul 600
2 Anna Eva 100
3 Eva Zack 400
我想创建一个数据框,其中包含每个人接收和支付的所有金额。所以:
df3 NAME RECEIVED PAYED
0 Paul 300 600
1 Jack 0 300
2 Anna 700 0
3 Tom NaN NaN
4 Eva 400 100
答案 0 :(得分:3)
使用:
df3 = (df1.join(df2.melt('Amount', value_name='NAME', var_name='type')
.groupby(['NAME','type'])['Amount']
.sum()
.unstack(fill_value=0), on='NAME'))
print (df3)
NAME Payer Receiver
0 Paul 600.0 300.0
1 Jack 300.0 0.0
2 Anna 0.0 700.0
3 Tom NaN NaN
4 Eva 100.0 400.0
说明:
使用pivot_table
的另一种解决方案:
df3 = (df1.join(df2.melt('Amount', value_name='NAME', var_name='type')
.pivot_table(index='NAME',
columns='type',
values='Amount',
aggfunc='sum',
fill_value=0), on='NAME'))
print (df3)
NAME Payer Receiver
0 Paul 600.0 300.0
1 Jack 300.0 0.0
2 Anna 0.0 700.0
3 Tom NaN NaN
4 Eva 100.0 400.0
如有必要,最后rename
列:
df3 = df3.rename(columns={'Receiver':'RECEIVED','Payer':'PAYED'})
print (df3)
NAME PAYED RECEIVED
0 Paul 600.0 300.0
1 Jack 300.0 0.0
2 Anna 0.0 700.0
3 Tom NaN NaN
4 Eva 100.0 400.0
详细信息:
print (df2.melt('Amount', value_name='NAME', var_name='type'))
Amount type NAME
0 300 Receiver Paul
1 600 Receiver Anna
2 100 Receiver Anna
3 400 Receiver Eva
4 300 Payer Jack
5 600 Payer Paul
6 100 Payer Eva
7 400 Payer Zack