我有2个数据框“交易”和“偏移”
偏移量:
Contact Account Name
0 TODD HOWARD
1 TODD HOWARD
2 JEFF COX
3 JEFF COX
4 TODD HOWARD
5 JEFF COX
6 MIKE BALDWIN
交易:
Contact Account Name
0 TODD HOWARD
1 TODD HOWARD
2 JEFF COX
3 JEFF COX
4 TODD HOWARD
5 JEFF COX
6 TODD HOWARD
7 MIKE BALDWIN
8 MIKE BALDWIN
9 JEFF COX
10 JC WHITE
它想做什么: 1)是计算每个唯一值。为此,我使用了:
df1 = offsets.groupby('Contact Account Name').size()
df2 = transactions.groupby('Contact Account Name').size()
我有
df1:
Contact Account Name
TODD HOWARD 3
JEFF COX 3
MIKE BALDWIN 1
df2:
Contact Account Name
JC WHITE 1
TODD HOWARD 4
JEFF COX 4
MIKE BALDWIN 2
2)我想合并两个数据框。我尝试过merge
,但是没有用。
3)我想创建另一个数据框并计算总交易中的偏移量百分比。
我想在最后看到什么结果?
Contact Account Name Offset Percentage
TODD HOWARD 75
JEFF COX 75
MIKE BALDWIN 50
JC WHITE 100
谢谢!
答案 0 :(得分:1)
聚合的输出为Series
,因此可以除以div
,再除以mul
,最后除reset_index
:
df = df1.div(df2, fill_value=1).mul(100).reset_index(name='Offset Percentage')
print (df)
Contact Account Name Offset Percentage
0 JC WHITE 100.0
1 JEFF COX 75.0
2 MIKE BALDWIN 50.0
3 TODD HOWARD 75.0
与value_counts
相似的解决方案:
df1 = offsets['Contact Account Name'].value_counts()
df2 = transactions['Contact Account Name'].value_counts()
df = (df1.div(df2, fill_value=1)
.mul(100)
.rename_axis('Contact Account Name')
.reset_index(name='Offset Percentage'))
print (df)
Contact Account Name Offset Percentage
0 JC WHITE 100.0
1 JEFF COX 75.0
2 MIKE BALDWIN 50.0
3 TODD HOWARD 75.0
如果需要将两个系列一起加入,请致电concat
:
df = pd.concat([df2, df1], axis=1, keys=('Offset Percentage','b'))
df['Offset Percentage'] = df.b.div(df['Offset Percentage'], fill_value=1).mul(100)
df = df.drop('b', 1).rename_axis('Contact Account Name').reset_index()
print (df)
Contact Account Name Offset Percentage
0 JC WHITE 100.0
1 JEFF COX 75.0
2 MIKE BALDWIN 50.0
3 TODD HOWARD 75.0