我正在使用Python。我有以下代码:
df=pd.DataFrame({"Function":["Agent","Seller","Agent","Director","Agent","Seller","Seller","Seller"],
"Rating":[1,2,1,3,7,7,3,1]}, index["John","Mathew","Martin","Clain","McGregor","Clause","Bob","Viktor"])
产生以下数据框:
Name Function Rating
John Agent 1
Mathew Seller 2
Martin Agent 1
Clain Director 3
McGregor Agent 7
Clause Seller 7
Bob Seller 3
Viktor Seller 1
我想按评分对数据框进行分组,同时创建其他列,以显示每个评分中的功能(代理商,卖方,总监)的数量和百分比。预期结果,如下所示:
Rating Agents Seller Director Agent Seller Director
1 2 0 0 100% 0% 0%
2 0 1 0 0% 100% 0%
3 0 1 1 0% 50% 50%
7 1 1 0 50% 50% 0%
非常感谢您的帮助。 干杯。
答案 0 :(得分:2)
首先使用crosstab
,然后将sum
除以新的DataFrame
,再乘以100
和add_suffix
以防止重复的列名,最后使用{{3} }:
df1 = pd.crosstab(df['Rating'], df['Function'])
df2 = df1.div(df1.sum(axis=1), 0).mul(100).add_suffix('%').round(2)
df = df1.join(df2).reset_index().rename_axis(None, axis=1)
print (df)
Rating Agent Director Seller Agent% Director% Seller%
0 1 2 0 1 66.67 0.0 33.33
1 2 0 0 1 0.00 0.0 100.00
2 3 0 1 1 0.00 50.0 50.00
3 7 1 0 1 50.00 0.0 50.00
如果要使用带有%
的字符串:
df2 = df1.div(df1.sum(axis=1), 0).mul(100).add_suffix('%').round(2).astype(str).add('%')
df = df1.join(df2).reset_index().rename_axis(None, axis=1)
print (df)
Rating Agent Director Seller Agent% Director% Seller%
0 1 2 0 1 66.67% 0.0% 33.33%
1 2 0 0 1 0.0% 0.0% 100.0%
2 3 0 1 1 0.0% 50.0% 50.0%
3 7 1 0 1 50.0% 0.0% 50.0%