基于DataFrame列的操作

时间:2019-02-09 19:21:58

标签: python pandas

我正在使用Python。我有以下代码:

df=pd.DataFrame({"Function":["Agent","Seller","Agent","Director","Agent","Seller","Seller","Seller"],
"Rating":[1,2,1,3,7,7,3,1]}, index["John","Mathew","Martin","Clain","McGregor","Clause","Bob","Viktor"])

产生以下数据框:

Name       Function  Rating
      John     Agent          1
      Mathew   Seller         2
      Martin   Agent          1
      Clain    Director       3
      McGregor Agent          7
      Clause   Seller         7
      Bob      Seller         3
      Viktor   Seller         1

我想按评分对数据框进行分组,同时创建其他列,以显示每个评分中的功能(代理商,卖方,总监)的数量和百分比。预期结果,如下所示:

  Rating    Agents  Seller  Director    Agent   Seller  Director
    1          2       0       0          100%    0%       0%
    2          0       1       0          0%      100%     0%
    3          0       1       1          0%      50%      50%
    7          1       1       0          50%     50%      0%

非常感谢您的帮助。 干杯。

1 个答案:

答案 0 :(得分:2)

首先使用crosstab,然后将sum除以新的DataFrame,再乘以100add_suffix以防止重复的列名,最后使用{{3} }:

df1 = pd.crosstab(df['Rating'], df['Function'])

df2 = df1.div(df1.sum(axis=1), 0).mul(100).add_suffix('%').round(2)

df = df1.join(df2).reset_index().rename_axis(None, axis=1)
print (df)
   Rating  Agent  Director  Seller  Agent%  Director%  Seller%
0       1      2         0       1   66.67        0.0    33.33
1       2      0         0       1    0.00        0.0   100.00
2       3      0         1       1    0.00       50.0    50.00
3       7      1         0       1   50.00        0.0    50.00

如果要使用带有%的字符串:

df2 = df1.div(df1.sum(axis=1), 0).mul(100).add_suffix('%').round(2).astype(str).add('%')

df = df1.join(df2).reset_index().rename_axis(None, axis=1)
print (df)

   Rating  Agent  Director  Seller  Agent% Director% Seller%
0       1      2         0       1  66.67%      0.0%  33.33%
1       2      0         0       1    0.0%      0.0%  100.0%
2       3      0         1       1    0.0%     50.0%   50.0%
3       7      1         0       1   50.0%      0.0%   50.0%