我是Python编程的新手。我有一个熊猫数据框,其中有两个字符串列。
数据框如下:
Case Action
Create Create New Account
Create New Account
Create New Account
Create New Account
Create Old Account
Delete Delete New Account
Delete New Account
Delete Old Account
Delete Old Account
Delete Old Account
在这里,我们可以在Create
中看到5个动作,其中Create New Account
是4个动作。平均值为4/5(= 80%)。类似地,在Delete
情况下,最大情况为Delete Old Account
。因此,我的要求是,下次遇到任何情况Create
时,我应该以频率得分将o / p设为Crate New Account
。
预期的O / P:
Case Action Score
Create Create New Account 80
Delete Delete Old Account 60
答案 0 :(得分:1)
在crosstab
groupby
之前使用tail
pd.crosstab(df.Case,df.Action,normalize='index').stack().sort_values().groupby(level=0).tail(1)
Out[769]:
Case Action
Delete DeleteOldAccount 0.6
Create CreateNewAccount 0.8
dtype: float64
或者使用where
pdf=pd.crosstab(df.Case,df.Action,normalize='index')
pdf.where(pdf.eq(pdf.max(1),axis=0)).stack()
Out[781]:
Case Action
Create CreateNewAccount 0.8
Delete DeleteOldAccount 0.6
dtype: float64