Question

我的数据框是-

id       score
1          50
2          88
3          44
4          77
5          93

我希望数据框看起来像-

id       score      is_good
1          50        low
2          88        high
3          44        low
4          77        medium
5          93        high

我已完成以下代码-

def selector(row):
    if row['score'] >= 0 and row['score'] <= 50 :
        return "low"
    elif row['score'] > 50 and row['score'] <=80 :
        return "medium"
    else:
        return "high"

x['is_good'] = x.apply(lambda row : selector(x), axis=1)

我认为逻辑很好，但是代码不起作用。也许我们可以使用地图功能。

Answer 1

这是pd.cut的好用例：

df['is_good'] = pd.cut(df.score, 
                       [-np.inf,50,80,np.inf], 
                       labels=['low','medium','high'])

print(df)
   id  score is_good
0   1     50     low
1   2     88    high
2   3     44     low
3   4     77  medium
4   5     93    high

Answer 2

您可以使用np.where + Series.between

import numpy as np

df['is_good'] = (
    np.where(df.score.between(0, 50), "low",
             np.where(df.score.between(51, 80), "medium", "high"))
)

   id  score is_good
0   1     50     low
1   2     88    high
2   3     44     low
3   4     77  medium
4   5     93    high

Answer 3

由于以下原因，您的代码中有错误

x['is_good'] = x.apply(lambda row : selector(x), axis=1)

应为：

x['is_good'] = x.apply(lambda row : selector(row), axis=1)

这是系列而不是行，这就是您得到错误的原因。

在熊猫数据框中实现多个if else条件

3 个答案: