我正在阅读基于conditaion的新计算的博客,其中插入了新的col'类别'。
data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
'age': [42, 52, 36, 24, 73],
'preTestScore': [4, 24, 31, 2, 3],
'postTestScore': [25, 94, 57, 62, 70]}
df = pd.DataFrame(data, columns = ['name', 'age', 'preTestScore', 'postTestScore'])
df['category'] = np.where(df['age']>=50, 'yes', 'no')
如何将它扩展到多个条件,例如 如果年龄小于20,那么孩子; 如果在21到40之间那么年轻; 如果超过40岁那么
答案 0 :(得分:5)
对于多种情况,您只需使用numpy.select
代替numpy.where
import numpy as np
cond = [df['age'] < 20, df['age'].between(20, 39), df['age'] >= 40]
choice = ['kid', 'young', 'old']
df['category'] = np.select(cond, choice)
# name age preTestScore postTestScore category
#0 Jason 42 4 25 old
#1 Molly 52 24 94 old
#2 Tina 36 31 57 young
#3 Jake 24 2 62 young
#4 Amy 73 3 70 old
答案 1 :(得分:1)
你可以使用pd.cut
(BTW,40不是老人: - ()
pd.cut(df.age,bins=[0,20,39,np.inf],labels=['kid','young','old'])
Out[179]:
0 old
1 old
2 young
3 young
4 old
Name: age, dtype: category
Categories (3, object): [kid < young < old]