例如,我想将“ ModelPrediction”列中的所有值更改为1,其中“ AgeGrp”列等于[0,5],“性别”列等于male,而“ PClass”列等于“ 1”以及“ 2”。
我已经将AgeGrp,Pclass列的数据类型更改为对象。
我的尝试如下:
train.loc[train['Sex'] == 'male' & ['Pclass'] == 1 & ['Pclass'] == 2 & ['AgeGrp'] == (0, 5], 'ModelPrediction'] = 1
我对python / pandas的所有事物都是新手,感谢您的帮助!!谢谢!
答案 0 :(得分:1)
我认为您需要添加()
和Interval
,并且Pclass
有两次条件,如果需要同时检查两个值,我认为这里需要isin
:
train = pd.DataFrame({'Sex':['male','female','male'],
'Pclass':[1,0,1],
'AgeGrp':[pd.Interval(0, 5, closed='right'),
pd.Interval(6, 10, closed='right'),
pd.Interval(0, 5, closed='right')],
'ModelPrediction':[0,1,0]})
print (train)
Sex Pclass AgeGrp ModelPrediction
0 male 1 (0, 5] 0
1 female 0 (6, 10] 1
2 male 1 (0, 5] 0
train.loc[(train['Sex'] == 'male') &
(train['Pclass'].isin([1, 2])) &
(train['AgeGrp'] == pd.Interval(0, 5, closed='right')), 'ModelPrediction'] = 1
print (train)
Sex Pclass AgeGrp ModelPrediction
0 male 1 (0, 5] 1
1 female 0 (6, 10] 1
2 male 1 (0, 5] 1
答案 1 :(得分:1)
您非常接近,但是其中一个条件Pclass
既为1也为2,是不可能的,间隔的语法不存在,并且您希望圆括号分隔每个条件:
train.loc[(train['Sex'] == 'male') & ((train['Pclass'] == 1) | (train['Pclass'] == 2)) & (train['AgeGrp'] > 0) & (train['AgeGrp'] <= 5), 'ModelPrediction'] = 1