这是我当前的数据框:
sports_gpa music_gpa Activity Sport
2 3 nan nan
0 2 nan nan
3 3.5 nan nan
2 1 nan nan
我有以下情况:
如果'sports_gpa'大于0且'music_gpa'大于'sports_gpa',则在'Activity'栏中填写'sport_gpa',在'Sport'栏中填写str'basketball'
预期输出:
sports_gpa music_gpa Activity Sport
2 3 2 basketball
0 2 nan nan
3 3.5 3 basketball
2 1 nan nan
为此,我将使用以下语句...
df['Activity'], df['Sport'] = np.where(((df['sports_gpa'] > 0) & (df['music_gpa'] > df['sports_gpa'])), (df['sport_gpa'],'basketball'), (df['Activity'], df['Sport']))
这当然会导致错误,即操作数不能与形状一起广播。
要解决此问题,我可以在数据框中添加一列。
df.loc[:,'str'] = 'basketball'
df['Activity'], df['Sport'] = np.where(((df['sports_gpa'] > 0) & (df['music_gpa'] > df['sports_gpa'])), (df['sport_gpa'],df['str']), (df['Activity'], df['Sport']))
这给了我预期的输出。
我想知道是否有一种方法可以解决此错误,而不必创建新列即可将str值“ basketball”添加到np.where语句的“ Sport”列中。
答案 0 :(得分:0)
使用np.where
+ Series.fillna
:
where=df['sports_gpa'].ne(0)&(df['sports_gpa']<df['music_gpa'])
df['Activity'], df['Sport'] = np.where(where, (df['sports_gpa'],df['Sport'].fillna('basketball')), (df['Activity'], df['Sport']))
您还可以使用Series.where
+ Series.mask
:
df['Activity']=df['sports_gpa'].where(where)
df['Sport']=df['Sport'].mask(where,'basketball')
print(df)
sports_gpa music_gpa Activity Sport
0 2 3.0 2.0 basketball
1 0 2.0 NaN NaN
2 3 3.5 3.0 basketball
3 2 1.0 NaN NaN
答案 1 :(得分:0)
只知道我可以做:
df['Activity'], df['Sport'] = np.where(((df['sports_gpa'] > 0) & (df['music_gpa'] > df['sports_gpa'])), (df['sports_gpa'],df['Sport'].astype(str).replace({"nan": "basketball"})), (df['Activity'], df['Sport']))