I am trying to take a pandas dataframe and returns a pandas dataframe object after adding the new column 'Size_Category' with value of either small medium or large based on some conditions.
mod_df = df.copy(deep=True)
mod_df.loc[(mod_df['Length'] <= 300 , 'Size_Category')] = 'small' # condition, new_column
mod_df.loc[(mod_df['Length'] <= 300 | mod_df['Length'] > 450) , 'Size_Category')] = 'medium' # condition, new_column
mod_df.loc[(mod_df['Length'] >= 450, 'Size_Category')] = 'large' # condition, new_column
When I do this, it gives me an error saying
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
How can I handle this?
答案 0 :(得分:2)
You missing ()
:
mod_df.loc[(mod_df['Length'] <= 300) | (mod_df['Length'] > 450) , 'Size_Category')]
Another solution is use cut
:
df = pd.DataFrame({'Length': [0,10,300,400,449,450,500]})
bins = [-np.inf, 300, 449, np.inf]
labels=['small','medium','large']
df['Size_Category'] = pd.cut(df['Length'], bins=bins, labels=labels)
print (df)
Length Size_Category
0 0 small
1 10 small
2 300 small
3 400 medium
4 449 medium
5 450 large
6 500 large