我有一个像这样的数据表:
scripts
我想使用以下掩码仅获取数据的子选择:
Item Colour Item Range Item Size
789 COLOUR-BLUE RANGE-PANT SIZE-XXL
2507 COLOUR-BLACK RANGE-OTHER SIZE-XXL
2376 COLOUR-BLACK RANGE-JACKET SIZE-S
1378 COLOUR-WHITE RANGE-OTHER SIZE-L
598 COLOUR-BLUE RANGE-JACKET SIZE-M
1589 COLOUR-BLUE RANGE-JACKET SIZE-L
2580 COLOUR-BLACK RANGE-SHIRT SIZE-L
366 COLOUR-BLUE RANGE-PANT SIZE-XXL
2320 COLOUR-WHITE RANGE-OTHER SIZE-L
1247 COLOUR-GREEN RANGE-PANT SIZE-M
2224 COLOUR-BLACK RANGE-JACKET SIZE-L
3615 COLOUR-BLACK RANGE-OTHER SIZE-S
4176 COLOUR-GREEN RANGE-PANT SIZE-XL
1640 COLOUR-BLACK RANGE-PANT SIZE-S
1136 COLOUR-WHITE RANGE-OTHER SIZE-M
3437 COLOUR-BLACK RANGE-JACKET SIZE-S
4448 COLOUR-WHITE RANGE-OTHER SIZE-S
1188 COLOUR-WHITE RANGE-SHIRT SIZE-XXL
3332 COLOUR-GREEN RANGE-OTHER SIZE-M
1080 COLOUR-WHITE RANGE-OTHER SIZE-XXL
我尝试了mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])
,但它给了我错误:
df[mask]
如何避免错误。
到目前为止,我已经做到了:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
更新 仍然不起作用。
import numpy as np
import pandas as pd
df = pd.read_clipboard()
df.drop(['Item','Item.2','Size'], inplace=True,axis=1)
df.columns = ['Item Colour', 'Item Range', 'Item Size']
print(df)
mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])
dff = df[mask]
dff
答案 0 :(得分:2)
问题出在您通过检查项是否在列表中来构建蒙版的方式。您可以使用pd.Series.isin([item1, item2, ...])
系列方法执行此操作。因此,代替:
df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']
,
这样做:
df['Item Range'].isin(['RANGE-JACKET','RANGE-PANT'])
否定,表示“不在”:
df['Item Size'] not in ['SIZE-XXL']
,
您可以这样做:
~df['Item Size'].isin(['SIZE-XXL'])