在熊猫DataFrame中使用多个条件会导致ValueError

时间:2018-12-08 21:19:47

标签: python pandas

我有一个像这样的数据表:

scripts

我想使用以下掩码仅获取数据的子选择:

       Item Colour    Item Range Item Size
789    COLOUR-BLUE    RANGE-PANT  SIZE-XXL
2507  COLOUR-BLACK   RANGE-OTHER  SIZE-XXL
2376  COLOUR-BLACK  RANGE-JACKET    SIZE-S
1378  COLOUR-WHITE   RANGE-OTHER    SIZE-L
598    COLOUR-BLUE  RANGE-JACKET    SIZE-M
1589   COLOUR-BLUE  RANGE-JACKET    SIZE-L
2580  COLOUR-BLACK   RANGE-SHIRT    SIZE-L
366    COLOUR-BLUE    RANGE-PANT  SIZE-XXL
2320  COLOUR-WHITE   RANGE-OTHER    SIZE-L
1247  COLOUR-GREEN    RANGE-PANT    SIZE-M
2224  COLOUR-BLACK  RANGE-JACKET    SIZE-L
3615  COLOUR-BLACK   RANGE-OTHER    SIZE-S
4176  COLOUR-GREEN    RANGE-PANT   SIZE-XL
1640  COLOUR-BLACK    RANGE-PANT    SIZE-S
1136  COLOUR-WHITE   RANGE-OTHER    SIZE-M
3437  COLOUR-BLACK  RANGE-JACKET    SIZE-S
4448  COLOUR-WHITE   RANGE-OTHER    SIZE-S
1188  COLOUR-WHITE   RANGE-SHIRT  SIZE-XXL
3332  COLOUR-GREEN   RANGE-OTHER    SIZE-M
1080  COLOUR-WHITE   RANGE-OTHER  SIZE-XXL

我尝试了mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL']) ,但它给了我错误:

df[mask]

如何避免错误。

到目前为止,我已经做到了:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

更新 仍然不起作用。

import numpy as np
import pandas as pd

df = pd.read_clipboard()
df.drop(['Item','Item.2','Size'], inplace=True,axis=1)
df.columns = ['Item Colour', 'Item Range', 'Item Size']
print(df)

mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])

dff = df[mask]
dff

1 个答案:

答案 0 :(得分:2)

问题出在您通过检查项是否在列表中来构建蒙版的方式。您可以使用pd.Series.isin([item1, item2, ...])系列方法执行此操作。因此,代替:

df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']

这样做:

df['Item Range'].isin(['RANGE-JACKET','RANGE-PANT'])

否定,表示“不在”:

df['Item Size'] not in ['SIZE-XXL']

您可以这样做:

~df['Item Size'].isin(['SIZE-XXL'])