我有一张超过一百列的excel表。我需要过滤其中的五个以查看哪个列在其中一个单元格中有“否”。有没有办法使用单个搜索条件筛选多个列,例如:
no_invoice_filter = df[(df['M1: PL - INVOICED']) & (df['M2: EX - INVOICED']) & (df['M3: TEST DEP - INVOICED']) == 'No']
反对单独写出如果每列等于“否”
上面代码的错误:
TypeError: unsupported operand type(s) for &: 'str' and 'bool'
答案 0 :(得分:1)
您需要在列中使用any
列的子集至少一个No
:
df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')
.any(axis=1)]
样品:
df = pd.DataFrame({'M1: PL - INVOICED':['a','Yes','No'],
'M2: EX - INVOICED':['Yes','No','b'],
'M3: TEST DEP - INVOICED':['s','a','No']})
print (df)
M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED
0 a Yes s
1 Yes No a
2 No b No
print ((df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No'))
M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED
0 False False False
1 False True False
2 True False True
print ((df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')
.any(axis=1))
0 False
1 True
2 True
dtype: bool
print (df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')
.any(1)])
M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED
1 Yes No a
2 No b No
答案 1 :(得分:1)
你可以这样做:
df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')]
因此,您基本上会传递一系列感兴趣的列表,并将这些列与您的标量值进行比较,如果您在“否”出现在任何地方,请使用any(axis=1)
In [115]:
df = pd.DataFrame({'a':'no', 'b':'yes', 'c':['yes','no','yes','no','no']})
df
Out[115]:
a b c
0 no yes yes
1 no yes no
2 no yes yes
3 no yes no
4 no yes no
使用any(axis=1)
然后返回所有感兴趣的col中出现No的行:
In [133]:
df[(df[['a','c']] == 'no').any(axis=1)]
Out[133]:
a b c
0 no yes yes
1 no yes no
2 no yes yes
3 no yes no
4 no yes no
您还可以使用掩码使用dropna
In [132]:
df[df[['a','c']] == 'no'].dropna(subset=['c'])
Out[132]:
a b c
1 no NaN no
3 no NaN no
4 no NaN no