我正在清理数据集,并希望使用满足条件的变量列表对其进行过滤。如
import pandas as pd
import numpy as np
data = {"var1": [0,1,0,0,0],
"var2": [0,0,0,0,0],
"var3": [0,0,0,0,1],
'var4': [0,0,0,0,0],
'var5': [1,2,3,4,5]
}
df = pd.DataFrame(data)
#here is a list as an example
SelLst = ['var1','var2','var3','var4']
#This is what I'd like to do, but instead of 4 variables, I have any number.
b = df[SelLst ].query('var1 ==1 | var2 == 1 | var3 ==1 | var4 == 1')
#This doesn't work, but would be cool
c = df[df[SelLst ].isin([1])]
#Something like this works, but I feel like pandas has something under the hood that would be easier.
strSel = " ".join([i + '== 1 |' for i in SelLst])
d = df[SelLst].query(strSel[:-1])
所以任何神奇的功能或想法如何平滑这个?或者是这样做的方式?谢谢!
答案 0 :(得分:1)
isin
+ any(1)
df[['var1','var2','var3','var4']].isin([1]).any(1)
Out[538]:
0 False
1 True
2 False
3 False
4 True
dtype: bool