我有一个字典,每列都是dataframe
中的一个关键字:
dict = {"colA":1,"colB":1,"colC":1}
使用colA,colB,colC我的dataframe
。
我想做点什么:
df.loc[(df["colA"] < = dict["colA"]) & (df["colB"] < = dict["colB"]) & (df["colC"] < = dict["colC"])]
但动态(我不知道字典的长度/列数)
有没有办法用动态数量的参数做&
?
答案 0 :(得分:1)
您可以使用:
from functools import reduce
df = pd.DataFrame({'colA':[1,2,0],
'colB':[0,5,6],
'colC':[1,8,9]})
print (df)
colA colB colC
0 1 0 1
1 2 5 8
2 0 6 9
d = {"colA":1,"colB":1,"colC":1}
a = df[(df["colA"] <= d["colA"]) & (df["colB"] <= d["colB"]) & (df["colC"] <= d["colC"])]
print (a)
colA colB colC
0 1 0 1
创建Series
的解决方案,与le
进行比较,按all
检查所有True
,最后使用boolean indexing
:
d = {"colA":1,"colB":1,"colC":1}
s = pd.Series(d)
print (s)
colA 1
colB 1
colC 1
dtype: int64
print (df.le(s).all(axis=1))
0 True
1 False
2 False
dtype: bool
print (df[df.le(s).all(axis=1)])
colA colB colC
0 1 0 1
另一个解决方案numpy.logical_and
和reduce
用于创建掩码,list comprehension
用于应用条件:
print ([df[x] <= d[x] for x in df.columns])
[0 True
1 False
2 True
Name: colA, dtype: bool, 0 True
1 False
2 False
Name: colB, dtype: bool, 0 True
1 False
2 False
Name: colC, dtype: bool]
mask = reduce(np.logical_and, [df[x] <= d[x] for x in df.columns])
print (mask)
0 True
1 False
2 False
Name: colA, dtype: bool
print (df[mask])
colA colB colC
0 1 0 1
答案 1 :(得分:1)
这是一个类似SQL的解决方案,它使用.query()方法:
数据:
In [23]: df
Out[23]:
colA colB colC
0 2 2 5
1 3 0 8
2 5 9 2
3 3 0 2
4 9 1 3
5 7 5 6
6 7 8 0
7 0 4 1
8 8 2 6
9 9 6 7
解决方案:
In [20]: dct = {"colA":4,"colB":4,"colC":4}
In [21]: qry = ' and '.join(('{0[0]} <= {0[1]}'.format(tup) for tup in dct.items()))
In [22]: qry
Out[22]: 'colB <= 4 and colA <= 4 and colC <= 4'
In [24]: df.query(qry)
Out[24]:
colA colB colC
3 3 0 2
7 0 4 1