与熊猫和动态比较

时间:2016-11-15 06:03:50

标签: python pandas

我有一个字典,每列都是dataframe中的一个关键字:

dict = {"colA":1,"colB":1,"colC":1}

使用colA,colB,colC我的dataframe

的列

我想做点什么:

df.loc[(df["colA"] < = dict["colA"]) & (df["colB"] < = dict["colB"]) & (df["colC"] < = dict["colC"])]

但动态(我不知道字典的长度/列数)

有没有办法用动态数量的参数做&

2 个答案:

答案 0 :(得分:1)

您可以使用:

from  functools import reduce

df = pd.DataFrame({'colA':[1,2,0],
                   'colB':[0,5,6],
                   'colC':[1,8,9]})

print (df)
   colA  colB  colC
0     1     0     1
1     2     5     8
2     0     6     9

d = {"colA":1,"colB":1,"colC":1}

a = df[(df["colA"] <= d["colA"]) & (df["colB"] <= d["colB"]) & (df["colC"] <= d["colC"])]
print (a)
   colA  colB  colC
0     1     0     1

创建Series的解决方案,与le进行比较,按all检查所有True,最后使用boolean indexing

d = {"colA":1,"colB":1,"colC":1}

s = pd.Series(d)
print (s)
colA    1
colB    1
colC    1
dtype: int64

print (df.le(s).all(axis=1))
0     True
1    False
2    False
dtype: bool

print (df[df.le(s).all(axis=1)])
   colA  colB  colC
0     1     0     1

另一个解决方案numpy.logical_andreduce用于创建掩码,list comprehension用于应用条件:

print ([df[x] <= d[x] for x in df.columns])
[0     True
1    False
2     True
Name: colA, dtype: bool, 0     True
1    False
2    False
Name: colB, dtype: bool, 0     True
1    False
2    False
Name: colC, dtype: bool]

mask = reduce(np.logical_and, [df[x] <= d[x] for x in df.columns])
print (mask)
0     True
1    False
2    False
Name: colA, dtype: bool

print (df[mask])
   colA  colB  colC
0     1     0     1

答案 1 :(得分:1)

这是一个类似SQL的解决方案,它使用.query()方法:

数据:

In [23]: df
Out[23]:
   colA  colB  colC
0     2     2     5
1     3     0     8
2     5     9     2
3     3     0     2
4     9     1     3
5     7     5     6
6     7     8     0
7     0     4     1
8     8     2     6
9     9     6     7

解决方案:

In [20]: dct = {"colA":4,"colB":4,"colC":4}

In [21]: qry = ' and '.join(('{0[0]} <= {0[1]}'.format(tup) for tup in dct.items()))

In [22]: qry
Out[22]: 'colB <= 4 and colA <= 4 and colC <= 4'

In [24]: df.query(qry)
Out[24]:
   colA  colB  colC
3     3     0     2
7     0     4     1