NB:类似的问题是asked before,但它并没有完全回答我的问题。
如何根据满足某些布尔条件的特定大量列,对包含许多列的pandas数据框进行子集化。
现在,我必须做类似的事情:
df[(df.column4 > a1) | (df.column23 < a2) | (df.column27 == a3) | ...
(df.column56 > a21) | (df.column72 < a22)]
由于
答案 0 :(得分:0)
您必须以某种方式指定您的条件。您可以为每种条件创建单独的掩码,最终将其缩减为单个条件:
import seaborn.apionly as sns
import operator
import numpy as np
# Load a sample dataframe to play with
df = sns.load_dataset('iris')
# Define individual conditions as tuples
# ([column], [compare_function], [compare_value])
cond1 = ('sepal_length', operator.gt, 5)
cond2 = ('sepal_width', operator.lt, 2)
cond3 = ('species', operator.eq, 'virginica')
conditions = [cond1, cond2, cond3]
# Apply those conditions on the df, creating a list of 3 masks
masks = [fn(df[var], val) for var, fn, val in conditions]
# Reduce those 3 masks to one using logical OR
mask = np.logical_or.reduce(masks)
result = df.ix[mask]
当我们将这与'#34;手工制作&#34;进行比较时选择,我们看到他们是一样的:
result_manual = df[(df.sepal_length>5) | (df.sepal_width<2) | (df.species == 'virginica')]
result_manual.equals(result) # == True