因此，最简单的方法是

Question

我试图使用几个属于df的布尔变量来过滤df，但是无法这样做。

示例数据：

A | B | C | D
John Doe | 45 | True | False
Jane Smith | 32 | False | False
Alan Holmes | 55 | False | True
Eric Lamar | 29 | True | True

列C和D的dtype是布尔值。我想创建一个新的df（df1），其中只有C或D为True的行。它应该是这样的：

A | B | C | D
John Doe | 45 | True | False
Alan Holmes | 55 | False | True
Eric Lamar | 29 | True | True

我尝试过这样的事情，因为它无法处理布尔类型而面临问题：

df1 = df[(df['C']=='True') or (df['D']=='True')]

有什么想法吗？

Answer 1

In [82]: d
Out[82]:
             A   B      C      D
0     John Doe  45   True  False
1   Jane Smith  32  False  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

解决方案1：

In [83]: d.loc[d.C | d.D]
Out[83]:
             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

解决方案2：

In [94]: d[d[['C','D']].any(1)]
Out[94]:
             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

解决方案3：

In [95]: d.query("C or D")
Out[95]:
             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

PS如果您将解决方案更改为：

df[(df['C']==True) | (df['D']==True)]

它也会起作用

Pandas docs - boolean indexing

Answer 2

万岁！更多选择！

`np.where`

df[np.where(df.C | df.D, True, False)]

             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

在pd.Series.where 上

df.index

df.loc[df.index.where(df.C | df.D).dropna()]

               A   B      C      D
0.0     John Doe  45   True  False
2.0  Alan Holmes  55  False   True
3.0   Eric Lamar  29   True   True

`df.select_dtypes`

df[df.select_dtypes([bool]).any(1)]   

             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

滥用`np.select`

df.iloc[np.select([df.C | df.D], [df.index])].drop_duplicates()

             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

Answer 3

或

d[d.eval('C or D')]

Out[1065]:
             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

Answer 4

因此，最简单的方法是

students = [ ('jack1', 'Apples1' , 341) ,
             ('Riti1', 'Mangos1'  , 311) ,
             ('Aadi1', 'Grapes1' , 301) ,
             ('Sonia1', 'Apples1', 321) ,
             ('Lucy1', 'Mangos1'  , 331) ,
             ('Mike1', 'Apples1' , 351),
              ('Mik', 'Apples1' , np.nan)
              ]
#Create a DataFrame object
df = pd.DataFrame(students, columns = ['Name1' , 'Product1', 'Sale1']) 
print(df)


    Name1 Product1  Sale1
0   jack1  Apples1    341
1   Riti1  Mangos1    311
2   Aadi1  Grapes1    301
3  Sonia1  Apples1    321
4   Lucy1  Mangos1    331
5   Mike1  Apples1    351
6     Mik  Apples1    NaN

# Select rows in above DataFrame for which ‘Product’ column contains the value ‘Apples’,
subset = df[df['Product1'] == 'Apples1']
print(subset)

 Name1 Product1  Sale1
0   jack1  Apples1    341
3  Sonia1  Apples1    321
5   Mike1  Apples1    351
6     Mik  Apples1    NA

# Select rows in above DataFrame for which ‘Product’ column contains the value ‘Apples’, AND notnull value in Sale

subsetx= df[(df['Product1'] == "Apples1")  & (df['Sale1'].notnull())]
print(subsetx)
    Name1   Product1    Sale1
0   jack1   Apples1      341
3   Sonia1  Apples1      321
5   Mike1   Apples1      351

# Select rows in above DataFrame for which ‘Product’ column contains the value ‘Apples’, AND Sale = 351

subsetx= df[(df['Product1'] == "Apples1")  & (df['Sale1'] == 351)]
print(subsetx)

   Name1 Product1  Sale1
5  Mike1  Apples1    351

# Another example
subsetData = df[df['Product1'].isin(['Mangos1', 'Grapes1']) ]
print(subsetData)

Name1 Product1  Sale1
1  Riti1  Mangos1    311
2  Aadi1  Grapes1    301
4  Lucy1  Mangos1    331

以下是此代码的来源：https://thispointer.com/python-pandas-select-rows-in-dataframe-by-conditions-on-multiple-columns/
我在上面做了些小改动。

Answer 5

你可以轻松地尝试这个：

df1 = df[(df['C']=='True') | (df['D']=='True')]

注意：

or逻辑运算符需要由按位|替换操作者。
确保()用于包含每个操作数。

使用多个布尔列过滤pandas数据帧

5 个答案:

`np.where`

`df.index`
`df.loc[df.index.where(df.C | df.D).dropna()] A B C D 0.0 John Doe 45 True False 2.0 Alan Holmes 55 False True 3.0 Eric Lamar 29 True True`

`df.select_dtypes`

滥用`np.select`

因此，最简单的方法是

使用多个布尔列过滤pandas数据帧

5 个答案:

np.where

df.index df.loc[df.index.where(df.C | df.D).dropna()] A B C D 0.0 John Doe 45 True False 2.0 Alan Holmes 55 False True 3.0 Eric Lamar 29 True True

df.select_dtypes

滥用np.select

因此，最简单的方法是

`np.where`

`df.index`
`df.loc[df.index.where(df.C | df.D).dropna()] A B C D 0.0 John Doe 45 True False 2.0 Alan Holmes 55 False True 3.0 Eric Lamar 29 True True`

`df.select_dtypes`

滥用`np.select`