查找两列不等于预定值的行

时间:2018-08-31 22:20:48

标签: python pandas dataframe indexing

我有一个数据框,我试图查找两列不匹配的行。

例如,column:landing_page可以等于new_pageold_page,而column: group可以等于controltreatment。目前,我使用

no_line_up = df.query('group = treatment and landing_page = old_page or group = control and landing_page = new_page')

我正在尝试查找new_pagetreatment不匹配的行。

但是会抛出错误。这样做的正确方法是什么?

2 个答案:

答案 0 :(得分:1)

对于pd.DataFrame.query,您仍然需要使用相同的基本运算符,例如使用==测试是否相等,并使用括号分隔条件:

df = pd.DataFrame({'group': ['treatment', 'control', 'hello'],
                   'landing_page': ['old_page', 'new_page', 'test']})

res = df.query('(group == "treatment" and landing_page == "old_page") \
                 or (group == "control" and landing_page == "new_page")')

print(res)

       group landing_page
0  treatment     old_page
1    control     new_page

更具可读性的是结合布尔掩码并使用pd.DataFrame.loc

m1 = (df['group'] == 'treatment') & (df['landing_page'] == 'old_page')
m2 = (df['group'] == 'control') & (df['landing_page'] == 'new_page')

res = df.loc[m1 & m2]

答案 1 :(得分:0)

也许是

df.loc[((df['group']==df['treatment'])|(df['landing_page']==df['old_page']))&((df['group']==df['control'])|(df['landing_page']==df['new_page']))]