比较多组列以生成布尔结果列

时间:2015-04-21 16:25:00

标签: python-2.7 pandas

使用Pandas,我有一个如下所示的数据框:

col_a   col_b    col_a1    col_b1
Larry   Larry     Peter     Peter
Lee     Lee      Jeremy    Ilia

我想将col_acol_bcol_a1col_b1进行比较。如果两个对匹配,请在新列(flag)中指明:

col_a   col_b    col_a1    col_b1   flag
Larry   Larry     Peter     Peter   True
Lee     Lee      Jeremy    Ilia     False

我该怎么做?

3 个答案:

答案 0 :(得分:1)

您可以使用apply函数:

import pandas as pd

df = pd.DataFrame({'col_a':('A','B'), 'col_b':('A','B'), 'col_a1':('C','D'),'col_b1':('C','E')})

df = df[['col_a','col_b','col_a1','col_b1']]

df['flag'] = df.apply(lambda x: ('True' if x['col_a']== x['col_b'] and x['col_a1']==x['col_b1'] else 'False'),axis=1)

print df

答案 1 :(得分:0)

您可以使用DataFrame.eval

import pandas as pd

df = pd.DataFrame({
    "col_a":["Larry","Lee"],
    "col_b":["Larry","Lee"],
    "col_a1":["Peter","Jeremy"],
    "col_b1":["Peter","Ilia"]
    })

print df
df["flag"] = df.eval("col_a==col_b and col_a1==col_b1")    
print df

输出:

   col_a  col_a1  col_b col_b1
0  Larry   Peter  Larry  Peter
1    Lee  Jeremy    Lee   Ilia

   col_a  col_a1  col_b col_b1   flag
0  Larry   Peter  Larry  Peter   True
1    Lee  Jeremy    Lee   Ilia  False

如果要比较的列存储在两个列表中,例如a_colsb_cols,您可以执行以下操作:

a_cols = ["col_a","col_a1"]
b_cols = ["col_b","col_b1"]
df["flag"] = df.eval(" and ".join("%s==%s" % pair for pair in zip(a_cols,b_cols)))   
print df

输出:

   col_a  col_a1  col_b col_b1   flag
0  Larry   Peter  Larry  Peter   True
1    Lee  Jeremy    Lee   Ilia  False

答案 2 :(得分:0)

我发现以下代码更易于阅读。 您只需要一次比较两列,并and两个结果都可以获得flag列:

在一行中:

In [18]: tf['flag'] = (tf['col_a'] == tf['col_b']) & (tf['col_a1'] == tf['col_b1'])

In [19]: tf
Out[19]: 
   col_a  col_b  col_a1 col_b1   flag
0  Larry  Larry   Peter  Peter   True
1    Lee    Lee  Jeremy   Ilia  False