使用Pandas,我有一个如下所示的数据框:
col_a col_b col_a1 col_b1
Larry Larry Peter Peter
Lee Lee Jeremy Ilia
我想将col_a
与col_b
和col_a1
与col_b1
进行比较。如果两个对匹配,请在新列(flag
)中指明:
col_a col_b col_a1 col_b1 flag
Larry Larry Peter Peter True
Lee Lee Jeremy Ilia False
我该怎么做?
答案 0 :(得分:1)
您可以使用apply函数:
import pandas as pd
df = pd.DataFrame({'col_a':('A','B'), 'col_b':('A','B'), 'col_a1':('C','D'),'col_b1':('C','E')})
df = df[['col_a','col_b','col_a1','col_b1']]
df['flag'] = df.apply(lambda x: ('True' if x['col_a']== x['col_b'] and x['col_a1']==x['col_b1'] else 'False'),axis=1)
print df
答案 1 :(得分:0)
您可以使用DataFrame.eval
:
import pandas as pd
df = pd.DataFrame({
"col_a":["Larry","Lee"],
"col_b":["Larry","Lee"],
"col_a1":["Peter","Jeremy"],
"col_b1":["Peter","Ilia"]
})
print df
df["flag"] = df.eval("col_a==col_b and col_a1==col_b1")
print df
输出:
col_a col_a1 col_b col_b1
0 Larry Peter Larry Peter
1 Lee Jeremy Lee Ilia
col_a col_a1 col_b col_b1 flag
0 Larry Peter Larry Peter True
1 Lee Jeremy Lee Ilia False
如果要比较的列存储在两个列表中,例如a_cols
和b_cols
,您可以执行以下操作:
a_cols = ["col_a","col_a1"]
b_cols = ["col_b","col_b1"]
df["flag"] = df.eval(" and ".join("%s==%s" % pair for pair in zip(a_cols,b_cols)))
print df
输出:
col_a col_a1 col_b col_b1 flag
0 Larry Peter Larry Peter True
1 Lee Jeremy Lee Ilia False
答案 2 :(得分:0)
我发现以下代码更易于阅读。
您只需要一次比较两列,并and
两个结果都可以获得flag
列:
在一行中:
In [18]: tf['flag'] = (tf['col_a'] == tf['col_b']) & (tf['col_a1'] == tf['col_b1'])
In [19]: tf
Out[19]:
col_a col_b col_a1 col_b1 flag
0 Larry Larry Peter Peter True
1 Lee Lee Jeremy Ilia False