比较两个Dataframe列以检查它们在python中是否具有相同的值

时间:2017-08-09 06:29:27

标签: python pandas dataframe data-analysis

我有两个数据帧,

new1.
      Name       city
 0    sri won    chn
 1    pechi won  pune
 2    Ram won    mum
 0    pec won    kerala

new3
    req
0   pec
1   mut

我试过了,

mask=new1.Name.str.contains("|".join(new3.req.values.tolist()))
new1[mask]

我到了,

 new1[mask]
      Name       city
 1  pechi won    pune
 0  pec won      kerala

as" pechi"包含" pec",它取了这个值。但是我希望值之间没有"包含"

我想要的输出是,

 new1[mask]
      Name       city
 0  pec won      kerala

2 个答案:

答案 0 :(得分:1)

您需要\b表示"字边界":

a = r'\b(' + "|".join(new3.req.values.tolist()) + r')\b'
print (a)
\b(pec|mut)\b

mask=new1.Name.str.contains(a)
df = new1[mask]
print (df)
      Name    city
0  pec won  kerala

答案 1 :(得分:0)

分隔符中需要空格

In [1350]: new1
Out[1350]:
        Name    city
0    sri won     chn
1  pechi won    pune
2    Ram won     mum
0    pec won  kerala

In [1351]: new3
Out[1351]:
   req
0  pec
1  mut

In [1352]: ' | '.join(new3.req)
Out[1352]: 'pec | mut'

In [1353]: new1.Name.str.contains(' | '.join(new3.req))
Out[1353]:
0    False
1    False
2    False
0     True
Name: Name, dtype: bool

In [1354]: new1[new1.Name.str.contains(' | '.join(new3.req))]
Out[1354]:
      Name    city
0  pec won  kerala