如何匹配和合并两个值完全不同的数据框(单个单词除外)?

时间:2019-01-29 09:51:09

标签: python python-3.x pandas dataframe

具有值的数据框ABC

        0                        1           2
0   sun is rising         |  UNKNOWN    | 1465465
1   micheal has arrived   |   UNKNOWN   | 324654
2   goal has been scored | UNKNOWN     | 547854

和其他XYZ值

    0         1 
0 sun       | password1
1 goal      | password2
2 micheal   | password3

如何使用(sun,goal和micheal)ABC映射XYZ,以便用密码1替换ABC中的UNKNOWN 1

我需要的输出

    0                        1           2
0  sun is rising         |  password1    | 1465465
1   micheal has arrived  |   password3   | 324654
2   goal has been scored| password2     | 547854

3 个答案:

答案 0 :(得分:2)

这是将str.containsboolean indexation结合使用的一种方法,用于选择在两个数据帧之间找到匹配项的密码:

from itertools import chain
abc.loc[:,1] = list(chain(*[xyz.loc[abc[0].str.contains(i),1] for i in xyz[0]]))

         0                  1         2
0  sun is rising         password1  1465465
1  goal has been scored  password2   324654
2  micheal has arrived   password3   547854

答案 1 :(得分:2)

d = dict(zip(XYZ[0],XYZ[1]))
#{'sun': 'password1', 'goal': 'password2', 'micheal': 'password3'}
pat = (r'({})'.format('|'.join(d.keys())))
ABC[1]=ABC[0].str.extract(pat,expand=False).map(d)
print(ABC)

          0                  1         2
0  sun is rising         password1  1465465
1  micheal has arrived  password3   324654
2  goal has been scored  password2   547854

答案 2 :(得分:2)

创建字典并通过getnext匹配第一个值:

d = dict(zip(XYZ[0], XYZ[1]))
ABC[1] = [next(d.get(y) for y in x.split() if y in d) for x in ABC[0]]
print (ABC)
                      0          1        2
0         sun is rising  password1  1465465
1   micheal has arrived  password3   547854
2  goal has been scored  password2   324654