Question

下面是示例数据框

data = {'A': ['hi UK','hi IN','hi US']}

df = pd.DataFrame(data)

我想从下面的匹配字典更新A列的UK，IN值

abs = {'U': 'UK -- extra', 'UK': 'test Kingdom', 'IN':'India'}

然后我使用了replace function (pandas.DataFrame.replace)

df['A'] = df['A'].replace(to_replace = abs, regex=True)
print(df)

                           A
0  hi test Kingdom -- extraK
1                   hi India
2  hi test Kingdom -- extraS

它首先用UK -- extra替换U，然后再用UK替换test kingdom，所以最终结果是hi test Kingdom -- extraK，通常它应该赋予测试王国

预期输出为

                 A
0  hi test Kingdom
1         hi India
2            hi US

我错过了什么吗？还是有什么可以实现上述结果的

谢谢。

Answer 1

我认为单词边界应该有助于匹配UK，US和否U：

data = {'A': ['hi UK','hi IN','hi US']}

d = {'U': 'UK -- extra', 'UK': 'test Kingdom', 'IN':'India'}

d = {r'\b{}\b'.format(k):v for k, v in d.items()}
df = pd.DataFrame(data)

df['A'] = df['A'].replace(to_replace = d, regex=True)
print(df)
                 A
0  hi test Kingdom
1         hi India
2            hi US

熊猫从值的dict替换子字符串

1 个答案: