如何在Python中替换数据框中的子字符串

时间:2018-09-16 13:50:07

标签: python pandas dataframe replace


我有一个数据框,我想根据另一个数据框将某些单词替换为其他单词:

import pandas as pd
dist = pd.DataFrame([["21","apple"],["25","balana"],["30","lemon"]],columns=["idx","item"])
a = pd.DataFrame(["apple - banana"],columns=["pf"])
a['pf'] = a['pf'].replace(dist["item"], dist["idx"], regex=True)
print(a)

我该怎么做? (这不适用于当前形式)

2 个答案:

答案 0 :(得分:0)

您可以尝试以下方法:

dist = pd.DataFrame([["21","apple"],["25","balana"],["30","lemon"]],columns= ["idx","item"])
a = pd.DataFrame(["apple - banana"],columns=["pf"])
b = dict(zip(dist["idx"], dist["item"]))

def replace_items(token):
    for key, value in b.items():
        token = token.replace(value, key)
    return token

a["pf"] = a["pf"].apply(replace_items)

请注意,您的banana数据框中的distbalana。不确定这是否有意...

答案 1 :(得分:0)

将翻译表转换为字典似乎可以解决问题:

import pandas as pd
dist = pd.DataFrame([["apple","21"],["banana","25"],["lemon","30"]],columns=["item","idx"])
dist = dist.set_index('item')['idx'].to_dict()
a = pd.DataFrame(["apple - banana"],columns=["pf"])
a['pf'] = a['pf'].replace(dist, regex=True)
print(a)