我有一个数据框,我想根据另一个数据框将某些单词替换为其他单词:
import pandas as pd
dist = pd.DataFrame([["21","apple"],["25","balana"],["30","lemon"]],columns=["idx","item"])
a = pd.DataFrame(["apple - banana"],columns=["pf"])
a['pf'] = a['pf'].replace(dist["item"], dist["idx"], regex=True)
print(a)
我该怎么做? (这不适用于当前形式)
答案 0 :(得分:0)
您可以尝试以下方法:
dist = pd.DataFrame([["21","apple"],["25","balana"],["30","lemon"]],columns= ["idx","item"])
a = pd.DataFrame(["apple - banana"],columns=["pf"])
b = dict(zip(dist["idx"], dist["item"]))
def replace_items(token):
for key, value in b.items():
token = token.replace(value, key)
return token
a["pf"] = a["pf"].apply(replace_items)
请注意,您的banana
数据框中的dist
是balana
。不确定这是否有意...
答案 1 :(得分:0)
将翻译表转换为字典似乎可以解决问题:
import pandas as pd
dist = pd.DataFrame([["apple","21"],["banana","25"],["lemon","30"]],columns=["item","idx"])
dist = dist.set_index('item')['idx'].to_dict()
a = pd.DataFrame(["apple - banana"],columns=["pf"])
a['pf'] = a['pf'].replace(dist, regex=True)
print(a)