我有一个字符串的数据帧列。现在我想用另一个具有要替换的单词含义的数据框中的值替换这些字符串中的特定单词。我目前正在使用iterrrows(),大约需要2分钟才能完成25000行。我想知道是否有更有效的方法来做到这一点。
syn = pd.ExcelFile("C:/Key-Value.xlsx")
df_syn = syn.parse("Keys")
for idx, row in df_syn.iterrows():
df['col'] = df['col'].str.replace(r"\b"+row['synonym']+r"\b", row['word'])
答案 0 :(得分:1)
IIUC:
设置
df_syn = pd.DataFrame(dict(synonym=['hug', 'kiss'], word=['warm', 'tender']))
df = pd.DataFrame(dict(col=['I want a hug', 'a kiss would be great']))
print(df_syn, df, sep='\n\n')
synonym word
0 hug warm
1 kiss tender
col
0 I want a hug
1 a kiss would be great
解决方案
mapping = df_syn.assign(
synonym=df_syn.synonym.radd(r'\b').add(r'\b')
).set_index('synonym').word.to_dict()
df.replace({'col': mapping}, regex=True)
col
0 I want a warm
1 a tender would be great