Question

我有一个有1列的df

     List
 0   What are you trying to achieve
 1   What is your purpose right here
 2   When students don’t have a proper foundation
 3   I am going to DESCRIBE a sunset

我有其他数据框df2

有2列

    original       correct
0     are          were
1     sunset       sunrise
2     I            we
3     right        correct
4     is           was

我想在我的df中替换这样的单词，这发生在我的df2的original列中并替换为correct列中的相应字词。并将新字符串存储在其他数据框df_new

中

是否可以不使用循环和迭代，只使用普通的熊猫概念？

即我的df_new应该包含。

     List
 0   What were you trying to achieve
 1   What was your purpose correct here
 2   When students don’t have a proper foundation
 3   we am going to DESCRIBE a sunrise

这只是一个测试示例， MY df可能包含数百万行字符串，所以我的df2，什么是我可以继续的最有效的解决方案路径？

Answer 1

许多可能的解决方案之一：

In [371]: boundary = r'\b'
     ...:
     ...: df.List.replace((boundary + df2.orignal + boundary).values.tolist(),
     ...:                 df2.correct.values.tolist(),
     ...:                 regex=True)
     ...:
Out[371]:
0                  What were you trying to achieve
1               What was your purpose correct here
2     When students don’t have a proper foundation
3                we am going to DESCRIBE a sunrise
Name: List, dtype: object

将数据帧中的字符串行替换为其他数据帧panda中的相应字

1 个答案: