pandas:替换字符串不会替换目标子字符串

时间:2017-10-14 13:18:20

标签: python string python-3.x pandas replace

我正在尝试使用dataframe1迭代字符串列表,以检查其他dataframe2是否在dataframe1中找到任何字符串来替换它们。

for index, row in nlp_df.iterrows():
    print( row['x1'] )
    string1 = row['x1'].replace("(","\(")
    string1 = string1.replace(")","\)")
    string1 = string1.replace("[","\[")
    string1 = string1.replace("]","\]")
    nlp2_df['title'] = nlp2_df['title'].replace(string1,"")

为了做到这一点,我使用上面显示的代码进行迭代检查并替换df1中找到的任何字符串

下面的输出显示df1

中的字符串
wait_timeout
interactive_timeout
pool_recycle
....
__all__
folder_name
re.compile('he(lo') 

下面的输出显示在替换df2

中的字符串后的输出
0   have you tried watching the traffic between th...
1   /dev/cu.xxxxx is the "callout" device, it's wh...
2               You'll want the struct package.\r\r\n

对于df2中的输出,如/dev/cu.xxxxx之类的字符串应该在迭代期间被替换,但如图所示,它不会被删除。但是,我尝试使用nlp2_df['title'] = nlp2_df['title'].replace("/dev/cu.xxxxx","")并设法成功删除它是否有一个原因,为什么直接写字符串工作但循环使用变量用于替换唐?

先谢谢!

1 个答案:

答案 0 :(得分:0)

IIUC你可以简单地使用正则表达式:

nlp2_df['title'] = nlp2_df['title'].str.replace(r'([\(\)\[\]])',r'\\\1')

PS你根本不需要for loop ......

演示:

In [15]: df
Out[15]:
           title
0  aaa (bbb) ccc
1   A [word] ...

In [16]: df['new'] = df['title'].str.replace(r'([\(\)\[\]])',r'\\\1')

In [17]: df
Out[17]:
           title              new
0  aaa (bbb) ccc  aaa \(bbb\) ccc
1   A [word] ...   A \[word\] ...