替换子字符串

时间:2018-11-12 10:50:34

标签: regex pandas replace

我正在尝试使用正则表达式替换文本中的某些内容。

我的数据框:

       A          B                                    C
  French      house               Phone. <phone_numbers>
 English      house               email - <adresse_mail>
  French  apartment                       code : bla!123
  French      house                        Hello George!
 English  apartment   Ethan, my phone is <phone_numbers>

好的输出:

       A          B                                    C
  French      house               Phone. <phone_numbers>
 English      house               email - <adresse_mail>
  French  apartment                        code : <code>
  French      house                        Hello George! 
 English  apartment   Ethan, my phone is <phone_numbers>

首先,我尝试过:

df['C'] = df['C'].str.replace(r'((ask code)|(code))\s?:?\s?\w+','<code>')

它有效,但不完全。

code : bla!123

输出:

<code>!123

所以,我尝试了这个:

df['C'] = df['C'].str.replace(r'(ask code)|(code)\s?:?\s?), (\s?\w+)', r'\2,<code>')

但是什么也没发生...

2 个答案:

答案 0 :(得分:3)

我愿意:

df['C'] = df['C'].str.replace(r'(ask code|code)(\s?:?\s?).+', r'\1\2<code>')

答案 1 :(得分:2)

输入:

 import re
 string = 'code : bla!123'
 string.replace((re.match(r'code*\s?:?\s?(.*)',string)[1]), '<code>')

输出:

 'code : <code>'