在两个定义的单词之间更改字符

时间:2019-08-15 20:58:15

标签: regex python-3.x pandas text replace

背景

我有以下df

import pandas as pd
df = pd.DataFrame({'Text' : ['But the here is SERG BATH # : S00-1111 MR # 111 is Here ', 
                                   'Found here SERG BATH # : E22-22222 MR # 000', 
                                   'So so SERG BATH # : L88-888 MR # 975 hey the ',
                                'The SERG BATH # : V99-99 MR # 232 here but',
                              'The is not here is the SERG BATH # : A33-3 MR # 212 here and'],


                      'ID': [1,2,3,4,5],
                       'P_ID': ['A','B','C','D','E'],

                     })

目标

1)阻止SERG BATH # :MR #

之间的所有字符

2)创建新列New_Text

示例

更改

"SERG BATH # : A33-3 MR #" 

进入

"SERG BATH # : **BLOCK** MR #"

所需的输出

   ID P_ID  Text  New_Text
0                 "But the here is SERG BATH # : **BLOCK** MR # 111 is Here"
1                 "Found here SERG BATH # : **BLOCK**  MR # 000"
2                 "So so SERG BATH # : **BLOCK** MR # 975 hey the"
3                 "The SERG BATH # : **BLOCK**  MR # 232 here but"
4                 "The is not here is the SERG BATH # : **BLOCK**  MR # 212 here and"

1 个答案:

答案 0 :(得分:3)

尝试:

df['New_Text'] = df['Text'].str.replace('BATH \# \:(.+?)MR \#','BATH #  :*** Block *** MR #')