背景
我有以下df
import pandas as pd
df = pd.DataFrame({'Text' : ['But the here is SERG BATH # : S00-1111 MR # 111 is Here ',
'Found here SERG BATH # : E22-22222 MR # 000',
'So so SERG BATH # : L88-888 MR # 975 hey the ',
'The SERG BATH # : V99-99 MR # 232 here but',
'The is not here is the SERG BATH # : A33-3 MR # 212 here and'],
'ID': [1,2,3,4,5],
'P_ID': ['A','B','C','D','E'],
})
目标
1)阻止SERG BATH # :
和MR #
2)创建新列New_Text
示例
更改
"SERG BATH # : A33-3 MR #"
进入
"SERG BATH # : **BLOCK** MR #"
所需的输出
ID P_ID Text New_Text
0 "But the here is SERG BATH # : **BLOCK** MR # 111 is Here"
1 "Found here SERG BATH # : **BLOCK** MR # 000"
2 "So so SERG BATH # : **BLOCK** MR # 975 hey the"
3 "The SERG BATH # : **BLOCK** MR # 232 here but"
4 "The is not here is the SERG BATH # : **BLOCK** MR # 212 here and"
答案 0 :(得分:3)
尝试:
df['New_Text'] = df['Text'].str.replace('BATH \# \:(.+?)MR \#','BATH # :*** Block *** MR #')