re.sub字符串标签提取

时间:2018-07-17 10:45:01

标签: html regex string

我试图理解为什么以下正则表达式的使用不起作用,即在字符串中找不到模式(re.sub的通用表达式是re.sub(pattern,repl,string)。告诉我原因,并在可能的情况下说明如何修改代码以使其正常工作?

import re

string = " Many of them are not just serving time in prison, but they are being let out of prison and back into our communities, having committed appalling crimes. They are not being kicked out.</p><p><i>[</i></p><p><i>Interruption.</i></p><p><i>]</i></p><p>And no doubt they are indeed receiving benefits. That is why the British people are fed up and want action to be taken. It is unlikely that my hon."

testOutput = re.sub(r'</p><p><i>[</i></p><p><i>Interruption.</i></p><p><i>]</i></p><p>',' ',string)

print(testOutput)

1 个答案:

答案 0 :(得分:0)

我实际上找到了答案。方括号在re.sub中具有其自身的含义,因此我不得不将'['替换为'反斜杠['

将[转换为符号并与字符串的一部分匹配模式