我将API响应作为字符串获取,可以采用两种不同的格式:
1)This is a message. <br><br>This message was created by Jimmy.
2)
This is a message.
Text can be in the new row.
This message was created by Jimmy.
我想删除文本“此消息是由['name']”从每条消息创建的。预期结果:
这是一条消息。
这是我尝试过的:
modified_message = re.search('(.+?)<br><br>', message).group(1)
它适用于1)示例,但当然不适用于2)。
我如何从2)示例中过滤掉文本,因为它是多行字符串,或者是否可以使用一个表达式?
答案 0 :(得分:1)
请检查一下。 添加了处理多行字符串的代码。
import re
data1 = "This is a message. <br><br>This message was created by Jimmy."
data2 = """
This is a message.
This message was created by Jimmy.
"""
print "First case..."
print data1
output1 = re.findall('(.*?)This message was created',data1,re.DOTALL)[0].replace("<br>",'')
print "Output is ..."
print(output1)
print "----------------------------------------"
print "Second Case..."
print data2
print "Output is ..."
output2 = re.findall('(.*?)This message was created',data1,re.DOTALL)[0].replace("<br>",'')
print(output2)
输出:
C:\Users>python main.py
First case...
This is a message. <br><br>This message was created by Jimmy.
Output is ...
This is a message.
----------------------------------------
Second Case...
This is a message.
This message was created by Jimmy.
Output is ...
This is a message.