表示一个字符串:
string = 'Other unwanted text here and start here: This is the first sentence.\nIt is the second one.\nNow, this is the third one.\nThis is not I want.\n'
我想提取前三个句子,
This is the first sentence.\nIt is the second one.\nNow, this is the third one.
显然,以下正则表达式不起作用:
re.search('(?<=This)(.*?)(?=\n)', string)
提取This
和第三个\n
之间的文本的正确表达式是什么?
谢谢。
答案 0 :(得分:1)
您可以使用此正则表达式捕获以This
文本开头的三个句子,
This(?:[^\n]*\n){3}
编辑:
Python代码,
import re
s = 'Other unwanted text here and start here: This is the first sentence.\nIt is the second one.\nNow, this is the third one.\nThis is not I want.\n'
m = re.search(r'This(?:[^\n]*\n){3}',s)
if (m):
print(m.group())
打印
This is the first sentence.
It is the second one.
Now, this is the third one.
答案 1 :(得分:0)
Jerry的对,正则表达式不是正确的工具,并且有很多更容易,更有效的方法来解决问题;
this = 'This is the first sentence.\nIt is the second one.\nNow, this is the third one.\nThis is not I want.\n'
print('\n'.join(this.split('\n', 3)[:-1]))
输出:
This is the first sentence.
It is the second one.
Now, this is the third one.
如果您只想练习使用正则表达式,那么按照教程进行操作会容易得多。
答案 2 :(得分:0)
尝试以下操作:
import re
string = 'Other unwanted text here and start here: This is the first sentence.\nIt is the second one.\nNow, this is the third one.\nThis is not I want.\n'
extracted_text = re.search(r'This(.*?\n.*?\n.*?)\n', string).group(1)
print(extracted_text)
给你
is the first sentence.
It is the second one.
Now, this is the third one.
这假设n
之前缺少Now
。如果您希望保留This
,则可以将其移至(
答案 3 :(得分:0)
(?s)(This.*?)(?=\nThis)
使用.
使(?s)
包含换行符,查找以This
开头,后跟\nThis
的序列。
别忘了搜索结果中的__repr__
不会打印出整个匹配的字符串,因此您需要
print(re.search('(?s)(This.*?)(?=\nThis)', string)[0])