我正在尝试使用正则表达式删除多行字符串中某行之前的所有内容。是否有正则表达式捕获之前的所有(并包含)表达式?
import re
sample = '''
This is content I need to delete
I do not need any of this.
===
Text I need
Is here'''
content = re.sub(r'\n===', "", sample)
print(content)
答案 0 :(得分:1)
You are not grabbing characters which occur before \n===
. You can use this.
content = re.sub(r'.*\n===', "", sample, flags=re.DOTALL)
答案 1 :(得分:0)
If you want to be left with just
Text I need
Is here
(so without any more newlines after the ===
) you can use
content = re.sub(r'(.|\n)*===\n*', "", sample)
The (.|\n)*
will get rid of all the text and newlines up to the ===
and the \n*
will delete the following newlines. You can also leave this last part out if you want to keep the newlines after ===
. So
content = re.sub(r'(.|\n)*===', "", sample)
will result in
// newline
// newline
Text I need
Is here
There will be two newlines left (one directly after the ===
and the second one for the empty line). If you just want one newline before Text I need...
then use:
r'(.|\n)*===\n'