Question

我正在尝试使用正则表达式删除多行字符串中某行之前的所有内容。是否有正则表达式捕获之前的所有（并包含）表达式？

import re sample = ''' This is content I need to delete I do not need any of this. === Text I need Is here''' content = re.sub(r'\n===', "", sample) print(content)

Answer 1

You are not grabbing characters which occur before \n===. You can use this.

content = re.sub(r'.*\n===', "", sample, flags=re.DOTALL)

Answer 2

If you want to be left with just

Text I need
Is here

(so without any more newlines after the ===) you can use

content = re.sub(r'(.|\n)*===\n*', "", sample)

The (.|\n)* will get rid of all the text and newlines up to the === and the \n* will delete the following newlines. You can also leave this last part out if you want to keep the newlines after ===. So

content = re.sub(r'(.|\n)*===', "", sample)

will result in

             // newline 
             // newline
Text I need 
Is here

There will be two newlines left (one directly after the === and the second one for the empty line). If you just want one newline before Text I need... then use:

r'(.|\n)*===\n'

正则表达式删除行前的所有内容

2 个答案: