Question

我需要在某个短语之后比较两个不同文件的第一个元素。到目前为止，我有这个：

import re data1 = "" data2 = "" first = re.match(r".*Ignore until after this:(?P<data1>.*)", firstlist[0]) second = re.match(r".*Ignore until after this:(?P<data2>.*)", secondarray[0]) data1 = first.group('data1') data2 = second.group('data2') if data1 == data2: #rest of the code...

我想忽略某些特定点，然后将其余部分保存到变量中。我在脚本中做了与之前几乎完全相同的事情并且它有效。但是，当我运行它时，我收到此错误：

File "myfile.py", line [whatever line it is], in <module> data1 = first.group('data1') AttributeError: 'NoneType' object has no attribute 'group'

为什么re.match无法正常使用第一个和第二个？

修改

根据建议，我已将[\s\S]*更改为.*。

编辑2：这是输入的样子（不像下面的评论中所示）：

Random text More random text Even more random text Ignore until after this: Meaningful text, keep this ...and everything else... ...until the end of the file here

这基本上就是它的全部内容：需要在某一点之后保存的一串文字

Answer 1

由于文件中的换行符，您可能只是遇到问题。正如Martijn Pieters在评论中指出的那样，您可以使用标志re.DOTALL捕获所有内容。所以使用这样的文件（在这个例子中名为tmp）

Random text

More random text

Even more random text

Ignore until after this:

Meaningful text, keep this

...and everything else...

...until the end of the file here

你可以做这样的事情

with open('tmp') as f:
  first = re.match(r'.*Ignore until after this:(?P<data1>.*)', f.read(), re.DOTALL)
  print(first.group('data1'))

给出了

Meaningful text, keep this

...and everything else...

...until the end of the file here

Answer 2

点'。'正则表达式中的字符匹配除换行符之外的任何字符。因此，如果您将整个文件作为单个字符串，则正则表达式将匹配第一个新行，然后尝试将您的短语与下一行的开头匹配。如果失败，则返回NoneType。

请参阅this和this。

AttributeError：'NoneType'对象在使用re.match时没有属性'group'

修改

2 个答案: