Question

我有一个像data = ['This is the sentence "Hello" by writer "MK"', '2 Worlds [Harry]']

这样的字符串列表

我想只提取"Hello，这就是我所做的：

import re
s = re.match('This is the sentence (.*) by writer', data[0])
s

但我没有"Hello，而是<_sre.SRE_Match object; span=(0, 38), match='This is the sentence "Hello" by writer'>

有人可以告诉我如何正确地写它吗？

Answer 1

假设您只想要引用字符串中的任何内容，请将re.search与捕获组一起使用，并在找到匹配项时提取第一个组。

m = re.search('"(.*?)"', data[0])
if m:
    print(m.group(1))

Hello

如果找到匹配项，则返回match个对象。您可以调用此对象的group(n)属性来提取字符串。如果没有匹配项，则返回None。因此，有必要在打印前查询返回值，否则会收到AttributeError。

<强>详情

"        # double quote
(        # open 1st capture group
.*?      # non-greedy matcher
)            
"

请注意，您不应该对您的模式进行硬编码。更重要的是，除非你知道自己在做什么，否则不要使用贪婪的捕获.*。

Answer 2

当你打印s时，它会打印正则表达式Match对象 - 就像你编写一个函数一样，然后在不调用它的情况下打印函数的名字，你将获得字符串表示形式该函数，而不是函数将返回的函数：

def hello():
    return 'hello!'

print(hello)
>>> <function hello at 0x7f570e3aa9b0>

如果您想访问使用（。*）表示的群组，则必须明确访问该群组：

s = re.match('This is the sentence (.*) by writer', data[0])
print(s.group(1))
>>>"Hello

此外，您可以通过以下方式检查是否有匹配的组，如果没有匹配则避免出现AttributeError：

 s = re.match('This is the sentence (.*) by writer', data[0])
 if s:
     print(s.group(1))

Answer 3

＆＃39; re.match＆＃39;返回“匹配对象”，您需要使用.group(1)从匹配中获取数据。