python正则表达式匹配所有

时间:2017-05-29 15:00:13

标签: regex

我正在尝试进行4行多行匹配。我的代码找到第一个。但不是其他人。

以下是模式:

pattern = re.compile("([a-z]+\.com\.|net\.)[.\s\S]+(Z[A-Z0-9]+)")

这是主题:

sub = """yahoo.com.
Public
8
Z2RVE9XGX4PFJN
google.com.
Public
7
Z2VATLWTLBDR5D
""" 

以下是完整的代码:

import re
pattern = re.compile("([a-z]+\.com\.|net\.)[.\s\S]+(Z[A-Z0-9]+)")

sub = """yahoo.com.
Public
8
Z2RVE9JJGX4PFJN
google.com.
Public
7
Z2VATZOPLBDR5D
"""

m = pattern.findall(sub)

print(m)

结果如下:

[('yahoo.com.', 'Z2RVE9JJGX4PFJN')]

最后,这是理想的结果:

[('yahoo.com.', 'Z2RVE9JJGX4PFJN'), ('google.com', Z2VATZOPLBDR5D')]

谢谢。

1 个答案:

答案 0 :(得分:0)

你很亲密。只是让你的比赛不那么贪心:

import re
pattern = re.compile("([a-z]+\.com\.|net\.)[\s\S]+?(Z[A-Z0-9]+)")
# Note the 'less greedy' addition                 ^
# The '.' is not necessary in the           ^ in the character class
sub = """yahoo.com.
Public
8
Z2RVE9JJGX4PFJN
google.com.
Public
7
Z2VATZOPLBDR5D
"""

m = pattern.findall(sub)

print(m)

打印:

[('yahoo.com.', 'Z2RVE9JJGX4PFJN'), ('google.com.', 'Z2VATZOPLBDR5D')]

为了在模式的末尾具有更高的特异性,您可能需要使用锚点:

pattern = re.compile("^([a-z]+\.com\.|net\.)$[\s\S]+?^(Z[A-Z0-9]+)$", re.M)
# Start of line       ^                              ^ 
# End of line                               ^                     ^
# Multi line flag                                                       ^