我正在尝试进行4行多行匹配。我的代码找到第一个。但不是其他人。
以下是模式:
pattern = re.compile("([a-z]+\.com\.|net\.)[.\s\S]+(Z[A-Z0-9]+)")
这是主题:
sub = """yahoo.com.
Public
8
Z2RVE9XGX4PFJN
google.com.
Public
7
Z2VATLWTLBDR5D
"""
以下是完整的代码:
import re
pattern = re.compile("([a-z]+\.com\.|net\.)[.\s\S]+(Z[A-Z0-9]+)")
sub = """yahoo.com.
Public
8
Z2RVE9JJGX4PFJN
google.com.
Public
7
Z2VATZOPLBDR5D
"""
m = pattern.findall(sub)
print(m)
结果如下:
[('yahoo.com.', 'Z2RVE9JJGX4PFJN')]
最后,这是理想的结果:
[('yahoo.com.', 'Z2RVE9JJGX4PFJN'), ('google.com', Z2VATZOPLBDR5D')]
谢谢。
答案 0 :(得分:0)
你很亲密。只是让你的比赛不那么贪心:
import re
pattern = re.compile("([a-z]+\.com\.|net\.)[\s\S]+?(Z[A-Z0-9]+)")
# Note the 'less greedy' addition ^
# The '.' is not necessary in the ^ in the character class
sub = """yahoo.com.
Public
8
Z2RVE9JJGX4PFJN
google.com.
Public
7
Z2VATZOPLBDR5D
"""
m = pattern.findall(sub)
print(m)
打印:
[('yahoo.com.', 'Z2RVE9JJGX4PFJN'), ('google.com.', 'Z2VATZOPLBDR5D')]
为了在模式的末尾具有更高的特异性,您可能需要使用锚点:
pattern = re.compile("^([a-z]+\.com\.|net\.)$[\s\S]+?^(Z[A-Z0-9]+)$", re.M)
# Start of line ^ ^
# End of line ^ ^
# Multi line flag ^