Question

我正在尝试同时匹配下面的line1和line2的正则表达式，当前它仅匹配第1行，我如何使problem/为可选，以便该正则表达式也匹配第2行？ >

import re
line1 = '<change://problem/52547719> DEM: Increase granularity of the lower size bins in the packet burst size histograms'

line2 = '<change://51736404> [KIC] Not seeing NACK events from tech when packet ex'
match = re.findall("[\S]*(?:change:\/\/problem\/)(\d{8,8})", line1)
print match
match = re.findall("[\S]*(?:change:\/\/problem\/)(\d{8,8})", line2)
print match

Answer 1

您可以通过添加与dplyr匹配0到1次的量词?来做到这一点：

problem/

请注意，您事先会贪婪地匹配所有非空格值。如果您的行总是以这种模式放在方括号中，请尝试以下方法：

[\S]*change:\/\/(?:problem\/)?\d{8}

Answer 2

我猜测此表达式可能与我们所需的字符串匹配：

<change:\/\/.*?(\d{8})\s*>

使用`re.findall`

进行测试

import re

regex = r"<change:\/\/.*?(\d{8})\s*>"

test_str = ("<change://problem/52547719> DEM: Increase granularity of the lower size bins in the packet burst size histograms\n"
    "<change://51736404> [KIC] Not seeing NACK events from tech when packet ex\n"
    "<change://problem/problem/problem/52547719> DEM: Increase granularity of the lower size bins in the packet burst size histograms")

print(re.findall(regex, test_str))

使用`re.finditer`

进行测试

import re

regex = r"<change:\/\/.*?(\d{8})\s*>"

test_str = ("<change://problem/52547719> DEM: Increase granularity of the lower size bins in the packet burst size histograms\n"
    "<change://51736404> [KIC] Not seeing NACK events from tech when packet ex\n"
    "<change://problem/problem/problem/52547719> DEM: Increase granularity of the lower size bins in the packet burst size histograms")

matches = re.finditer(regex, test_str)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

在this demo的右上角对表达式进行了说明，如果您想探索/简化/修改它，在this link中，您可以观察它如何与某些示例输入步骤匹配一步一步，如果您喜欢。

RegEx电路

jex.im可视化正则表达式：

Answer 3

使用简单的模式来匹配<change://，然后使用可选部分来匹配直到第一个/和/本身的任何文本，然后捕获任意一位或多位数字

match = re.search(r"<change://(?:[^/]*/)?(\d+)", line)
if match:
    print(match.group(1))

注意：如果您有<change://more/problems/52547719>这样的字符串，则可以使用一个小的变化形式：

match = re.search(r"<change://[^>]*?(\d+)>", line)

请参见this regex demo。

请参见Python demo：

import re
lines = ['<change://problem/52547719> DEM: Increase granularity of the lower size bins in the packet burst size histograms',
         '<change://51736404> [KIC] Not seeing NACK events from tech when packet ex']
for line in lines:
    match = re.search(r"<change://(?:[^/]*/)?(\d+)", line)
    if match:                 # Check if matched or exception will be raised
        print(match.group(1)) # .group(1) only prints Group 1 value

请参见regex demo和regex graph：

详细信息

<change://-文字
(?:[^/]*/)?-可选序列：
- [^/]*-除/之外的0个或更多字符
- /-一个/字符
(\d+)-第1组：一个或多个数字

正则表达式匹配因<change：// 51736404>而失败

3 个答案:

使用`re.findall`

使用`re.finditer`

RegEx电路

正则表达式匹配因<change：// 51736404>而失败

3 个答案:

使用re.findall

使用re.finditer

RegEx电路

使用`re.findall`

使用`re.finditer`