Question

输入字符串：

I0419 01：52：16.606123 136 TrainerInternal.cpp：181] Pass = 15批次= 74个样本= 3670 AvgCost = 263.331 Eval：classification_error_evaluator = 0.970178 I0419 01：52：16.815407 136 Tester.cpp：115]测试样本= 458 cost = 203.737 Eval：classification_error_evaluator = 0.934446

模式：

通行证=（[0-9] +）。* classification_error_evaluator =（0 [0-9] +）。* classification_error_evaluator =（0 [0-9] +）

期望的输出：

(15, 0.970178, 0.934446)

在Regex101（https://regex101.com/r/Hwxsib/1）上，似乎我正在捕捉正确的模式。

但是在Python中，它并没有与这些组相匹配而且它什么也没找到：

import re

x = "I0419 01:52:16.606123   136 TrainerInternal.cpp:181]  Pass=15 Batch=74 samples=3670 AvgCost=263.331 Eval: classification_error_evaluator=0.970178 I0419 01:52:16.815407   136 Tester.cpp:115]  Test samples=458 cost=203.737 Eval: classification_error_evaluator=0.934446"

pattern = "Pass=([0-9]+).*classification_error_evaluator=(0\.[0-9]+).*classification_error_evaluator=(0\.[0-9]+)"

re.match(pattern, x)

与Python re包相比，regex101设置有什么区别？或者他们是一样的吗？他们有不同的旗帜或设置吗？

为什么不在Python中进行模式匹配？

Answer 1

您想使用re.search。 match只有在匹配位于字符串的开头时才会返回！

import re

x = "I0419 01:52:16.606123   136 TrainerInternal.cpp:181]  Pass=15 Batch=74 samples=3670 AvgCost=263.331 Eval: classification_error_evaluator=0.970178 I0419 01:52:16.815407   136 Tester.cpp:115]  Test samples=458 cost=203.737 Eval: classification_error_evaluator=0.934446"

pattern = r'Pass=([0-9]+).*classification_error_evaluator=(0\.[0-9]+).*classification_error_evaluator=(0\.[0-9]+)'

print re.search(pattern, x).groups(1)

Answer 2

您可能希望re.search，re.match只会在字符串开头出现匹配时返回匹配

regex101还会显示它使用的代码：https://regex101.com/r/Hwxsib/1/codegen?language=python

从regex101代码，这里是它正在做什么（为简洁而复制和编辑）：

import re

regex = r"..."

test_str = "..."

matches = re.finditer(regex, test_str)

...

Python正则表达式与Regex101

2 个答案: