Question

我正在使用re模块中的finditer-function来匹配某些东西，一切正常。

现在我需要找出我有多少匹配，是否可以在没有循环遍历迭代器两次的情况下？（一个是查找计数然后是实际迭代）

编辑：根据要求，一些代码：

imageMatches = re.finditer("<img src\=\"(?P<path>[-/\w\.]+)\"", response[2])
<Here I need to get the number of matches>
for imageMatch in imageMatches:
     doStuff

一切正常，我只需要在循环之前获得匹配数量。

Answer 1

如果您知道自己需要所有匹配项，则可以使用re.findall函数。它将返回所有匹配的列表。然后你只需要len(result)来获得匹配数量。

Answer 2

如果你总是需要知道长度，而你只需要匹配的内容而不是其他信息，那么你也可以使用re.findall。否则，如果您有时只需要长度，则可以使用例如

matches = re.finditer(...)
...
matches = tuple(matches)

将匹配的迭代存储在可重用的元组中。然后只需len(matches)。

另一种选择，如果您只需要知道对匹配对象做任何事后的总计数，就是使用

matches = enumerate(re.finditer(...))

将为每个原始匹配返回(index, match)对。那么你就可以将每个元组的第一个元素存储在某个变量中。

但是如果你首先需要长度，并且你需要匹配对象而不仅仅是字符串，你应该这样做

matches = tuple(re.finditer(...))

Answer 3

如果您发现需要坚持使用finditer()，则可以在迭代迭代器时使用计数器。

示例：

>>> from re import *
>>> pattern = compile(r'.ython')
>>> string = 'i like python jython and dython (whatever that is)'
>>> iterator = finditer(pattern, string)
>>> count = 0
>>> for match in iterator:
        count +=1
>>> count
3

如果您需要finditer()的功能（与重叠实例不匹配），请使用此方法。

Answer 4

#An example for counting matched groups
import re

pattern = re.compile(r'(\w+).(\d+).(\w+).(\w+)', re.IGNORECASE)
search_str = "My 11 Char String"

res = re.match(pattern, search_str)
print(len(res.groups())) # len = 4  
print (res.group(1) ) #My
print (res.group(2) ) #11
print (res.group(3) ) #Char
print (res.group(4) ) #String

Answer 5

对于那些你真的想避免建立列表的时刻：

import re
import operator
from functools import reduce
count = reduce(operator.add, (1 for _ in re.finditer(my_pattern, my_string)))

有时您可能需要操作大字符串。这可能有所帮助。

Answer 6

我知道这有点旧，但这是一个用于计算正则表达式模式的简洁函数。

def regex_cnt(string, pattern):
    return len(re.findall(pattern, string))

string = 'abc123'

regex_cnt(string, '[0-9]')

正则表达式匹配数

6 个答案: