Question

我目前正在学习正则表达式如何在Python中运行，到目前为止已经发现一切都非常易于理解。

我知道您可以使用.start函数在Python中找到匹配对象的起始位置。

我知道您可以使用re.findAll（）函数检索总匹配列表（以字符串形式）。

有没有人知道是否有一种简单的方法可以找到第n个元素的起始位置？

到目前为止，我只能想到一种方法，这是一种手动编码的解决方案，我在每次匹配时迭代地将字符串拆分为n，按照我的要求计算总字符数：

def getNthStartingPosOfPattern(pattern, text, n):
    all_matches = re.findall(pattern, text)
    result = 0
    for i in range(n):
        this_split = text.split(all_matches[i])
        result += len(this_split[0])
        new_start_pos = result + len(all_matches[i])
        text = text[new_start_pos:]
    return result


text = "09834 82 Monkey-wtf 2323, 8371853 Monkey-wtf 244, 39082348 Monkey-ftw 827,2  Monkey-lbj 2,24857 Monkey-kkk,oo293 Monkey-iij 55, 273 Monkey-eif 7,22288888 Monkey-abc"
pattern = r'Monkey-[a-z]{3}'
result = getNthStartingPosOfPattern(pattern, text, 5)
print(result)

这似乎有效，但似乎很费力，而且容易出现边缘问题。 Python库是否为我们提供了一种更简单的方法来实现这一目标，而我却无法理解？

非常感谢你的时间

Answer 1

您可以在MatchObject.start()的帮助下使用re.finditer：

在这里，你可以像这样获得第5场比赛的起始位置：

import re
def getNthStartingPosOfPattern(pattern, text, n):
    for i,x in enumerate(re.finditer(pattern, text)):
        if i == n-1:
            return x.start()

text = "09834 82 Monkey-wtf 2323, 8371853 Monkey-wtf 244, 39082348 Monkey-ftw 827,2  Monkey-lbj 2,24857 Monkey-kkk,oo293 Monkey-iij 55, 273 Monkey-eif 7,22288888 Monkey-abc"
pattern = r'Monkey-[a-z]{3}'
print(getNthStartingPosOfPattern(pattern, text, 5))

请参阅IDEONE demo

如何在Python中找到正则表达式第N个匹配的起始位置？

1 个答案: