Question

我正在编写关于Python的“高级”正则表达式的教程，我无法理解

lastindex

我的意思是这个例子：

re.match('((ab))', 'ab').lastindex

为什么是1？第二组也是匹配的。

Answer 1

lastindex是匹配的最后一个组的索引。文档中的示例包括使用2个捕获组的示例：

(a)(b)

其中lastindex设置为2，因为要匹配的最后一个捕获组为(b)。

当您有可选的捕获组时，该属性会派上用场;比较：

>>> re.match('(required)( optional)?', 'required').lastindex
1
>>> re.match('(required)( optional)?', 'required optional').lastindex
2

当您拥有嵌套组时，外部组是最后匹配的组。因此，对于((ab))或((a)(b))，外部组是第1组，最后一个匹配。

Answer 2

在正则表达式组中使用()捕获。在python re中，lastindex拥有最后一个捕获组。

让我们看看这个小代码示例：

match = re.search("(\w+).+?(\d+).+?(\W+)", "input 123 ---")
if match:
    print match.lastindex

在这个例子中，输出将是3，因为我在我的正则表达式中使用了三个()并且它匹配了所有这些。

对于上面的代码，如果在if块中执行以下行，它将输出123，因为它是第二个捕获组。

print match.group(2)