Question

我正在尝试搜索字符串以匹配多个捕获组。在两个这样的捕获组的情况下，数据是可选的，因此它们可以匹配也可以不匹配。我使用pcregrep选项-onumberto返回各种捕获组。问题是：在没有值匹配的情况下，如何返回默认值。我试图使用析取但没有成功。

示例：

../pcre-8.32/pcregrep  -Min -o1 -o2 --om-separator="; " '(?s)<!-- BOUNDARY -->(?!.*?Read the full review).*?((\d*) of (\d*) people found the following review helpful|.*?).*?Help other customers find the most helpful' shirts/B000W18VGW

生成正确的行号。

-Min -o1 -o2 --om-separator="; " '(?s)<!-- BOUNDARY -->(?!.*?Read the full review).*?(\d*) of (\d*) people found the following review helpful.*?Help other customers find the most helpful' shirts/B000W18VGW

产生正确的输出，但仅适用于带

的行

(\d*) of (\d*) people found the following review helpful

如果上面的行不存在，我想为每个捕获组返回“0”。

这是可能的，如果是这样的话？

Answer 1

你不能让角色神奇地出现。也就是说，如果主题字符串中没有0，那么就无法捕获0。因此，如果您想要捕获0，则必须在主题中插入0。

现在，让我们说出一些疯狂的原因，你能够并且愿意修改你的主题字符串（虽然显然你不能或不愿意在正则表达式之外设置0个案例，< em> ie 代码中）。然后，这是一个解决方案。

在主题字符串的最后添加0 of 0 people found the following review helpful，而不是：

((\d*) of (\d*) people found the following review helpful|.*?)

这样做：

(?=.*?(\d*) of (\d*) people found the following review helpful)

换句话说，通过附加0 of 0 people [...]，你保证该句子将存在某处，所以通过捕获零宽度先行断言中的数字，你可以寻求在继续使用正则表达式的其余部分之前，在主题字符串中的任何地方使用句子。

pcre正则表达式：group capture和disjunction - 当没有匹配时返回默认值

1 个答案: