我想要一个正则表达式,它会将多次出现捕获到一个组中。例如,想象以下短语:
cat | likes her | mat
dog | goes to his | basket
我希望能够将短语的每个部分都捕捉到一个固定的位置
array(
0 => cat likes her mat
1 => cat
2 => likes her
3 => mat
)
显然使用:
$regex = '/(cat|dog)( likes| goes| to| his| her)* (mat|basket)/';
preg_match($regex, "The cat likes her mat", $m);
给出:
array(
0 => cat likes her mat
1 => cat
2 => likes
3 => her
4 => mat
)
但我总是想要$ m [3]中的垫子/篮子,无论中间有多少单词匹配。
我试过这个:
$regex = '/(cat|dog)(?:( likes| goes| to| his| her)*) (mat|basket)/';
尝试阻止捕获多个子模式,但这只会导致捕获第一个单词,即
array(
0 => cat likes her mat
1 => cat
2 => likes
3 => mat
)
有没有人知道如何捕捉短语的中间部分(未知数量的病房长度),但仍然可以将其记录到预测输出中。
btw我不能使用(cat|dog).*?(mat|basket)
,因为中间只允许指定的单词。
上面只是一个例子;实际使用情况为每个子模式提供了更多选项。
感谢。
答案 0 :(得分:2)
/\b(cat|dog) ((?: ?(?:likes|goes|to|his|her)\b)*) ?(mat|basket)\b/
答案 1 :(得分:1)
这种模式怎么样?
$regex = '/\b(cat|dog)\b((?:\b(?:\s+|likes|goes|to|his|her)\b)*)\b(mat|basket)\b/';
preg_match($regex, "The cat likes her mat", $m);
我有这个结果:
array (size=4)
0 => string 'cat likes her mat' (length=17)
1 => string 'cat' (length=3)
2 => string ' likes her ' (length=11)
3 => string 'mat' (length=3)
我投票给卡西米尔的结果,但他的模式在这些字符串上返回误报:
cat likesher mat
cat likes her mat
cat mat