我尝试将字符串中的两个部分与PHP中的正则表达式匹配。我认为贪婪有问题。我想第一个正则表达式(见注释)给我前两个捕获,作为第二个正则表达式,但仍然捕获两个字符串。我做错了什么?
我正在尝试获取+123
(如果cd:
存在,如第一个字符串中所示)和456
。
<?php
$data[] = 'longstring start waste cd:+123yz456z longstring';
$data[] = 'longstring start waste +yz456z longstring';
$regexs[] = '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/'; // first
$regexs[] = '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/'; // second
foreach ($regexs as $regex) {
foreach ($data as $string) {
if (preg_match($regex, $string, $match)) {
echo "Tried '$regex' on '$string' and got " . implode(',', array_split($match, 1));
echo "\n";
}
}
}
?>
输出是:
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste +yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
由于第二个字符串中不存在cd:
,因此没有第四行。
预期输出(因为我不是专家),第一行与实际输出不同:
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste +yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
答案 0 :(得分:1)
好的,如果有+123
,并且始终cd:
,您希望捕获456
?我就是这样做的:
$data[] = 'longstring start waste cd:+123yz456z longstring';
$data[] = 'longstring start waste +yz456z longstring';
$regexs[] = '/start.+?(?:cd:(.+?)y)?.*?z(.+?)z/';
随着非贪婪(?
)乘数的自由使用,你可以让它完全符合你的要求。
另请注意(?:)
非捕获组。它们非常有用。
编辑显然这不起作用,让我们尝试一种不同的方法,使用“或/或”组:
$regexs[] = '/start.+?(?:cd:(.+?)yz(.+?)z|\+yz(.+?)z)/';