正则表达式混合贪婪和非贪婪?

时间:2013-04-04 13:35:31

标签: php regex

我有一个字符串,我试图打破容易处理的数据。对于此示例,我想要收入以及一致数据。

$digits = '[\$]?[\d]{1,3}(?:[\.][\d]{1,2})?';
$price = '(?:' . $digits . '(?:[\-])?' . $digits . '[\s]?(?:million|billion)?)';

$str = 'revenue of $31-34 billion, versus the consensus of $29.3 billion';
preg_match_all('/(?:revenue|consensus)(?:.*)' . $price . '/U', $str, $matches[]);
print_r($matches);

返回:

Array (
    [0] => Array (
        [0] => Array (
            [0] => 'revenue of $31'
            [1] => 'consensus of $29'
        )
    )
)

我的期待:

Array (
    [0] => Array (
        [0] => Array (
            [0] => 'revenue of $31-34 billion'
            [1] => 'consensus of $29.3 billion'
        )
    )
)

当我遗漏U修饰符时:

Array (
    [0] => Array (
        [0] => Array (
            [0] => 'revenue of $31-34 billion, versus the consensus of $29.3 billion'
        )
    )
)

我无法在of中使用revenue of $31-34 billion作为明确的模式,数据可能会/可能不会使用它,因此我使用(?:.*)

1 个答案:

答案 0 :(得分:2)

preg_match_all('/(?:revenue|consensus)(?:.*?)' . $price . '/', $str, $matches[]);
                                           ^               ^  

您可以通过添加?来使一个特定的通配符非贪婪,如.*?中所示。摆脱全局/U修饰符,只将上面的通配符更改为非贪婪,只留下$digits$price

Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => revenue of $31-34 billion
                    [1] => consensus of $29.3 billion
                )
        )
)