Question

我正在使用preg_match_all从论坛上的帖子中获取引用的用户：

    preg_match_all('/quote author=(.*) link=/', $post, $quotedUsers);

$ post字符串通常类似于：

[quote author=John link=topic=1234.msg123456#msg123456 date=1234567890]Lorem ipsum dolor sit amet[/quote]
Lorem ipsum dolor sit amet consectetur elit...

当只引用一个用户时，preg_match_all函数可以正常工作，并返回如下内容：

Array
(
    [0] => Array
        (
            [0] => quote author=John link=
        )

    [1] => Array
        (
            [0] => John
        )

)

我的代码循环遍历每个$ quotedUsers [1]以获取用户名，我认为一切都很好。除了，当引用两个用户时，它看起来更像是这样：

Array
(
    [0] => Array
        (
            [0] => quote author=Bob link=topic=1234.msg123456#msg13456 date=1234567890]Lorem ipsum dolor sit amet[/quote]

[quote author=John link=
        )

    [1] => Array
        (
            [0] => Bob link=topic=1234.msg123456#msg13456 date=1234567890]Lorem ipsum dolor sit amet[/quote]

[quote author=John
        )

)

发生了什么，我该如何解决这个问题？我认为preg_match_all会将所有用户名放入$ quotedUsers [1]数组中。

Answer 1

在正则表达式上，你必须要*不要贪心

'/quote author=(.*?) link=/'

只需添加一个？在*

之后

Answer 2

让*非贪婪：

/quote author=(.*?) link=/

这将匹配任何字符，直到下一个 )找到。否则它将匹配尽可能多的字符（意味着它将匹配最后找到的)）。

有关Repetition with Star and Plus

的更多信息

Answer 3

问题是你当前的RegExp，.*正在贪婪并且抓取太多内容。

preg_match_all('\[quote author\=([^\]]+) link\=', $post, $quotedUsers);

你应该做。

修订：希望用户名不会有方括号......

PHP / Regex / preg_match_all

3 个答案: