Question

可能重复：
Regex - Greedyness - matching HTML tags, content and attributes

我要解析的文字是这样的：

Dir: <a href="/name/nm0381817/">Vinton Heuck</a>, <a href="/name/nm1367649/">Ciro Nieli</a>
    With: <a href="/name/nm0519680/">Eric Loomis</a>, <a href="/name/nm0732436/">Bumper Robinson</a>, <a href="/name/nm1685408/">Dawn Olivieri</a>

通常，在“Dir”之后有一个或两个锚元素，在“With”之后有多个锚元素。

我想要做的是在“Dir”之后和“With”之前获取锚元素的所有值。我尝试了一些像这样的正则表达式：

preg_match_all("/Dir: <a href=\"\/name\/.+\/\">(.+)<\/a>/", $content, $matches);

但这仅在“Dir”之后只有一个锚元素时才有效。有什么建议？谢谢！

Answer 1

我认为你缺少一些分组指令“（）+”不仅要获得一个而是一个或两个链接，请查看this来测试你的正则表达式。

Answer 2

您必须将正则表达式分组才能找到锚标记，并使用+作为一个或多个。

类似的东西：

/Dir: (<a href=\"\/name\/.+\/\">(.+)<\/a>)+/

您必须进行编辑以考虑逗号，但它会让您入门。

Answer 3

假设包含“Dir：”的行只出现一次：

preg_match_all("/(<([[:graph:]]+)[^>]*>)(.*?)(<\/\\2>)/", preg_replace("/[[:blank:]]*With:.*/","",$content), $matches);

print_r($matches[3]);

如何通过php为此编写正则表达式？

3 个答案: