正则表达式模式查找Markdown链接定义

时间:2014-03-20 11:02:22

标签: php regex markdown

我正在尝试在降价文档中找到所有链接定义:

See you saturday!
  [1]: //xxxx.com/assets/uploads/2014/01/0334633e32aa851cac33d55f4b30a1c4.jpg
  [2]: http://a_url.com/aaaa-2013
  [3]: //xxxx.com/assets/uploads/2014/01/cfdd0c7ce2bbef9b0536455c0914c2f8.jpg

我正在使用这种模式(取自Michelf的markdown解析器http://michelf.com/projects/php-markdown)::

/
^[ ]{0,3}(?:\[(.+?)\][ ]?:) # id = $1
[ ]*
\n?             # maybe *one* newline
[ ]*
(?:
<(.+?)>         # url = $2
|
(\S+?)          # url = $3
)
[ ]*
\n?             # maybe one newline
[ ]*
(?:
(?<=\s)         # lookbehind for whitespace
["(]
(.*?)           # title = $4
[")]
[ ]*
)?  # title is optional
(?:\n+|\Z)
/xm

这个工作正常,直到您使用具有相对模式的网址,在上面的输入示例中,preg_match_all仅返回最后一个链接,但使用此输入:

See you saturday!
  [1]: //xxxx.com/assets/uploads/2014/01/0334633e32aa851cac33d55f4b30a1c4.jpg
  [2]: http://a_url.com/aaaa-2013
  [3]: http://xxxx.com/assets/uploads/2014/01/cfdd0c7ce2bbef9b0536455c0914c2f8.jpg

它返回所有三个,我不明白为什么。
我怀疑它与正则表达式的贪婪行为有关。

0 个答案:

没有答案