改变正则表达式不包括[和]内的东西

时间:2013-06-12 11:21:23

标签: php regex

我在这里有这个自动链接正则表达式代码:

// turn any url into url bbcode that doesn't have it already - so we can auto link urls- thanks stackoverflow

$URLRegex = '/(?:(?<!(\[\/url\]|\[\/url=))(\s|^))'; // No [url]-tag in front and is start of string, or has whitespace in front
$URLRegex.= '(';                                    // Start capturing URL
$URLRegex.= '(https?|ftps?|ircs?):\/\/';            // Protocol
$URLRegex.= '\S+';                                  // Any non-space character
$URLRegex.= ')';                                    // Stop capturing URL
$URLRegex.= '(?:(?<![.,;!?:\"\'()-])(\/|\s|\.?$))/i';      // Doesn't end with punctuation and is end of string, or has whitespace after

$body = preg_replace($URLRegex,"$2[url=$3]$3[/url]$5", $body);

问题是如果url在quote标签内,并且end quote标签正好在链接中,则结束引用标记包含在链接中,这当然会混淆它!

如何调整该正则表达式,使其不包含链接中[和]内的任何内容?

示例输入:

[quote=liamdawe] Have you had a look at [url=http://pcgamingwiki.com/wiki/Serious_Sam_II#Linux_Installation]this howto[/url]? :)

http://pcgamingwiki.com/wiki/Serious_Sam_II#Linux_Installation[/quote]
Testing

正确输出将是:

<div class="quote"><strong>Quote from liamdawe</strong><br />  Have you had a look at <a href="http://pcgamingwiki.com/wiki/Serious_Sam_II#Linux_Installation" target="_blank">this howto</a>? <img src="/jscripts/sce/emoticons/smile.png" alt="" /><br />
<br />
<a href="http://pcgamingwiki.com/wiki/Serious_Sam_II#Linux_Installation" target="_blank">http://pcgamingwiki.com/wiki/Serious_Sam_II#Linux_Installation</a></div><br />
Testing

但我得到的输出是:

<div class="quote"><strong>Quote from liamdawe</strong><br />  Have you had a look at <a href="http://pcgamingwiki.com/wiki/Serious_Sam_II#Linux_Installation" target="_blank">this howto</a>? <img src="/jscripts/sce/emoticons/smile.png" alt="" /><br />
<br />
<a href="http://pcgamingwiki.com/wiki/Serious_Sam_II#Linux_Installation </div>" target="_blank">http://pcgamingwiki.com/wiki/Serious_Sam_II#Linux_Installation[/quote]</a><br />
Testing<br />

正如您所看到的,它在链接中包含了[/ quote]标记,因为它不会忽略自动链接器正则表达式中的bbcode标记。

如果需要,以下是执行该类型引用的代码:     //引用一个真实的人,书或其他什么     $ pattern ='/ [quote \ =(。+?)](。+?)[/ quote] /是';

$replace = "<div class=\"quote\"><strong>Quote from $1</strong><br />$2</div>";

while(preg_match($pattern, $body))
{
    $body = preg_replace($pattern, $replace, $body);
}

1 个答案:

答案 0 :(得分:2)

尝试这个

$URLRegex = '/(?:(?<!(\[\/url\]|\[\/url=))(\s|^))'; // No [url]-tag in front and is start of string, or has whitespace in front
$URLRegex.= '(';                                    // Start capturing URL
$URLRegex.= '(https?|ftps?|ircs?):\/\/';            // Protocol
$URLRegex.= '[\w\d\.\/#\_\-\?:=]+';                        // Any non-space character
$URLRegex.= ')';                                    // Stop capturing URL
$URLRegex.= '(?:(?<![.,;!?:\"\'()-])(\/|\[|\s|\.?$))/i';      // Doesn't end with punctuation and is end of string, or has whitespace after

$body = preg_replace($URLRegex,"$2[url=$3]$3[/url]$5", $body);