Question

我正在使用以下功能：

function MakeLinks($source){
 return preg_replace('!(((f|ht){1}tp://)[-a-zA-Zа-яА-Я()0-9@:%_+.~#?&;//=]+)!i', '<a href="/1">$1</a>', $source);
}

function simpleWiki($text){
 $text = preg_replace('/\[\[Image:(.*)\]\]/', '<a href="$1"><img src="$1" /></a>', $text);
 return $text;
}

第一个将http://example.com转换为http://example.com链接。

第二个功能将[[Image:http://example.com/logo.png]]之类的字符串转换为图像。

现在，如果我有文字

$text = 'this is my image [[Image:http://example.com/logo.png]]';

并将其转换为simpleWiki(makeLinks($text))，输出类似于：

的内容

this is my image <a href="url"><img src="<a href="url">url</a>"/></a>

我该怎样防止这种情况？如何检查URL是否不属于[[Image:URL]]构造？

Answer 1

在MakeLinks添加此[^:"]{1}，请参阅以下内容：

function MakeLinks($source){
    return preg_replace('![^:"]{1}(((f|ht){1}tp://)[-a-zA-Zа-яА-Я()0-9@:%_+.~#?&;//=]+)!i', '<a href="/1">$1</a>', $source);
}

然后只有没有“：”的链接（如图像:)中的链接将被转换。并使用$text = simpleWiki(MakeLinks($text));。

编辑：您可以更改：preg_replace('![[:space:]](((f|ht){1}tp://)[-a-zA-Zа-яА-Я()0-9@:%_+.~#?&;//=]+)[[:space:]]!i', '<a href="$1">$1</a>', $source);

Answer 2

通过将两个表达式合并为一个（有两个替代方案），然后使用不那么知名但非常强大的，可以解决您的直接问题：{{1} }函数，它通过目标字符串分别处理每个案例，如下所示：

preg_replace_callback()

这个脚本实现了你的两个正则表达式并完成了你的要求。请注意，我确实将贪婪的<?php // test.php 20110312_1200 $data = "[[Image:http://example.com/logo1.png]]\n". "http://example1.com\n". "[[Image:http://example.com/logo2.png]]\n". "http://example2.com\n"; $re = '!# Capture WikiImage URLs in $1 and other URLs in $2. # Either $1: WikiImage URL \[\[Image:(.*?)\]\] | # Or $2: Non-WikiImage URL. (((f|ht){1}tp://)[-a-zA-Zа-яА-Я()0-9@:%_+.~#?&;//=]+) !ixu'; $data = preg_replace_callback($re, '_my_callback', $data); // The callback function is called once for each // match found and is passed one parameter: $matches. function _my_callback($matches) { // Either $1 or $2 matched, but never both. if ($matches[1]) { // $1: WikiImage URL return '<a href="'. $matches[1] . '"><img src="'. $matches[1] .'" /></a>'; } else { // $2: Non-WikiImage URL. return '<a href="'. $matches[2] . '">'. $matches[2] .'</a>'; } } echo($data); ?>更改为(.*)懒惰版本，因为贪婪版本无法正常工作（它无法处理多个WikiImages）。我还将(.*?)修饰符添加到正则表达式（当模式包含Unicode字符时需要）。如您所见，preg回调函数非常强大。（这种技术可用于做一些非常繁重的工作，文本处理。）

但请注意，您用于挑选网址的正则表达式可以得到显着改善。有关“链接”URL的更多信息，请查看以下资源（提示：有一堆“陷阱”）：
The Problem With URLs
An Improved Liberal, Accurate Regex Pattern for Matching URLs
URL Linkification (HTTP/FTP)

简单的Wiki分析器和链接自动检测

2 个答案: