Question

我有以下内容：

$reg[0] = '`<a(\s[^>]*)href="([^"]*)"([^>]*)>`si';
$reg[1] = '`<a(\s[^>]*)href="([^"]*)"([^>]*)>`si';
$replace[0] = '<a$1href="http://www.yahoo.com"$3>';
$replace[1] = '<a$1href="http://www.live.com"$3>';
$string = 'Test <a href="http://www.google.com">Google!!</a>Test <a href="http://www.google.com">Google!!2</a>Test';
echo preg_replace($reg, $replace, $string);

结果是：

Test <a href="http://www.live.com">Google!!</a>Test <a href="http://www.live.com">Google!!2</a>Test

我希望最终得到（差异在于第一个链接）：

Test <a href="http://www.yahoo.com">Google!!</a>Test <a href="http://www.live.com">Google!!2</a>Test

我们的想法是用一个唯一的其他URL替换字符串中链接中的每个URL。这是一个新闻通讯系统，我想跟踪人们点击的内容，因此URL将是一个“假”URL，在记录点击后，它们将被重定向到真实的URL。

Answer 1

问题是你的第一个替换字符串将与第二个搜索模式匹配，有效地用第二个替换字符串覆盖第一个替换字符串。

除非你能以某种方式将“修改过的”链接与原始链接区分开来，以便它们不被其他表达式捕获（可能通过添加额外的HTML属性？），我认为你不能真正解决这个问题。单个preg_replace()电话。我想到的一个可能的解决方案（除了正则表达式的区别）将使用preg_match_all()，因为它将为您提供一系列匹配项。然后，您可以通过迭代数组并在每个匹配的网址上运行str_replace()，将匹配的网址与跟踪网址进行编码。

Answer 2

我对正则表达式并不擅长，但如果您正在做的只是用内部URL替换外部URL（即不是您网站/应用程序的一部分），该内部URL将跟踪点击并重定向用户，那么应该很容易构建一个只匹配外部URL的正则表达式。

因此，假设您的域名为foo.com，那么您只需要创建一个仅与不包含以http://foo.com开头的网址的超链接匹配的正则表达式。现在，正如我所说，我对正则表达式非常糟糕，但这是我最好的尝试：

$reg[0] = '`<a(\s[^>]*)href="(?!http://foo.com)([^"]*)"([^>]*)>`si';

编辑：如果您还要跟踪点击引导到内部网址，那么只需将http://foo.com替换为您的重定向/跟踪网页的网址，例如http://foo.com/out.php。

我将通过一个示例场景来展示我正在谈论的内容。假设你有以下时事通讯：

<h1>Newsletter Name</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec lobortis,
ligula <a href="http://bar.com">sed sollicitudin</a> dignissim, lacus dolor
suscipit sapien, <a href="http://foo.com">eget auctor</a> ipsum ligula
non tortor. Quisque sagittis sodales elit. Mauris dictum blandit lacus.
Mauris consequat <a href="http://last.fm">laoreet lacus</a>.</p>

出于本练习的目的，搜索模式将是：

// Only match links that don't begin with: http://foo.com/out.php
`<a(\s[^>]*)href="(?!http://foo.com/out\.php)([^"]*)"([^>]*)>`si

这个正则表达式可以分为三个部分：

<a(\s[^>]*)href="
(?!http://foo.com/out\.php)([^"]*)
"([^>]*)>

在搜索的第一遍中，脚本将检查：

<a href="http://bar.com">

此链接满足regexp的所有3个组件，因此URL存储在数据库中，并替换为http://foo.com/out.php?id=1。

在搜索的第二遍中，脚本将检查：

<a href="http://foo.com/out.php?id=1">

此链接匹配1和3，但不匹配2.因此搜索将转到下一个链接：

<a href="http://foo.com">

此链接满足regexp的所有3个组件，因此URL存储在数据库中，并替换为http://foo.com/out.php?id=2。

在搜索的第3遍中，脚本将检查前2个（已经替换的）链接，跳过它们，然后找到与时事通讯中最后一个链接匹配的内容。

Answer 3

我不知道，如果我理解的话。但是我写了以下代码片段：正则表达式匹配一些超链接。然后它循环通过结果并将文本节点与超链接引用进行比较。在超链接引用中找到文本节点时，它会通过插入带有唯一键的引用示例链接来扩展匹配。

<强>更新片段查找所有超链接：

查找链接
构建追踪链接
找到每个找到的链接的位置（匹配[3]）和设置模板标记
用引用链接替换templatetags 每个链接位置都是唯一的。

$ string ='＆lt; h1＆gt;简报名称＆lt; / h1＆gt; ＆lt; p＆gt; Lorem ipsum dolor sit amet，consectetur adipiscing elit。 Donec lobortis， ligula＆lt; a href =“http://bar.com”＆gt; sed sollicitudin＆lt; / a＆gt; dignissim，lacus dolor suscipit sapien，＆lt; a href =“http://foo.com”＆gt; bar.com＆lt; / a＆gt; ipsum ligula 非犯罪者。 Quisque sagittis sodales elit。 Mauris dictum blandit lacus。 Mauris consequat＆lt; a href =“http://last.fm”＆gt; laoreet lacus＆lt; / a＆gt;。＆lt; / p＆gt; ＆lt; h1＆gt;通讯名称＆lt; / h1＆gt; ＆lt; p＆gt; Lorem ipsum dolor sit amet，consectetur adipiscing elit。 Donec lobortis， ligula＆lt; a href =“http://bar.com”＆gt; sed sollicitudin＆lt; / a＆gt; dignissim，lacus dolor suscipit sapien，＆lt; a href =“http://foo.com”＆gt; bar.com＆lt; / a＆gt; ipsum ligula 非犯罪者。 Quisque sagittis sodales elit。 Mauris dictum blandit lacus。 Mauris consequat＆lt; a href =“http://last.fm”＆gt; laoreet lacus＆lt; / a＆gt;。＆lt; / p＆gt; ＆lt; h1＆gt;通讯名称＆lt; / h1＆gt; ＆lt; p＆gt; Lorem ipsum dolor sit amet，consectetur adipiscing elit。 Donec lobortis， ligula＆lt; a href =“http://bar.com”＆gt; sed sollicitudin＆lt; / a＆gt; dignissim，lacus dolor suscipit sapien，＆lt; a href =“http://foo.com”＆gt; bar.com＆lt; / a＆gt; ipsum ligula 非犯罪者。 Quisque sagittis sodales elit。 Mauris dictum blandit lacus。 Mauris consequat＆lt; a href =“http://last.fm”＆gt; laoreet lacus＆lt; / a＆gt;。＆lt; / p＆gt; “;

$regex = '<[^>]+>(.*)<\/[^>]+>';
preg_match_all("'<a\s+href=\"(.*)\"\s*>(.*)<\/[^>]+>'U",$string,$matches);


$uniqueURL = 'http://www.yourdomain.com/trackback.php?id=';

foreach($matches[2] as $k2 => $m2){
    foreach($matches[1] as $k1 => $m1){
        if(stristr($m1, $m2)){
                $uniq = $uniqueURL.md5($matches[0][$k2])."_".rand(1000,9999);
                $matches[3][$k1] = $uniq."&refLink=".$m1;
        }
    }
}


foreach($matches[3] as $key => $val) {

    $startAt = strpos($string, $matches[1][$key]);
    $endAt= $startAt + strlen($matches[1][$key]);

    $strBefore = substr($string,0, $startAt);
    $strAfter = substr($string,$endAt);

    $string = $strBefore . "@@@$key@@@" .$strAfter;

}
foreach($matches[3] as $key => $val) {
        $string = str_replace("@@@$key@@@",$matches[3][$key] ,$string);
}
print "<pre>";
echo $string;

Answer 4

在PHP 5.3之前你可以在现场创建一个函数，你必须使用create_function（我讨厌）或帮助类。

/**
 * For retrieving a new string from a list.
 */
class StringRotation {
    var $i = -1;
    var $strings = array();

    function addString($string) {
        $this->strings[] = $string;
    }

    /**
     * Use sprintf to produce result string
     * Rotates forward
     * @param array $params the string params to insert
     * @return string
     * @uses StringRotation::getNext()
     */
    function parseString($params) {
        $string = $this->getNext();
        array_unshift($params, $string);
        return call_user_func_array('sprintf', $params);
    }

    function getNext() {
        $this->i++;
        $t = count($this->strings);
        if ($this->i > $t) {
            $this->i = 0;
        }
        return $this->strings[$this->i];
    }

    function resetPointer() {
        $this->i = -1;
    }
}

$reg = '`<a(\s[^>]*)href="([^"]*)"([^>]*)>`si';
$replaceLinks[0] = '<a%2$shref="http://www.yahoo.com"%4$s>';
$replaceLinks[1] = '<a%2$shref="http://www.live.com"%4$s>';

$string = 'Test <a href="http://www.google.com">Google!!</a>Test <a href="http://www.google.com">Google!!2</a>Test';

$linkReplace = new StringRotation();
foreach ($replaceLinks as $replaceLink) {
    $linkReplace->addString($replaceLink);
}

echo preg_replace_callback($reg, array($linkReplace, 'parseString'), $string);

如何将字符串中的每个URL替换为另一个唯一的URL？

4 个答案: