PHP:查找所有链接或链接文本字符串并附加查询字符串值

时间:2014-12-09 10:56:24

标签: php dom

我正在尝试查找给定字符串(电子邮件正文)中的所有链接(或单纯链接文本),并在所有网址中附加自定义查询字符串值(谷歌链接跟踪)。

我以此为例:

$html = <<< S
<html><body><p></p><div align="center"><img
src="https://domain.com/assets/uploads/291c7977c3b2dc87cdfd77533aa95d25.png"></div><br><br>Hello&nbsp;<strong></strong>,&nbsp;<br><br>Type
your message
here...<br><br>https://domain.com/qa/<br><br><br>Thanks</body></html>
S;

$dom = new DOMDocument;

$dom->loadHTML($html);

$anchors = $dom->getElementsByTagName('body')->item(0)->getElementsByTagName('a');

foreach($anchors as $anchor) {

    $href = $anchor->getAttribute('href');

    $url = parse_url($href);

    $attach = 'stackoverflow=true'; // attach this to all urls

    if (isset($url['query'])) {
        $href .= '&' . $attach;
    } else {
        $href .= '?' . $attach;
    }

    $anchor->setAttribute('href', $href);
}

echo $dom->saveHTML();

但是链接不会被替换。在这种情况下,我希望能够将stackoverflow=true附加到给定字符串中的所有链接,但这不会发生。

任何帮助将不胜感激。感谢

1 个答案:

答案 0 :(得分:1)

好的我找到了解决方案。我首先需要链接所有文本链接,然后使用DOM来执行追加工作。这是修改后的代码:

$html = <<< S
<html><body><p></p><div align="center"><img
src="https://domain.com/assets/uploads/291c7977c3b2dc87cdfd77533aa95d25.png"></div><br><br>Hello&nbsp;<strong></strong>,&nbsp;<br><br>Type
your message
here...<br><br>https://domain.com/qa/<br><br><br>Thanks</body></html>
S;

// first linkify any non-links
$s = preg_replace(
   "/(?<!a href=\")(?<!src=\")((https?|ftp)+(s)?:\/\/[^<>\s]+)/i",
   "<a href=\"\\0\">\\0</a>",
   $body
);

// now find links and append custom query string values
$dom = new DOMDocument;
$dom->loadHTML($s);

$anchors = $dom->getElementsByTagName('body')->item(0)->getElementsByTagName('a');

foreach($anchors as $anchor) {

    $href = $anchor->getAttribute('href');

    $url = parse_url($href);

    $attach = 'stackoverflow=true'; // attach this to all urls

    if (isset($url['query'])) {
        $href .= '&' . $attach;
    } else {
        $href .= '?' . $attach;
    }

    $anchor->setAttribute('href', $href);
}

echo $dom->saveHTML();

所以我只在顶部添加了这部分代码:

// first linkify any non-links
$s = preg_replace(
   "/(?<!a href=\")(?<!src=\")((https?|ftp)+(s)?:\/\/[^<>\s]+)/i",
   "<a href=\"\\0\">\\0</a>",
   $body
);