正则表达式检测并替换<a> tag?</a>

时间:2013-12-08 19:22:35

标签: php html regex hyperlink preg-match

如果我要使用preg_replace,那么相应的正则表达式是什么,以确定字符串是否包含一个或多个<a>标记,然后将rel="nofollow"添加到其中?

所以需要这个:

Hi! What's up? <a href="http://test.com">Click here</a> to check out
<a href="http://apple.com">my</a> website. This is <b>also</b> a test.

把它变成这个:

Hi! What's up? <a href="http://test.com" rel="nofollow">Click here</a>
to check out <a href="http://apple.com" rel="nofollow">my</a> website. This is
<b>also</b> a test.

2 个答案:

答案 0 :(得分:1)

使用DOM比使用正则表达式更好。

$html = <<<DATA
Hi! What's up? <a href="http://test.com">Click here</a> to check out
<a href="http://apple.com">my</a> website. This is <b>also</b> a test.
DATA;

$dom = new DOMDocument;
$dom->loadHTML($html);

$xpath = new DOMXPath($dom);
$links = $xpath->query('//a');

foreach($links as $link) { 
   $link->setAttribute('rel', 'nofollow');
}

echo $dom->saveHTML();

输出

Hi! What's up? <a href="http://test.com" rel="nofollow">Click here</a> 
to check out <a href="http://apple.com" rel="nofollow">my</a> website. This is 
<b>also</b> a test.

答案 1 :(得分:0)

现在开始:只需匹配<a>标记的内容并进行修改即可。

$new_text = preg_replace('#<a\b((?![^>]*rel="nofollow")[^>]+)>#', '<a \1 rel="nofollow">', $your_starting_text);

否定前瞻((?![^>]*rel="nofollow"))的目的是避免重复添加rel属性。它说,如果<a>标记已经rel="nofollow",则不匹配。 编辑修复双重添加故障。

演示:

$your_starting_text = 'Hi! What\'s up? <a href="http://test.com" rel="nofollow">Click here</a>
    to check out <a href="http://apple.com" rel="nofollow">my</a> website. This is
    <b>also</b> a test.';
$new_text = preg_replace('#<a\b((?![^>]*rel="nofollow")[^>]+)>#', '<a \1 rel="nofollow">', $your_starting_text);
echo htmlentities($new_text);

输出:

Hi! What's up? <a href="http://test.com" rel="nofollow">Click here</a> to check out <a href="http://apple.com" rel="nofollow">my</a> website. This is <b>also</b> a test.