我需要一个函数来纠正给定HTML文本中的所有传出链接,并将属性“ rel = nofollow”添加到链接中。仅传出的链接应得到纠正。
示例:我的域名是www.laptops.com
$myDomain = "www.laptops.com";
$html =
"Hello World have a look at <a href="www.laptops.com/apple">Apple Laptops</a>.
For more ino go to <a href="www.apple.com">Apple.com</a>
or to <a href="www.appleblog.com">Appleblog.com</a>";
function correct($html,$myDomain){
//get all links by filtering '<a ... href="{$link}" .....>' and
//check with isOutgoing($href,$myDomain )
}
$newHTML = correct($html,$myDomain);
echo $newHTML;
//Hello World have a look at <a href="www.laptops.com/apple">Apple Laptops</a>.
//For more ino go to <a rel="nofollow" href="www.apple.com">Apple.com</a>
//or to <a rel="nofollow" href="www.appleblog.com">Appleblog.com</a>
到目前为止,我有一个函数“ isOutgoing($ link)”,该函数可以检测链接是否传出,但是可以检测所有“ “部分HTML文本并过滤{$ link}会产生问题。我知道preg_match()应该可以,但是我不知道如何解决。
答案 0 :(得分:2)
您应避免使用正则表达式,而应使用DOMDocument和DOMXPath。
ADJP = 'ADJP'
ADVP = 'ADVP'
NUMBER = 'CD'
DET = 'DT'
PREP = 'IN'
ADJ = 'JJ'
ADJ_COMP = 'JJR'
ADJ_SUP = 'JJS'
MODAL = 'MD'
NOUN = 'NN'
NOUN_PROPER = 'NNP'
NOUN_PL = 'NNS'
NP = 'NP'
POSS = 'POS'
PP = 'PP'
PRONOUN = 'PRP'
PRONOUN_POSS = 'PRP$'
ADVERB = 'RB'
ROOT = 'ROOT'
SENTENCE = 'S'
SBAR = 'SBAR'
WH_QUESTION = 'SBARQ'
BIN_QUESTION = 'SQ'
TO = 'TO'
VERB_INF = 'VB'
VERB_PAST = 'VBD'
VERB_PLURAL = 'VBP'
VERB_3SG = 'VBZ'
VP = 'VP'
WHNP = 'WHNP'
WHADJP = 'WHADJP'
WHADVP = 'WHADVP'
WDT = 'WDT'
WP_POSS = 'WP$'
COMMA = ','
PERIOD = '.'
结果:
<?php
$dom = new DOMDocument();
$dom->loadHtml('
Hello World have a look at <a href="www.laptops.com/apple">Apple Laptops</a>.
For more ino go to <a href="www.apple.com">Apple.com</a>
or to <a href="www.appleblog.com">Appleblog.com</a>
', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//a") as $link) {
$href = $link->getAttribute('href');
// link does not have a www.laptops.com in it, add rel attribute
if (strpos($href, 'www.laptops.com') === false) {
$link->setAttribute("rel", "nofollow noopener");
}
}
echo $dom->saveHTML();
答案 1 :(得分:0)
使用一些jQuery,这将变得更加容易。
<script type="text/javascript">
$(document).ready(function(){
$('a').each(function(){
let href = $(this).prop('href');
if (!href.search('laptops.com')) {
$(this).prop('rel', 'nofollow');
}
});
});
</script>