下面的函数将对找到的关键字的前两次出现的内容(即html标记)进行替换,包括粗体和em标记。
我需要考虑的一个案例是,如果关键字已经在h1标签内,我不希望回调发生。
示例:
< h1>这是标题标记< / h1>
内的关键字更换后
< h1>这是< b>关键字< / b>在标题标记内< / h1>
我如何更改替换项以便它跳过标题标记(h1-h6)中出现的关键字并继续下一个匹配?
function doReplace($matches)
{
static $count = 0;
switch($count++) {
case 0: return ' <b>'.trim($matches[1]).'</b>';
case 1: return ' <em>'.trim($matches[1]).'</em>';
default: return $matches[1];
}
}
function save_content($content){
$mykeyword = "test";
if ((strpos($content,"<b>".$mykeyword) > -1 ||
strpos($content,"<strong>".$mykeyword) > -1) &&
strpos($content,"<em>".$mykeyword) > -1 )
{
return $content;
}
else
{
$theContent = preg_replace_callback("/\b(?<!>)($mykeyword)\b/i","doReplace", $content);
return $theContent;
}
}
答案 0 :(得分:4)
不要将正则表达式用于HTML / XML:
$d = new DOMDocument();
$d->loadHTML($your_html);
$x = new DOMXpath($d);
foreach($x->query("//text()[
contains(.,'keyword')
and not(ancestor::h1)
and not(ancestor::h2)
and not(ancestor::h3)
and not(ancestor::h4)
and not(ancestor::h5)
and not(ancestor::h6)]") as $node){
//do with the node as you like
}