如何使用preg_replace剥离与父标记相同的标记?
例如,我有一个名为<strip>
的标签,我想剥离任何<strip>
的子标签
<strip><strip>Avoid my tag</strip></strip>
我想要那样 - &gt; <strip>Avoid my tag</strip>
我不太了解preg_ *,但那就是我已经拥有的东西:
preg_replace_callback(
'#\<strip\>(.+?)\<\/strip\>#s',
create_function(
'$matches',
'return "<strip>".htmlentities($matches[1])."</strip>";'
),
$content
);
这个小函数会对<strip>
个标记内的所有内容应用htmlentities,而idon不希望<strip>
标记在彼此内部重复
感谢
答案 0 :(得分:0)
请不要使用html dom的正则表达式,看看DOMXPath
该文档是HERE
这里有一个例子:
$doc = new DOMDocument();
$doc->loadHTMLFile($file);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("/html/body/*");
foreach ($elements as $element) {
$nodes = $element->childNodes;
foreach ($nodes as $node) {
//Do your stuff here :)
echo $node->nodeValue. "\n";
}
}
如果是某些XML,请查看SimpleXMLElement
HERE
答案 1 :(得分:0)
如果您想要正则表达式,请执行此操作
$str = 'abc<strip><strip><strip>Avoid my tag</strip></strip></strip>def';
echo preg_replace('
/((<strip>(?=<strip>))*)(<strip>[^<]+<\/strip>)(((?<=<\/strip>)<\/strip>)*)/',
'\3', $str); // abc<strip>Avoid my tag</strip>def
更新再一步
echo preg_replace_callback(
'/((<strip>(?=<strip>))*)<strip>([^<]+)<\/strip>(((?<=<\/strip>)<\/strip>)*)/',
create_function(
'$matches',
'return "<strip>".htmlentities($matches[3])."</strip>";'
),
$content
);