发布示例文字:
<a href="http://www.google.com" onclick="unwanted_code" style="unwanted_style" ondblclick="unwanted_code" onmouseover="unwanted_code">google</a> is a search engine. There are other engines too. <a href="http://www.yahoo.com" onclick="unwanted_code" ondblclick="unwanted_code" onmouseover="unwanted_code" style="unwanted_style">yahoo</a> is another engine.
首先尝试:
$pattern[0] = '/(<[^>]+) on.*=".*?"/iU';
$replace[0] = '$1';
$pattern[1] = '/(<[^>]+) style=".*?"/iU';
$replace[1] = '$1';
$out = preg_replace($pattern, $replace, $in);
输出:
<a href="http://www.google.com">yahoo</a> is another engine.
第二次尝试:
$out = preg_replace_callback('/(<[^>]+) on.*=".*?"/iU', function($m) {return $m[1];}, $in);
输出:
<a href="http://www.google.com">yahoo</a> is another engine.
输出我想要的是:
<a href="http://www.google.com">google</a> is a search engine. There are other engines too. <a href="http://www.yahoo.com">yahoo</a> is another engine.
谁帮助我了?
答案 0 :(得分:3)
怎么样:
$content = '<a href="http://www.google.com" onclick="unwanted_code" style="unwanted_style" ondblclick="unwanted_code" onmouseover="unwanted_code">google</a> is a search engine. There are other engines too. <a href="http://www.yahoo.com" onclick="unwanted_code" ondblclick="unwanted_code" onmouseover="unwanted_code" style="unwanted_style">yahoo</a> is another engine.';
$result = preg_replace('%(<a href="[^"]+")[^>]+(>)%m', "$1$2", $content);
echo $result,"\n";
<强>输出:强>
<a href="http://www.google.com">google</a> is a search engine. There are other engines too. <a href="http://www.yahoo.com">yahoo</a> is another engine.
答案 1 :(得分:2)
即使这个问题被标记为regex,我仍然会添加这个答案,因为它对输入验证更加健壮;此特定解决方案仅接受某些标记并限制允许的属性:
$doc->loadHTML('<html><body>' . $html . '</body></html>');
$allowedTags = ['a' => ['href']];
$body = $doc->getElementsByTagName('body')->item(0);
$elements = $body->getElementsByTagName('*');
for ($k = 0; $element = $elements->item($k); ) {
$name = strtolower($element->nodeName);
if (isset($allowedTags[$name])) {
$allowedAttributes = $allowedTags[$name];
for ($i = 0; $attribute = $element->attributes->item($i); ) {
if (!in_array($attribute->nodeName, $allowedAttributes)) {
$element->removeAttribute($attribute->nodeName);
continue;
}
++$i;
}
} else {
$element->parentNode->removeChild($element);
continue;
}
++$k;
}
$result = '';
foreach ($body->childNodes as $childNode) {
$result .= $doc->saveXML($childNode);
}
echo $result;
答案 2 :(得分:0)
由于您要保留属性(href),因此无法全部删除它们。使用此代码,您可以实现所需的功能,但列出了所有不需要的属性:
preg_replace('#(onclick|style|ondblclick|onmouseover)="[^"]+"#', '', $in);
也许它可以简化,但这只是有效:)