Question

这是一个简单的：）

我有这条线很好用：

$listing['biz_description'] = preg_replace('/<!--.*?--\>/','',$listing['biz_description']);

删除html实体版本的正确正则表达式是什么？

这是实体：

&lt;!-- --&gt;

Answer 1

如果您对已经拥有的preg_replace正则表达式感到满意，我会解码html实体 ... html_entity_decode正如@ircmaxell所提到的，使用正则表达式进行html解析可能非常痛苦。

$str = "This is a  of the emergency  system"; $str = preg_replace('/<!--.*?--\>', '' ,html_entity_decode($str)); echo $str;

Answer 2

NEVER use regex to parse HTML/XML ...

使用DomDocument的实现（假设有效的xml）：

$dom = new DomDocument();
$dom->loadXml($listing['biz_description']);
removeComments($dom);
$listing['biz_description'] = $dom->saveXml();

function removeComments(DomNode $node) {
    if ($node instanceof DomComment) {
        $node->parentNode->removeChild($node);
    } else {
        foreach ($node->childNodes as $child) {
            removeComments($child);
        }
    }
}

preg_replace的问题

2 个答案: