Question

如何删除所有带有文本的标签PHP

我读了其他SO答案，但没有按预期工作。我尝试了/<[^>]*>/和其他reg表达式，但无法使其正常工作。并且strip_tags仅删除没有文字的标签。

以下是我的示例：http://www.regexr.com/3dmif

如何删除标签中的标签？像：

<a>test</a> hello mate <p> test2 <a> test3 </a></p>

输出应为： hello mate

Answer 1

使用正则表达式获取结果将非常困难，因为它需要了解html范围，正则表达式无法使用它，因此使用它将是一个非常糟糕的解决方案。

解决问题的一个简单方法就是解析html并在第一维上获取文本节点。

此代码段解决了您遇到的问题，但您必须根据需要进行扩展/更改。

<?php 
// creates a new dom document with your html
// contents
$dom = new DOMDocument;
$dom->loadHTML("<a>test</a> hello mate <p> test2 <a> test3 </a></p>");

// always use the body element
$body = $dom->getElementsByTagName('body')->item(0);

// prepare your  text
$text = '';

// itarete over all items on the first dimension
// and check if they are a text node:
foreach($body->childNodes as $node)
{
    if ($node->nodeName === '#text')
    {
        $text .= $node->nodeValue;
    }
}

var_dump($text); // hello mate

干杯。

修改

正如@ splash58指出的那样，你也可以使用xpath直接访问文本节点。

<?php // creates a new dom document with your html // contents $dom = new DOMDocument; $dom->loadHTML("<a>test</a> hello mate <p> test2 <a> test3 </a></p>"); $xpath = new DOMXpath($dom); $text = ''; foreach ($xpath->query("/html/body/text()") as $node) { $text .= $node->nodeValue; } var_dump($text); // hello mate

Answer 2

此代码段解决了您遇到的问题。它会对你有所帮助。

<?php

$title = "<a>test</a> hello mate <p> test2 <a> test3 </a></p>";

$result = preg_replace("(<([a-z]+)>.*?</\\1>)is","",$title);
echo $result;   // hello mate

?>

如何删除PHP中包含文本的所有标签

2 个答案: