从html字符串中获取纯文本

时间:2016-03-03 18:33:43

标签: strip-tags

我正在创建一个xml文件,它将提取产品的描述(opencart)。问题是它显示了描述中的所有html。我希望得到没有任何html标签,div,样式等的纯文本。

例如,sting在xml输出以下html

<p><span style="font-weight: bold; font-family: Arial; color: rgb(0, 0, 0); font-size: 18px; font-style: italic;">Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed</span></p>

<p><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">diam nonummy nibh euismod tincidunt ut laoreet dolore magna.</span></p>

<p><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">aliquam erat volutpat. Ut wisi enim ad minim veniam, quis</span></p>

<p><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip&nbsp;</span><span style="color: rgb(0, 0, 0); font-family: Arial; font-size: 14px;">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip.</span><span style="color: rgb(0, 0, 0); font-family: Arial; font-size: 14px;">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span><span style="color: rgb(0, 0, 0); font-family: Arial; font-size: 14px;">&nbsp;nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span><span style="color: rgb(0, 0, 0); font-family: Arial; font-size: 14px;">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip.</span><span style="color: rgb(0, 0, 0); font-family: Arial; font-size: 14px;">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span></p>

<p><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">&nbsp;</span></p>

<p><span style="font-family: Arial; color: rgb(0, 0, 0); font-size: 18px; font-weight: bold;">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span></p>

<ul>
	<li><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span></li>
	<li><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span></li>
	<li><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span></li>
	<li><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span></li>
	<li><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span></li>
</ul>

<p><em><strong><span style="font-family: Arial; font-size: 14px; color: rgb(0, 0, 0);">nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip</span></strong></em></p>

<p><span style="font-family: Arial;">&nbsp;</span></p>

<p>&nbsp;</p>

在 $ product ['description']我想只保留没有html标签的描述。

我试过

  

$ proddescr = strip_tags(html_entity_decode($ product ['description'],ENT_QUOTES,'UTF-8'));

但是它给了我xml的错误

我也试过

  

$ proddescr = strip_tags($ product ['description'];   ehco $ proddescr;

但也没有运气

你能告诉我一个只保留字符串文本的方法吗? 感谢

1 个答案:

答案 0 :(得分:0)

<![CDATA['.strip_tags(htmlspecialchars_decode($product['description'])).']]