Question

这个问题是继续： Garbage values coming on pulling data from wordpress

我使用以下代码处理了垃圾值：

 htmlentities($entry->title, ENT_QUOTES | ENT_IGNORE, 'UTF-8')

上面这段代码的问题在于，如果数据中有任何网址，那么它不会显示该网址，而是将网址分解为以下内容：

… <a href="http://abc.com/blog/">Continue reading <span class="meta-nav">→</span></a>

如果有网址，请告诉我如何忽略。

Answer 1

这是一个hacky解决方案，但收集你如何接近这个而不用担心字符编码，你可能只是想让该死的东西工作。

首先，我们将超链接转换为hacky BBCode。然后，我们在其上运行htmlentities()，最后我们用旧的HTML替换hacky A BBCode。看看这个：

$foo = 'Opening quietly in Chicagos West Loop, the Inspire Business Center is looking to take a more active role in Chicagos startup scene &#8230; Continue reading <span class="meta-nav">&#8594;</span>';
echo smartencode($foo);

function smartencode($str) {
     $tags = 'a|span';
     // Convert Anchor Tags to hacky-BBCode
     $ret = preg_replace('/\<(\/?('.$tags.').*)\>/U', '[$1]', $str);

     // Remove so-called Garbage
     $ret = preg_replace('/[^(\x20-\x7F)]*/','', $ret);
     // $ret = htmlentities($ret, ENT_QUOTES | ENT_IGNORE, 'UTF-8');

     // Reinstate Anchor tags in HTML
     $ret = preg_replace('/\[(\/?('.$tags.').*)\]/U', '<$1>', $ret);
     return $ret;
}

同样，它不优雅。事实上，如果仔细观察，你可能会发现一些陷阱 - 但我认为它可能只适用于你的用例。

在http://writecodeonline.com/php/上测试并按预期工作。

Answer 2

使用修复特殊字符数据的htmlspecialchars_decode（）函数修复了网址问题。

下面的代码行也修复了垃圾值问题以及URL：

$ret = $feed;     
echo htmlspecialchars_decode(htmlentities($ret, ENT_QUOTES | ENT_IGNORE, 'UTF-8'));

如果有URL则忽略

2 个答案: