Question

我有一个网站，可以从网站上解析RSS提要并在网页上发布在我的网站后面运行的脚本，它读取并重新格式化RSS源，目前正在剥离所有HTML标记。

以下是代码;
$description = strip_tags($description);

我想允许<p>，<a>或<br />这样的标签，但如果我这样做，由于某种原因我的网站变得一团糟。就像标题上面会有一个很大的空间。
那将是什么解决方案？

===编辑===（更多代码）

// get all of the sources of news from the database $get_sources = $db->query("SELECT * FROM ".$prefix."sources ORDER BY last_crawled ASC"); while ($source = $db->fetch_array($get_sources)) {

$feed = new SimplePie($source[url]);

$feed->handle_content_type();  

foreach ($feed->get_items() as $item)  
{  

    $title = $item->get_title();  
    $link = $item->get_link();
    $description = $item->get_content();

    // strip all html
    $description = strip_tags($description);

    // format the data to make sure it's all fine
    $title = html_entity_decode($title, ENT_QUOTES, 'UTF-8');

    // create the path, or slug if you will
    $path = post_slug($title);

    $description = html_entity_decode($description, ENT_QUOTES, 'UTF-8');

Answer 1

在剥离标记之前，处理字符串替换以转换您想要保留的特殊字符。

$source = str_replace('<p>', '&lt;p&gt;', $source);
$source = strip_tags($source);

然后使用htmlspecialchars_decode(trim($source))输出到html。

我愿意打赌你的页面布局出错的原因是css相关。仔细查看生成的源代码（如果可能的话，使用firebug），并确保每个html元素也有一个相应的close标记，并且您的脚本没有更改任何有意的html元素，尽管我不知道为什么它们会是

尝试将脚本的输出隔离到空白页面，以便您可以仔细查看正在发生的事情。然后，一旦你确定一切都在哪里，如果问题仍然存在，请尝试将输出放在页面的不同部分。另外，请确保修剪空白。

让我们知道你发现了什么。

从字符串中删除HTML标记

1 个答案: