Question

我发现了一些提到这个问题的帖子，但没有一个帖子完全解决它。

我需要一个函数，它将输出以htmlentities（）方式转换所有特殊字符的内容，但保留所有html标签。

我尝试了很多不同的方法，但正如我上面提到的 - 它们都没有按预期工作。

我想知道是否有办法使用PHP类DomDocument。

我尝试使用以下方法：

$objDom = new DOMDocument('1.0', 'utf-8');
$objDom->loadhtml($content);
return $objDom->savehtml();

有效，但它也增加了整个页面结构，即

<head><body> etc.

我只需要转换$ content变量的内容并完成工作。

另外值得一提的是，$ content也可能会将某些字符转换为xhtml投诉 - 因为它来自Wysiwyg。所以它可能包含＆amp;等等，也应该保留。

任何人都知道使用DomDocument的方法 - 也许我应该使用不同的保存方法？

好的 - 我已经提出了以下内容 - 不是很好，但确实可以找到工作：

$objDom = new DOMDocument('1.0', 'UTF-8');
$objDom->loadHTML($string);
$output = $objDom->saveXML($objDom->documentElement);
$output = str_replace('<html><body>', '', $output);
$output = str_replace('</body></html>', '', $output);
$output = str_replace('&#13;', '', $output);
return $output;

任何更好的想法都会受到赞赏。

Answer 1

您可以使用get_html_translation_table并移除<和>项：

$trans = get_html_translation_table(HTML_ENTITIES, ENT_NOQUOTES);
unset($trans['<'], $trans['>']);
$output = strtr($input, $trans);

Answer 2

get_html_translation_table（HTML_ENTITIES）为您提供htmlentities（）中使用的转换表作为数组。您可以删除＆lt;，＆gt;和“像这样的数组：

<?php
$trans = get_html_translation_table(HTML_ENTITIES);
unset($trans["\""], $trans["<"], $trans[">"]);
$str = "Hallo <strong>& Frau</strong> & Krämer";
$encoded = strtr($str, $trans);

echo $encoded;
?>

Answer 3

首先让我说，在我看来，你要做的是有趣的错误。如果有人想输入一个低于标志怎么办？就个人而言，我认为htmlentities()是确保用户无法的一种方法输入自己的HTML代码。

如果您需要用户能够设置文本样式，则有许多解决方案已经为此做了（查看TinyMCE或例如Markdown。

如果您必须允许用户输入HTML标记和，您必须假设它们不知道如何使用实体，这是一个有效的简单函数：

function my_htmlentities ($str)
{
  // We'll append everything to this.
  $result = '';

  // Continue while there are HTML tags.
  while (($lt = strpos($str, '<')) !== false)
  {
    // Run `htmlentities` on everything before the tag, and pop it 
    // off the original string.
    $result .= htmlentities(substr($str, 0, $lt));
    $str = substr($str, $lt);

    // We want to continue until we reach the end of the tag. I know 
    // these loops are bad form. Sorry. I still think in F77 :p
    while (true)
    {
      // Find the closing tag as well as quotes.
      $gt = strpos($str, '>');
      $quot = strpos($str, '"');

      // If there is no closing bracket, append the rest of the tag 
      // as plaintext and exit.
      if ($gt === false)
        return $result . $str;

      // If there is a quote before the closing bracket, take care 
      // of it.
      if ($quot !== false && $quot < $gt)
      {
        // Grab everything before the quote.
        $result .= substr($str, 0, $quot+1);
        $str = substr($str, $quot+1);

        // Find the closing quote (if there is none, append and 
        // exit).
        if (($quot = strpos($str, '"')) === false)
          return $result . $str;

        // Grab the inside of the quote.
        $result .= substr($str, 0, $quot+1);
        $str = substr($str, $quot+1);

        // Start over as if we were at the beginning of the tag.
        continue;
      }

      // We just have the closing bracket to deal with. Deal.
      $result .= substr($str, 0, $gt+1);
      $str = substr($str, $gt+1);
      break;
    }
  }

  // There are no more tags, so we can run `htmlentities()` on the 
  // rest of the string.
  return $result . htmlentities($str);

  // Alternatively, if you want users to be able to enter their own
  // entities as well, you'll have to use this last line instead:
  return str_replace('&amp;', '&', $result . htmlentities($str));
}

但请允许我重申：这是非常不安全的！我会给你怀疑你知道自己想要什么，但我不这么认为你（或任何人）应该想要这个。

Answer 4

好的 - 经过大量的研究后，我提出了最终的选择 - 这似乎正是我所需要的。

我使用了HTMLPurifier并使用以下内容过滤了我的内容：

require_once('HTMLPurifier/HTMLPurifier.auto.php');
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
$objPurifier = new HTMLPurifier($config);
return $objPurifier->purify($string);

我希望其他人会发现它很有用。

PHP htmlentities（）没有转换html标签

4 个答案: