Question

基本上我想转一个这样的字符串：

<code> <div> blabla </div> </code>

进入这个：

<code> <div> blabla </div> </code>

我该怎么做？

<小时/> 用例（某些人很好奇）：

像this这样的页面，其中包含允许的HTML标记和示例列表。例如，<code>是允许的标记，这将是示例：

<code>&lt;?php echo "Hello World!"; ?&gt;</code>

我想要一个反向函数，因为有许多这样的标签带有样本，我将它们全部存储到一个数组中，我在一个循环中迭代，而不是单独处理每个...

Answer 1

没有现有功能，但请看一下。到目前为止，我只在您的示例中对其进行了测试，但此功能应该适用于所有 htmlentities

function html_entity_invert($string) {
    $matches = $store = array();
    preg_match_all('/(&(#?\w){2,6};)/', $string, $matches, PREG_SET_ORDER);

    foreach ($matches as $i => $match) {
        $key = '__STORED_ENTITY_' . $i . '__';
        $store[$key] = html_entity_decode($match[0]);
        $string = str_replace($match[0], $key, $string);
    }

    return str_replace(array_keys($store), $store, htmlentities($string));
}

更新

感谢@Mike花时间用其他字符串测试我的函数。我已将我的正则表达式从/(\&(.+)\;)/更新为/(\&([^\&\;]+)\;)/，这应该解决他提出的问题。
我还添加了{2,6}来限制每场比赛的长度，以减少误报的可能性。
将正则表达式从/(\&([^\&\;]+){2,6}\;)/更改为/(&([^&;]+){2,6};)/以删除不必要的删除。
Whooa，脑波！将正则表达式从/(&([^&;]+){2,6};)/更改为/(&(#?\w){2,6};)/以减少误报概率更进一步！

Answer 2

我的版本使用正则表达式：

$string = '<code> &lt;div&gt; blabla &lt;/div&gt; </code>';
$new_string = preg_replace(
    '/(.*?)(<.*?>|$)/se', 
    'html_entity_decode("$1").htmlentities("$2")', 
    $string
);

尝试匹配每个标记和 textnode ，然后分别应用 htmlentities 和 html_entity_decode 。

Answer 3

单独替换对你来说不够好。无论是正则表达式还是简单的字符串替换，因为如果你替换＆amp; lt＆amp; gt标志，那么＆lt;和＆gt;标志，反之亦然，你将最终得到一个编码/解码（所有＆amp; lt和＆amp; gt或所有＆lt;和＆gt;标志）。

因此，如果你想这样做，你将需要解析一套（我选择用占位符替换）做一个替换然后把它们放回去做另一个替换。

$str = "<code> &lt;div&gt; blabla &lt;/div&gt; </code>";
$search = array("&lt;","&gt;",);

//place holder for &lt; and &gt;
$replace = array("[","]");

//first replace to sub out &lt; and &gt; for [ and ] respectively
$str = str_replace($search, $replace, $str);

//second replace to get rid of original < and >
$search = array("<",">");
$replace = array("&lt;","&gt;",);
$str = str_replace($search, $replace, $str);

//third replace to turn [ and ] into < and >
$search = array("[","]");
$replace = array("<",">");

$str = str_replace($search, $replace, $str);

echo $str;

Answer 4

我认为我有一个小的解决方案，为什么不将html标签分解成一个数组然后根据需要进行比较和更改？

function invertHTML($str) {
    $res = array();
    for ($i=0, $j=0; $i < strlen($str); $i++) { 
        if ($str{$i} == "<") { 
           if (isset($res[$j]) && strlen($res[$j]) > 0){
                $j++; 
                $res[$j] = '';
           } else {
               $res[$j] = '';
           }
           $pos = strpos($str, ">", $i); 
           $res[$j] .= substr($str, $i, $pos - $i+1); 
           $i += ($pos - $i); 
           $j++;
           $res[$j] = '';
           continue; 
        } 
        $res[$j] .= $str{$i}; 
    } 

    $newString = '';
    foreach($res as $html){
        $change = html_entity_decode($html);
        if($change != $html){
            $newString .= $change;
        } else {
            $newString .= htmlentities($html);
        }
    }
    return $newString; 
}

修改....没有错误。

Answer 5

所以，虽然这里的其他人推荐了正则表达式，这可能是绝对正确的方式......我想发布这个，因为它足以解答你提出的问题。

假设您总是使用html'esque代码：

 $str = '<code> &lt;div&gt; blabla &lt;/div&gt; </code>';
 xml_parse_into_struct(xml_parser_create(), $str, $nodes);
 $xmlArr = array();
 foreach($nodes as $node) { 
     echo htmlentities('<' . $node['tag'] . '>') . html_entity_decode($node['value']) . htmlentities('</' . $node['tag'] . '>');
 }

给我以下输出：

&lt;CODE&gt; <div> blabla </div> &lt;/CODE&gt;

相当肯定这不会支持再次倒退..正如其他解决方案所发布的那样，意思是：

 $orig = '<code> &lt;div&gt; blabla &lt;/div&gt; </code>';
 $modified = '&lt;CODE&gt; <div> blabla </div> &lt;/CODE&gt;';
 $modifiedAgain = '<code> &lt;div&gt; blabla &lt;/div&gt; </code>';

Answer 6

编辑：似乎我还没有完全回答你的问题。没有内置的PHP函数可以执行您想要的操作，但您可以使用正则表达式甚至简单表达式查找和替换：str_replace，preg_replace

Answer 7

我建议使用正则表达式，例如的preg_replace（）：

反向htmlentities / html_entity_decode

7 个答案:

更新