PHP \ DOMDocument转换>到>和&到&

时间:2016-12-29 08:58:14

标签: php special-characters domdocument

我发送html到\ DomDocument和\ DomDocument转换所有特殊字符。

我怎么能说\ DomDocument不会在{%.....%}

之间转换我们的特殊字符

{%if& a> 10%} 转换为 {%if& a> 10%}

输入

<!DOCTYPE html>
<body>
    {% if &a > 10 %}
        {% print &a %}
    {% end if %}
<img src="{%# image %}" >
<script>
    if a > 10
</script>
</body>

输出

<!DOCTYPE html>
<html><body>
    {% if &amp;a &gt; 10 %}
        {% print &amp;a %}
    {% end if %}
<img src="%7B%# image %%7D" >
<script>
    if a > 10
</script></body></html>

$dom = new \DOMDocument('1.0', 'UTF-8');
$content = '<!DOCTYPE html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}" >
                <script>
                    if a > 10
                </script>
            </body>';
@$dom->loadHTML($content);
echo $dom->saveHTML();

2 个答案:

答案 0 :(得分:0)

尝试使用htmlspecialchars

$dom = new DOMDocument('1.0', 'UTF-8');
$content =  htmlspecialchars('<!DOCTYPE html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}" >
                <script>
                    if a > 10
                </script>
            </body>');
$dom->loadHTML($content);
echo $dom->saveHTML();

输出:

  

<!DOCTYPE html><body> {% if &a > 10 %} {% print &a %} {% end if %} <img > src="{%# image %}" > <script> if a > 10 </script> </body>

答案 1 :(得分:0)

在将HTML发送到DOMDocument之前,我们应该对特殊数据进行编码,并且在Dom工作之后结束解码数据。

编码

<?php
$dom = new DomDocument();
$content = '<!DOCTYPE html>
<html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}"><script>
                    if a > 10
                </script></body></html>';

$tag_start = '(base64';
$tag_end   = ')';
//MWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMW
// encode data
$pattern = '/({%[^}]+})/ium';
preg_match_all($pattern, $content, $matches);
foreach($matches[0] as $key => $val){
    $base64 = $tag_start.base64_encode($val).$tag_end;
    $content = str_replace($val, $base64, $content);
}

// echo $content;

$dom->loadHTML($content);
$domContent = $dom->saveHTML();

<强>输出

<!DOCTYPE html>
<html><body>
                (base64eyUgaWYgJmEgPiAxMCAlfQ==)
                    (base64eyUgcHJpbnQgJmEgJX0=)
                (base64eyUgZW5kIGlmICV9)
            <img src="(base64eyUjIGltYWdlICV9)"><script>
                if a > 10
            </script></body></html>