php整理的替代方案?

时间:2011-07-31 18:26:04

标签: php tidy htmltidy

我使用php tidy来处理我的数据库中的html输入,

$fragment = tidy_repair_string($dom->saveHTML(), array('output-xhtml'=>1,'show-body-only'=>1));

我在我的服务器上打开了这个php_tidy但是我的直播服务器不支持整洁,

  

致命错误:调用未定义的函数tidy_repair_string()   第587行/customers/0/5/a/mysite.com/httpd.www/models/functions.php

我可以解决这个问题吗?

5 个答案:

答案 0 :(得分:8)

我发现htmLawed非常快。我在寻找HTMLPurifier的替代品时发现了它,这非常慢。

答案 1 :(得分:6)

或者只是传递DOMDocument对象:

$dirty = "<xml>some content</xml>"
$x = new DOMDocument;
$x->loadHTML($dirty);
$clean = $x->saveXML();

答案 2 :(得分:5)

如果您使用的是RedHat / CentOS / Fedora Linux机箱并拥有对服务器的root访问权限,则可以运行...

yum install php-tidy

以root身份。然后重启apache,这应该让你去。

可能存在关于缺少需要添加的依赖项的错误,但通常上述命令将是您所需要的。

其他发行版的命令略有不同,但应该有类似的内容。

在Windows上,您需要手动安装它。可以在此处找到说明... http://devzone.zend.com/article/761#Heading3

答案 3 :(得分:4)

HTML Purifier可以将HTML重写为符合标准的like HTML Tidy。如果您需要过滤该输入以进行XSS预防等,它也会这样做。

这都是PHP,所以你应该可以在任何服务器上使用它。

答案 4 :(得分:0)

PHP SuperTidy?

我对PHP Tidy的工作效率感到厌倦,所以我开始写这个。它也应该整理所有JavaScript。它没有经过充分测试,因此您可能会发现一些需要考虑的意外情况。看到其中一些其他才华横溢的开发人员对此进行说明,将很有趣。附言我知道这是一个旧线程,但是我想在某个地方共享它...

这是一个开始。享受。

SuperTidy实施

$Tidy = new SuperTidy($html);
$Tidy->SetIndentSize(4);
$Tidy->SetOffset(0);
echo $Tidy->BeautifiedHTML();

SuperTidy类:

<?php
    class SuperTidy
    {
        /*
            Name: PHP SuperTidy
            Author: Paul Ishak
            Copyright: 2020
        */
        private $usedJSNames = [];
        private $indentSize = 4;
        private $sourceHtml = "";
        private $offset = -4;
        public function SetIndentSize($size)
        {
            $this->indentSize = $size;
        }
        public function __construct($html)
        {
            $this->sourceHtml = $html;
        }
        public function OriginalSource()
        {
            return $this->sourceHtml;
        }
        public function UpdateSource($html)
        {
            $this->sourceHtml = $html;          
        }
        public function SetOffset($offset)
        {
            $this->offset = $offset;
        }
        function BeautifiedHTML()
        {
            $this->usedJSNames = [];
            $buffer = $this->sourceHtml;
            $spacesPerIndent = $this->indentSize;
            $JSPlaceHolders = [];
            $out = str_replace("\r","\n",$buffer);
            $out = str_replace("\n\n","\n",$out);
            $out = str_replace("<script", "\n<script",$out);
            $out = str_replace("</script>", "\n</script>\n",$out);
            $lines = explode("\n",$out);
            $javascript = "";
            $outLines = [];
            for($i = 0; $i < count($lines); $i++)
            {
                $line = $lines[$i];
                $line = trim($line);
                if($line == "</script>") continue;
                if(strlen($line) >= strlen("<script"))
                {
                    if(strtolower(substr($line,0,7)) == "<script")
                    {
                        if(strpos(strtolower($line),"</script>"))
                        {
                            $outLines[] = $line;
                        }
                        else
                        {
                            $counter = $i + 1;
                            $jsLine = $lines[$counter];
                            $javascript = "";
                            $lineCount = 0;
                            while(strtolower(trim($jsLine)) !== "</script>")
                            {
                                $lineCount++;
                                $javascript.=$jsLine."\n";
                                $counter++;
                                if($counter > count($lines) - 1) break;
                                $jsLine = $lines[$counter];
                            }
                            $i+=$lineCount;
                            if(trim($javascript) == "")
                            {
                                $i++;
                                $line2 = $lines[$i];
                                $thisLine = $line.$line2;
                                if(strpos($thisLine,"src="))
                                {
                                    $outLines[] = $thisLine;
                                }
                                else
                                {
                                    $chars = str_split($thisLine);
                                    
                                    $stO = strpos(strtolower($thisLine),"<script");
                                    $enO = strpos(strtolower($thisLine),">",$stO)+1;
                                    $tagO = substr($thisLine,$stO,$enO);
                                    
                                    $stC = strpos(strtolower($thisLine),"</script");
                                    $enC = strpos(strtolower($thisLine),">",$stC)+1;
                                    $tagC = substr($thisLine,$stC,$enC);
                                    $javascript = substr($thisLine,$enO,$stC - $enO);
                                    $outLines[] = "<script type='application/javascript'>".$javascript."</script>";             
                                }
                            }
                            else
                            {
                                $unique = $this->GetUniqueJSPlaceHolder($out);
                                $JSPlaceHolders[$unique] = ['javascript'=>$javascript];
                                $outLines[] = "<$unique type='application/javascript'></$unique>";
                            }
                        }
                    }
                    else
                    {
                        $outLines[] = $line;
                    }
                }
                else
                {
                    $outLines[] = $line;
                }
            }
            $modHTML = "";
            foreach($outLines as $line)
            {
                $modHTML .= $line."\n";
            }
            $modHTML = str_replace("\n","",$modHTML);
            $modHTML = str_replace(">",">\n",$modHTML);
            $modHTML = str_replace("<","\n<",$modHTML);
            $modHTML = str_replace("\n\n","\n",$modHTML);
            $lines = explode("\n",$modHTML);
            $outLines = [];
            $indentLevel = -$spacesPerIndent + $this->offset;
            $openTags = [];
            foreach($lines as $line)
            {
                $line = trim($line);
                if($line !== "") $outLines[] = $line;
            }
            $modHTML = "";
            for($j = 0; $j < count($outLines); $j++)
            {
                $line = $outLines[$j];
                $isCloseTag = false;
                $firstChar = substr($line,0,1);
                $isMetaTag = substr(strtolower($line),1, 4) == "meta" ? true: false;
                $isDocType = substr(strtolower($line),2, 7) == "doctype" ? true: false;
                $isSelfClosing = substr($line, strlen($line)-2,1) == "/" ? true : false;
                $beginComment = substr($line, 0,4) == "<!--" ? true : false;
                $applyIndent = ($firstChar == "<") ? true : false;
                $applyIndent = $isMetaTag     ? false : $applyIndent;
                $applyIndent = $isDocType     ? false : $applyIndent;
                $applyIndent = $isSelfClosing ? false : $applyIndent;
                $applyIndent = $beginComment ? false : $applyIndent;
                $contentIndent = $applyIndent ? false : true;
                $tag = "";
                if($applyIndent) 
                { //This is a tag only
                    $tagInner = substr($line,1,-1);
                    $tag = "";
                    for($i = 0; $i < strlen($tagInner); $i++)
                    {
                        $char = substr($tagInner,$i,1);
                        if($char == " ") break;
                        if($char == ">") break; 
                        $tag .=$char;
                    }
                    $isCloseTag = substr($tag,0,1) == "/" ? true: false;
                    
                    if($isCloseTag)
                    {
                        $indentLevel -= $spacesPerIndent;   
                    }
                    else
                    {
                        $indentLevel += $spacesPerIndent;
                        $findTag = "</$tag>";
                        $line2 = $outLines[$j+1];
                        if(strtolower($line2) == strtolower($findTag))
                        {
                            $line = $line.$line2;
                            $j+=1;
                            $indentLevel -= $spacesPerIndent;
                            $isCloseTag = true;
                        }
                    }
                }
                $spaces = $indentLevel;
                $spaces += $contentIndent ? $spacesPerIndent : 0;
                $spaces += $isCloseTag    ? $spacesPerIndent : 0;
                $prependSpace = str_repeat(" ", $spaces);
                $line = $prependSpace.$line;
                if($tag !== "")
                {
                    $keys = array_keys($JSPlaceHolders);
                    if(in_array($tag,$keys))
                    {
                        $JSPlaceHolders[$tag]['indent'] = $indentLevel;
                    }
                }           
                $modHTML .= $line."\n";
            }
            $keys = array_keys($JSPlaceHolders);
            foreach($keys as $key)
            {
                $javascript = $JSPlaceHolders[$key]['javascript'];
                $indentOffset = $JSPlaceHolders[$key]['indent']+1;
                $javascript = $this->JSTidy($javascript, $indentOffset + ($spacesPerIndent*2), $spacesPerIndent);
                $otStart = strpos($modHTML,"<$key");
                $otEnd   = strpos($modHTML,">", $otStart)+1;
                $ot = substr($modHTML,$otStart, ($otEnd - $otStart));
                $otOut = str_replace($key, "script",$ot);
                $ctStart = strpos($modHTML,"</$key", $otEnd);
                $ctEnd = strpos($modHTML,">", $ctStart)+1;
                $ct = substr($modHTML,$ctStart, ($ctEnd - $ctStart));
                $ctOut = str_repeat(" ",$indentOffset+$spacesPerIndent-1).str_replace($key, "script",$ct);
                $otOut .= "\n".$javascript."\n";
                $modHTML = str_replace($ot,$otOut,$modHTML);
                $modHTML = str_replace($ct,$ctOut,$modHTML);
            }
            return $modHTML;
        }
        function JSTidy($javascript, $indentOffset, $spacesPerIndent)
        {
            $javascript = str_replace("{", "\n{",$javascript);
            $javascript = str_replace("}", "\n}",$javascript);
            $minJs = preg_replace(array("/\s+\n/", "/\n\s+/", "/ +/"), array("\n", "\n ", " "), $javascript);
            $jsLines = explode("\n",$minJs);
            $jsOut = "";
            $indent = $indentOffset;
            $count = count($jsLines);
            for($j = 0; $j < $count;$j++)
            {
                $line = trim($jsLines[$j]);
                if($line == "") continue;
                $c = substr($line,0,1);
                if($c == "}") $indent = $indent - $spacesPerIndent;
                $i = 0;
                $outLine = "";
                while(++$i < $indent)
                {
                    $outLine .=" ";
                }
                $outLine .=$line;
                $jsOut .=$outLine;
                if($j < $count - 2)
                {
                    $jsOut .="\n";
                }
                if($c == "{") $indent = $indent + $spacesPerIndent;             
            }
            return $jsOut;
        }
        function GetUniqueJSPlaceHolder($targetHTML)
        {
            $this->usedJSNames;
            $str = rand(); 
            $unique = "JS".strtoupper(hash("sha256", $str));
            while(strpos($targetHTML,$unique) || in_array($unique, $this->usedJSNames))
            {
                $str = rand(); 
                $unique = "JS".strtoupper(hash("sha256", $str));
            }
            $this->usedJSNames[] = $unique;
            return $unique;
        }
    }
?>