使用递归迭代XML文件

时间:2015-12-07 15:52:16

标签: php xml recursion

我遇到了一个我正在努力处理的递归函数的问题。我有一个基本上记录文件/文件夹结构的XML文件。这些XML节点(代表文件/文件夹)可以是N级深度。所以,我正在尝试编写一个迭代所有节点并在数据库中创建条目的脚本。对于文件夹,我有一个具有ID,field_name和parent_id列的表结构。 parent_id指向当前文件夹所在文件夹的ID。如果它位于根级别,则ID为0.

我的问题是我无法准确跟踪parent_id,当我降低情人级别时,我又回来了。这是一个XML的例子,但是实现文件夹可以是任意数量的级别:

<XML>
    <programs>
        <program name ="xxx">
            <groups>
                <group id ="1" name ="yyy">
                    <folder name = "ggg">
                        <file name = "ddfdf"/>
                        <file name = "ddfdf"/>
                        <folder name = "sub" />
                    </folder>
                    <folder name = "sdfsdfs">
                        <file name = "ddfdf"/>                  
                        <folder name = "sub" >
                            <file name = "ddfdf"/>
                        </folder>       
                    </folder>
                </group>
            </groups>
        </program>
    </programs>
</xml>

脚本:

   foreach($program as $p){
       //creates root folder and returns ID
        $id = create_folder($folder);
        $rootId = $id;
        $groups = $program->groups;
        if($p->groups){
            foreach($p->groups as $group){
                foreach($group as $folder){
                        process_folder($folder,$id, $rootId);
                }
            }
        }
    }

function process_folder(($folder,$id, $rootId){
    foreach($folder as $key=>$value){
        switch ($key){
           case "folder":
                //creates folder, then returns the ID of the db record
                $parentId = create_folder($folder);
                process_folder($value, $parentId, $rootId);
                //reset ID but this doesnt seem to work
                $parentId = $rootId;
                break;
            case "file":
                break;
        }    
    }    
}

2 个答案:

答案 0 :(得分:0)

这不是一个完整的解决方案,但它显示了如何使用recursiveIterator来完成你想要做的事情。

    $strxml='
    <xml>
        <programs>
            <program name ="xxx">
                <groups>
                    <group id ="1" name ="yyy">
                        <folder name = "ggg">
                            <file name = "ddfdf"/>
                            <file name = "ddfdf"/>
                            <folder name = "sub" />
                        </folder>
                        <folder name = "sdfsdfs">
                            <file name = "ddfdf"/>                  
                            <folder name = "sub" >
                                <file name = "ddfdf"/>
                            </folder>       
                        </folder>
                    </group>
                </groups>
            </program>
        </programs>
    </xml>';

    class RecursiveDOMIterator implements RecursiveIterator {
        private $index;
        private $list;

        public function __construct(DOMNode $domNode){
            $this->index = 0;
            $this->list = $domNode->childNodes;
        }
        public function current(){
            return $this->list->item($this->index);
        }
        public function getChildren(){
            return new self( $this->current() );
        }
        public function hasChildren(){
            return $this->current()->hasChildNodes();
        }
        public function key(){
            return $this->index;
        }
        public function next(){
            $this->index++;
        }
        public function rewind(){
            $this->index = 0;
        }
        public function valid(){
            return $this->index < $this->list->length;
        }
    }



    $xml=new DOMDocument;
    $xml->loadXML( $strxml );
    $rootnode=$xml->getElementsByTagName('programs')->item(0);

    $nodeItr=new RecursiveDOMIterator( $rootnode );
    $itr=new RecursiveIteratorIterator( $nodeItr, RecursiveIteratorIterator::SELF_FIRST );
    foreach( $itr as $node ) {
        if( $node->nodeType === XML_ELEMENT_NODE ) {
            $id=$node->hasAttribute('id') ? $node->getAttribute('id') : false;
            $attr=$node->hasAttribute('name') ? $node->getAttribute('name') : false;
            echo $id.' '.$node->nodeName . ' ' . $node->nodeValue. ' ' . $attr .' '.$node->parentNode->tagName . ' ' . '<br />';
        }
    }
    $dom=$rootnode=$itr=$nodeItr=null;

答案 1 :(得分:0)

考虑使用XSLT将XML简化为一个子级别的格式。然后运行一个简单的PHP循环来提取父ID和文件夹数据以进行数据库迁移。实际上,如果使用MySQL,您可以使用LoadXML()将这个精确转换的XML文档导入数据库,如果节点名称与列名匹配。

作为信息,XSLT是一种特殊用途的声明性编程语言(与SQL类型相同),用于将XML文档重构为各种结构以满足最终用途需求。像所有通用语言一样,包括C#,Java,Python,Perl,VB,PHP维护着一个XSLT处理器。下面是一个XSLT脚本(可以为XML文档中的每个嵌套添加)和PHP脚本(转换和迭代输出)。

XSLT 脚本(另存为.xsl或.xslt)

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>

  <xsl:template match="/">
    <root>
      <xsl:apply-templates select="*"/>
    </root>
  </xsl:template>  

  <xsl:template match="folder" name="foldertemplate">
    <row>      
      <parent_id>0</parent_id>      
      <field_name><xsl:value-of select="@name"/></field_name>      
    </row>

    <!-- ADD LEVELS FOR EACH NEST IN XML DOCUMENT -->
    <row>
      <parent_id><xsl:value-of select="@name"/></parent_id>      
      <field_name><xsl:value-of select="folder/@name"/></field_name>      
    </row>

    <!-- EXAMPLE NEXT LEVEL -->
    <!-- <row> -->
    <!--   <parent_id><xsl:value-of select="folder/@name"/></parent_id> -->      
    <!--   <field_name><xsl:value-of select="folder/folder/@name"/></field_name> -->    
    <!-- </row> -->
  </xsl:template>

</xsl:transform>

转换XML输出 (更易于解析和迭代)

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <row>
    <parent_id>0</parent_id>
    <field_name>ggg</field_name>
  </row>
  <row>
    <parent_id>ggg</parent_id>
    <field_name>sub</field_name>
  </row>
  <row>
    <parent_id>0</parent_id>
    <field_name>sdfsdfs</field_name>
  </row>
  <row>
    <parent_id>sdfsdfs</parent_id>
    <field_name>sub</field_name>
  </row>
</root>

PHP 脚本(转换并循环输出)

// Set current directory
$cd = dirname(__FILE__);

// Load the XML source and XSLT file
$doc = new DOMDocument();
$doc->load($cd.'/Input.xml');

$xsl = new DOMDocument;
$xsl->load($cd.'/XSLTScript.xsl');

// Configure the transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl); 

// Transform XML source
$newXml = $proc->transformToXML($doc);

// Save output to file
$xmlfile = $cd.'/Output.xml';
file_put_contents($xmlfile, $newXml);    

// Load new XML with SimpleXML
$newdoc = simplexml_load_file($cd.'/Output.xml');

$data = [];
$node = $newdoc->xpath('//row');
$parents = $newdoc->xpath('//row/parent_id');
$folders = $newdoc->xpath('//row/field_name');    

// Loop through folder names and parent
for($i=0; $i < sizeof($node); $i++) {        
    echo 'parent: '.$parents[$i]. ' folder: ' . $folders[$i]."\n";    
}

#parent: 0 folder: ggg
#parent: ggg folder: sub
#parent: 0 folder: sdfsdfs
#parent: sdfsdfs folder: sub