使用PHP中的xPath操作子元素

时间:2016-07-14 09:55:14

标签: php xpath

我有一个带有元素(名为region)的XML文档,该元素可能有也可能没有多个子元素。对于我的导入,我需要确保只有一个子元素。如果有更多的子元素,我需要删除除最后一个之外的所有子元素。可能发生以下情况:

 // Option 1: No child elements
 <property name="region">
 </property>

 // Option 2: One child element
 <property name="region">
 <value>Bottom</value>
 </property>

 // Option 3: Two child elements
 <property name="region">
 <value>Top</value>
 <value>Bottom</value>
 </property>

 // Option 4: Three child elements
 <property name="region">
 <value>Top</value>
 <value>Middle</value>
 <value>Bottom</value>
 </property>

我需要的是过滤XML以重新格式化region元素的子元素,使其始终具有一个子元素(value),其中ether是最后一个子元素的值,或者是否有没有孩子,价值'无'。

我想要的输出如下:

 // For Option 1
 <property name="region">
 <value>none</value>
 </property>

 // For option 2, 3 & 4
 <property name="region">
 <value>Bottom</value>
 </property>

我知道我可以使用xPath查询region选择//property[@name = "region"]属性,但我不知道如何从那里操纵子项。

我遇到以下代码

 <?php
 $xml = '<properties>
      <property name="region">
      </property>

      <property name="region">
      <value>Bottom</value>
      </property>

      <property name="region">
      <value>Top</value>
      <value>Bottom</value>
      </property>

      <property name="region">
      <value>Top</value>
      <value>Middle</value>
      <value>Bottom</value>
      </property>
 </properties>';
 $document = new DOMDocument();
 $document->loadXML($xml);
 $xpath = new DOMXpath($document);
 foreach($xpath->query('//property[@name = "region"]') as $node){

      // Now i need something like below, but i can't find a way to make it work
      if $node->hasChildren()
           Remove all but last child
      else
           Create child element with text none
 }

我希望有人能指出我正确的方向

2 个答案:

答案 0 :(得分:0)

您可以使用扭曲的嵌套for循环与getElementByTagName()类的DOMElement一起使用。方法如下:

<?php
    $xml = '<properties>
      <property name="region">
      </property>

      <property name="region">
      <value>Bottom</value>
      </property>

      <property name="region">
      <value>Top</value>
      <value>Bottom</value>
      </property>

      <property name="region">
      <value>Top</value>
      <value>Middle</value>
      <value>Bottom</value>
      </property>
 </properties>';

    $document           = new DOMDocument();
    $document->loadXML($xml);
    $xpath              = new DOMXpath($document);

    foreach($xpath->query('//property[@name = "region"]') as $node){
        /**@var DOMElement $node*/
        // CHECK IF THE $node HAS CHILD NODES USING $node->getElementsByTagName("value")->length PROPERTY
        // WHICH RETURNS THE NUMBER OF CHILD NODES
        $numChildNodes      = $node->getElementsByTagName("value")->length;

        // IF THE CURRENT NODE HAS AT LEAST 1 CHILD, LOOP THROUGH THE CHILDREN
        // AND REMOVE THE CHILD NODES...EXCEPT FOR THE LAST CHILD NODE...
        if($numChildNodes > 0){
            $cuePoint       = ($numChildNodes - 1);

            // LOOP THROUGH ALL CHILD-NODES OF THE CURRENT NODE AND REMOVE ALL CHILD NODES
            // EXCEPT FOR THE LAST CHILD NODE...
            for($index=0; $index<$numChildNodes; $index++){
                if($index !== $cuePoint ) {
                    $currentNode    = $node->getElementsByTagName("value")->item($index);
                    // REMOVE THIS NODE...
                    $node->removeChild($currentNode);
                }
            }
        }else{
            // IF THE CURRENT NODE HAS NO CHILD AT ALL,
            // SIMPLY CREATE AN ELEMENT NODE AND APPEND IT IN THE RIGHT CONTEXT...
            // HERE WE ARE USING none AS DEFAULT BUT YOU CAN USE WHATEVER
            // STRING YOU PLEASE ...
            $newElementNode     = new DOMElement("value", "none", "");
            $node->appendChild($newElementNode);
        }
    }
    $document->save("abc.xml");
    var_dump($document);

abc.xml 文件应该类似于:

    <?xml version="1.0"?>
    <properties>
        <property name="region">
        <value>none</value>
        </property>

        <property name="region">
        <value>Bottom</value>
        </property>

        <property name="region">          
        <value>Bottom</value>
        </property>

        <property name="region">          
        <value>Bottom</value>         
        </property>
     </properties>

虽然var_dump($document);的结果应该产生类似于此的内容:

    object(DOMDocument)[1]
      public 'doctype' => null
      public 'implementation' => string '(object value omitted)' (length=22)
      public 'documentElement' => string '(object value omitted)' (length=22)
      public 'actualEncoding' => null
      public 'encoding' => null
      public 'xmlEncoding' => null
      public 'standalone' => boolean true
      public 'xmlStandalone' => boolean true
      public 'version' => string '1.0' (length=3)
      public 'xmlVersion' => string '1.0' (length=3)
      public 'strictErrorChecking' => boolean true
      public 'documentURI' => string '/Applications/MAMP/htdocs/poiz/so/' (length=34)
      public 'config' => null
      public 'formatOutput' => boolean false
      public 'validateOnParse' => boolean false
      public 'resolveExternals' => boolean false
      public 'preserveWhiteSpace' => boolean true
      public 'recover' => boolean false
      public 'substituteEntities' => boolean false
      public 'nodeName' => string '#document' (length=9)
      public 'nodeValue' => null
      public 'nodeType' => int 9
      public 'parentNode' => null
      public 'childNodes' => string '(object value omitted)' (length=22)
      public 'firstChild' => string '(object value omitted)' (length=22)
      public 'lastChild' => string '(object value omitted)' (length=22)
      public 'previousSibling' => null
      public 'attributes' => null
      public 'ownerDocument' => null
      public 'namespaceURI' => null
      public 'prefix' => string '' (length=0)
      public 'localName' => null
      public 'baseURI' => string '/Applications/MAMP/htdocs/poiz/so/' (length=34)
      public 'textContent' => string '

                none

                Bottom

                Bottom

                Bottom

         ' (length=146)

希望这可以让你了解如何自己做到最好。

干杯和祝你好运!!!

答案 1 :(得分:0)

每个节点都有一个名为childNodes的属性,该属性具有length属性,这意味着您可以通过这种方式拥有当前节点的子节点数。但由于空格本身被视为单个节点,因此您应该初始化另一个查询并从childNodes->length中减去它们。

这个解决方案是正确的,但不是最好的,因为我们可以缩短它:

$xpath->query('./*', $node)->length > 1

这意味着当前节点中的所有元素节点(不是空格)。在找到具有多个元素子元素的元素之后,我们差不多完成了:

./*[position() < last()]

这意味着当前节点中位置小于最后一个的所有子节点。之后我们可以轻松删除它们。这是您修改后的foreach循环:

foreach ($query as $node){
    if ($xpath->query('./*', $node)->length > 1) {
        $children = $xpath->query('./*[position() < last()]', $node);
        foreach ($children as $child) {
            $child->parentNode->removeChild($child);
        }
    }
}

您可以通过保存修改来确认输出:

echo $document->saveXML();