在PHP中展平XML结构并将所有值添加到数组中

时间:2016-11-22 21:56:49

标签: php xml multidimensional-array flatten

我有一些从Amazon API返回的浏览节点作为XML,看起来像下面的输出。我怎样才能完成这个混乱/压扁它并提取出我需要的数据。这是输入:

object(SimpleXMLElement)#72 (1) {
  ["BrowseNode"]=>
  array(2) {
    [0]=>
    object(SimpleXMLElement)#73 (3) {
      ["BrowseNodeId"]=>
      string(10) "1342630031"
      ["Name"]=>
      string(8) "Chargers"
      ["Ancestors"]=>
      object(SimpleXMLElement)#75 (1) {
        ["BrowseNode"]=>
        object(SimpleXMLElement)#76 (3) {
          ["BrowseNodeId"]=>
          string(9) "389516011"
          ["Name"]=>
          string(11) "Accessories"
          ["Ancestors"]=>
          object(SimpleXMLElement)#77 (1) {
            ["BrowseNode"]=>
            object(SimpleXMLElement)#78 (3) {
              ["BrowseNodeId"]=>
              string(9) "389514011"
              ["Name"]=>
              string(38) "Sat Nav, GPS, Navigation & Accessories"
              ["Ancestors"]=>
              object(SimpleXMLElement)#79 (1) {
                ["BrowseNode"]=>
                object(SimpleXMLElement)#80 (4) {
                  ["BrowseNodeId"]=>
                  string(6) "560800"
                  ["Name"]=>
                  string(10) "Categories"
                  ["IsCategoryRoot"]=>
                  string(1) "1"
                  ["Ancestors"]=>
                  object(SimpleXMLElement)#81 (1) {
                    ["BrowseNode"]=>
                    object(SimpleXMLElement)#82 (2) {
                      ["BrowseNodeId"]=>
                      string(6) "560798"
                      ["Name"]=>
                      string(19) "Electronics & Photo"
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    [1]=>
    object(SimpleXMLElement)#74 (3) {
      ["BrowseNodeId"]=>
      string(9) "340328031"
      ["Name"]=>
      string(12) "Car Chargers"
      ["Ancestors"]=>
      object(SimpleXMLElement)#75 (1) {
        ["BrowseNode"]=>
        object(SimpleXMLElement)#76 (3) {
          ["BrowseNodeId"]=>
          string(9) "340327031"
          ["Name"]=>
          string(8) "Chargers"
          ["Ancestors"]=>
          object(SimpleXMLElement)#77 (1) {
            ["BrowseNode"]=>
            object(SimpleXMLElement)#78 (3) {
              ["BrowseNodeId"]=>
              string(6) "560826"
              ["Name"]=>
              string(11) "Accessories"
              ["Ancestors"]=>
              object(SimpleXMLElement)#79 (1) {
                ["BrowseNode"]=>
                object(SimpleXMLElement)#80 (3) {
                  ["BrowseNodeId"]=>
                  string(10) "1340509031"
                  ["Name"]=>
                  string(29) "Mobile Phones & Communication"
                  ["Ancestors"]=>
                  object(SimpleXMLElement)#81 (1) {
                    ["BrowseNode"]=>
                    object(SimpleXMLElement)#82 (4) {
                      ["BrowseNodeId"]=>
                      string(6) "560800"
                      ["Name"]=>
                      string(10) "Categories"
                      ["IsCategoryRoot"]=>
                      string(1) "1"
                      ["Ancestors"]=>
                      object(SimpleXMLElement)#83 (1) {
                        ["BrowseNode"]=>
                        object(SimpleXMLElement)#84 (2) {
                          ["BrowseNodeId"]=>
                          string(6) "560798"
                          ["Name"]=>
                          string(19) "Electronics & Photo"
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

我想通过它并将其展平成一个我可以使用的结构,如下所示:

array(

    (1342630031,'Chargers'),

    (389516011,'Accessories'),

    (389514011,'Sat Nav, GPS, Navigation & Accessories'),

    (560800,'Categories'),

    (560798,'Electronics & Photo'),

    (340328031,'Car Chargers'),

    (340327031,'Chargers'),

    (560826,'Accessories'),

    (1340509031,'Mobile Phones & Communication'),

    (560800,'Categories'),

    (560798,'Electronics & Photo')

)

这样我就可以:

echo $ array [0] [0];

echo $ array [0] [1];

echo $ array [5] [1];

哪会给:

1342630031

充电器

电子与电子照片

等...

如果它有帮助,那么原始XML

    <?xml version="1.0" encoding="UTF-8"?>
<BrowseNodes>
   <BrowseNode>
      <BrowseNodeId>1342630031</BrowseNodeId>
      <Name>Chargers</Name>
      <Ancestors>
         <BrowseNode>
            <BrowseNodeId>389516011</BrowseNodeId>
            <Name>Accessories</Name>
            <Ancestors>
               <BrowseNode>
                  <BrowseNodeId>389514011</BrowseNodeId>
                  <Name>Sat Nav, GPS, Navigation &amp; Accessories</Name>
                  <Ancestors>
                     <BrowseNode>
                        <BrowseNodeId>560800</BrowseNodeId>
                        <Name>Categories</Name>
                        <IsCategoryRoot>1</IsCategoryRoot>
                        <Ancestors>
                           <BrowseNode>
                              <BrowseNodeId>560798</BrowseNodeId>
                              <Name>Electronics &amp; Photo</Name>
                           </BrowseNode>
                        </Ancestors>
                     </BrowseNode>
                  </Ancestors>
               </BrowseNode>
            </Ancestors>
         </BrowseNode>
      </Ancestors>
   </BrowseNode>
   <BrowseNode>
      <BrowseNodeId>340328031</BrowseNodeId>
      <Name>Car Chargers</Name>
      <Ancestors>
         <BrowseNode>
            <BrowseNodeId>340327031</BrowseNodeId>
            <Name>Chargers</Name>
            <Ancestors>
               <BrowseNode>
                  <BrowseNodeId>560826</BrowseNodeId>
                  <Name>Accessories</Name>
                  <Ancestors>
                     <BrowseNode>
                        <BrowseNodeId>1340509031</BrowseNodeId>
                        <Name>Mobile Phones &amp; Communication</Name>
                        <Ancestors>
                           <BrowseNode>
                              <BrowseNodeId>560800</BrowseNodeId>
                              <Name>Categories</Name>
                              <IsCategoryRoot>1</IsCategoryRoot>
                              <Ancestors>
                                 <BrowseNode>
                                    <BrowseNodeId>560798</BrowseNodeId>
                                    <Name>Electronics &amp; Photo</Name>
                                 </BrowseNode>
                              </Ancestors>
                           </BrowseNode>
                        </Ancestors>
                     </BrowseNode>
                  </Ancestors>
               </BrowseNode>
            </Ancestors>
         </BrowseNode>
      </Ancestors>
   </BrowseNode>
</BrowseNodes>

4 个答案:

答案 0 :(得分:1)

使用Xpath是从XML文档读取数据的最简单方法。您可以使用一个表达式来迭代项目,使用几个表达式来提取每个项目的数据。

$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);

$result = [];
foreach($xpath->evaluate('//BrowseNode[BrowseNodeId]') as $browseNode) {
  $id = $xpath->evaluate('string(BrowseNodeId)', $browseNode);
  if (array_key_exists($id, $result)) {
    continue;
  }
  $result[$id] = [
    'id' => $id,
    'name' => $xpath->evaluate('string(Name)', $browseNode)
  ];
}

var_dump($result);

输出:

array(9) {
  [1342630031]=>
  array(2) {
    ["id"]=>
    string(10) "1342630031"
    ["name"]=>
    string(8) "Chargers"
  }
  [389516011]=>
  array(2) {
    ["id"]=>
    string(9) "389516011"
    ["name"]=>
    string(11) "Accessories"
  }
  ...
}

//BrowseNode[BrowseNodeId]获取文档中具有子节点BrowseNode的任何BrowseNodeId元素。 string(BrowseNodeId)在节点的上下文中执行,它返回所有BrowseNodeId子节点并将第一个子节点转换为字符串(如果没有找到节点,则为空字符串)。

通过使用id作为数组的键,将删除重复项。

答案 1 :(得分:0)

这有点难看,但把它变成了我可以使用的结构,不是我想要的输出,但可能足够接近使用。

$json = json_encode($xml);

$array = json_decode($json,TRUE);

$it = new RecursiveIteratorIterator(new RecursiveArrayIterator($array));

foreach($it as $v) {

    $values[] = $v;

}

答案 2 :(得分:0)

$DOM = new DOMDocument();
$DOM->loadHTML($xml);

$XPATH = new DOMXpath($DOM);

// Gets all BrowseNodeId anywhere within the document
$r = $XPATH->query("//BrowseNodeId");

// Gets only BrowseNodeIds that re directly below a BrowseNodes and then a BrowseNodes
$r = $XPATH->query("/BrowseNodes/BrowseNode/BrowseNodeId");

您可能希望使用第一个Xpath查询来获取所有Ids元素。

$r = $XPATH->query("//BrowseNodeId");

foreach ($r as $element) { // $element will be a DOMElement object
     $original = $element;
     while($element->nextSibling != null) { 
          if("Name" == $element->tagName) {
                echo "The ID for " . $element->nodeValue . " is " . $original->nodeValue;
          }
          $element = $element->nextSibling;
     }
}

这至少为你提供了一个开始/想法。

未经测试。

答案 3 :(得分:0)

考虑XSLT展平源XML,然后遍历结果以填充数组:

// Load the XML source and XSLT string
$doc = simplexml_load_file('Input.xml');

$xslstr = '<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
             <xsl:output version="1.0" encoding="UTF-8" indent="yes" />
             <xsl:strip-space elements="*"/>      
             <xsl:template match="/BrowseNodes">
                <xsl:copy>            
                   <xsl:apply-templates select="descendant::BrowseNodeId"/>
                </xsl:copy>
             </xsl:template>      
             <xsl:template match="BrowseNodeId">
                <data>            
                    <xsl:copy-of select="."/>
                    <xsl:copy-of select="following-sibling::Name"/>
                </data>
            </xsl:template>  
          </xsl:transform>';
$xsl = new SimpleXMLElement($xslstr);

// Configure and run the transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl); 
$newXML = $proc->transformToXML($doc);

// Populate flattened array
$output = new SimpleXMLElement($newXML);

values = [];
foreach ($output->data as $line){
    $inner = [];
    $inner[] = (string)$line->BrowseNodeId;
    $inner[] = (string)$line->Name;
    $values[] = $inner;
}

新XML

<?xml version="1.0" encoding="UTF-8"?>
<BrowseNodes>
  <data>
    <BrowseNodeId>1342630031</BrowseNodeId>
    <Name>Chargers</Name>
  </data>
  <data>
    <BrowseNodeId>389516011</BrowseNodeId>
    <Name>Accessories</Name>
  </data>
  <data>
    <BrowseNodeId>389514011</BrowseNodeId>
    <Name>Sat Nav, GPS, Navigation &amp; Accessories</Name>
  </data>
  <data>
    <BrowseNodeId>560800</BrowseNodeId>
    <Name>Categories</Name>
  </data>
  <data>
    <BrowseNodeId>560798</BrowseNodeId>
    <Name>Electronics &amp; Photo</Name>
  </data>
  <data>
    <BrowseNodeId>340328031</BrowseNodeId>
    <Name>Car Chargers</Name>
  </data>
  <data>
    <BrowseNodeId>340327031</BrowseNodeId>
    <Name>Chargers</Name>
  </data>
  <data>
    <BrowseNodeId>560826</BrowseNodeId>
    <Name>Accessories</Name>
  </data>
  <data>
    <BrowseNodeId>1340509031</BrowseNodeId>
    <Name>Mobile Phones &amp; Communication</Name>
  </data>
  <data>
    <BrowseNodeId>560800</BrowseNodeId>
    <Name>Categories</Name>
  </data>
  <data>
    <BrowseNodeId>560798</BrowseNodeId>
    <Name>Electronics &amp; Photo</Name>
  </data>
</BrowseNodes>

值数组

array(11) {
  [0]=>
  array(2) {
    [0]=>
    string(10) "1342630031"
    [1]=>
    string(8) "Chargers"
  }
  [1]=>
  array(2) {
    [0]=>
    string(9) "389516011"
    [1]=>
    string(11) "Accessories"
  }
  [2]=>
  array(2) {
    [0]=>
    string(9) "389514011"
    [1]=>
    string(38) "Sat Nav, GPS, Navigation & Accessories"
  }
  [3]=>
  array(2) {
    [0]=>
    string(6) "560800"
    [1]=>
    string(10) "Categories"
  }
  [4]=>
  array(2) {
    [0]=>
    string(6) "560798"
    [1]=>
    string(19) "Electronics & Photo"
  }
  [5]=>
  array(2) {
    [0]=>
    string(9) "340328031"
    [1]=>
    string(12) "Car Chargers"
  }
  [6]=>
  array(2) {
    [0]=>
    string(9) "340327031"
    [1]=>
    string(8) "Chargers"
  }
  [7]=>
  array(2) {
    [0]=>
    string(6) "560826"
    [1]=>
    string(11) "Accessories"
  }
  [8]=>
  array(2) {
    [0]=>
    string(10) "1340509031"
    [1]=>
    string(29) "Mobile Phones & Communication"
  }
  [9]=>
  array(2) {
    [0]=>
    string(6) "560800"
    [1]=>
    string(10) "Categories"
  }
  [10]=>
  array(2) {
    [0]=>
    string(6) "560798"
    [1]=>
    string(19) "Electronics & Photo"
  }
}