xml中具有相同名称的多个节点

时间:2013-02-05 04:39:50

标签: php xml cakephp dom

我在xml文件下面有这个: -

 <item> 
  <title>Troggs singer Reg Presley dies at 71</title>  
  <description>Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.</description>  
  <link>http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/uk-21332048</guid>  
  <pubDate>Tue, 05 Feb 2013 01:13:07 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701366_65701359.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701387_65701359.jpg"/> 
</item>  
<item> 
  <title>Horsemeat found at Newry cold store</title>  
  <description>Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.</description>  
  <link>http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/world-europe-21331208</guid>  
  <pubDate>Mon, 04 Feb 2013 23:47:38 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700000_002950295-1.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700001_002950295-1.jpg"/> 
</item>  
<item> 
  <title>US 'will sue' Standard &amp; Poor's</title>  
  <description>Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.</description>  
  <link>http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/21331018</guid>  
  <pubDate>Mon, 04 Feb 2013 22:45:52 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701717_mediaitem65699884.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701718_mediaitem65699884.jpg"/> 
   </item>  

现在,当我将输入节点作为“项目”来检索数据时,而不是显示所有项目节点,它只显示最后一个项目节点.....

我的代码是: -

    $dom->load($url);
    $link = $dom->getElementsByTagName($tag_name);
    $value = array();

    for ($i = 0; $i < $link->length; $i++) {
        $childnode['name'] = $link->item($i)->nodeName;
        $childnode['value'] = $link->item($i)->nodeValue;
        $value[$childnode['name']] = $childnode['value'];
    }

这里,$ url是我的xml页面的url $ tag_name是节点的名称,在这种情况下它是“item”

我得到的输出是: -

  US 'will sue' Standard &amp; Poor's.Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa.http://www.bbc.co.uk/news/world-europe-21331208.Mon, 04 Feb 2013 22:45:52 GMT

这是最后一个标签的数据。我想要所有商品标签的数据,我也希望数据采用这种格式: -

title :-  US 'will sue' Standard &amp; Poor's
description :- Standard &amp; Poor's says it is to be sued by the US government over 
the credit ratings agency's assessment of mortgage bonds before the financial crisis

我甚至想要输出中的childnodes(如果有的话)的名字...... 请帮帮我....

4 个答案:

答案 0 :(得分:2)

(不要忘记根节点。)看起来其中一个方法就是将该元素下的所有文本节点连接在一起(几乎相当于一个xsl:value-of select =。)。我从未在PHP中使用DOMDocument类和相关类做过多。但是你可以做的是使用C14N()方法规范化DOMNode,然后解析生成的字符串。它并不漂亮,但它可以获得您想要的结果并且易于扩展:

    $tag_name = 'item';
    $link = $dom->getElementsByTagName($tag_name);
    for ($i = 0; $i < $link->length; $i++) {
        $treeAsString = $link->item($i)->C14N();
        $curBranchParts = explode("\n",$treeAsString);
        $curBranchPartsSize = count($curBranchParts);
        $curBranchParts = explode("\n",$treeAsString);
        $curBranchPartsSize = count($curBranchParts);
        for ($j = 1; $j < ($curBranchPartsSize - 1); $j++) { 
            $curItem = $curBranchParts[$j];
            $curItemParts = explode('<', $curItem);
            $tagWithContent = $curItemParts[1];
            $tagWithContentParts = explode('>',$tagWithContent);
            $tag = $tagWithContentParts[0];
            $content = $tagWithContentParts[1];

            if (trim($content) != '') echo $tag . ' :- ' . $content . '<br />';
            else echo $tag . '<br />';   
        }
    }

答案 1 :(得分:2)

您似乎只是在'item'节点上循环,正如其他人提到的那样,它会覆盖每次迭代的前一个值。

如果使用print_r($ value)循环中调试$ value数组;

$dom->load($url);
$link = $dom->getElementsByTagName($tag_name);
$value = array();

for ($i = 0; $i < $link->length; $i++) {
    $childnode['name'] = $link->item($i)->nodeName;
    $childnode['value'] = $link->item($i)->nodeValue;
    $value[$childnode['name']] = $childnode['value'];

    echo 'iteration: ' . $i . '<br />';
    echo '<pre>'; print_r($value); echo '</pre>';
}

你可能会看到类似这样的东西

// iteration: 0
Array
(
    [item] => Troggs singer Reg Presley dies at 71 ......
)

// iteration: 1
Array
(
    [item] => Horsemeat found at Newry cold store .........
)

// iteration: 2
Array
(
    [item] => US 'will sue' Standard & Poor's .........
)

你应该做的是:

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->load($url);
$items = $dom->getElementsByTagName($tag_name);
$values = array();

foreach ($items as $item) {
    $itemProperties = array();

    // Loop through the 'sub' items 
    foreach ($item->childNodes as $child) {
        // Note: using 'localName' to remove the namespace
        if (isset($itemProperties[(string) $child->localName])) {
            // Quickfix to support multiple 'thumbnails' per item (although they have no content)
            $itemProperties[$child->localName] = (array) $itemProperties[$child->localName];
            $itemProperties[$child->localName][] = $child->nodeValue;
        } else {
            $itemProperties[$child->localName] = $child->nodeValue;
        }
    }

    // Append the item to the 'values' array
    $values[] = $itemProperties;

}


// Output the result
echo '<pre>'; print_r($values); echo '</pre>';

哪个输出:

Array
(
    [0] => Array
        (
            [title] => Troggs singer Reg Presley dies at 71
            [description] => Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.
            [link] => http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/uk-21332048
            [pubDate] => Tue, 05 Feb 2013 01:13:07 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

    [1] => Array
        (
            [title] => Horsemeat found at Newry cold store
            [description] => Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.
            [link] => http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/world-europe-21331208
            [pubDate] => Mon, 04 Feb 2013 23:47:38 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

    [2] => Array
        (
            [title] => US 'will sue' Standard & Poor's
            [description] => Standard & Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.
            [link] => http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/21331018
            [pubDate] => Mon, 04 Feb 2013 22:45:52 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

)

答案 2 :(得分:1)

您的问题是源XML需要有一个根节点(可以随意调用它)。要成为有效的XML,您始终需要一个根节点。也就是说,每个有效的XML文件都只有一个没有父元素或兄弟元素的元素。获得根节点后,您的XML将加载到您的对象中。

例如:

<root>
    <item> 
      <title>Troggs singer Reg Presley dies at 71</title>  
      <description>Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.</description>  
      <link>http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/uk-21332048</guid>  
      <pubDate>Tue, 05 Feb 2013 01:13:07 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701366_65701359.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701387_65701359.jpg"/> 
    </item>  
    <item> 
      <title>Horsemeat found at Newry cold store</title>  
      <description>Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.</description>  
      <link>http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/world-europe-21331208</guid>  
      <pubDate>Mon, 04 Feb 2013 23:47:38 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700000_002950295-1.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700001_002950295-1.jpg"/> 
    </item>  
    <item> 
      <title>US 'will sue' Standard &amp; Poor's</title>  
      <description>Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.</description>  
      <link>http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/21331018</guid>  
      <pubDate>Mon, 04 Feb 2013 22:45:52 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701717_mediaitem65699884.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701718_mediaitem65699884.jpg"/> 
    </item>
</root>

答案 3 :(得分:0)

我认为代码存在问题:

    for ($i = 0; $i < $link->length; $i++) {
        $childnode['name'] = $link->item($i)->nodeName;
        $childnode['value'] = $link->item($i)->nodeValue;
        $value[$childnode['name']] = $childnode['value'];
    } 

每次$childnode['name']for loop按新值分配,$i等于$link.length长度时,{/ 1}}此值将分配给$childnode array }。 因此,要减少问题,它应该是一个多维数组,如

for ($i = 0; $i < $link->length; $i++) {
    $childnode['name'][$i] = $link->item($i)->nodeName;
    $childnode['value'][$i] = $link->item($i)->nodeValue;
    $value[$childnode['name'][$i]][$i] = $childnode['value'];
}

测试它:print_r($childnode);