使用DOM检索嵌套div标签中的所有元素

时间:2015-11-27 06:36:17

标签: php xml class dom

<?xml version="1.0" encoding="utf-8" ?><rss version="2.0" xml:base="http://www.example.com/feeds/events.xml" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:og="http://ogp.me/ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:sioc="http://rdfs.org/sioc/ns#" xmlns:sioct="http://rdfs.org/sioc/types#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
      <channel>
        <title>Event Calendar</title>
        <link>http://www.example.com/feeds/events.xml</link>
        <description></description>
        <language>en</language>

         <item>
        <title>Thanksgiving Break 2015</title>
        <link>http://www.example.com/event/42811211</link>
        <description>&lt;div class=&quot;field field-name-body field-type-text-with-summary field-label-hidden&quot;&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot; property=&quot;content:encoded&quot;&gt;&lt;p&gt;Happy Holidays.&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-date field-type-datetime field-label-inline clearfix&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;Date:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;span class=&quot;date-display-single&quot; property=&quot;dc:date&quot; datatype=&quot;xsd:dateTime&quot; content=&quot;2015-11-25T00:00:00-05:00&quot;&gt;Wednesday, November 25, 2015 (All day)&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;field-item odd&quot;&gt;&lt;span class=&quot;date-display-single&quot; property=&quot;dc:date&quot; datatype=&quot;xsd:dateTime&quot; content=&quot;2015-11-26T00:00:00-05:00&quot;&gt;Thursday, November 26, 2015 (All day)&lt;/span&gt;&lt;/div&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;span class=&quot;date-display-single&quot; property=&quot;dc:date&quot; datatype=&quot;xsd:dateTime&quot; content=&quot;2015-11-27T00:00:00-05:00&quot;&gt;Friday, November 27, 2015 (All day)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-location field-type-text field-label-inline clearfix&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;Location:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;Blacksburg, VA&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</description>
         <pubDate>Wed, 19 Aug 2015 17:20:12 +0000</pubDate>
        <dc:creator>Cronus</dc:creator>
        <guid isPermaLink="false">311191 at http://www.example.com</guid>
        <category domain="http://www.example.com/event-categories/othermiscellaneous">Other/Miscellaneous</category>
      </item>
      </channel>
    </rss>

我试图将上面XML中嵌套div标签中嵌入的所有元素提取到一个数组中,以便根据类型单独提取所有元素,例如:Date和Location。理想的输出看起来像:

Title : Thanksgiving Break 2015
Link  : http://www.example.com/event/42811211
Description : Happy Holidays.
Date  : Wednesday, November 25, 2015 (All day), 
        Thursday, November 26, 2015 (All day), 
        Friday, November 27, 2015 (All day)
Location : Blacksburg, VA

我是php和DOM的新手,我真诚地感谢此代码中的帮助。这就是我到目前为止所拥有的

<?php

$rss    = simplexml_load_file('http://www.example.com/feeds/events.xml');
$html   = "";
$dom    = new DOMDocument(); // the HTML parser used for descriptions' HTML

 foreach ($rss->channel->item as $item) {
     $title         = $item->title;
     $link          = $item->link;
     $description   = $item->description;

     foreach ($description as $desc)
    {
        $dom->loadHTML($desc);
        $html = simplexml_import_dom($dom)->body;
        // ?????
    }        

     $html .= "Title : $title <br /> Link : $link <br /> Description : $description <br /> Date : <br /> Location : <hr>";
}    
echo $html;

?>

提前致谢!

2 个答案:

答案 0 :(得分:0)

试试这个:

$xmlfile='http://www.example.com/feeds/events.xml';
    $xml = simplexml_load_file($xmlfile) or die("Error: Cannot create object");
    foreach($xml->children() as $item)
    {
     $title         = $item->title;
     $link          = $item->link;
     $description   = $item->description;
     $description   = $item->description;

             foreach ($description as $desc)
             {
              $title1         = $desc->title;
              $link 1         = $desc->link;
              $description1   = $desc->description;
             }   

    }

$document = new DOMDocument();
$document->load($xmlfile);

// this will also output doctype and comments at top level
foreach($document->childNodes as $node)
    $result .= $document->saveXML($node)."\n";

echo $result;
}

答案 1 :(得分:0)

使用simplexmlDOMDocumentXPath

$rss    = simplexml_load_file('http://www.example.com/feeds/events.xml');

foreach($rss->channel->item as $item){

    print 'Title: ' . $item->title . PHP_EOL;
    print 'Link: ' . $item->link . PHP_EOL;

    $dom = new DOMDocument();
    $dom->loadHTML($item->description);

    $xpath = new DOMXpath($dom);

    $description = $xpath->query("//div[contains(@class, 'field-name-body')]");
    print 'Description: ' . $description->item(0)->nodeValue . PHP_EOL;

    $date = $xpath->query("//div[contains(@class, 'field-name-field-date')]");
    print 'Date: ' . $date->item(0)->nodeValue . PHP_EOL;

    $location = $xpath->query("//div[contains(@class, 'field-name-field-location')]/div/div");
    print 'Location: ' . $location->item(0)->nodeValue . PHP_EOL;   
}
/*
will output

Title: Thanksgiving Break 2015
Link: http://www.example.com/event/42811211
Description: Happy Holidays.
Date: Date: Wednesday, November 25, 2015 (All day)Thursday, November 26, 2015 (All day)Friday, November 27, 2015 (All day)
Location: Blacksburg, VA

*/