我正在使用PHP解析一个大的(大约1.5 MB)XML文件。我想要关注的节点大约有2个级别,对于每个节点,我希望能够提取某些值。
我希望使用SimplePie来做到这一点,但是从我读过的内容来看,XMLReader似乎是最好的方法。我从未使用过XMLReader并且正在测试this example。不幸的是,它对我不起作用。
以下是(部分)XML:
<?xml version="1.0" encoding="UTF-8"?>
<comiclist>
<comic>
<id>117</id>
<index>1</index>
<mainsection>
<pagecount>33</pagecount>
<credits>
<credit>
<role id="dfPenciler">Penciller</role>
<roleid>dfPenciler</roleid>
<person>
<displayname>Jim Lawson</displayname>
<sortname>Jim Lawson</sortname>
</person>
</credit>
<credit>
<role id="dfWriter">Writer</role>
<roleid>dfWriter</roleid>
<person>
<displayname>Peter Laird</displayname>
<sortname>Peter Laird</sortname>
</person>
</credit>
</credits>
<characters/>
<series>
<displayname>Teenage Mutant Ninja Turtles</displayname>
<sortname>Teenage Mutant Ninja Turtles</sortname>
<complete>No</complete>
<bpseriesid>0</bpseriesid>
</series>
</mainsection>
<collectionstatus listid="3">In Collection</collectionstatus>
<rare boolvalue="0">No</rare>
<coverfront>/Data/Images/tmnt_2.jpg</coverfront>
<format>
<displayname>Standard Comic Format</displayname>
<sortname>Standard Comic Format</sortname>
</format>
<publisher>
<displayname>Mirage Studios</displayname>
<sortname>Mirage Studios</sortname>
</publisher>
<country>
<displayname>USA</displayname>
<sortname>USA</sortname>
</country>
<language>
<displayname>English</displayname>
<sortname>English</sortname>
</language>
<store>
<displayname>All About Books & Comics</displayname>
<sortname>All About Books & Comics</sortname>
</store>
<purchaseprice>$2.95</purchaseprice>
<coverprice>$2.95</coverprice>
<purchasedate>
<year>
<displayname>2003</displayname>
</year>
<month>1</month>
<date>January 2003</date>
</purchasedate>
<condition>
<displayname>Near Mint</displayname>
<sortname>094 Near Mint</sortname>
<lastname>094 Near Mint</lastname>
</condition>
<issuenr>2</issuenr>
<publicationdate>
<year>
<displayname>2002</displayname>
</year>
<month>2</month>
<date>February 2002</date>
</publicationdate>
<genres>
<genre>
<displayname>Science Fiction</displayname>
<sortname>Science Fiction</sortname>
</genre>
</genres>
<tags/>
<links/>
<lastmodified>
<date>10/4/2007 6:17:29 AM</date>
</lastmodified>
<thumbfilepath>/Thumbnails/6108a98d11f81eee6dbd2a67c20b1650.jpg</thumbfilepath>
<sections/>
<seriesgroup>
<displayname>Other</displayname>
<sortname>Other</sortname>
</seriesgroup>
<issue>2</issue>
<quantity>1</quantity>
<bpcomicid>0</bpcomicid>
<bpcomiclastreceivedrevision>0</bpcomiclastreceivedrevision>
<bpseriesid>0</bpseriesid>
<wraparoundcover boolvalue="0">No</wraparoundcover>
<seriefirstletter>
<displayname>T</displayname>
<sortname>T</sortname>
</seriefirstletter>
<allcreators>Jim Lawson; Peter Laird</allcreators>
<submissiondate/>
<releasedate/>
<readingdate/>
<readtimes>0</readtimes>
<readit>No</readit>
</comic>
</comiclist>
</comicinfo>
这是我正在使用的PHP:
<?php
$z = new XMLReader;
$z->open('comiclist.xml');
$doc = new DOMDocument;
while ($z->read() && $z->name !== 'comic');
while ($z->name === 'comic')
{
$node = simplexml_import_dom($doc->importNode($z->expand(), true));
var_dump($node->element_1);
$z->next('comic');
}
?>
显示的是:
object(SimpleXMLElement)#3 (0) { } object(SimpleXMLElement)#4 (0) { }
对于每个节点,这一遍又一遍地重复。我做错了什么,有没有更好的方法来做我想要完成的事情?
答案 0 :(得分:0)
我设法自己解决了这个问题。
通过几个小时的试用和错误(和研究)我已经弄清楚如何完成我的要求。其他人发布的测试代码如下。这将打印出每个“漫画”节点的3个值:
<?php
$xml = simplexml_load_file('comiclist.xml');
foreach ($xml->comiclist->comic as $comic) {
echo $comic->mainsection->series->displayname . ' #' . $comic->issuenr . ' is ID number: ' . $comic->id . '<br />';
}
?>