使用PHP在非常大的XML FEED中搜索数据

时间:2012-08-30 05:03:48

标签: php xml

http://www.topbuy.com.au/tbcart/tbadmin/datafeed/shoppingcom.xml

这是我的Feed网址,它永远不会在浏览器中打开,也不需要花费数小时才能打开它。

我想创建PHP页面,我想在我的PHP页面中放置一个搜索框(我将放置),一旦点击提交我的PHP代码,应搜索并显示相应的产品PRICE。

那么我应该使用SimpleXML来提供这个或任何其他建议吗?

<Products>
<Product><MPN><![CDATA[INK-PE-009]]></MPN><Manufacturer><![CDATA[Epson]]></Manufacturer><ProductName><![CDATA[Epson T009 Colour Compatible Inkjet Cartridge]]></ProductName><ProductURL><![CDATA[http://www.topbuy.com.au/tbcart/pc/Epson-T009-Colour-Compatible-Inkjet-Cartridge-p3343.htm?utm_source=TopBuy_ShoppingCom&utm_content=&utm_medium=cpc&dismode=1&utm_campaign=TBDF-XX10421]]></ProductURL><ProductType><![CDATA[Compatible Ink Cartridges]]></ProductType><ImageURL><![CDATA[http://www2.topbuy.com.au/tbcart/pc/catalog/General/TBDF-XX10421_1.jpg]]></ImageURL><Price>4.09</Price><OriginalPrice>9</OriginalPrice><Category><![CDATA[Consumables->Compatible Ink Cartridges]]></Category><ProductDescription><![CDATA[$4.05 Cash Price see store for detailsRelated Brand  EpsonOriginal Cartridge Equivalent T009Related Printers STYLUS 1270, STYLUS 1280, STYLUS 1290, STYLUS 3300C, STYLUS PHOTO 1270, STYLUS PHOTO 1290, STYLUS PHOTO 1290 silverThis cartridge works in the following printers  Epson Stylus Photo 1270/1280Please check the name (code) of the cartridge in your printer before ordering to ensure that it matches the name of the cartridges you are ordering from us. In some instances a printer can take more than one cartridge type and ...]]></ProductDescription><Stock>Y</Stock><ShippingCost>10</ShippingCost><StockDescription>No.1 OZ SUPERSTORE AUS WARRANTY FAST SHIPPING</StockDescription><Condition>Brand New</Condition></Product>
<Product><MPN><![CDATA[INK-PE-013]]></MPN><Manufacturer><![CDATA[Epson]]></Manufacturer><ProductName><![CDATA[Epson T013 Black Compatible Inkjet Cartridge]]></ProductName><ProductURL><![CDATA[http://www.topbuy.com.au/tbcart/pc/Epson-T013-Black-Compatible-Inkjet-Cartridge-p3345.htm?utm_source=TopBuy_ShoppingCom&utm_content=&utm_medium=cpc&dismode=1&utm_campaign=TBDF-XX10423]]></ProductURL><ProductType><![CDATA[Compatible Ink Cartridges]]></ProductType><ImageURL><![CDATA[http://www2.topbuy.com.au/tbcart/pc/catalog/General/TBDF-XX10423_1.jpg]]></ImageURL><Price>2.09</Price><OriginalPrice>5</OriginalPrice><Category><![CDATA[Consumables->Compatible Ink Cartridges]]></Category><ProductDescription><![CDATA[$2.07 Cash Price see store for detailsRelated Brand  EpsonOriginal Cartridge Equivalent T013Related Printers STYLUS COLOR 480, STYLUS COLOR 580, STYLUS COLOR C20, STYLUS COLOR C40, STYLUS COLOUR 480, STYLUS COLOUR 580, STYLUS COLOUR C20UX, STYLUS COLOUR C40SX, STYLUS COLOUR C40UXThis cartridge works in the following printers  Epson Stylus Colour 480/580Please check the name (code) of the cartridge in your printer before ordering to ensure that it matches the name of the cartridges you are ordering from us. In some instances a ...]]></ProductDescription><Stock>Y</Stock><ShippingCost>10</ShippingCost><StockDescription>No.1 OZ SUPERSTORE AUS WARRANTY FAST SHIPPING</StockDescription><Condition>Brand New</Condition></Product>

1 个答案:

答案 0 :(得分:2)

我建议使用XMLReader类,因为它不需要立即将完整的xml加载到内存中。

缺点是您必须手动实现搜索和过滤,对于简单的名称搜索,您可以执行以下操作:

$reader = new XMLReader;
$reader->open('shoppingcom.xml');

while ($reader->read()) {
    if ($reader->name == 'Product') {
        $productxml = $reader->readOuterXML();
        while ($reader->read()) {
            if ($reader->name == 'ProductName' && stristr($reader->readInnerXML(), 'adidas')) {
                print $productxml;
                // now it contains the <Product>...</Product> fragment of the xml
                // you can use simplexml on this fragment, or just add an if for Prize node
            }
            if ($reader->name == 'Product' && $reader->nodeType == XMLReader::END_ELEMENT) {
                break;
            }
        }
    }
}

尽管如此,示例Feed仍为实心64M,如果您为此创建一个可以索引要搜索的字段并返回xml片段的数据库,则可能会有更好的实时搜索结果,所以您不要不必将所有内容规范化为表格。