Question

我在这里看到很多教程溢出，但我无法理解我所缺少的东西..所以我需要一些帮助..

我有一个在线的XML，我试图像这样解析它：

<products>
    <product>
    <id>13389</id>
    <name><![CDATA[ product name ]]></name>
    <category id="14"><![CDATA[ Shoes > test1 ]]></category>
    <price>41.30</price>
</products>

到目前为止，我正在阅读XML并解析它：

$reader = new XMLReader();
$reader->open($product_xml_link);
while($reader->read()) {
if($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'product' ) {
    $product = new SimpleXMLElement($reader->readOuterXml());
    $pid = $product->id;
    $name = $product->name;
    $name = strtolower($name);
    $link = $product->link;
    $price = $product->Price;
    ...
    ...
}
} //end while loop

正如你所看到的，类别标签中有一个id ..这是我想要抓取并导入我的代码的那个..

我做了类似的事情：

echo "prodcut= " . (string)$product->category->getAttribute('id');

我得到的错误是：调用未定义的方法SimpleXMLElement :: getAttribute（）

我需要这个id才能在将其插入DB之前对其进行测试..所以，

if($id = 600) {
//insert DB
}

Answer 1

以下是几件事。首先$product = new SimpleXMLElement($reader->readOuterXml());表示您将所有内容作为单独的XML文档读取并再次解析。这是expand（），它将直接返回一个DOM节点，DOM节点可以导入SimpleXML。

对于属性，使用数组语法..

$reader = new XMLReader();
$reader->open($product_xml_link);

// an document to expand to
$document = new DOMDocument();

// find the first product node
while ($reader->read() && $reader->localName !== 'product') {
  continue;
}

while ($reader->localName === 'product') {
  $product = simplexml_import_dom($reader->expand($document));
  $data = [
    'id' => (string)$product->id,
    'name' => (string)$product->name,
    'category_id' => (string)$product->category['id'],
    // ...
  ];
  var_dump($data);
  // move to the next product sibling
  $reader->next('product');
}
$reader->close();

输出：

array(3) {
  ["id"]=>
  string(5) "13389"
  ["name"]=>
  string(14) " product name "
  ["category_id"]=>
  string(2) "14"
}

当然，您可以直接使用DOM并使用Xpath表达式获取详细信息数据：

$reader = new XMLReader();
$reader->open($product_xml_link);

// prepare a document to expand to
$document = new DOMDocument();
// and an xpath instance to use
$xpath = new DOMXpath($document);

// find the first product node
while ($reader->read() && $reader->localName !== 'product') {
  continue;
}

while ($reader->localName === 'product') {
  $product = $reader->expand($document);
  $data = [
    'id' => $xpath->evaluate('string(id)', $product),
    'name' => $xpath->evaluate('string(name)', $product),
    'category_id' => $xpath->evaluate('string(category/@id)', $product),
    // ...
  ];
  var_dump($data);
  // move to the next product sibling
  $reader->next('product');
}
$reader->close();

Answer 2

您想循环所有产品，并提取子元素id，name，link和price的文字内容？可以这样做：

foreach((@DOMDocument::loadHTML($xml))->getElementsByTagName("product") as $product){
    $vars=array('id','name','link','price');
    foreach($vars as $v){
        ${$v}=$product->getElementsByTagName($v)->item(0)->textContent;
    }
    unset($v,$vars);
    //now you have $id , $name , $link , $price as raw text, and $product is the DOMNode for the <product> tag.
}

如果您只想处理ID 600，请在unset（）之后添加if($id!=600){continue;}; - 如果你想保存一些CPU，你也应该插入一个休息时间;在那种情况下，在foreach循环结束时。（一旦找到id 600，它就会停止循环）

编辑：修正了破解错误的代码，代码在没有拼写错误修复的情况下无法正常工作

编辑：如果你想使用XPath找到正确的元素，那就是$product=(new DOMXpath((@DOMDOcument::loadHTML($xml))))->query('//product/id[text()=\'600\']')->item(0)->parentNode;

编辑：修正了另一个破译错误的拼写错误（items(0) - ＆gt; item(0)）

XML获取属性

2 个答案: