Php,DOMXpath。无效的项目(x)返回

时间:2015-02-19 12:18:49

标签: php dom domxpath

简单但是......我们有这样的PHP代码

$oPath = new \DOMXPath($this->oHtmlProperty);
$oNode = $oPath->query('//div[@class="product-spec__body"]');

foreach ($oNode as $oNodeProperty) {
    $oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);

    // ### VARIANT 1 (error with message 'Trying to get property of non-object')

    // $aPropertyGroup = [
    //     'title' => $oListTitle->item(0)->textContent,
    //     'property' => []
    // ];

    // ### VARIANT 2

    foreach ($oListTitle as $oListTitleItem){
        $aPropertyGroup = [
             'title' => $oListTitleItem->textContent,
             'property' => []
        ];

        break; // we need only first item
   }

// ....

$oListTitle总是->item(0)节点而不是更多的主要内容。当我们尝试获取它时,我们得到错误with message 'Trying to get property of non-object',但此节点存在!当我们做同样的事情但通过迭代(返回我们调用的相同节点类 - > item(x))时,我们得到了我们需要的东西。

有人能说出原因吗? XD

增加:

$ oListTitle是:

object(DOMNodeList)#340 (1) { ["length"]=> int(1) } 

增加:

var_dump($oListTitle->item(0));返回此

object(DOMElement)#338 (18) { ["tagName"]=> string(2) "h2" ["schemaTypeInfo"]=> NULL ["nodeName"]=> string(2) "h2" ["nodeValue"]=> string(45) "ОÑновные характериÑтики" ["nodeType"]=> int(1) ["parentNode"]=> string(22) "(object value omitted)" ["childNodes"]=> string(22) "(object value omitted)" ["firstChild"]=> string(22) "(object value omitted)" ["lastChild"]=> string(22) "(object value omitted)" ["previousSibling"]=> NULL ["nextSibling"]=> string(22) "(object value omitted)" ["attributes"]=> string(22) "(object value omitted)" ["ownerDocument"]=> string(22) "(object value omitted)" ["namespaceURI"]=> NULL ["prefix"]=> string(0) "" ["localName"]=> string(2) "h2" ["baseURI"]=> NULL ["textContent"]=> string(45) "ОÑновные характериÑтики" } 

另一个词不是空的并且存在。

2 个答案:

答案 0 :(得分:1)

我无法使用php 5.6.3 / win32和以下代码重现问题(您的代码+一些样板文件)

<?php
$foo = new Foo;
var_export($foo->bar());

class Foo {

    public function __construct() {
        $this->oHtmlProperty = new DOMDocument;
        $this->oHtmlProperty->loadhtml('<html><head><title>...</title></head><body>
    <div class="product-spec__body">
        <h2 class="title title_size_22">h2_1</h2>
        <h2 class="title title_size_22">h2_2</h2>
    </div>
    <div></div>
    <div class="product-spec__body">
        <h2 class="title title_size_22">h2_3</h2>
        <h2 class="title title_size_22">h2_4</h2>
    </div>
</body></html>');
    }

    public function bar() {
        $retval = array(); $aPropertyGroup = array();
        $oPath = new \DOMXPath($this->oHtmlProperty);
        $oNode = $oPath->query('//div[@class="product-spec__body"]');

        foreach ($oNode as $oNodeProperty) {
            $oListTitle = $oPath->query('h2[@class="title title_size_22"]', $oNodeProperty);
            // ### VARIANT 1 (error with message 'Trying to get property of non-object')
            if ( !is_object($oListTitle) ) die('$oListTitle is not an object');
            if ( ! ($oListTitle instanceof DOMNodeList) ) die('$oListTitle is not a DOMNodeList');
            if ( $oListTitle->length < 1 ) die('oListTitle->length < 1');
            $node = $oListTitle->item(0);
            if ( is_null($node) ) die('$node is NULL');
            if ( !is_object($node) ) die('$node is not an object');
            if ( ! ($node instanceof DOMNode) ) die('$node is not a DOMNode');

            $aPropertyGroup = [
                'title' => $oListTitle->item(0)->textContent,
                'property' => []
            ];

            if ( !empty($aPropertyGroup) ) {
                $retval[] = $aPropertyGroup;
                $aPropertyGroup = array();
            }
        } 

        return $retval;
    }
}

输出

array (
  0 => 
  array (
    'title' => 'h2_1',
    'property' => 
    array (
    ),
  ),
  1 => 
  array (
    'title' => 'h2_3',
    'property' => 
    array (
    ),
  ),
)

如预期的那样 但也许libxml_get_last_error()可以告诉你更多......

答案 1 :(得分:1)

你有两个表情,所以如果第一个匹配有多个项目。根据外部匹配,内部匹配可能具有不同的结果。您只需设置一个变量,因此如果所需结果位于其中一个外部匹配项中,则它将填充变量。

您没有提供HTML,因此无法真正重现该错误。

但如果您正在使用DOMNodelist::item(),则应始终验证返回值是否为节点。

以下是两种可能的优化:

  1. 将结果限制为第一个节点:
    h2[@class="title title_size_22"][1]
  2. 以字符串形式获取第一个节点的文本内容(仅适用于DOMXPath::evaluate()):
    string(h2[@class="title title_size_22"])
  3. 示例

    $html = <<<'HTML'
    <html><head><title>...</title></head><body>
        <div class="product-spec__body">
            <h2 class="title title_size_22">h2_1</h2>
            <h2 class="title title_size_22">h2_2</h2>
        </div>
        <div></div>
        <div class="product-spec__body">
        </div>
    </body></html>
    HTML;
    
    $dom = new DOMDocument();
    $dom->loadHtml($html);
    $xpath = new DOMXpath($dom);
    
    foreach ($xpath->evaluate('//div[@class="product-spec__body"]') as $index => $spec) {
      echo "Run #", $index, "\n";
      // all h2 with the class
      var_dump($xpath->evaluate('h2[@class="title title_size_22"]', $spec));
      // first h2 with the class
      var_dump($xpath->evaluate('h2[@class="title title_size_22"][1]', $spec));
      // first h2 with the class as string
      var_dump($xpath->evaluate('string(h2[@class="title title_size_22"])', $spec));
      echo "\n\n";
    }
    

    输出 - 比较两次运行的结果:

    Run #0
    object(DOMNodeList)#9 (1) {
      ["length"]=>
      int(2)
    }
    object(DOMNodeList)#8 (1) {
      ["length"]=>
      int(1)
    }
    string(4) "h2_1"
    
    
    Run #1
    object(DOMNodeList)#8 (1) {
      ["length"]=>
      int(0)
    }
    object(DOMNodeList)#8 (1) {
      ["length"]=>
      int(0)
    }
    string(0) ""