如何在没有子div文本数据的情况下获取div的文本数据 - 使用php xpath?

时间:2013-04-05 16:03:13

标签: php text xpath

<div class="date">
  <div class="rating">good</div>
  Movie Review - Mar 24, 2013
</div>

<div class="date">
  Movie Review - Mar 23, 2013
</div>

什么xpath查询将获得“电影评论...”部分没有评级div内容(它说好的地方)。有时评分div有时不存在。

当我将div节点导入$reviewnode:

时,我尝试了这种方法
  $thedate = $xpath->query('text()[1]',$reviewdate)->item(0) ;

但它也捕获了评级div内容。

解析的文档是html5。

2 个答案:

答案 0 :(得分:1)

这应该返回包含字符串“Movie”的divs'文本子项:

//div[@class = "date"]/text()[contains(., "Movie")]

如果您只想要第一个非空白文本节点,可以使用

//div[@class = "date"]/text()[normalize-space(.) != ''][1]

答案 1 :(得分:0)

您正在寻找第一个不是仅限空白节点的文本节点子节点:

// xpath: text()[normalize-space(.)][1]

$thedate = $xpath->query(
    'text()[normalize-space(.)][1]', $reviewdate
)->item(0);

结果(var_dump($thedate->data)):

string(39) "\n      Movie Review - Mar 24, 2013\n    "
string(39) "\n      Movie Review - Mar 23, 2013\n    "

此外,当您在寻找值时,您可能希望直接检索字符串值:

// xpath: normalize-space(text()[normalize-space(.)])

$thedate = $xpath->evaluate(
    'normalize-space(text()[normalize-space(.)])', $reviewdate
);

结果(var_dump($thedate)):

string(27) "Movie Review - Mar 24, 2013"
string(27) "Movie Review - Mar 23, 2013"

我希望这会有所帮助。另请参阅Online Demo和完整的代码示例:

<?php
/**
 * how can I get the text data of a div without the child divs text data - with php xpath?
 *
 * @link http://stackoverflow.com/q/15838487/367456
 * @link http://eval.in/15474
 */
$buffer = <<<BUFFER
<html>
    <div class="date">
      <div class="rating">good</div>
      Movie Review - Mar 24, 2013
    </div>

    <div class="date">
      Movie Review - Mar 23, 2013
    </div>
</html>
BUFFER;

$doc = new DOMDocument();
$doc->loadHTML($buffer);
$xpath = new DOMXPath($doc);

foreach ($xpath->query('/*/body/div[@class = "date"]') as $reviewdate) {
    $thedate = $xpath->query('text()[normalize-space(.)][1]', $reviewdate)->item(0);
    var_dump($thedate->data);

    // string:
    $thedate = $xpath->evaluate('normalize-space(text()[normalize-space(.)])', $reviewdate);
    var_dump($thedate);
}