Xpath从多个标签获取文本内容

时间:2018-11-05 22:21:29

标签: php html xpath domdocument

我有这个HTML模板:

<ul>
<li>
<div>
<span class="field_full"><strong>Title 1</strong></span> :
<span itemprop="alternativeHeadline">
<span itemprop="alternativeHeadline">
DESC 1
</span>
</span></div>
</li>
<li>
<div>
<span class="field_full"><strong>Title 2</strong></span> :
<span itemscope="" itemtype="http://schema.org/type2" itemprop="type2">
<a href="/"><span itemprop="name">DESC 2</span></a>
</span>
</div>
</li>
<li>
<div>
<span class="field_full"><strong> Title 3</strong></span>:
<span itemprop="type3" itemscope="" itemtype="http://schema.org/type3">
<a href="/"><span itemprop="name">DESC 3-1</span></a>, <a href="/"><span itemprop="name">DESC 3-2</span></a>, <a href="/"><span itemprop="name">DESC 3-3</span></a>
</span>
</div>
</li>
<li>
<span class="field_full"><strong>Title 4</strong></span>:
<span> <a href="/">DESC 4</a></span>
</li>
<li>
<span class="field_full"><strong>Title 5</strong></span>:
<span itemprop="type">
<a href="/">DESC 5-1</a>, <a href="/">DESC 5-2</a>
</span>
</li>
<li>
<span class="field_full"><strong>Title 6</strong></span>:
<span itemprop="type">
DESC 6
</span>
</li>
<li>
<span class="field_full"><strong>Title 7</strong></span>:
<span itemprop="type">
DESC 7
</span>
</li>
<li>
<span class="field_full"><strong>Title 8</strong></span>:
<span itemprop="type">
<a href="/">DESC 8</a>
</span>
</li>
</ul>

我想使用xpath获得预期结果:

TITLE 1 = DESC 1
TITLE 2 = DESC 2
TITLE 3 = DESC 3-1, DESC 3-2, DESC 3-3
TITLE 4 = DESC 4
TITLE 5 = DESC 5-1, DESC 5-2
TITLE 6 = DESC 6
TITLE 7 = DESC 7
TITLE 8 = DESC 8

我尝试了什么?

$dom = new DOMDocument();
$dom->loadHTML($html_string);
$xpath = new DOMXpath($dom);

$elements = $xpath->query("//span[@class='field_full']");
foreach($elements as $e) {
    echo $e->nodeValue . '<br/>';
}

但不幸的是,这仅返回TITLE 1,TITLE 2,TITLE 3等。

我想获取它们各自的值(在这种情况下为DESC 1,DESC 2等...)。

达到这个目标我可以采取什么方法?

谢谢

2 个答案:

答案 0 :(得分:1)

要获得所需的确切结果,可以使用相对的XPath查询,以原始$elements = $xpath->query("//span[@class='field_full']"); foreach($elements as $e) { echo trim($e->nodeValue) . ' = '; $spans = $xpath->query("following-sibling::span", $e); foreach ($spans as $span) echo " " . trim($span->nodeValue); echo "<br/>"; } 节点为根:

Title 1 =  DESC 1<br/>
Title 2 =  DESC 2<br/>
Title 3 =  DESC 3-1, DESC 3-2, DESC 3-3<br/>
Title 4 =  DESC 4<br/>
Title 5 =  DESC 5-1, DESC 5-2<br/>
Title 6 =  DESC 6<br/>
Title 7 =  DESC 7<br/>
Title 8 =  DESC 8<br/>

输出:

    {
"vms": [
{
  "hostname": "host1",
  "state": "running",
  "platform": "linux",
  "Disks": [
    {
      "index": 1,
      "volume_name": "/boot",
      "size": "34359738368"
    },
    {
      "index": 2,
      "volume_name": "/data",
      "size": "27917287424"
    }
 ]
},
{
  "hostname": "host2",
  "state": "running",
  "platform": "linux",
  "Disks": [
    {
      "index": 1,
      "volume_name": "/boot",
      "size": "34359738368"
    },
    {
      "index": 2,
      "volume_name": "/user_data",
      "size": "5159324276"
    },
    {
      "index": 3,
      "volume_name": "/temporary",
      "size": "102400"
    }
  ]
}
]
}

Demo on 3v4l.org

答案 1 :(得分:0)

下面的表达式应该可以做到:

//span[@class="field_full"]/following-sibling::span

演示:https://3v4l.org/rTmq9