Question

我想解析看起来类似于以下内容的数据：

<table-wrap id ="T1">
<table-wrap-foot>
<fn>
<p>
Blah blah blah <strong>dsf</strong> blah blah blah <br>
</p>
</fn>
<table-wrap-foot>
<table-wrap>

当我打电话

$x = $xpath->query("//table-wrap-foot[@id='" . $tableAttributes . "']/p")->item(0);

我将获得包含标签和数据的段落节点以及<p>标签。

$x = $xpath->query("//table-wrap-foot[@id='".$tableAttributes."']/p")->item(0)->nodeValue;

我会在

标记内获取数据，但它不包含<strong>标记..

所以我的要求是我需要数据以及不包含<p>标记的标签。

有可能这样做吗？

Answer 1

您只需选择node()元素的p子元素并迭代列表即可。将您的示例表达式置于面值（尽管它与您的示例输入不匹配）：

//table-wrap-foot[@id='".$tableAttributes."']/p/node()

请注意，有五个这样的节点：

#text 
strong
#text 
br
#text

更合适的是选择这些文本和元素节点的并集：

//table-wrap-foot[@id='".$tableAttributes."']/p/*|
//table-wrap-foot[@id='".$tableAttributes."']/p/text()