Question

您好我有一些html文件：

<div class="text">
   <p></p>
   <p>text in p2</p>
   <p></p>
   <p>text in p4</p>
</div>

和其他类似：

<div class="text">    
   <p>text in p1</p>
   <p></p>
   <p>text in p3</p>
   <p></p>
</div>

我的查询是:(在快速通道中）

//h:div[contains(@class,'inside')]/h:div[contains(@class,'text')]/h:p/node()/text()

但只返回<p>。

我的问题是如何将<p>中的所有文字加入同一个字符串？

谢谢

Answer 1

我会将表达式限制为您提供的HTML代码段，因此我会删除前几个轴步骤。

首先，此查询不应返回任何结果，因为段落节点没有任何子节点（但文本节点）。

//h:div[contains(@class,'text')]/h:p/node()/text()

要访问所有文本节点，您应该使用类似

的内容

//h:div[contains(@class,'text')]/h:p/text()

加入字符串在很大程度上取决于您能够使用的XPath版本。如果rapidminer提供 XPath 2.0 （可能没有），那么你很幸运，可以使用string-join(...)，它将所有字符串连接到一个字符串：

string-join(//h:div[contains(@class,'text')]/h:p/text())

如果您遇到 XPath 1.0 ，则无法执行此操作，但对于固定数量的字符串，请枚举所有字符串。我出于可读性原因添加了新行，如果您愿意，请删除它们：

concat(
  //h:div[contains(@class,'text')]/h:p[1]/text(),
  //h:div[contains(@class,'text')]/h:p[2]/text(),
  //h:div[contains(@class,'text')]/h:p[3]/text(),
  //h:div[contains(@class,'text')]/h:p[4]/text()
)

从节点xpath加入所有文本

1 个答案: