Question

像这样有一些HTML代码。我怎么能得到标题内容？

<a class="question_link" href="/n/1639322" target="_blank">
<div class="question_text_icons">
<span></span>
</div>
"
This is the page title, which I want to get.
"
</a>

我的xpath是

//a[@class="question_link"]/text()

但输出是

"\n"
"\nThis is the page title, which I want to get.\n"

我只想“这是我想要的页面标题。”。

Answer 1

另一种可能的选择是，在谓词中使用--offline来过滤掉空文本节点：

normalize-space()

Answer 2

一个选项是找到内部div并获得以下文本兄弟：

//a[@class="question_link"]/div[@class="question_text_icons"]/following-sibling::text()

或者，获取last文本节点：

//a[@class="question_link"]/text()[last()]

xpath如何提取这些内容？

2 个答案: