Question

我的html结构如下。

<div id="description">
   wanted text
   <div class="text-smaller normal wine-user-description">
    <a href = "/users/user1"> unwanted text</a>
   </div>
</div>

我使用selenium打开网址并从上方提取required text。以下是代码

val = self.driver.find_element_by_xpath('//div[@id="description"]').text

但是上面的代码会返回所有文本（包括想要的和不需要的）。我甚至试过

 val = self.driver.find_element_by_xpath('//div[@id="description"]/text()').text

但我得到一些xpath错误。这是我第一次使用硒，而且我有一些困难时期。如果有人可以帮助我，那将非常有帮助。

Answer 1

尝试使用以下jquery获取第一个节点内的文本

$('#description')[0].childNodes[0].nodeValue

我尝试使用你的HTML工作的上述代码。如果你的网站没有使用jquery这将无法工作，那么你必须将jquery注入DOM然后尝试它。将jquery注入DOM { {3}}

String node_text=(String)((JavascriptExecutor)driver).executeScript("return $('#description')[0].childNodes[0].nodeValue");

System.out.println(node_text.trim());

Output snapshot

我尝试使用java而不是python如果你使用python然后使用JavascriptExecutor而不是使用browser.execute_script获取更多信息请参阅view this article

Answer 2

xpath无法工作的原因有两个：

在python中，selenium xpath方法不支持＆＃39; / text（）＆＃39;在xpath语句中。我认为您可以将其用作选择DOM元素但不返回文本的条件。
xpath对于您的用例而言过于宽泛。您需要从父div中取消选择子项。

但是，我们可以尝试获取单个文本而不更改您的代码：

    val =
    self.driver.find_element_by_xpath('//div[@id="description"]').get_attribute('textContent')

Selenium选择了节点文本

2 个答案: