为什么我的Selenium xpath表达式返回一个[object attribute]而不是一个元素?

时间:2019-06-27 20:26:41

标签: python python-3.x selenium selenium-webdriver xpath

我正在制作一个刮板,它将通过我的网页并获取所有链接。许多链接都在封闭列表(也称为树)中。因此,我找到了包含所有链接的xpath。我在google inspect中运行了以下xpath,它运行得非常好,给了我以下输出。

var result=$x("//div[@id='index__tree']//a[contains(text(),doku.php)]/@href")

result[0].value
"/doku.php?ihome"
result[4].value
"/doku.php?start"

然后我将xpath转换为硒代码:

a = driver.find_elements_by_xpath("//div[@id='index__tree']//a[contains(text(),doku.php)]/@href")

for aa in a:
        print(aa)

然后我运行代码并收到以下错误:

opening browser
Login Successful
Traceback (most recent call last):
  File "wiki.py", line 49, in <module>
    a = driver.find_elements_by_xpath("//div[@id='index__tree']//a[contains(text(),doku.php)]/@href")
  File "/home/aevans/wikiProject/venv/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 410, in find_elements_by_xpath
    return self.find_elements(by=By.XPATH, value=xpath)
  File "/home/aevans/wikiProject/venv/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 1007, in find_elements
    'value': value})['value'] or []
  File "/home/aevans/wikiProject/venv/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/home/aevans/wikiProject/venv/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: The result of the xpath expression "//div[@id='index__tree']//a[contains(text(),doku.php)]/@href" is: [object Attr]. It should be an element.
  (Session info: headless chrome=73.0.3683.86)
  (Driver info: chromedriver=73.0.3683.86,platform=Linux 3.10.0-957.12.2.el7.x86_64 x86_64)

1 个答案:

答案 0 :(得分:1)

尝试更换

a = driver.find_elements_by_xpath("//div[@id='index__tree']//a[contains(text(),doku.php)]/@href")
for aa in a:
    print(aa)

a = [elem.get_attribute("href") for elem in driver.find_elements_by_xpath("//div[@id='index__tree']//a[contains(text(),doku.php)]")]

for aa in a:
    print(aa)

注意,我从选择器的末尾删除了“ / @ href”。

Selenium选择器必须返回一个WebElement。通过指定“ / @ href”,它返回了该元素的href属性,而不是元素本身。

方法get_attribute(attribute_name)返回元素的属性。然后,您可以遍历它。