如何使驱动程序等待长达10秒才能单击超链接

时间:2019-01-05 13:51:00

标签: python selenium web-scraping

在下面的URL中,我需要单击一个邮件图标超链接,有时即使代码正确也无法正常工作,在这种情况下,驱动程序需要等待长达10秒的时间才能进入下一级

https://www.sciencedirect.com/science/article/pii/S1001841718305011

         tags = driver.find_elements_by_xpath('//a[@class="author size-m workspace-trigger"]//*[local-name()="svg"]')
         if tags:
             for tag in tags:
                 tag.click()

如何在此处明确或隐式使用-“ tag.click()”

5 个答案:

答案 0 :(得分:0)

顺便说一句..您可以从json之类的脚本中提取JSON中的作者联系电子邮件(与单击相同)

from selenium import webdriver
import json
d = webdriver.Chrome()
d.get('https://www.sciencedirect.com/science/article/pii/S1001841718305011#!')
script = d.find_element_by_css_selector('script[data-iso-key]').get_attribute('innerHTML')
script = script.replace(':false',':"false"').replace(':true',':"true"')
data = json.loads(script)
authors = data['authors']['content'][0]['$$']
emails = [author['$$'][3]['$']['href'].replace('mailto:','') for author in authors if len(author['$$']) == 4]
print(emails)
d.quit()

您还可以使用请求获取所有推荐信息

import requests
headers = {
    'User-Agent' : 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36'
          }
data = requests.get('https://www.sciencedirect.com/sdfe/arp/pii/S1001841718305011/recommendations?creditCardPurchaseAllowed=true&preventTransactionalAccess=false&preventDocumentDelivery=true', headers = headers).json()   
print(data)

示例视图:

答案 1 :(得分:0)

您必须等到该元素可点击为止。您可以使用WebDriverWait函数来做到这一点。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Firefox()
driver.get('url')

elements = driver.find_elements_by_xpath('xpath')

for element in elements:
    try:
        WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.LINK_TEXT, element.text)))
    finally:
        element.click()

答案 2 :(得分:0)

您可以尝试如下所示,单击包含邮件的超链接图标。启动单击后,将显示一个弹出框,其中包含其他信息。以下脚本可以从那里获取电子邮件地址。当svg元素存在时,挖掘任何东西总是很麻烦的。我使用BeautifulSoup库是为了使用.extract()函数踢出svg元素,以便脚本可以访问内容。

from bs4 import BeautifulSoup
from contextlib import closing
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

with closing(webdriver.Chrome()) as driver:
    driver.get("https://www.sciencedirect.com/science/article/pii/S1001841718305011")

    for elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//a[starts-with(@name,'baut')]")))[-2:]:
        elem.click()
        soup = BeautifulSoup(driver.page_source,"lxml")
        [item.extract() for item in soup.select("svg")]
        email = soup.select_one("a[href^='mailto:']").text
        print(email)

输出:

weibingzhang@ecust.edu.cn
junhongqian@ecust.edu.cn

答案 3 :(得分:0)

据我了解,单击元素后,应等待作者弹出窗口出现,然后使用details()提取?

tags = driver.find_elements_by_css_selector('svg.icon-envelope')

if tags:
    for tag in tags:
        tag.click()
        # wait until author dialog/popup on the right appear
        WebDriverWait(driver, 10).until(
            lambda d: d.find_element_by_class_name('e-address') # selector for email
        )
        try:
            details()
            # close the popup
            driver.find_element_by_css_selector('button.close-button').click()

        except Exception as ex:
            print(ex)
            continue

答案 4 :(得分:-1)

使用内置的time.sleep()函数

from time import sleep

tags = driver.find_elements_by_xpath('//a[@class="author size-m workspace-trigger"]//*[local-name()="svg"]')
 if tags:
  for tag in tags:
    sleep(10)
    tag.click()