使用Python Selenium废弃dictionary.cambridge.org

时间:2018-04-22 23:49:03

标签: python selenium selenium-webdriver web-scraping selenium-chromedriver

我想从 dictionary.cambridge.org 获取下载mp3文件的链接。 xpath找到了正确的按钮,但在任何情况下我都无法获得该链接。我试图使用 .text .get_attribute(" href")方法。你有什么主意吗 ? enter image description here

from selenium import webdriver

words=['hunch']
link='https://dictionary.cambridge.org/dictionary/english-polish'

driver = webdriver.Chrome()

main_window = driver.current_window_handle
for i in words:
    driver.get(link+"/"+str(i))
    try:
        content = driver.find_elements_by_xpath('//*[@id="entryContent"]/div[3]/div/div/div[1]/span/span[2]/span[1]/span[2]')
        print(content)
      # print(content.text)
    except:
        driver.close()
    driver.close()

1 个答案:

答案 0 :(得分:1)

根据url data-src-mp3 属性中检索链接,您需要引导 WebDriverWait ,您可以使用以下行:代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# lines of code
content = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[@class='circle circle-btn sound audio_play_button uk']")))
print(content.get_attribute("data-src-mp3"))

控制台输出:

https://dictionary.cambridge.org/media/english-polish/uk_pron/u/ukh/ukhun/ukhunch001.mp3