问题是当我运行脚本时,我没有得到page_source,而硒停止单击脚本中断,也没有从page_source获得链接
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.support import ui
import time
#url = ''
driver = webdriver.Chrome(executable_path='C:/Users/yacerpc/Desktop/chrome/chromedriver')
driver.get('https://www.white-river-gems.com/shop')
while driver.find_element_by_class_name("dn9KO"):
wait = ui.WebDriverWait(driver, 10)
button = wait.until(lambda driver: driver.find_element_by_class_name("dn9KO"))
button.click()
print("clicked")
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
page = soup.find('div',{'class':'_1hM3_ jw2qu'})
find_links = page.find_all('li')
for url in find_links:
link = url.find('a',{'class':'_2zTHN _2AHc6'}).get('href')
print(link)
我希望输出从page_source获得链接
答案 0 :(得分:0)
像这样尝试:
driver.set_script_timeout(120)
driver.execute_async_script("""
var interval = setInterval(() => {
var button = document.querySelector('[data-hook="load-more-button"]')
if(button){
button.click()
} else {
clearInterval(interval)
arguments[0]()
}
}, 5000)
""")
请注意,您要选择[data-hook="load-more-button"]
,因为dn9KO
看起来会在下一次部署中发生变化。