我正在学习python crawler,我想知道如何处理位于以下网址中的“加载更多”按钮:
https://www.photo.net/search/#//Sort-View-Count/All-Categories/All-Time/Page-1
(我试图抓住所有图片)
我目前的代码是使用beautifulsoup:
from urllib.request import *
from http.cookiejar import CookieJar
from bs4 import BeautifulSoup
url = 'https://www.photo.net/search/#//Sort-View-Count/All-Categories/All- Time/Page-1'
cj = CookieJar()
opener = build_opener(HTTPCookieProcessor(cj))
try:
p = opener.open(url)
soup = BeautifulSoup(p, 'html.parser')
except Exception as e:
print(str(e))
答案 0 :(得分:0)
好吧,我有一个解决方案。
您应该尝试使用Selenium模块进行python。
1)下载Chrome驱动程序
2)通过pip安装Selenium
以下是如何使用它的示例
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
browser = webdriver.Chrome('Path to chrome driver')
browser.get()
while True:
button = WebDriverWait(browser,10).until(EC.presence_of_element_located((By.LINK_TEXT, 'Load More')))
button.click()