Question

我正在学习python crawler，我想知道如何处理位于以下网址中的“加载更多”按钮：

https://www.photo.net/search/#//Sort-View-Count/All-Categories/All-Time/Page-1

（我试图抓住所有图片）

我目前的代码是使用beautifulsoup：

from urllib.request import *

from http.cookiejar import CookieJar

from bs4 import BeautifulSoup

url = 'https://www.photo.net/search/#//Sort-View-Count/All-Categories/All- Time/Page-1'

cj = CookieJar()

opener = build_opener(HTTPCookieProcessor(cj))

try:
    p = opener.open(url)

    soup = BeautifulSoup(p, 'html.parser')

except Exception as e:

    print(str(e))

Answer 1

好吧，我有一个解决方案。

您应该尝试使用Selenium模块进行python。

1）下载Chrome驱动程序

2）通过pip安装Selenium

以下是如何使用它的示例

from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

browser = webdriver.Chrome('Path to chrome driver')
browser.get()
while True:
    button = WebDriverWait(browser,10).until(EC.presence_of_element_located((By.LINK_TEXT, 'Load More')))
    button.click()

Python Crawler：处理“加载更多”按钮

1 个答案: