即使元素肯定存在,BeautifulSoup和Selenium有时也返回None

时间:2020-03-09 06:03:29

标签: selenium selenium-webdriver web-scraping dynamic beautifulsoup

我正在尝试Kickstarter项目的“风险和挑战”部分。我有以下代码:

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')

wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
wd.get('https://www.kickstarter.com/projects/snapmaker/snapmaker-20-modular-3-in-1-3d-printers')
html = wd.page_source
soup = BeautifulSoup(html, "lxml")
project = soup.find("p", {"class" :"js-risks-text"})

但是,有时它返回None。有没有一种方法可以保证真正抓住那里的东西?

1 个答案:

答案 0 :(得分:0)

页面加载完成后,将通过JavaScript动态加载页面,然后通过JS呈现页面。因此,requests模块将无法即时渲染JS

Selenium暂时是不错的选择:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from bs4 import BeautifulSoup


options = Options()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)

driver.get(
    "https://www.kickstarter.com/projects/snapmaker/snapmaker-20-modular-3-in-1-3d-printers")

soup = BeautifulSoup(driver.page_source, 'html.parser')

risk = soup.find("p", class_="js-risks-text text-preline").text
print(risk)
driver.quit()

输出:

This is our second Kickstarter campaign. To date, we have delivered over 10,000 units to users in over 100 countries. We are using the same supply base that we established over the past two years through our first product. And we continue to improve our production process and technics process. We’ve also selected the best 
suppliers in this industry. We feel confident that we will ensure product quality and hit the delivery dates we are providing.

The Snapmaker 2.0 is a more complex product than the original model. There could 
be unexpected risks that can arise during the pilot production. The safety class 
of the laser module in Snapmaker 2.0 is class 4. We will pay special attention to get FDA/FCC/CE/RoHS certifications in late August, which are necessaries for delivering our products to the US, EU and most areas in the world. When we fulfilled our first Kickstarter campaign, we experienced the hard decision of delaying the mass-production to hold ourselves to high quality and user experience standards. We won’t ship unless our product can meet or exceed these high standards.