BeautifulSoup在Amazon ID元素上返回None

时间:2019-07-18 16:37:43

标签: python web-scraping beautifulsoup

我正在尝试使用BeautifulSoup库获取亚马逊产品的价格,但是当我运行代码时,尽管ID存在,但它返回None。

import requests 
from bs4 import BeautifulSoup

URL = 'https://www.amazon.com/Silicone-Heat-Resistant-Spatulas-Non-stick-Stainless/dp/B01MR507HZ'

headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}

page = requests.get(URL, headers=headers)

soup = BeautifulSoup(page.content, 'html.parser')

price = soup.find(id="priceblock_ourprice")
print(price)

我希望输出为$6.99,但实际输出为None

1 个答案:

答案 0 :(得分:2)

将解析器更改为lxml BeautifulSoup会找到您的标签:

import requests
from bs4 import BeautifulSoup

URL = 'https://www.amazon.com/Silicone-Heat-Resistant-Spatulas-Non-stick-Stainless/dp/B01MR507HZ'

headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}

page = requests.get(URL, headers=headers)

soup = BeautifulSoup(page.content, 'lxml')

price = soup.find(id="priceblock_ourprice")
print(price)
price_float = float(price.text.replace('$', ''))
print(price_float)

打印:

<span class="a-size-medium a-color-price priceBlockBuyingPriceString" id="priceblock_ourprice">$6.99</span>
6.99

编辑:在此类问题中,运行diagnose()doc)通常很有用