我正在尝试使用BeautifulSoup库获取亚马逊产品的价格,但是当我运行代码时,尽管ID存在,但它返回None。
import requests
from bs4 import BeautifulSoup
URL = 'https://www.amazon.com/Silicone-Heat-Resistant-Spatulas-Non-stick-Stainless/dp/B01MR507HZ'
headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
price = soup.find(id="priceblock_ourprice")
print(price)
我希望输出为$6.99
,但实际输出为None
。
答案 0 :(得分:2)
将解析器更改为lxml
BeautifulSoup会找到您的标签:
import requests
from bs4 import BeautifulSoup
URL = 'https://www.amazon.com/Silicone-Heat-Resistant-Spatulas-Non-stick-Stainless/dp/B01MR507HZ'
headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'lxml')
price = soup.find(id="priceblock_ourprice")
print(price)
price_float = float(price.text.replace('$', ''))
print(price_float)
打印:
<span class="a-size-medium a-color-price priceBlockBuyingPriceString" id="priceblock_ourprice">$6.99</span>
6.99
编辑:在此类问题中,运行diagnose()
(doc)通常很有用