所以我有这段代码,该代码应该可以获取亚马逊上任何物品的价格。但是,我没有得到价格,而是得到了一个空清单。
from lxml import html
import requests
page = requests.get('https://www.amazon.com/gp/product/B06XP634L1?pf_rd_p=183f5289-9dc0-416f-942e-e8f213ef368b&pf_rd_r=W4XQCYJ4N9VQGF8HDAH0')
doc = html.fromstring(page.content)
price = doc.xpath("//span[@id='priceblock_ourprice']")
print(price)
这以前对我有用。 我将不胜感激任何帮助。预先感谢。
答案 0 :(得分:0)
您需要添加一个User-Agent标头
from lxml import html
import requests
headers = {'User-Agent':'Mozilla\5.0'}
page = requests.get('https://www.amazon.com/gp/product/B06XP634L1?pf_rd_p=183f5289-9dc0-416f-942e-e8f213ef368b&pf_rd_r=W4XQCYJ4N9VQGF8HDAH0', headers = headers)
doc = html.fromstring(page.content)
price = doc.xpath("//span[@id='priceblock_ourprice']")
print(price[0].text)
或
price = doc.xpath("//span[@id='priceblock_ourprice']/text()")
print(price)
bs4
from bs4 import BeautifulSoup as bs
import requests
headers = {'User-Agent':'Mozilla\5.0'}
page = requests.get('https://www.amazon.com/gp/product/B06XP634L1?pf_rd_p=183f5289-9dc0-416f-942e-e8f213ef368b&pf_rd_r=W4XQCYJ4N9VQGF8HDAH0', headers = headers)
soup = bs(page.content, 'lxml')
price = soup.select_one("#attach-base-product-price")['value']
print(price)