Question

from bs4 import BeautifulSoup
import urllib.request

page = urllib.request.urlopen('https://www.applied.com/categories/bearings/accessories/adapter-sleeves/c/1580?q=%3Arelevance&page=1')
html = page.read()
soup = BeautifulSoup(html)

items = soup.find_all(class_= 'product product--list ')

for i in items[0:1]:
    product_name = i.find(class_="product__name").a.string.strip()
    print(product_name)
    product_url = i.find(class_="product__name").a['href']
    print(product_url)
    price = i.find(itemprop="price").string
    print(price)

使用上面的代码我试图获得该页面中每个产品的价格。但是当我尝试时，价格变量的输出显示为无。

当我在浏览器中检查html源代码的价格时，它将价格显示为普通文本，作为我获取product_name变量的方式。

有人可以指导我如何在该页面中获取产品的价格。

Answer 1

加载页面后，Ajax（https://www.applied.com/getprices）加载了价格，这就是为什么它不是HTML格式。

使用https://www.applied.com/getprices获取商品的价格您必须发送带有以下参数的邮寄请求才能获得产品的价格。

{
  "productCodes": "100731658",
  "page": "PLP",
  "productCode": "100731658",
  "CSRFToken": "172c7073-742f-4d7d-9c97-358e0d9e631e"
}

从html源提取字符串数据

1 个答案: