我正在使用BeautifulSoup解析网站及其产品。我编写了一个脚本,该脚本返回商品名称,价格和确切的URL。
我的问题是这些行
containers = soup.find_all("div", {"class": "ProductList-grid clear"})
print(len(containers))
# Output is ALWAYS 1
如果您从屏幕截图中注意到,则实际上只打印了1个到控制台,而实际上应该打印4件事:http://prntscr.com/kbq6l3 我不确定为什么只找到第一个产品,却找不到其他3。
这是我的剧本:
from bs4 import BeautifulSoup as Bs
import requests
website = "https://www.revengeofficial.com"
session = requests.session()
urls_and_prices = {}
def get_items():
response = session.get(website + "/webstore")
soup = Bs(response.text, "html.parser")
containers = soup.find_all("div", {"class": "ProductList-grid clear"})
print(len(containers))
for div in containers:
item_name = div.a["href"]
get_price(website + item_name)
def get_price(item_url):
response = session.get(item_url)
soup = Bs(response.text, "html.parser")
container = soup.find_all("section", {"class": "ProductItem-details"})
for element in container:
if element.div is not None:
name = element.h1.text
price = element.div.span.text
urls_and_prices[item_url] = price
def print_item_info():
if len(urls_and_prices) == 0:
print("Could not find any items")
return
for key, value in urls_and_prices.items():
name = key.split("/")[4]
print("Item name: " + name)
print("Price: " + value)
print("Link: " + key)
get_items()
print_item_info()
感谢您的帮助。
编辑:另外,我也很感谢对我的代码的批评。我是python的新手,并且希望尽可能地提高自己。
答案 0 :(得分:0)
您选择的是整个网格,只有1个网格,请选择所有产品,而不要使用ProductList-item
soup.find_all("div", {"class": "ProductList-item"})
答案 1 :(得分:0)
这将找到4个项目
containers = soup.find_all("a", {"class": "ProductList-item-link"})
print(len(containers))
for a in containers:
item_name = a["href"]
get_price(website + item_name)