为这个新手问题道歉,但我才刚刚开始我的Python旅程,并开始学习网络抓取。
我已经写了一些代码来刮擦时尚网站并返回一些产品信息。我真正想做的是刮掉主要类别页面并提取所有产品名称和价格。我认为我将需要使用FOR循环,并且尝试过在本网站上找到的各种迭代,但似乎无法正常工作。
我想提取页面上所有项目的产品名称和价格,以便随后导出。下面的代码可以很好地返回页面上的第一项,但是我不确定如何添加循环以获取其余内容。
import requests
from bs4 import BeautifulSoup
url = 'https://www.riverisland.com/c/men/seasonal-offers?icid=mhp/winter-treats/m/seasonal-offers/cat'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
data_item = []
for item in name_box, price_box:
data_item.append()
name_box = soup.find('div', attrs={'class':'product__title ui-body-text'})
price_box = soup.find('div', attrs={'class':'product-price__headline-product-price__headline--sale'})
name = name_box.text.strip()
price = price_box.text.strip()
答案 0 :(得分:1)
您需要获取页面中的所有产品。 find
仅会为您带来第一个产品。您需要使用find_all来获取页面中的所有产品。然后,您可以遍历并打印它们。
import requests
from bs4 import BeautifulSoup
url = 'https://www.riverisland.com/c/men/seasonal-offers?icid=mhp/winter-treats/m/seasonal-offers/cat'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
name_box = soup.find_all('div', attrs={'class':'product__title ui-body-text'})
price_box = soup.find_all('div', attrs={'class':'product-price__headline product-price__headline--sale'})
for product in zip(name_box,price_box):
name,price=product
name_proper=name.text.strip()
price_proper=price.text.strip()
print(name_proper,'-',price_proper)
输出
Bellfield navy three-in-one mac coat - £50.00
Black rib muscle fit short sleeve T-shirt - £12.00
Criminal Damage black colour block zip jacket - £50.00
Jack & Jones Premium green puffer gilet - £30.00
Jack & Jones red faux fur bomber jacket - £50.00
Jack & Jones black parka jacket - £70.00
Light grey ribbed muscle fit T-shirt - £12.00
Navy satin velour panel slim fit T-shirt - £12.00
Pepe Jeans light blue denim jacket - £90.00
Navy slim fit tape crew neck T-shirt - £12.00
Superdry green camo parka jacket - £90.00
Superdry green double zip Fuji padded jacket - £60.00
Superdry green hooded parka jacket - £80.00
Superdry navy hooded quilted jacket - £80.00
Superdry navy triple zip funnel neck jacket - £60.00
Superdry red zip funnel neck puffer jacket - £60.00
Superdry yellow lightweight hooded jacket - £70.00
Superdry black camo funnel neck coat - £70.00
Superdry black double zip Fuji padded jacket - £60.00
Superdry black funnel neck puffer jacket - £60.00
Superdry blue lightweight hooded jacket - £70.00
Superdry green army jacket - £60.00
Only & Sons black hooded puffer jacket - £40.00
Pepe Jeans dark blue denim jacket - £90.00
Red waffle slim fit short sleeve T-shirt - £12.00
Selected Homme black stripe long sleeve top - £50.00
White waffle slim fit short sleeve T-shirt - £12.00
Big and Tall R96 burgundy muscle fit T-shirt - £12.00
Black Dean straight leg jeans - £20.00
Black R96 muscle fit long sleeve T-shirt - £12.00
Black R96 pique muscle fit long sleeve shirt - £15.00
Black ribbed crew neck long sleeve top - £12.00
Black velour R96 slim fit piped joggers - £20.00
Blue Dylan slim fit distressed jeans - £25.00
Dark blue straight leg jeans - £20.00
Dark blue straight leg jeans - £20.00
Dark blue straight leg manhattan jeans - £20.00
Dark blue ripped super skinny jeans - £25.00
Dark blue Dean straight leg jeans - £20.00
Dark blue Dylan slim fit jeans - £25.00
Dark grey R96 muscle fit grandad shirt - £15.00
Burgundy slim fit colour block sleeve hoodie - £20.00
Burgundy R96 muscle fit grandad shirt - £15.00
Dark red R95 muscle fit raglan T-shirt - £12.00
Dark red R96 muscle fit long sleeve T-shirt - £12.00
Dark red wasp embroidered Oxford shirt - £15.00
Green poplin muscle fit long sleeve shirt - £15.00
Grey check button down long sleeve shirt - £20.00
Light blue long sleeve flannel shirt - £20.00
R96 black velour slim fit hoodie - £20.00
Pink R96 muscle fit button-down shirt - £15.00
White ribbed crew neck long sleeve top - £12.00
Khaki slim fit tape sleeve hoodie - £20.00
Stone pique muscle fit long sleeve shirt - £15.00
Black lace up chukka boot - £25.00
Black 'Prolific' padded puffer coat - £45.00
Black muscle fit rib crew neck jumper - £20.00
Black hooded borg lined jacket - £45.00
Black longline faux fur hooded parka jacket - £45.00
Black zip front funnel neck puffer jacket - £25.00
答案 1 :(得分:1)
好的。您犯了小错误。您尝试抓取的是通过find
的单个产品名称。相反,您必须为所有产品尝试find_all
。
另一件事是您的price
抓取数据中,实际上是两个class
,应由.
而不是-
合并。
答案 2 :(得分:0)
我会尽力为您找到解决方案,但现在尝试使用
soup.find_all('div', attrs={'your attributes'}
功能