我正在尝试从此 page 上的每个产品中提取图片网址,但收到以下错误:
<块引用>Traceback (most recent call last):
File "D:\Documentos\ZalandoDiscountGen-main\Zalando discout gen\scrapersnipes.py", line 98, in
<module>
scraper()
File "D:\Documentos\ZalandoDiscountGen-main\Zalando discout gen\scrapersnipes.py", line 92, in
scraper
imagen = producto.find("img", {"class": "b-dynamic_image_content b-product-tile-image ls-is-cached h-
lazyloaded"})['src']
TypeError: 'NoneType' object is not subscriptable
我尝试过的代码:
from bs4 import BeautifulSoup
from dhooks import Webhook, Embed
import requests
import pandas as pd
import time, datetime
import random
import numpy as np
import os
headers = {
'authority': 'www.snipes.es',
'cache-control': 'max-age=0',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0',
}
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'accept-language': 'es-ES,es;q=0.9,en;q=0.8,de;q=0.7,eo;q=0.6',
'dnt': '1',
}
def scraper():
response = requests.get("https://www.snipes.es/c/shoes?q=jordan%2B1&openCategory=true&sz=all&srule=New", headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
listadoproductos = soup.find_all('div', {'class': 'b-product-grid-tile js-tile-container'})
for producto in listadoproductos:
marca = producto.find("span", {"class":"b-product-tile-brand b-product-tile-text js-product-tile-link"}).text
titulo = producto.find("span", {"class":"b-product-tile-link js-product-tile-link"}).text
precio = producto.find("span", {"class":"b-product-tile-price-item"}).text
imagen = producto.find("img", {"class": "b-dynamic_image_content b-product-tile-image ls-is-cached h-lazyloaded"})['src']
imagen2 = "https://www.snipes.es" + str(imagen)
print (marca.strip(), titulo.strip(), precio.strip(), imagen2)
scraper()
无法弄清楚出了什么问题,很高兴提示从哪里开始。
答案 0 :(得分:0)
您尝试找到具有多个类的 <img>
,这种方法行不通且没有必要。
认为您也不会拥有 src
因为它是一个空白的 png,您可能想要的是 data-src
将您尝试查找图像的行更改为以下内容:
imagen = producto.select_one('div.b-product-tile-image-container img')['data-src']
也跳过 imagen2,你不需要它:
for producto in listadoproductos:
marca = producto.find("span", {"class":"b-product-tile-brand b-product-tile-text js-product-tile-link"}).text
titulo = producto.find("span", {"class":"b-product-tile-link js-product-tile-link"}).text
precio = producto.find("span", {"class":"b-product-tile-price-item"}).text
imagen = producto.select_one('div.b-product-tile-image-container img')['data-src']
print (marca.strip(), titulo.strip(), precio.strip(), imagen)
输出
<块引用>JORDAN WMNS Zoom '92 149,99 € https://www.snipes.es/dw/image/v2/BDCB_PRD/on/demandware.static/-/Sites-snse-master-eu/default/dw1986ce4d/1899597_P.jpg?sw=300&sh=300&sm=fit&sfrm=png JORDAN Air Jordan 1 Mid (PS) 64,99 € https://www.snipes.es/dw/image/v2/BDCB_PRD/on/demandware.static/-/Sites-snse-master-eu/default/dwe2e88c0b/1930682_P.jpg?sw=300&sh=300&sm=fit&sfrm=png JORDAN Air Jordan 11 Crib Bootie 59,99 € https://www.snipes.es/dw/image/v2/BDCB_PRD/on/demandware.static/-/Sites-snse-master-eu/default/dw2dd01aa4/1883653_P.jpg?sw=300&sh=300&sm=fit&sfrm=png 约旦 Jordan Air Max 200 129,99 € https://www.snipes.es/dw/image/v2/BDCB_PRD/on/demandware.static/-/Sites-snse-master-eu/default/dw21a7bda8/1829411_P.jpg?sw=300&sh=300&sm=fit&sfrm=png