我是python的新手,上次我为Amazon构建了Crawler。我的问题是,我从来没有得到所有物品。我有一个包含产品链接的列表。大约有1300个链接。但是,当我让搜寻器运行时,我会得到不同数量的“爬行”物品。它在700-1100个项目之间波动。我做错了什么,所以我没有从所有1300个项目中获取信息?
class AmazoncrawlerSpider(scrapy.Spider):
name = 'amazonCrawler'
with open("/Users/username/PycharmProjects/whiskywebsite/src/csv/product_links.csv", "r") as f:
start_urls = [url.strip() for url in f.readlines()]
def parse(self, response):
all_important_information = response.css('#dp')
for information in all_important_information:
product_name = information.css('#productTitle').css('::text').extract()
product_name = [name.strip() for name in product_name]
product_price = information.css('#price_inside_buybox').css('::text').extract()
product_price = [price.strip() for price in product_price]
#product_asin = information.css('.col2 tr:nth-child(1) .value').css('::text').extract()
#product_asin = [asin.strip() for asin in product_asin]
#product_rating = information.css('.a-icon-alt').css('::text').extract()
#product_rating = [rating.strip() for rating in product_rating]
#product_volume = information.css('.comparison_other_attribute_row:nth-child(11) .comparison_baseitem_column .a-color-base').css('::text').extract()
#product_volume = [volume.strip() for volume in product_volume]
#product_country = information.css('.comparison_other_attribute_row:nth-child(10) .comparison_baseitem_column .a-color-base').css('::text').extract()
#product_country = [country.strip() for country in product_country]
product_picture = information.css('#imgTagWrapperId').css('img::attr(data-old-hires)').extract()
product_picture = [picture.strip() for picture in product_picture]
result = zip(product_name, product_price, product_picture)
for name, price, picture in result:
items = AmazoncrawlingItem()
items['Name'] = name
items['Preis'] = price
#items['Asin'] = asin
items['Time'] = now
#items['Rating'] = rating
#items['Volume'] = volume
#items['Country'] = country
items['Picture'] = picture
items['Website'] = website
yield items
[scrapy.core.engine]调试:已抓取(200)https://www.amazon.de/...link ..>(引荐来源:无)
是否与(引用者:无)有关?如果是,那是什么意思?