在网络抓取中打印空白

时间:2018-08-09 04:53:13

标签: python scrapy

让我说我有以下脚本:

# -*- coding: utf-8 -*-
import scrapy


class StrongSpider(scrapy.Spider):
    name = 'Strong'
    allowed_domains = ['https://www.strongflex.de/en/4-acura-integra-93-01/']
    start_urls = ['https://www.strongflex.de/en/4-acura-integra-93-01/']

def parse(self,response):
    product_container = response.css("div.product-container")
    prodname = product_container.css("a.product-name::text").extract_first().strip()
    price = product_container.css("span.price::text").extract_first().strip()
    description = product_container.css("p.product-desc::text").extract_first().strip()
    img = product_container.css("img.replace-2x.img-responsive::attr(src)").extract_first()

    for item in zip(prodname,price,description,img):
        scraped_info = {
        'prodname' : prodname[0],
        'price' : price[1],
        'description' : description[2],
        'img' : img[3],
    }
        yield scraped_info

我想在循环系统中说是否不存在item [1],然后打印空白,实际上我不知道该怎么做...如果没有所有产品都没有价格,我的脚本就会跳过< / p>

2 个答案:

答案 0 :(得分:0)

如果您要发送空值而不是跳过的项目,则可以这样做

for item in zip(prodname,price,description,img):
    scraped_info = {
        'prodname' : prodname[0],
        'price' : price[1] if len(price)>=2 else '',
        'description' : description[2],
        'img' : img[3],
    }

答案 1 :(得分:0)

通常不采用extract_first()并且仅执行单页请求。也许您最好发布更多蜘蛛代码?好,更重要的是 您可以显示_scraped_info_的调试打印吗?输出没有什么意义

<html> <head> <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.6.9/angular.min.js"></script> </head> <body ng-app="myApp"> <div ng-controller="thecontroller"> <form> <table> <tbody> <tr ng-repeat="x in names"> <td>{{x.name}}</td> <td>{{x.email}}</td> <td>{{x.password}}</td> </tr> </tbody> </table> </form> </div> </body> </html>

相反,您已经停止使用[indexs]截断项目,并且 {'description': 'f', 'img': 'r', 'price': '0', 'prodname': '0'}

所以,这样

{'description': 'ref: 081097B\r\nMaterial: POLYURETHANE (PUR/PU)\r\nHardness 80ShA\r\nPcs/prod: 1\r\nRequired/car: 2\r\nTo every product we add grease!', 'img': 'acura-integra-93-01_files/front-anti-roll-bar-bush.jpg', 'price': '10,01 €', 'prodname': '081097B: Front anti roll bar bush'}

NotaBene:问题似乎是雇主的一项测试任务,不是吗?如果是这种情况,请稍后答复。祝你好运!

相关问题