CSV文件为空,即使从网站上刮掉了项目

时间:2019-12-18 07:31:09

标签: python scrapy scrapy-pipeline

我的要求是将报废的项目转储到两个不同的csv文件中。我可以抓取数据,但是CSV文件为空。任何人都可以在这方面提供帮助。

以下是pipeline.py文件和控制台日志的代码:

Pipeline.py code

Console logs

enter image description here

pipeline.py的代码:

# -*- coding: utf-8 -*-



# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
from scrapy.exporters import CsvItemExporter
from scrapy import signals
from pydispatch import dispatcher



def item_type(item):
    # The CSV file names are used (imported) from the scrapy spider.
    return type(item)



class SecfilingsPipeline(object):
    fileNamesCsv = ['NonDerivatives','Derivatives']

    def __init__(self):
        self.files = {}
        self.exporters = {}
        dispatcher.connect(self.spider_opened, signal=signals.spider_opened)`enter code here`
        dispatcher.connect(self.spider_closed, signal=signals.spider_closed)

    def spider_opened(self, spider):
        self.files = dict([ (name, open(name+'.csv','wb')) for name in self.fileNamesCsv ])
        for name in self.fileNamesCsv:
            self.exporters[name] = CsvItemExporter(self.files[name])




            if name == "NonDerivatives":
                print("File Name 1" + name)
                self.exporters[name].fields_to_export = ['TitleofSecurity','TransactionDate','TransactionCode','Amount','SecuritiesAcquirednDisposed','AmountOfSecurityOwned','OwnershipForm']
                self.exporters[name].start_exporting()



            if name == "Derivatives":
                print("File Name 2" + name)
                self.exporters[name].fields_to_export = ['TitleofDerivativeSecurity','TransactionDate','TransactionCode','SecuritiesAcquired','SecuritiesDisposed','TitleOfSecurity','Amount','AmountOfSecurityOwned','OwnershipForm']
                self.exporters[name].start_exporting()



    def spider_closed(self, spider):
        [e.finish_exporting() for e in self.exporters.values()]
        [f.close() for f in self.files.values()]


    def process_item(self, item, spider):
        typesItem = item_type(item)
        if typesItem in set(self.fileNamesCsv):
            self.exporters[typesItem].export_item(item)
        return item

我还在setting.py中启用了管道配置

0 个答案:

没有答案