粗暴的csv饲料蜘蛛要花很长时间

时间:2018-09-29 15:26:24

标签: web-scraping scrapy-spider

我已经编写了CSVFeedSpider,但是只要文件很大,这会花费很长时间,有人可以告诉我我需要做些什么来优化它。谢谢。

from scrapy.spiders import CSVFeedSpider


class Asos(CSVFeedSpider):
    name = 'asos'

    start_urls = ['https://productdata.awin.com/datafeed/download/apikey/a9c759ce97541f939fc788547cc5d9d9/'
                  'language/en/fid/23139/columns/aw_deep_link,product_name,aw_product_id,merchant_product_id,'
                  'merchant_image_url,description,merchant_category,search_price,merchant_name,merchant_id,category_name,'
                  'category_id,aw_image_url,currency,store_price,delivery_cost,merchant_deep_link,language,last_updated,'
                  'display_price,data_feed_id/format/csv/delimiter/%2C/compression/gzip/adultcontent/1/']

    delimiter = ','
    quotechar = "\""

    headers = ["aw_deep_link","product_name","aw_product_id","merchant_product_id","merchant_image_url",
               "description","merchant_category","search_price","merchant_name","merchant_id","category_name",
               "category_id","aw_image_url","currency","store_price","delivery_cost","merchant_deep_link","language",
               "last_updated","display_price","data_feed_id"]

    def parse_row(self, response, row):
        print(row["product_name"], row["aw_deep_link"])

0 个答案:

没有答案