scrapy运行顺序,在它被分配之前打印

时间:2017-06-22 13:32:09

标签: python scrapy web-crawler

所以问题是运行顺序,它基本上运行最后的功能

import scrapy

class uppspider(scrapy.Spider):
      start_urls = ['something.com']
      def parse(self, response):
          return scrapy.FormRequest.from_response(
              response,
              formdata={'login': '', 'Password': ''},
              callback=self.after_login
          )

      def after_login(self, response):
            #check login succeed before going on

          return Request(url="", callback=self.ret)

      def ret(self, response):
            #scraping
              yield scrapy.Request(callback=self.parse_tastypage)


      def parse_tastypage(self, response):

            item = uppItem()
            er = response.status      
            self = list()
            self.append(er)

            #scraping

             yield item

      print "whatever i print here, prints before the spider"

      mylist = list()
      parse_tastypage(mylist, 0)
      print (mylist)

所以如果我想打印一个函数中指定的变量。它不起作用,因为它在被分配到函数之前被打印。

1 个答案:

答案 0 :(得分:1)

import logging 

class uppspider(scrapy.Spider):
    mylist = list()

    def parse_tastypage(self):
        # access the above declared list like this
        self.mylist = ['some data']

    parse_tastypage()

    logging.info(mylist) # this will print ['some data']