获得Scrapy Request的结果

时间:2017-08-03 23:33:15

标签: python request scrapy scrapy-spider

如何在可用变量中获取scrapy请求的结果。

 def parse_node(self,response,node):
    yield Request('LINK',callback=self.parse_listing)
 def parse_listing(self,response):
    for agent in string.split(response.xpath('//node[@id="Agent"]/text()').extract_first() or "",'^'):
       HERE=Request('LINK',callback=self.parse_agent)
       print HERE
 def parse_agent(self,response):
    yield response.xpath('//node[@id="Email"]/text()').extract_first()

我正在尝试从HERE=Request('LINK',callback=self.parse_agent)获取结果并打印出来。 parse_agent应该收到一封电子邮件,但我想得到它并在parse_listing中使用它。

2 个答案:

答案 0 :(得分:0)

def parse_listing(self, response):
    for agent in string.split(response.xpath('//node[@id="Agent"]/text()').extract_first() or "", '^'):
        HERE = scrapy.Request('LINK', callback=self.parse_agent)
        # call this req or something calls parse_agent(link)
        yield HERE # this will yield to callback which will print or log


def parse_agent(self, response):
     print response #response is the parsed page from HERE)
     email = response.xpath('//node[@id="Email"]/text()').extract_first() #something 

     print email # logging is better 
     #import logging
     #logging.log(logging.INFO, "info from page")
     yield email #yield to whatever function

答案 1 :(得分:0)

根据你在第一个答案下的评论,我认为你真正需要的是使用scrapy-inline-requests来达到目的(参见那里的例子)。您的代码看起来像这样:

def parse_node(self, response, node):
    yield Request('LINK', callback=self.parse_listing)

@inline_requests
def parse_listing(self, response):
    for agent in string.split(response.xpath('//node[@id="Agent"]/text()').extract_first() or "",'^'):
        agent_response = yield Request('LINK')
        email = agent_response.xpath('//node[@id="Email"]/text()').extract_first()