def parse(self,response):
print("parse!!!!!!!!!!!!!!!!!!!")
yield scrapy.Request("http://xx.com", callback=self.parseHeader,meta={'item': item})
yield scrapy.Request("http://xx.com ", callback=self.parseBody,meta={'item': item})
yield scrapy.Request("http://xx.com ", callback=self.parseFooter,meta={'item': item})
def parseHeader(self,response):
print("parseHeader!!!!!!!!!!!!!!!!!!!")
item = ItemHeader()
#...
yield item
def parseBody(self,response):
print("parseBody!!!!!!!!!!!!!!!!!!!")
item = ItemBody()
#...
yield item
def parseFooter(self,response):
print("parseFooter!!!!!!!!!!!!!!!!!!!")
item = ItemFooter()
#...
yield item
执行上述代码会产生以下结果。 目前的结果
parse!!!!!!!!!!!!!!!!!!!
↓
parseHeader!!!!!!!!!!!!!!!!!!!
↓
pipeline
↓
Closing spider (finished)
将执行“parseHeader”的唯一方法 在它下面没有被执行 将收益率改为回报并不会改变结果。
我想将上述结果更改如下。
parse!!!!!!!!!!!!!!!!!!!
↓
parseHeader!!!!!!!!!!!!!!!!!!!
↓
pipeline
↓
parseBody!!!!!!!!!!!!!!!!!!!
↓
pipeline
↓
parseFooter!!!!!!!!!!!!!!!!!!!
↓
pipeline
↓
Closing spider (finished)
我怎么能这样做? 如果你知道一些暗示的东西,请告诉我吗?
答案 0 :(得分:1)
如果你有一个响应,并想要从中解析多个东西,你可以将解析逻辑分成不同的方法,只需将它们称为返回项目的普通python方法:
def parse(self, response):
yield scrapy.Request("http://xx.com",
callback=self.parse_item,
meta={'item': item})
def parse_item(self, response):
# either return everything as one item:
item = response.meta['item']
item['header'] = self.parse_header(response)
item['body'] = self.parse_body(response)
item['footer'] = self.parse_footer(response)
yield item
# or as multiple items:
yield self.parse_header(response)
yield self.parse_body(response)
yield self.parse_footer(response)
def parse_header(self, response):
print("parseHeader!!!!!!!!!!!!!!!!!!!")
item = ItemHeader()
return item
def parse_body(self, response):
print("parseBody!!!!!!!!!!!!!!!!!!!")
item = ItemBody()
return item
def parse_footer(self, response):
print("parseFooter!!!!!!!!!!!!!!!!!!!")
item = ItemFooter()
return item