我认为这更像是一个python数据结构问题。我已经在这里工作了几个小时,并且没有任何可行的工作。
就JSON而言,我希望以这种方式抓取我的数据:
{"person":"john"
"friends":[
{"name":"sara",
"link":"url"},
{"name":"rick",
"link":"url"}
]
}
一个人可以拥有该对象中的N个朋友。这是如何用python编写的并用于scrapy?我有以下内容:
class people(scrapy.Item):
title = scrapy.Field()
link = scrapy.Field()
friends = [friend()] #this is where I'm having problems
class friend(scrapy.Item):
title = scrapy.Field()
link = scrapy.Fied()
我的代码使用这些(不完整的?)类:
def parse_person(self,response):
url = response.url
item = people()
item['title'] = response.xpath('//h1/text()').extract()
item['link'] = response.url
yield scrapy.Request(url, callback=self.parse_friends, meta={'item':item})
def parse_friends(self, response):
item = response.meta['item']
item['friends']['link'] = response.xpath('//h1/text()').extract() #also this is a question
yield item