具有子类类结构数组的Scrapy类

时间:2016-05-13 20:30:11

标签: python data-structures scrapy

我认为这更像是一个python数据结构问题。我已经在这里工作了几个小时,并且没有任何可行的工作。

就JSON而言,我希望以这种方式抓取我的数据:

{"person":"john"
 "friends":[
     {"name":"sara",
      "link":"url"},
     {"name":"rick",
      "link":"url"}
    ]
}

一个人可以拥有该对象中的N个朋友。这是如何用python编写的并用于scrapy?我有以下内容:

class people(scrapy.Item):
    title = scrapy.Field()
    link = scrapy.Field()
    friends = [friend()] #this is where I'm having problems

class friend(scrapy.Item):
     title = scrapy.Field()
     link = scrapy.Fied()

我的代码使用这些(不完整的?)类:

def parse_person(self,response):
    url = response.url
    item = people()
    item['title'] = response.xpath('//h1/text()').extract()
    item['link'] = response.url
    yield scrapy.Request(url, callback=self.parse_friends, meta={'item':item})

def parse_friends(self, response):
    item = response.meta['item']
    item['friends']['link'] = response.xpath('//h1/text()').extract() #also this is a question
    yield item

0 个答案:

没有答案