Scrapy上的外键

时间:2013-02-22 05:14:47

标签: django django-models web-scraping scrapy scrape

我正在用scrapy做一个废料,我在django上的模型是:

class Creative(models.Model):
    name = models.CharField(max_length=200)
    picture = models.CharField(max_length=200, null = True)

class Project(models.Model):
    title = models.CharField(max_length=200)
    description = models.CharField(max_length=500, null = True)
    creative = models.ForeignKey(Creative)

class Image(models.Model):
    url = models.CharField(max_length=500)
    project = models.ForeignKey(Project)

我的scrapy模型:

from scrapy.contrib.djangoitem import DjangoItem
from app.models import Project, Creative

class ProjectItems(DjangoItem):
    django_model = Project

class CreativeItems(DjangoItem):
    django_model = Creative

所以当我保存时:

creative["name"]  = hxs.select('//*[@id="owner"]/text()').extract()[0]
picture  = hxs.select('//*[@id="owner-icon"]/a/img/@src').extract()
if len(picture)>0:
    creative["picture"] = picture[0]
creative.save()


# Extract title and description of the project
project["title"] = hxs.select('//*[@id="project-title"]/text()').extract()[0]
description = hxs.select('//*[@class="project-description"]/text()').extract()
if len(description)>0:
    project["description"] = description[0]
project["creative"] = creative
project.save()

我收到了错误:

  

Project.creative“必须是”Creative“实例。

那么,我怎样才能在scrapy上添加一个foreing key值?

2 个答案:

答案 0 :(得分:2)

这可以通过将creative.save()的返回值分配给project['creative']处的值来完成。例如,在以下示例中,我们使用djangoCreativeItem变量将此信息传递给项目:

creative["name"]  = hxs.select('//*[@id="owner"]/text()').extract()[0]
picture  = hxs.select('//*[@id="owner-icon"]/a/img/@src').extract()   
if len(picture)>0:
    creative["picture"] = picture[0]
djangoCreativeItem = creative.save()

# Extract title and description of the project
project["title"] = hxs.select('//*[@id="project-title"]/text()').extract()[0]
description = hxs.select('//*[@class="project-description"]/text()').extract()
if len(description)>0:
    project["description"] = description[0]
project["creative"] = djangoCreativeItem
project.save()

答案 1 :(得分:1)

it's been done here一样,将广告素材的ID直接放在creative_id中,我认为它应该有效:

 project["creative_id"] = creative.id

它将指定外键,而不会因缺少对象而烦恼(因为您处于不直接触摸模型对象的Scrapy环境中......)。