如果mongo db中已存在titleurl,我想过滤 如果是,则覆盖到mongo db 请指导我如何在scrapy和mongo之间过滤titleURL?
items.py:
from scrapy.contrib.djangoitem import DjangoItem
from mongo_test.models import Ct
class CtItem(DjangoItem):
django_model = Ct
mongo_test / models.py:
class Ct(models.Model):
title = models.CharField(max_length=100)
titleURL = models.URLField(max_length=255)
.....
pipeline.py:
from mongo_test.models import Ct
class CtPipeline(object):
def process_item(self, item, spider):
ct = item.save(commit=False)
ct_exist = Ct.objects.filter() #how to let scrapy titleURL= mongo titleURL
if ct_exist:
# override to mongo
ct.save()
return item
django项目中的settings.py:
DATABASES = {
'default': {
'ENGINE': 'django_mongodb_engine',
'NAME': 'scrapy',
}
}