如何以树形结构的方式将scrapy中的数据存储到mongoDB中?

时间:2016-04-12 14:26:39

标签: mongodb python-2.7 scrapy

我正在执行以树形结构将数据存储到MongoDB的任务。我成功地用scrapy删除了我想要的数据。另外我知道在MongoDB中可以将数据存储在树结构中,例如,

db.categories.insert( { _id: "MongoDB", parent: "Databases" } )
db.categories.insert( { _id: "dbm", parent: "Databases" } )
db.categories.insert( { _id: "Databases", parent: "Programming" } )
db.categories.insert( { _id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )

所以我想通过使用pymongo我可以做类似的工作,但事实证明我不能。

from scrapy.conf import settings
from scrapy.exceptions import DropItem
from scrapy import log

class CategoryPipeline(object):

    def __init__(self):
        connection = pymongo.MongoClient(settings['MONGODB_SERVER'],
                                         settings['MONGODB_PORT']
                                         )
        db = connection[settings['MONGODB_DB']]
        self.collection = db[settings['MONGODB_COLLECTION']]
    def process_item(self, item, spider):
        valid = True
        for data in item:
            if not data:
                valid = False
                raise DropItem("Missing {0}!".format(data))
        if valid:

            self.collection.insert( { _id: "MongoDB",parent:"Databases" } ) 
            log.msg("question added to mongodb database!",
                    level=log.DEBUG,spider = spider)

        return item

上面的代码不起作用。错误显示_id未定义。如果我喜欢" _id"然后它工作,但我认为这不会做树功能。

我的问题是如何在pymongo中将树状结构数据存储到mongodb中?

0 个答案:

没有答案