我正在执行以树形结构将数据存储到MongoDB的任务。我成功地用scrapy删除了我想要的数据。另外我知道在MongoDB中可以将数据存储在树结构中,例如,
db.categories.insert( { _id: "MongoDB", parent: "Databases" } )
db.categories.insert( { _id: "dbm", parent: "Databases" } )
db.categories.insert( { _id: "Databases", parent: "Programming" } )
db.categories.insert( { _id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )
所以我想通过使用pymongo我可以做类似的工作,但事实证明我不能。
from scrapy.conf import settings
from scrapy.exceptions import DropItem
from scrapy import log
class CategoryPipeline(object):
def __init__(self):
connection = pymongo.MongoClient(settings['MONGODB_SERVER'],
settings['MONGODB_PORT']
)
db = connection[settings['MONGODB_DB']]
self.collection = db[settings['MONGODB_COLLECTION']]
def process_item(self, item, spider):
valid = True
for data in item:
if not data:
valid = False
raise DropItem("Missing {0}!".format(data))
if valid:
self.collection.insert( { _id: "MongoDB",parent:"Databases" } )
log.msg("question added to mongodb database!",
level=log.DEBUG,spider = spider)
return item
上面的代码不起作用。错误显示_id未定义。如果我喜欢" _id"然后它工作,但我认为这不会做树功能。
我的问题是如何在pymongo中将树状结构数据存储到mongodb中?