嵌入还是不嵌入?

时间:2013-04-22 15:31:45

标签: mongodb schema normalization denormalization

我正在试图弄清楚应该使用哪种架构设计。

(这些是示例文档,实际文档包含更多属性)

嵌入式

{
   _id: ObjectId(),
   title: "trolo",
   subs: [
      {
         owner: refUserId
      },
      ...
   ]
}

我编入了索引:ensureIndex({ "subs.owner": 1 })

归一化:

Collection A:
{
   _id: ObjectId(),
   title: "trolo"
}
Collection B:
{
   parent: refId,
   owner: refUserId
}

我编入了索引:ensureIndex({ owner: 1 })

我在不同的模型上运行了一些benchRun()测试。但结果非常令人惊讶。

嵌入式查询:

ops = [
    {op: "find", ns: t.getFullName(), query: { "subs.owner": someUserId }}
]

规范化查询:

ops = [
    {op: "find", ns: t.getFullName(), query: { owner: someUserId }}
]

benchRun脚本:

for (x = 1; x <= 128; x *= 2) {
    res = benchRun({
        parallel : x,
        seconds : 5,
        ops : ops
    });
    print( "threads: " + x + "\t queries/sec: " + res.query);
}

输出:

嵌入式:

threads: 1       queries/sec: 11331
threads: 2       queries/sec: 16764.6
threads: 4       queries/sec: 21587
threads: 8       queries/sec: 25198.6
threads: 16      queries/sec: 24717.6
threads: 32      queries/sec: 24707.4
threads: 64      queries/sec: 25813.8
threads: 128     queries/sec: 30785.4

归一化:

threads: 1       queries/sec: 8.4
threads: 2       queries/sec: 13.2
threads: 4       queries/sec: 16.4
threads: 8       queries/sec: 17.4
threads: 16      queries/sec: 18.2
threads: 32      queries/sec: 20.8
threads: 64      queries/sec: 27.4
threads: 128     queries/sec: 39.6

为什么规范化模型会慢得多?我本来以为它是最快的。

更新

以下是.explain()对我的疑问所说的内容。

嵌入式

> db.embedded.find({"subs.owner":ObjectId("516ea63322f2a93c4fef8542")}).explain()

{
        "cursor" : "BasicCursor",
        "isMultiKey" : false,
        "n" : 5,
        "nscannedObjects" : 5,
        "nscanned" : 5,
        "nscannedObjectsAllPlans" : 5,
        "nscannedAllPlans" : 5,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 0,
        "indexBounds" : {

        },
        "server" : "localhost:27017"
}

归一化的

> db.collectionB.find({owner: ObjectId("516ea63322f2a93c4fef8542")}).explain()
{
        "cursor" : "BtreeCursor owner_1",
        "isMultiKey" : false,
        "n" : 76625,
        "nscannedObjects" : 76625,
        "nscanned" : 76625,
        "nscannedObjectsAllPlans" : 76625,
        "nscannedAllPlans" : 76625,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 91,
        "indexBounds" : {
                "owner" : [
                        [
                                ObjectId("516ea63322f2a93c4fef8542"),
                                ObjectId("516ea63322f2a93c4fef8542")
                        ]
                ]
        },
        "server" : "localhost:27017"
}

1 个答案:

答案 0 :(得分:0)

为什么您希望规范化更快?使用嵌入式文档,文档存储在磁盘上的单个位置。使用一个磁盘搜索可以将整个文档带回来。如果它被规范化,它将分布在磁盘上,这意味着2磁盘试图获取信息。根据磁盘的速度和针必须进入的扇区,它不可避免地会比嵌入式文档模型慢。