Question

我对mongo有一些性能问题。

我有这个系列：

{
    "_id" : ObjectId,
    "status" : String,
    "song" : ObjectId,
    "room" : ObjectId,
    "duration" : Number,
    "order" : 0,
    "addedAt" : ISODate("2016-02-09T14:16:21.331Z"),
    "startedAt" : ISODate("2016-02-09T14:16:21.393Z")
}

在其中我有以下索引：

/* 1 */
{
    "0" : {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "mydb.mycollection"
    },
    "1" : {
        "v" : 1,
        "key" : {
            "song" : 1
        },
        "name" : "song_1",
        "ns" : "mydb.mycollection",
        "background" : true,
        "safe" : null
    },
    "2" : {
        "v" : 1,
        "key" : {
            "user" : 1
        },
        "name" : "user_1",
        "ns" : "mydb.mycollection",
        "background" : true,
        "safe" : null
    },
    "3" : {
        "v" : 1,
        "key" : {
            "room" : 1
        },
        "name" : "room_1",
        "ns" : "mydb.mycollection",
        "background" : true,
        "safe" : null
    },
    "4" : {
        "v" : 1,
        "key" : {
            "duration" : 1
        },
        "name" : "duration_1",
        "ns" : "mydb.mycollection",
        "background" : true,
        "safe" : null
    }
}

该集合中有超过300万条记录。

现在，Mongo在日志中显示了这个缓慢的查询信息（为了便于阅读而缩进）：

2016-02-11T11:07:47.897+0000 [conn19] query mydb.mycollection query: {
    orderby: { startedAt: -1 },
    $query: { status: {$in: [ "ended", "skipped" ] }, room: ObjectId('myroomid') } }
    planSummary: IXSCAN {room: 1 }, IXSCAN { room: 1 } cursorid:64767933277   
    noreturn:10
    ntoskip:0
    nscanned:41663
    nscannedObjects:41663 keyUpdates:0
    numYields:4 locks(micros) r:2949888
    nreturned:10 realen:2668
    1737ms

正如您所看到的，执行时间是1737毫秒（有时甚至更多），而且我也经历了高CPU利用率。

任何人都知道为什么？我需要添加的任何索引？ 3M记录的数据太多了吗？

谢谢！

Answer 1

虽然有指数交叉点，但这些并不适用于此，一般情况下，良好的经验法则

MongoDB每个查询只使用一个索引。

因此，您的查询位于两个字段（status和room）上，另外一个字段（startedAt）。使用的查询计划清楚地表明它仅使用room上的索引。对于所有其他值，它会读取与room和nscanned所示的nscannedObjects部分匹配的文档。

为了在此处充分利用索引，您需要room，status和startedAt上的复合索引。请注意，顺序很重要，因此如果您的查询如下所示：

db.rooms.find({
    room: someRoomId,
    status: {$in: [ "ended", "skipped" ]
}).sort({startedAt:-1})

相应的索引应该是

db.rooms.createIndex({room:1,status:1,startedAt:-1})

如果您的查询符合

db.rooms.find({
    status: {$in: [ "ended", "skipped" ],
    room: someRoomId
}).sort({startedAt:-1})

你的索引应该是

db.rooms.createIndex({status:1,room:1,startedAt:-1})

相应地设置索引，您的查询应该快得多。

旁注

您在示例中使用ObjectId字符串值。这根本没有意义。您可以直接使用您在那里使用的字符串（例如，房间号），也可以使用new ObjectId()完全返回的ObjectId（）。当你的领域的基数足够高时，没有必要使用ObjectId（）（例如，由房间号码给出 - 在同一建筑物中不可能有两个房间具有相同的数字）。

Answer 2

numYields:4 locks(micros) r:2949888看起来不太好它基本上表示查询被中断4次以完成其他操作。

Answer 3

添加SELECT Empcode, B.Amount, Month_attd, Year_attd FROM #Your_Table A CROSS APPLY ( VALUES(Salary), (Phone_all), (Proff_tax)) B (Amount)降序索引（-1）。如果您使用startedAt首先选择最大集。如果您使用or，请先选择最小的集合。这也有帮助。

所以你应该and之前room: ObjectId('myroomid')。

我认为status: {$in: [ "ended", "skipped" ] }计数小于room: ObjectId('myroomid')计数。

Mongo慢速查询3m的集合

3 个答案: