MongoDB v3.2.6零星慢速查询

时间:2016-06-22 14:07:26

标签: mongodb database-performance wiredtiger

任何人都可以提供有关如何解决我遇到的m / moo偶尔缓慢的问题,或根据我的慢速操作日志指出任何明显的问题吗?

我使用安装在运行ubuntu服务器14.04的m4.large(8GB Ram,2个虚拟内核)ec2实例上的WiredTiger引擎的Mongo v3.2.6。在同一服务器上安装了一个java进程,它使用许多线程从/向mongo进程读/写。没有副本集(我知道应该有),没有我可以看到的内存问题,但是,我一直在遇到我无法诊断的偶发性慢查询。

以下是2016年6月21日的完整日志,其中显示了一些非常冗长的查询(10秒以上),查询通常会在几毫秒内完成。

2016-06-21T01:27:44.806Z I COMMAND  [conn156] update company.token query: { _id: "1111" } update: { _id: "1111", _class: "company.domain.Token", userUUID: "5555", expires: 1466476050734 } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keyUpdates:0 writeConflicts:0 numYields:1 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 2 } } } 14071ms
2016-06-21T01:27:44.806Z I COMMAND  [conn151] update company.token query: { _id: "2222" } update: { _id: "2222", _class: "company.domain.Token", userUUID: "6666", expires: 1466476050414 } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keyUpdates:0 writeConflicts:0 numYields:1 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 2 } } } 14391ms
2016-06-21T01:27:44.811Z I COMMAND  [conn150] insert company.count ninserted:1 keyUpdates:0 writeConflicts:0 numYields:0 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 } }, Collection: { acquireCount: { w: 1 } } } 14074ms
2016-06-21T01:39:45.807Z I COMMAND  [conn151] update company.token query: { _id: "3333" } update: { _id: "3333", _class: "company.domain.Token", userUUID: "7777", expires: 1466476776277 } keysExamined:0 docsExamined:0 nMatched:1 nModified:1 upsert:1 keyUpdates:0 writeConflicts:0 numYields:0 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { w: 1 } }, Collection: { acquireCount: { w: 1 } } } 9529ms
2016-06-21T10:34:14.596Z I COMMAND  [conn156] command company.count command: aggregate { aggregate: "count", pipeline: [ { $match: { lId: "4444", $and: [ { iTS: { $gte: 1466475578989 } }, { iTS: { $lte: 1466505254467 } } ] } }, { $group: { _id: { rId: "$rId", sType: "$sType" }, count: { $sum: "$rVal" } } } ] } keyUpdates:0 writeConflicts:0 numYields:92 reslen:162 locks:{ Global: { acquireCount: { r: 190 } }, Database: { acquireCount: { r: 95 } }, Collection: { acquireCount: { r: 95 } } } protocol:op_query 113ms
2016-06-21T10:39:17.113Z I COMMAND  [conn146] command company.count command: aggregate { aggregate: "count", pipeline: [ { $match: { lId: "4444", $and: [ { iTS: { $gte: 1466475589300 } }, { iTS: { $lte: 1466505556971 } } ] } }, { $group: { _id: { rId: "$rId", sType: "$sType" }, count: { $sum: "$rVal" } } } ] } keyUpdates:0 writeConflicts:0 numYields:93 reslen:162 locks:{ Global: { acquireCount: { r: 192 } }, Database: { acquireCount: { r: 96 } }, Collection: { acquireCount: { r: 96 } } } protocol:op_query 141ms
2016-06-21T10:41:16.871Z I COMMAND  [conn156] command company.count command: aggregate { aggregate: "count", pipeline: [ { $match: { lId: "4444", $and: [ { iTS: { $gte: 1466475578989 } }, { iTS: { $lte: 1466505676744 } } ] } }, { $group: { _id: { rId: "$rId", sType: "$sType" }, count: { $sum: "$rVal" } } } ] } keyUpdates:0 writeConflicts:0 numYields:94 reslen:162 locks:{ Global: { acquireCount: { r: 194 } }, Database: { acquireCount: { r: 97 } }, Collection: { acquireCount: { r: 97 } } } protocol:op_query 110ms

慢查询似乎始终是updateinsert操作。上述慢查询表中的一个集合是company.token,它有大约55,000个文档,平均大小为160B。这个集合是非常频繁地读取的(每15秒大约爆发100x一次)。读取查询使用主_id键进行搜索,因此它会触及主键索引。

日志中的另一个集合是company.count,其中包含大约2.7M文档,平均大小为587B,并且非常频繁地写入(每15秒一次,每次插入大约100次)。该集合具有稳定的聚合读取查询(也显示在日志中),针对收集执行,每分钟几十次。聚合读取match子句命中复合索引,似乎永远不会扫描比返回的文档更多的文档(executionStats显示在底部)。插入查询往往非常快,每次插入大约1-4ms,但是,每天一次或两次mongo似乎抓住了,有时需要20秒来完成快速操作。

我对滞后的最好猜测来自高度并发的写入和来自日志的两个集合的读取,但是,我的印象是WiredTiger在文档级别使用了intent exclusive和intent共享锁,这应该允许高并发性。

最后一点需要注意的是,我有时会看到另一个只包含一个文档的集合需要花费10秒才能更新,所以看起来写入和读取队列可能会在全局或数据库级别上最大化。有什么方法可以确认吗?

{
  "executionStats":{
    "executionSuccess":true,
    "nReturned":12006,
    "executionTimeMillis":19,
    "totalKeysExamined":12006,
    "totalDocsExamined":12006,
    "executionStages":{
      "stage":"FETCH",
      "nReturned":12006,
      "executionTimeMillisEstimate":20,
      "works":12007,
      "advanced":12006,
      "needTime":0,
      "needYield":0,
      "saveState":93,
      "restoreState":93,
      "isEOF":1,
      "invalidates":0,
      "docsExamined":12006,
      "alreadyHasObj":0,
      "inputStage":{
        "stage":"IXSCAN",
        "nReturned":12006,
        "executionTimeMillisEstimate":0,
        "works":12007,
        "advanced":12006,
        "needTime":0,
        "needYield":0,
        "saveState":93,
        "restoreState":93,
        "isEOF":1,
        "invalidates":0,
        "keyPattern":{
          "lId":1,
          "iTS":1
        },
        "indexName":"lId_1_iTS_1",
        "isMultiKey":false,
        "isUnique":false,
        "isSparse":false,
        "isPartial":false,
        "indexVersion":1,
        "direction":"forward",
        "indexBounds":{
          "lId":[
            "[\"d4444\", \"4444\"]"
          ],
          "iTS":[
            "[1466475578989.0, 1466505676744.0]"
          ]
        },
        "keysExamined":12006,
        "dupsTested":0,
        "dupsDropped":0,
        "seenInvalidated":0
      }
    }
  }
}

0 个答案:

没有答案