为什么MongoDB(或node-mongodb-native)将性能降低到每秒每个连接一个操作?

时间:2019-03-22 13:30:51

标签: node.js mongodb node-mongodb-native

我已经尝试了很长时间才能解决问题,但没有成功。两者都没有找到我类似的问题。 当我向mongod抛出20个聚合查询时,响应时间出现了问题,这些查询花费了5秒以上的时间来完成所有20个请求,而每个聚合都在1ms内得到了解决。我希望代码将在50ms以下运行。

主要问题之一是,结果将以明显可见的1秒延迟在块调度中出现。每秒每个连接可获得1个结果。每秒将创建一个附加连接,因此每秒可获得1,2,3,4,5个结果。

经过多次尝试,我可以将问题限制为mongod。我在3台不同的机器上使用了下面的node.js代码,而在我的本地机器上,我得到了该块分派,总运行时间超过5秒。在其他2台计算机上,结果在<50ms内完成,没有任何块调度,并且在同一数据库上延迟了1秒。

//console output - you can see the block dispatch.
//first section connects to mongod and send 20x the same quarry to mongod
18:16:25.440Z  INFO WS BOT/test3.js: MONGO SUCCESFULLY CONNECTED!
18:16:25.441Z DEBUG WS BOT/test3.js: 0 Start... %dms
//(...) while loop to send the aggregations to mongod finished in about 10ms - which is fine. 
18:16:25.450Z DEBUG WS BOT/test3.js: 18 Start... %dms
18:16:25.451Z DEBUG WS BOT/test3.js: 19 Start... %dms

//now the results of the mongo aggregations should come:

// First Block dispatch - 14ms after the request. Which is fine! But only 1 of 20 request
18:16:25.455Z DEBUG WS BOT/test3.js: 0 End... 0s 14.150247ms

// Second Block dispatch - 1 second 14ms after the request. 2 Results.
18:16:26.475Z DEBUG WS BOT/test3.js: 1 End... 1s 28.834198ms
18:16:26.476Z DEBUG WS BOT/test3.js: 2 End... 1s 28.880778ms

// Third Block dispatch - 2 seconds after the request. 3 Results.
18:16:27.486Z DEBUG WS BOT/test3.js: 3 End... 2s 38.423885ms
18:16:27.487Z DEBUG WS BOT/test3.js: 5 End... 2s 38.926009ms
18:16:27.488Z DEBUG WS BOT/test3.js: 4 End... 2s 40.08026ms

// Fourth Block dispatch - 3 seconds after the request. 4 Results.
18:16:28.502Z DEBUG WS BOT/test3.js: 6 End... 3s 53.779308ms
18:16:28.503Z DEBUG WS BOT/test3.js: 8 End... 3s 54.27472ms
18:16:28.504Z DEBUG WS BOT/test3.js: 9 End... 3s 55.358707ms
18:16:28.505Z DEBUG WS BOT/test3.js: 7 End... 3s 56.856392ms

// now you see each second with increasing results per second
18:16:29.519Z DEBUG WS BOT/test3.js: 13 End... 4s 69.557508ms
18:16:29.520Z DEBUG WS BOT/test3.js: 10 End... 4s 70.682548ms
18:16:29.520Z DEBUG WS BOT/test3.js: 14 End... 4s 70.377011ms
18:16:29.521Z DEBUG WS BOT/test3.js: 11 End... 4s 71.37652ms
18:16:29.535Z DEBUG WS BOT/test3.js: 12 End... 4s 85.550058ms
18:16:30.538Z DEBUG WS BOT/test3.js: 15 End... 5s 87.94975ms
18:16:30.539Z DEBUG WS BOT/test3.js: 16 End... 5s 88.68675ms
18:16:30.540Z DEBUG WS BOT/test3.js: 19 End... 5s 88.786227ms
18:16:30.540Z DEBUG WS BOT/test3.js: 17 End... 5s 90.013903ms
18:16:30.541Z DEBUG WS BOT/test3.js: 18 End... 5s 90.523922ms

我简化了代码,并以相同的慢响应请求了20次相同的查询。

这是我的代码:

const MongoClient = require('mongodb').MongoClient;
const dbName = 'kd';
var db;

//Connect to Mongo first. Once connnected, send 20x the same aggregation.

MongoClient.connect(`mongodb://${config.mongo.user}:${config.mongo.password}@localhost:${config.mongo.port}/${config.mongo.dbname}?authSource=${config.mongo.authSource}`, { useNewUrlParser: true, poolSize: 10 })
.then((data) => {
  log.info("MONGO SUCCESFULLY CONNECTED!")
  db = data.db(dbName);
  for (var i = 0; i < 20; i++) {
  aggregation(i)
  }
})
.catch(e => { log.error(e)})

// aggregation which will be used (simplified)
function aggregation(i) {
  var hrstart = process.hrtime()
  log.debug(`${i} Start... %dms`)
    db.collection("s.log").aggregate([
    { $match: { "updated_at": { $gte: yesterday } } }, 
    { $sort: { updated_at: 1 } }, 
  ],
  function(err, cursor) {
    if (err) { log.error(err)}
    else {
      cursor.toArray(function(err,product) {
          hrend = process.hrtime(hrstart)
          log.debug(`${i} End... %ds %dms`, hrend[0], hrend[1] / 1000000)
       })
    }
  })
}

我试图增加poolSize,以便驱动程序确实具有更多的连接,并且可以并行运行这些连接,但没有成功。 使用猫鼬会导致同样的性能下降。

我还检查了mongostat,看不见任何东西,除了连接每秒增加一秒钟

    *0    *0     *0     *0       0    13|0  0.1% 3.4%       0 1.03G 40.0M 0|0 1|0  10.4k   73.4k    3 Mar 22 14:07:52.173
    *0    *0     *0     *0       0    12|0  0.1% 3.4%       0 1.03G 40.0M 0|0 1|0  6.72k   66.3k    3 Mar 22 14:07:53.201
    *0    *0     *0     *0       0    17|0  0.1% 3.4%       0 1.03G 40.0M 0|0 1|0  7.97k   68.5k    4 Mar 22 14:07:54.210
insert query update delete getmore command dirty used flushes vsize   res qrw arw net_in net_out conn                time
    *0    *0     *0     *0       0    18|0  0.1% 3.4%       0 1.03G 40.0M 0|0 1|0  8.62k   70.4k    5 Mar 22 14:07:55.198
    *0    *0     *0     *0       0    20|0  0.1% 3.4%       0 1.03G 40.0M 0|0 1|0  14.0k   74.5k    6 Mar 22 14:07:56.207
    *0    *0     *0     *0       0    20|0  0.1% 3.4%       0 1.04G 40.0M 0|0 1|0  9.23k   70.7k    7 Mar 22 14:07:57.198
    *0    *0     *0     *0       0    20|0  0.1% 3.4%       0 1.04G 40.0M 0|0 1|0  9.37k   69.7k    8 Mar 22 14:07:58.207
    *0    *0     *0     *0       0    22|0  0.1% 3.4%       0 1.04G 40.0M 0|0 1|0  10.0k   71.6k    9 Mar 22 14:07:59.198
    *0    *0     *0     *0       0    13|0  0.1% 3.4%       0 1.04G 41.0M 0|0 1|0  6.90k   67.6k   10 Mar 22 14:08:00.206
    *0    *0     *0     *0       0    14|0  0.1% 3.4%       0 1.04G 41.0M 0|0 1|0  7.07k   68.7k   10 Mar 22 14:08:01.203
    *0    *0     *0     *0       0    13|0  0.1% 3.4%       0 1.04G 41.0M 0|0 1|0  21.0k   84.8k   10 Mar 22 14:08:02.162
    *0    *0     *0     *0       0    11|0  0.1% 3.4%       0 1.04G 41.0M 0|0 1|0  1.36k   60.4k   10 Mar 22 14:08:03.200

这是system.profile日志,该日志显示每个查询均在<1ms内执行。 在这里,您还可以看到mongod执行查询之前的1秒延迟,并且每秒执行每个连接1次操作。

> db.system.profile.find({ns: "kdm.shop.log"}, {ts: 1, millis: 1}).sort({ ts: -1}).limit(20)
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:30.538Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:30.538Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:30.537Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:30.536Z") }
{ "millis" : 1, "ts" : ISODate("2019-03-22T18:16:30.535Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:29.534Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:29.519Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:29.519Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:29.518Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:29.518Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:28.504Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:28.503Z") }
{ "millis" : 1, "ts" : ISODate("2019-03-22T18:16:28.502Z") }
{ "millis" : 1, "ts" : ISODate("2019-03-22T18:16:28.501Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:27.488Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:27.487Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:27.486Z") }
{ "millis" : 1, "ts" : ISODate("2019-03-22T18:16:26.476Z") }
{ "millis" : 2, "ts" : ISODate("2019-03-22T18:16:26.474Z") }
{ "millis" : 0, "ts" : ISODate("2019-03-22T18:16:25.453Z") }

有人知道吗,为什么这么慢,为什么要延迟一秒钟分配块?似乎每秒将连接数增加1,但每个连接仅执行1次操作。 即使当它仅使用一个连接并且每次都执行一个操作时,它也必须要快得多-如我的其他2个系统所示!

感谢您的帮助。如果您需要任何其他信息,请告诉我!

最好的问候, Meffesino

0 个答案:

没有答案