Question

我在使用NodeJS + Express + MongoDB开发的API上遇到性能问题。

在特定产品上使用$ match运行汇总时，性能不错，但是对于开放式搜索来说确实很慢。

我想在国家和出口商两列上运行一个组，然后将结果限制为每个组在国家上的3个结果。

要求：每个国家/地区的独特出口商总数以及每个国家/地区的任何3条记录。

在aggregate function上运行"indexFilterSet": false时，我得到以下关键指针，这些指针指示我的查询速度很慢。如果我错了，请纠正我。

"winningPlan": {
"stage": "COLLSCAN", "direction": "forward" }, 9,264,947

对32 seconds条记录进行查询，所需时间约为$match。我尝试过使用复合索引以及单字段索引，但是它根本没有帮助，因为我感觉索引{}为空Model.aggregate([ {"$match" : query}, { $group : {_id: {country: "$Country", exporter: "$Exporter"}, id: {$first: "$_id"}, product: { $first: "$Description" }}}, { $group : {_id: "$_id.country", data: {$push: { id: "$id", company: "$_id.exporter", product: "$product" }}, count:{$sum:1}}}, { "$sort": { "count": -1 } }, { $project: { "data": { "$slice": [ "$data", 3 ] }, "_id": 1, "count": 1 } }, ]).allowDiskUse(true).explain()

时没有使用

以下是我正在使用 mongoose 驱动程序

在mongoDB上运行的查询

{}

其中，查询是动态生成的，对于整个馆藏范围的搜索，默认情况下为空{Country: 1, Exporter: 1}。索引字段为

复合索引：{Description: "text"}
文本索引：{ "success": "Successfull", "status": 200, "data": { "stages": [ { "$cursor": { "query": {}, "fields": { "Country": 1, "Description": 1, "Exporter": 1, "_id": 1 }, "queryPlanner": { "plannerVersion": 1, "namespace": "db.OpenExportData", "indexFilterSet": false, "parsedQuery": {}, "winningPlan": { "stage": "COLLSCAN", "direction": "forward" }, "rejectedPlans": [] } } }, { "$group": { "_id": { "country": "$Country", "exporter": "$Exporter" }, "id": { "$first": "$_id" }, "product": { "$first": "$Description" } } }, { "$group": { "_id": "$_id.country", "data": { "$push": { "id": "$id", "company": "$_id.exporter", "product": "$product" } }, "count": { "$sum": { "$const": 1 } } } }, { "$sort": { "sortKey": { "count": -1 } } }, { "$project": { "_id": true, "count": true, "data": { "$slice": [ "$data", { "$const": 3 } ] } } } ], "ok": 1 } }

完整的explain（）响应：

const grad = newSlice.append('defs')
    .append('linearGradient');

const stop1 = grad.append("stop")
    .attr("offset", "0%")
    .attr("stop-color", "red");

const stop2 = grad.append("stop")
    .attr("offset", "0%")
    .attr("stop-color", "red");

集合大小： 9,264,947条记录和10.2 GB

响应时间： 32154毫秒

随着收藏数量的增加，查询速度越来越慢。

Answer 1

像这样使用聚合意味着mongodb必须遍历所有记录，然后对数据进行分组（加载10 Gb），然后对将要创建的数组进行切片。

Ofc您的收藏集增长得越多，它就越长。

我认为有必要重新考虑您的处理方式，而不是优化您的实际请求。

我会先使用一个请求，首先为每个国家/地区名称find。然后为每个国家/地区使用一个请求，以获取前三个出口商。

使用国家和出口商上的索引。

请求更多，但是请求更小，不需要加载所有数据。通过使用正确的索引直接访问数据。

考虑到那里没有成千上万的不同国家

Answer 2

如果您的查询为{}，则mongo引擎将跳过$match阶段，并直接进入$group。不使用索引。您可以从explain()结果中验证以上内容。 $match和$sort管道运算符出现在管道的开头时可以利用索引。查看管道，您可以使用 Country 和 Exporter 对它们进行分组。您可以做的是在{Country: 1, Exporter: 1}上创建索引，并在$sort上使用{Country: 1, Exporter: 1}作为管道的第一阶段。这样可以提高$group的效率。

MongoDB中的慢查询，使用组进行集合范围的聚合查询

2 个答案: