我在users
集合中有大约6万个文档,并且有以下查询:
db.getCollection('users').aggregate([
{"$match":{"userType":"employer"}},
{"$lookup":{"from":"companies","localField":"_id","foreignField":"owner.id","as":"company"}},
{"$unwind":"$company"},
{"$lookup":{"from":"companytypes","localField":"company.type.id","foreignField":"_id","as":"companyType"}},
{"$unwind":"$companyType"},
{ $group: { _id: null, count: { $sum: 1 } } }
])
计算大约需要12秒,即使我在列表功能之前调用count函数,但我的列表函数limit: 10
响应的速度比计数快。
以下是explain
结果:
{
"stages" : [
{
"$cursor" : {
"query" : {
"userType" : "employer"
},
"fields" : {
"company" : 1,
"_id" : 1
},
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "jobs.users",
"indexFilterSet" : false,
"parsedQuery" : {
"userType" : {
"$eq" : "employer"
}
},
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"userType" : {
"$eq" : "employer"
}
},
"direction" : "forward"
},
"rejectedPlans" : []
}
}
},
{
"$lookup" : {
"from" : "companies",
"as" : "company",
"localField" : "_id",
"foreignField" : "owner.id",
"unwinding" : {
"preserveNullAndEmptyArrays" : false
}
}
},
{
"$match" : {
"$nor" : [
{
"company" : {
"$eq" : []
}
}
]
}
},
{
"$group" : {
"_id" : {
"$const" : null
},
"total" : {
"$sum" : {
"$const" : 1
}
}
}
},
{
"$project" : {
"_id" : false,
"total" : true
}
}
],
"ok" : 1.0
}
答案 0 :(得分:3)
$lookup
操作很慢,因为它们模仿左边的连接行为,来自DOCS:
$ lookup在localField上执行相等匹配 来自集合
的文档中的foreignField
因此,如果用于joining
集合的字段中没有索引,Mongodb将强制进行集合扫描。
为foreignField
属性添加索引应该可以防止集合扫描并提高性能,即使是幅度
答案 1 :(得分:-1)
@paizo的答案很好,但是当我的foreignField已经是一个_id(带有索引)并持续很长时间了吗?
这是我的查询
db.customers.aggregate([
{
"$match": {}
},
{
"$lookup": {
"from": "core.entities",
"localField": "entityId",
"foreignField": "_id",
"as": "entity"
}
},
{
"$unwind": "$entity"
},
{
"$project": {
"entity._id": 0
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
"$entity",
"$$ROOT"
]
}
}
},
{
"$project": {
"entity": 0
}
},
{
$facet: {
paginatedResults: [
{
$skip: 0
},
{
$limit: 10
}
],
totalCount: [
{
$count: 'count'
}
]
}
}])
这是我的客户集合索引:
[{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "applekkus-gmp.core.customers"
},
{
"v" : 2,
"key" : {
"name" : 1
},
"name" : "name_1",
"ns" : "applekkus-gmp.core.customers"
}]
...这是我的实体集合索引:
[{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "applekkus-gmp.core.entities"
}]
...这是我的汇总explain():
"stages": [
{
"$cursor": {
"query": {
},
"queryPlanner": {
"plannerVersion": 1,
"namespace": "applekkus-gmp.core.customers",
"indexFilterSet": false,
"parsedQuery": {
},
"winningPlan": {
"stage": "COLLSCAN",
"direction": "forward"
},
"rejectedPlans": []
}
}
},
{
"$lookup": {
"from": "core.entities",
"as": "entity",
"localField": "entityId",
"foreignField": "_id",
"unwinding": {
"preserveNullAndEmptyArrays": false
}
}
},
{
"$project": {
"entity": {
"_id": false
}
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
"$entity",
"$$ROOT"
]
}
}
},
{
"$project": {
"entity": false
}
},
{
"$facet": {
"paginatedResults": [
{
"$limit": NumberLong(10)
}
],
"totalCount": [
{
"$group": {
"_id": {
"$const": null
},
"count": {
"$sum": {
"$const": 1
}
}
}
},
{
"$project": {
"_id": false,
"count": true
}
}
]
}
}
],
"ok": 1}
我的案例与@jones案例非常相似,我有一个40.000个文档,并且此汇总需要8秒才能显示仅显示10个文档(限制)的总数(40.000)。
P.S。如果我运行 customers.find()。count(),它将在不到1秒的时间内返回40.000的计数。