Question

我的数据库很大，我尝试优化查询。因此，例如，我有一个集合，其中的记录具有以下结构：

{
"field1": "value",
"field2": "value",
"field3": "value"
}

我有大约1000000条记录，因此我可以衡量查询的性能。

我的目标是通过field1 = 1和field2存在来优化搜索。

首先，我尝试不使用索引：

db.Collection.aggregate({"$match": {"field1": 1, "field2": {"$exists": true}}}, {"$count": "count"})

此查询需要1720毫秒。好吧，让我们添加索引。

db.Collection.createIndex({"field1": 1, "field2": 1}, {"sparse": true})

查询现在需要2212毫秒。什么？！也许我应该尽量不要稀疏索引：

db.Collection.createIndex({"field1": 1, "field2": 1})

2225毫秒。好。让我们开始实验。什么时间只查询一个没有索引的字段？

db.Collection.aggregate({"$match": {"field1": 1}}, {"$count": "count"})

1456毫秒

db.Collection.aggregate({"$match": {"field2": {"$exists": true}}}, {"$count": "count"})

1807毫秒

让我们尝试添加索引：

db.Collection.createIndex({"field1": 1}) db.Collection.aggregate({"$match": {"field1": 1}}, {"$count": "count"})

447毫秒。更好。

db.Collection.createIndex({"field2": 1}) db.Collection.aggregate({"$match": {"field2": {"$exists": true}}}, {"$count": "count"})

322毫秒。更好。

那两个字段呢？我再次查询并获得1821毫秒。

发生了什么事？我在explain（）中看到该索引正在使用，但是为什么这么慢？我认为，由于field2条件是field1条件的一部分，所以通过两个字段进行查询会更快，因此数据库可以通过索引找到field1 = 10的所有行，然后从上一行集中找到所有field2。

那么如何优化此查询？我认为可能不会超过700毫秒。