Question

在一个集合中，我有一个包含数组的对象，我想在该数组中查找某些对象而不查看整个数组。我的收藏夹中的对象如下所示：

{
    "transactions": [
        {"id": randint(0, 100000), "hello": randint(0, 1000)} for _ in range(100000)
    ]
}

我想获取集合中ID为17的所有交易。所以我创建了这个索引：

db.toto.createIndex({'transactions.id': 1})

但是仅查看我想要的交易，我必须进行$ unwind，而这种放松仍然很慢：

db.toto.aggregate(
        [
            {"$match": {"transactions.id": 17}},
            {"$unwind": "$transactions"},
            {"$match": {"transactions.id": 17}},
        ]
    )

给我

    [{'_id': ObjectId('5bf854f685699a394ce5ba82'),
  'transactions': {'hello': 920, 'id': 17}},
 {'_id': ObjectId('5bf854f685699a394ce5ba82'),
  'transactions': {'hello': 446, 'id': 17}},
 {'_id': ObjectId('5bf854f685699a394ce5ba84'),
  'transactions': {'hello': 822, 'id': 17}},
 {'_id': ObjectId('5bf854f685699a394ce5ba84'),
  'transactions': {'hello': 830, 'id': 17}},
 [...]
 {'_id': ObjectId('5bf854f885699a394ce5ba89'),
  'transactions': {'hello': 301, 'id': 17}},
 {'_id': ObjectId('5bf854f985699a394ce5ba8b'),
  'transactions': {'hello': 666, 'id': 17}}]

添加第一个$ match会使查询稍微快一点，因为它确实使用索引来查找仅包含我要查找的事务的对象。但是它不会使用索引来使$ unwind更快。 MongoDB仍然会遍历包含100000个事务的整个数组来查找我想要的事务。

查询需要5秒钟才能找到大约100个对象。像db.toto.count({"transactions.id": 17})这样的使用索引的查询所花费的时间不到0.1秒。

这是我用来研究此问题的python file。您可以通过执行以下操作来重现该问题：

pip3 install fire pymongo
chmod +x toto_mongo.py
./toto_mongo.py insert
./toto_mongo.py create_index
time ./toto_mongo.py slow_query

是否可以通过创建索引来使$ unwind在MongoDB中快速运行？

0 个答案: