Question

我有一个Web应用程序，用于在Mongo中存储一些数据，我需要从查找或聚合管道返回分页响应。我使用Django Rest Framework及其分页，最后将其切成Cursor对象。这对于Cursors无缝运行，但是聚合返回一个CommandCursor，它没有实现__getitem__()。

cursor = collection.find({})
cursor[10:20] # works, no problem

command_cursor = collection.aggregate([{'$match': {}}])
command_cursor[10:20] # throws not subscriptable error

这是什么原因？有人有CommandCursor.__getitem__()的实现吗？完全可行吗？

我想找到一种方法，当我只需要一页时就不获取所有值。转换为列表然后对其进行切片，对于大型（超过10万个文档）管道结果是不可行的。有一个基于this answer的解决方法，但这仅适用于前几页，而最后一页的性能却迅速下降。

Answer 1

Mongo具有某些聚合流水线阶段来处理此问题，例如$skip和$limit，您可以这样使用：

aggregation_results = list(collection.aggregate([{'$match': {}}, {'$skip':  10}, {'$limit':  10}]))

尤其是您注意到Pymongo的command_cursor没有针对__getitem__的实现，因此常规迭代器语法无法按预期工作。我个人建议不要篡改他们的代码，除非您有兴趣成为他们的软件包的贡献者。

Answer 2

find和aggregate的MongoDB游标以不同的方式工作，因为聚集查询的游标结果是数据的处理结果（在大多数情况下），而查找游标则不是这样它们是静态的，因此可以跳过文档并将其限制为您的意愿。

您可以在聚合管道中将分页器限制添加为$skip和$limit阶段。

例如：

command_cursor = collection.aggregate([
    {
        "$match": {
            # Match Conditions
        }
    },
    {
        "$skip": 10  # No. of documents to skip (Should be `0` for Page - 1)
    },
    {
        "$limit": 10  # No. of documents to be displayed on your webpage
    }
])

如何在pymongo中分页聚合管道结果？

2 个答案: