Question

我使用以下格式将数据存储在MongoDB中。

{
    "_id" : ObjectId("570b487fb5360dd1e5ef840c"),
    "internal_id" : 1,
    "created_at" : ISODate("2015-07-14T10:08:38.994Z"),
    "updated_at" : ISODate("2016-01-10T00:35:19.748Z"),
    "ad_account_id" : 1,
    "updated_time" : "2013-08-05T04:48:49-0700",
    "created_time" : "2013-08-05T04:46:35-0700",
    "name" : "Sale1",
    "daily": [
                 {"clicks": 5000, "date": "2015-04-16"},
                 {"clicks": 5100, "date": "2015-04-17"},
                 {"clicks": 5030, "date": "2015-04-20"}
             ]
    "custom_tags" : {
        "Event" : {
            "name" : "Clicks"
        },
        "Objective" : {
            "name" : "Sale"
        },
        "Image" : {
            "name" : "43c3fe7b262cde5f476ed303e472c65a"
        },
        "Goal" : {
            "name" : "10"
        },
        "Type" : {
             "name" : "None"
        },
        "Call To Action" : {
             "name" : "None",
        },
        "Landing Pages" : {
            "name" : "www.google.com",
    }
}

我正在尝试按internal_id对单个文档进行分组，以使用2015-04-15方法查找从2015-04-21到aggregate的总点击次数。

在pymongo中，当我尝试在aggregate上使用$project时internal_id时，我会得到结果，但是当我尝试$project {{1}时}字段，我收到以下错误：

custom_tags

在回答here之后，我甚至将我的聚合函数更改为OperationFailure: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.。但这仍然会引发早先的错误。

Answer 1

看一下这个链接： Can't get allowDiskUse:True to work with pymongo

这对我有用：

someSampleList= db.collectionName.aggregate(pipeline, allowDiskUse=True)

哪里

pipeline = [
    {'$sort': {'sortField': 1}},
    {'$group': {'_id': '$distinctField'}}, 
    {'$limit': 20000}]

Answer 2

尝试：

list（collection._get_collection（）。aggregate（mongo_query [“pipeline”]，{allowDiskUse：true}））

allowDiskUse在pymongo

2 个答案: