allowDiskUse在pymongo

时间:2016-06-22 14:03:27

标签: mongodb python-2.7 pymongo

我使用以下格式将数据存储在MongoDB中。

{
    "_id" : ObjectId("570b487fb5360dd1e5ef840c"),
    "internal_id" : 1,
    "created_at" : ISODate("2015-07-14T10:08:38.994Z"),
    "updated_at" : ISODate("2016-01-10T00:35:19.748Z"),
    "ad_account_id" : 1,
    "updated_time" : "2013-08-05T04:48:49-0700",
    "created_time" : "2013-08-05T04:46:35-0700",
    "name" : "Sale1",
    "daily": [
                 {"clicks": 5000, "date": "2015-04-16"},
                 {"clicks": 5100, "date": "2015-04-17"},
                 {"clicks": 5030, "date": "2015-04-20"}
             ]
    "custom_tags" : {
        "Event" : {
            "name" : "Clicks"
        },
        "Objective" : {
            "name" : "Sale"
        },
        "Image" : {
            "name" : "43c3fe7b262cde5f476ed303e472c65a"
        },
        "Goal" : {
            "name" : "10"
        },
        "Type" : {
             "name" : "None"
        },
        "Call To Action" : {
             "name" : "None",
        },
        "Landing Pages" : {
            "name" : "www.google.com",
    }
}

我正在尝试按internal_id对单个文档进行分组,以使用2015-04-15方法查找从2015-04-21aggregate的总点击次数。

在pymongo中,当我尝试在aggregate上使用$projectinternal_id时,我会得到结果,但是当我尝试$project {{1}时}字段,我收到以下错误:

custom_tags

在回答here之后,我甚至将我的聚合函数更改为OperationFailure: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in. 。但这仍然会引发早先的错误。

2 个答案:

答案 0 :(得分:1)

看一下这个链接: Can't get allowDiskUse:True to work with pymongo

这对我有用:

someSampleList= db.collectionName.aggregate(pipeline, allowDiskUse=True)

哪里

pipeline = [
    {'$sort': {'sortField': 1}},
    {'$group': {'_id': '$distinctField'}}, 
    {'$limit': 20000}]

答案 1 :(得分:0)

尝试:

list(collection._get_collection()。aggregate(mongo_query [“pipeline”],{allowDiskUse:true}))