如何在MongoDB中获取具有重复键的组的单个文档

时间:2016-07-06 03:38:56

标签: mongodb aggregation-framework

我的汇总如下:

[
    {
        "$project" : {
                "country_code" : "$country_code",
                "event" : "$event",
                "user_id" : "$user_id",
                "os" : "$os",
                "register_time" : "$register_time",
                "channel" : "$channel"
        }
    },
    {
        "$match" : {
                "channel" : "000001",
                "register_time" : {
                    "$gt" : ISODate("2016-06-01T00:00:00Z"),
                    "$lt" : ISODate("2016-06-30T23:59:00Z")
                },
                "event" : "Register_with_number"
        }
    },
    {
        "$group" : {
                "_id" : {
                    "country_code" : "$country_code",
                    "user_id" : "$user_id",
                    "os" : "$os",
                    "channel" : "$channel",
                    "register_time" : "$register_time"
                },
                "count" : {
                    "$sum" : 1
                }
        }
    }
]

结果如下:您可for country_codeIN,两条记录具有相同user_id但不同register_time,怎么能如果user_id相同,我只会得到一条记录。

{ "_id" : { "country_code" : "US", "user_id" : "d2a0fe91", "os" : "Android", "channel" : "000001", "register_time" : ISODate("2016-06-30T22:47:43Z") }, "count" : 1 }    
{ "_id" : { "country_code" : "US", "user_id" : "77911591", "os" : "Android", "channel" : "000001", "register_time" : ISODate("2016-06-30T19:47:21Z") }, "count" : 1 }
{ "_id" : { "country_code" : "IN", "user_id" : "1b72fd12", "os" : "Android", "channel" : "000001", "register_time" : ISODate("2016-06-30T19:17:28Z") }, "count" : 1 }
{ "_id" : { "country_code" : "IN", "user_id" : "1b72fd12", "os" : "Android", "channel" : "000001", "register_time" : ISODate("2016-06-30T19:15:13Z") }, "count" : 1 }
{ "_id" : { "country_code" : "ID", "user_id" : "045f1637", "os" : "Android", "channel" : "000001", "register_time" : ISODate("2016-06-30T19:02:19Z") }, "count" : 1 }

1 个答案:

答案 0 :(得分:1)

有几种解决方案,因为当有多个文档具有相同的用户但register_time不同时,您没有提到文档的外观。
以下内容会更改您的上一个$group阶段,以便将register_time值的数组与$push保持一致,或者 - 如果您只需要一个 - 将其中任何一个保留为$first。请注意,当您按register_time对管道进行排序时,可以使用$first / $last来保留每个用户的第一个/最后一个register_time,这可能是您想要的结果。

"$group" : {
        "_id" : {
            "country_code" : "$country_code",
            "user_id" : "$user_id",
            "os" : "$os",
            "channel" : "$channel",
        },
        "register_times" : {
            $push: "$register_time"
        },
        "any_register_time" : {
            $first: "$register_time"
        },                
        "count" : {
            "$sum" : 1
        }
}