选择大于最近每行用户特定值的行数

时间:2017-11-12 13:08:48

标签: mongodb mongodb-query

我有mongo集合'orders'包含一个具有orderid和时间的用户列表,如下所示:

user    orderid     time    has_pending
10001   1       1510489123  0
10002   2       1510489125  0
10003   3       1510489127  0
10001   5       1510489131  1
10001   6       1510489133  1
10002   7       1510489135  0
10003   8       1510489137  0
10001   9       1510489139  1
10001   10      1510489141  0
10002   11      1510489143  1
10001   12      1510489145  0 <<<<< 
10002   13      1510489147  0 <<<<< 
10001   14      1510489149  1
10002   15      1510489151  1
10003   16      1510489153  1
10003   17      1510489155  1
10003   18      1510489157  1
10003   21      1510489163  1
10003   22      1510489165  0 <<<<< 

我正在尝试获取每个用户的订单列表,其中订单时间&gt; =上次出现has_pending = 0的时间

例如:如果我们查看用户10001数据:

user    orderid time    has_pending
10001   1   1510489123  0
10001   5   1510489131  1
10001   6   1510489133  1
10001   9   1510489139  1
10001   10  1510489141  0
10001   12  1510489145  0
10001   14  1510489149  1

因此此查询对此用户的结果为:

10001   12  1510489145  0
10001   14  1510489149  1

所需的查询应该为所有用户获取数据,结果应如下所示:

user    orderid     time    has_pending
10001   12      1510489145  0
10002   13      1510489147  0
10001   14      1510489149  1
10002   15      1510489151  1
10003   22      1510489165  0

MYSQL QUERY:

SELECT
    t1.*
FROM
    test AS t1
LEFT JOIN test AS t2 ON t1.time >= t2.time AND t1.user = t2.user
WHERE
    t2.orderid= (SELECT max(orderid) FROM test WHERE user= t1.user AND has_pending = 0)

任何想法如何在一个mongo查询中获得结果?

由于

3 个答案:

答案 0 :(得分:2)

鉴于以下输入文件:

{ "user" : 10001, "orderid" : 1, "time" : 1510489123, "has_pending" : 0 }
{ "user" : 10002, "orderid" : 2, "time" : 1510489125, "has_pending" : 0 }
{ "user" : 10003, "orderid" : 3, "time" : 1510489127, "has_pending" : 0 }
{ "user" : 10001, "orderid" : 5, "time" : 1510489131, "has_pending" : 1 }
{ "user" : 10001, "orderid" : 6, "time" : 1510489133, "has_pending" : 1 }
{ "user" : 10002, "orderid" : 7, "time" : 1510489135, "has_pending" : 0 }
{ "user" : 10003, "orderid" : 8, "time" : 1510489137, "has_pending" : 0 }
{ "user" : 10001, "orderid" : 9, "time" : 1510489139, "has_pending" : 1 }
{ "user" : 10001, "orderid" : 10, "time" : 1510489141, "has_pending" : 0 }
{ "user" : 10002, "orderid" : 11, "time" : 1510489143, "has_pending" : 1 }
{ "user" : 10001, "orderid" : 12, "time" : 1510489145, "has_pending" : 0 }
{ "user" : 10002, "orderid" : 13, "time" : 1510489147, "has_pending" : 0 }
{ "user" : 10001, "orderid" : 14, "time" : 1510489149, "has_pending" : 1 }
{ "user" : 10002, "orderid" : 15, "time" : 1510489151, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 16, "time" : 1510489153, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 17, "time" : 1510489155, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 18, "time" : 1510489157, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 21, "time" : 1510489163, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 22, "time" : 1510489165, "has_pending" : 0 }

您的查询需要如下所示:

db.collection.aggregate([
{
    $sort: {
        "time": -1 // sort by "time" descending
    }
}, {
    $group: { // we want to slice our data per "user" so let's group by that field
        _id: "$user",
        "orders": {
            $push: "$$ROOT" // remember each document in an array per each "user" group (entries still sorted by "time" descending)
        }
    }
}, {
    $project: {
        "orders": { // our orders array shall only contain...
            $slice: [ "$orders", 0, { // ...all items from the last one up until...
                $add: [ { $indexOfArray: [ "$orders.has_pending", 0 ] }, 1 ] // ...the first appearance of an "has_pending" == 0 entry
                // the $add makes sure that we include the found element with "has_pending" == 0, too
            }]
        }
    }
}, {
    $unwind: "$orders" // restore original documents again by flattening the "orders" array
}, {
    $replaceRoot: { // move the (single) entry of the orders array to the root level of each document
        "newRoot": "$orders"
    }
}, {
    $sort: {
        "time": 1 // your example output was sorted by date so that's why we do that here, too...
    }
}])

这将为您提供您要求的确切顺序和内容(加上我为了简洁而省略的_id字段):

{ "user" : 10001, "orderid" : 12, "time" : 1510489145, "has_pending" : 0 }
{ "user" : 10002, "orderid" : 13, "time" : 1510489147, "has_pending" : 0 }
{ "user" : 10001, "orderid" : 14, "time" : 1510489149, "has_pending" : 1 }
{ "user" : 10002, "orderid" : 15, "time" : 1510489151, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 22, "time" : 1510489165, "has_pending" : 0 }

答案 1 :(得分:0)

答案

db.getCollection('order').aggregate([
{ $sort: {"time": -1}},
{ 
    $group:{
        _id: {
            user: "$user", 
            has_pending: "$has_pending"
            },
        time: { $first: "$time"},
        orderid: { $first: "$orderid"}
    }
},
{
    $project: {
        _id: 0,
        user: "$_id.user",
        orderid: "$orderid",
        time: "$time",
        has_pending: "$_id.has_pending"
    }
}
])

如果您想了解自己在每个聚合管道中正在做什么,可以继续阅读。

为了解释每个管道中发生的事情,我将采用您发布的内容的子集。所以我们假设我们有这些文件:

user    orderid     time    has_pending
10001   1       1510489123  0
10002   2       1510489125  0
10001   5       1510489131  1
10002   7       1510489135  0
10002   11      1510489143  1
10001   12      1510489145  0  
10002   13      1510489147  0 
10001   14      1510489149  1
10002   15      1510489151  1

解释$ sort result

按时间{ $sort: {"time": -1}}排序,按时间降序排列结果。这会使你的结果看起来像这样

user    orderid     time    has_pending
10002   15      1510489151  1
10001   14      1510489149  1
10002   13      1510489147  0
10001   12      1510489145  0
10002   11      1510489143  1
10002   7       1510489135  0
10001   5       1510489131  1
10002   2       1510489125  0
10001   1       1510489123  0

解释$ group pipeline

我们想要分组的键

现在,我们可以按userhas_pending对结果进行分组。因为我们每个user和每个has_pending只需要一个结果。所以我们只需要一个

user: 1001 with has_pending: 0, 
user: 1001 with has_pending: 1, 
user: 1002 with has_pending: 0, 
user: 1002 with has_pending: 1

这发生在您的群组聚合中:

_id: {
        user: "$user", 
        has_pending: "$has_pending"
        }

您论坛中的字段_id是强制性的,您可以根据想要分组的内容进行描述。

在组管道中使用$ first

注意我添加了:

time: { $first: "$time"},
orderid: { $first: "$orderid"}

我使用$first,因为我知道我的文档已经排序。所以我绝对肯定第一个

user: 1001 with has_pending: 0 will take "time" : 1510489145 and "orderid" : 12
user: 1001 with has_pending: 1 will take "time" : 1510489149 and "orderid" : 14
user: 1002 with has_pending: 0 will take "time" : 1510489147 and "orderid" : 13
user: 1002 with has_pending: 1 will take "time" : 1510489151 and "orderid" : 15

解释$ project pipeline

在这种情况下,$project仅用于&#34;标准化&#34;你的结果。所以我们可以得到你要求的最终结果。

答案 2 :(得分:-1)

您可以使用sort属性。 MongoDB shell中的示例,其输出与SQL查询相同:

db.collection.find({}).sort({ user: 1, orderid: 1, time: 1, has_pending: 1 }).pretty()