在mongodb聚合中包含字段

时间:2018-05-14 22:33:09

标签: mongodb mongodb-query aggregation-framework

我有以下系列:

{"orderID" : "30688", "branch" : "CO", "customerID" : "11396783", "customerEmail" : "foo@bar.com"}
{"orderID" : "30688", "branch" : "CO", "customerID" : "11396783", "customerEmail" : "foo@bar.com"}
{"orderID" : "30688", "branch" : "CO", "customerID" : "11396783", "customerEmail" : "foo@bar.com"}
{"orderID" : "89765", "branch" : "CO", "customerID" : "54157526", "customerEmail" : ""}
{"orderID" : "89765", "branch" : "CO", "customerID" : "54157526", "customerEmail" : ""}
{"orderID" : "21546", "branch" : "CO", "customerID" : "20103585", "customerEmail" : "xxx@yyy.com"}
{"orderID" : "21546", "branch" : "CO", "customerID" : "20103585", "customerEmail" : "xxx@yyy.com"}
{"orderID" : "21546", "branch" : "KA", "customerID" : "89374792", "customerEmail" : "aaa@ccc.com"}
{"orderID" : "21794", "branch" : "NY", "customerID" : "78125522", "customerEmail" : ""}

我需要在customerEmail不为null的某个分支中获取所有唯一的customerID。我期望"分支":" CO"

{"customerID" : "11396783", "customerEmail" : "foo@bar.com"}
{"customerID" : "20103585", "customerEmail" : "xxx@yyy.com"}

到目前为止,我已尝试过:

db.collection.aggregate([
    { $match: { branch: "CO" } },   
    { $group:
        {
            _id: { customer:"$customerID"}
        }
    },
    {
        $group: {_id:"$_id.customer"}
    },
    {
        $addFields: {  email: "$customerEmail"} 
    }
]);

但它没有带来电子邮件字段。

1 个答案:

答案 0 :(得分:2)

它不包括该字段,因为您没有要求该字段返回。您在此处缺少的是使用$first或类似的"accumulator",以便在$group期间返回元素。

此外,如果您不想要空的电子邮件地址,请在$match管道阶段将其排除,因为这是最有效的方法。

db.collection.aggregate([
    { $match: { branch: "CO", "customerEmail": { "$ne": "" } } },
    { $group:
        {
            _id: { customer:"$customerID"},
            email: { "$first": "$customerEmail" }
        }
    }
]);

A"管道"只返回"输出"从您实际要求的$group$project等阶段开始。就像" Unix管道" |运营商,"下一阶段可用的唯一内容"是你输出的。

这应该很明显地来自:

db.collection.aggregate([
    { $match: { branch: "CO" } },   
    { $group:
        {
            _id: { customer:"$customerID"}
        }
    }
]);

甚至:

db.collection.aggregate([
    { $match: { branch: "CO" } },   
    { $project:
        {
            _id: { customer:"$customerID"}
        }
    }
]);

当然只返回_id值,因为这就是你要求的全部内容。

您只能在任何管道阶段访问前一阶段"输出的数据。在$group内只表示分组键的_id,以及使用有效"accumulator"明确指定"明确" 的任何内容您希望返回的其他房产。任何累加器(对于"字符串"此处有效)都可以,但_id 以外的任何内容必须使用"accumulator"

我建议花时间查看所有aggregation operators以及他们实际做的事情。每个操作员都有示例用法