鉴于以下文件:
{
"_id" : ObjectId("53cd79bb300ccae6b3904402"),
"name" : "test product",
"sku" : "product-1",
"price" : 35,
"cost" : 12,
"max_cpc" : 100,
"price_in_cents" : 3500,
"cost_in_cents" : 1200,
"max_cpc_in_cents" : 10000,
"clicks" : [
{
"date" : ISODate("2014-04-25T00:00:00Z"),
"clicks" : 2,
"channel" : "google",
"campaign" : "12345687",
"group" : "987654321"
},
{
"date" : ISODate("2014-04-25T00:00:00Z"),
"clicks" : 3,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
},
{
"date" : ISODate("2014-04-24T00:00:00Z"),
"clicks" : 1,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
}
],
"impressions" : [
{
"date" : ISODate("2014-04-25T00:00:00Z"),
"impressions" : 15,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
},
{
"date" : ISODate("2014-04-24T00:00:00Z"),
"impressions" : 33,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
}
]
}
我想将此文档的总点击次数和总展示次数相加。我无法弄清楚如何正确设置聚合设置的管道。
最终结果是
{
ObjectId("53cd79bb300ccae6b3904402"),
total_clicks: 6,
total_impressions: 48
}
答案 0 :(得分:4)
这是一个相对简单的聚合操作,但如果分别对每个数组使用$unwind
操作,通常需要注意的是:
db.collection.aggregate([
// Unwind the first array
{ "$unwind": "$clicks" },
// Sum results and keep the other array per document
{ "$group": {
"_id": "$_id",
"total_clicks": { "$sum": "$clicks.clicks" }
"impressions": { "$first": "$impressions" }
}},
// Unwind the second array
{ "$unwind": "$impressions" },
// Group the final result keeping the first result
{ "$group": {
"_id": "$_id",
"total_clicks": { "$first": "$total_clicks" },
"total_impressions": { "$sum": "$impressions.impressions" }
}}
])
这可以为您提供所需的结果。
{
"_id": ObjectId("53cd79bb300ccae6b3904402"),
"total_clicks": 6,
"total_impressions": 48
}
$first
运算符可以在此处使用,因为您在分组中按文档操作。如果您希望在所有文档或其他键中使用此功能,则执行相同操作以添加阵列,然后为其他分组级别添加最终组。
请记住单独“扩展”每个数组,否则如果同时尝试$unwind
,最终会将一个元素中的每个元素重复一个元素的数量。
根据您的使用模式,您可能会考虑稍微更改架构。例如,由于此数据仅因“类型”而异,因此您可以考虑将其更改为单个“事件”数组:
{
"_id" : ObjectId("53cd79bb300ccae6b3904402"),
"name" : "test product",
"sku" : "product-1",
"price" : 35,
"cost" : 12,
"max_cpc" : 100,
"price_in_cents" : 3500,
"cost_in_cents" : 1200,
"max_cpc_in_cents" : 10000,
"events" : [
{
"type": "click",
"date" : ISODate("2014-04-25T00:00:00Z"),
"number" : 2,
"channel" : "google",
"campaign" : "12345687",
"group" : "987654321"
},
{
"type": "click",
"date" : ISODate("2014-04-25T00:00:00Z"),
"number" : 3,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
},
{
"type": "click",
"date" : ISODate("2014-04-24T00:00:00Z"),
"number" : 1,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
},
{
"type": "impression",
"date" : ISODate("2014-04-25T00:00:00Z"),
"number" : 15,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
},
{
"type": "impression",
"date" : ISODate("2014-04-24T00:00:00Z"),
"number" : 33,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
}
]
}
此类更改的聚合结构如下所示:
db.collection.aggregate([
// Unwind the events array
{ "$unwind": "$events" },
// Group each "type" conditionally
{ "$group": {
"_id": "$_id",
"total_clicks": {
"$sum": {
"$cond": [
{ "$eq": [ "$events.type", "click" ] },
"$events.number",
0
]
}
},
"total_impressions": {
"$sum": {
"$cond": [
{ "$eq": [ "$events.type", "impression" ] },
"$events.number",
0
]
}
}
}}
使用$cond
作为三元运算符,评估逻辑条件并选择要传递给$sum
的值,具体取决于条件是true
还是false
。
或者你可以单独聚合“类型”:
db.collection.aggregate([
// Unwind the events array
{ "$unwind": "$events" },
// Group each "type" conditionally
{ "$group": {
"_id": { "_id": "$_id", "type": "$events.type" },
"total": { "$sum": "$events.number" }
}}
])
结果略有不同:
{
"_id": {
"_id": ObjectId("53cd79bb300ccae6b3904402"),
"type": "clicks"
},
"total": 6
},
{
"_id": {
"_id": ObjectId("53cd79bb300ccae6b3904402"),
"type": "impressions"
},
"total": 48
}
最后,如果你可以忍受这样的事情,比如当你添加或以其他方式更新数组成员时,你不需要在数组外的字段上进行原子更新,那么将你的“事件流”放在一个单独的集合中就会删除需要致电$unwind
:
{
"sku_id" : ObjectId("53cd79bb300ccae6b3904402"),
"name" : "test product",
"sku" : "product-1",
"type": "click",
"date" : ISODate("2014-04-25T00:00:00Z"),
"number" : 2,
"channel" : "google",
"campaign" : "12345687",
"group" : "987654321"
},
{
"sku_id" : ObjectId("53cd79bb300ccae6b3904402"),
"name" : "test product",
"sku" : "product-1",
"type": "impression",
"date" : ISODate("2014-04-24T00:00:00Z"),
"number" : 33,
"channel" : "google",
"campaign" : "8675309",
"group" : "9035768"
}
声明:
db.eventstream.aggregate([
{ "$group": {
"_id": "$sku_id",
"total_clicks": {
"$sum": {
"$cond": [
{ "$eq": [ "$type", "click" ] },
"$number",
0
]
}
},
"total_impressions": {
"$sum": {
"$cond": [
{ "$eq": [ "$type", "impression" ] },
"$number",
0
]
}
}
}}
])