我需要汇总以下数据
- Country: One, Car: Volvo, Name: Smith, Price: 100
- Country: One, Car: BMW, Name: Smith, Price: 200
- Country: Two, Car: Romeo, Name: Joe, Price: 50
- Country: Two, Car: KIA, Name: Joe, Price: 110
- Country: Two, Car: KIA, Name: Joe, Price: 90
(名称是唯一的,每个人都拥有一个国家的汽车)
结果,我预计(不需要多元化):
- Name: Smith, Type: Volvos, Country: One, Val: 1 // Count of car-type
- Name: Smith, Type: BMWs, Country: One, Val: 1
- Name: Smith, Type: Total, Country: One, Val: 2 // Count of all his cars
- Name: Smith, Type: Price, Country: One, Val: 300 // Total car price
- Name: Joe, Type: Romeos, Country: Two, Val: 1
- Name: Joe, Type: KIAs, Country: Two, Val: 2
- Name: Joe, Type: Total, Country: Two, Val: 3
- Name: Joe, Type: Price, Country: Two, Val: 250
E.g。这是一个用于构建报告的数据透视数据版本
Country | Name | Volvos | BMWs | Romeos | KIAs | Total | Price
----------------------------------------------------------------
One | Smith | 1 | 1 | | | 2 | 300
----------------------------------------------------------------
Two | Joe | | | 1 | 2 | 3 | 250
| Other | ? | ? | ... etc
我在想mongo中的聚合框架是否可以处理这个问题,还是应该使用hardcore map-reduce?
答案 0 :(得分:2)
不完全是您规定的结果,但实际上采用的是一种MongoDB方式:
db.cars.aggregate([
{ "$group": {
"_id": {
"name": "$Name",
"type": "$Car"
},
"Country": { "$first": "$Country" },
"CarCount": { "$sum": 1 },
"TotalPrice": { "$sum": "$Price" }
}},
{ "$group": {
"_id": "$_id.name",
"cars": {
"$push": {
"type": "$_id.type",
"country": "$Country",
"carCount": "$CarCount",
"TotalPrice": "$TotalPrice"
}
},
"TotalPrice": { "$sum": "$TotalPrice" }
}}
])
这给了你:
{
"_id" : "Smith",
"cars" : [
{
"type" : "BMW",
"country" : "One",
"carCount" : 1,
"TotalPrice" : 200
},
{
"type" : "Volvo",
"country" : "One",
"carCount" : 1,
"TotalPrice" : 100
}
],
"TotalPrice" : 300
}
{
"_id" : "Joe",
"cars" : [
{
"type" : "KIA",
"country" : "Two",
"carCount" : 2,
"TotalPrice" : 200
},
{
"type" : "Romeo",
"country" : "Two",
"carCount" : 1,
"TotalPrice" : 50
}
],
"TotalPrice" : 250
}
答案 1 :(得分:1)
可能有一些技巧可行,但是我使用可变数量的类型我不相信你可以在一个聚合查询中得到这个,但是,你可以把整个表分成两个。
我应该提到总计可以在客户端计算,这也应该非常快。
我还应该注意到聚合框架目前无法“合并”两个输出:http://docs.mongodb.org/manual/reference/operator/aggregation/out/但您可以将两个结果排序为相同的排序。
首先,您需要总计(如果您通过聚合框架执行此操作):
db.cars.aggregate({
{$group: {
_id: {
Country: '$country',
Name: '$Name'
},
car_count: {$sum: 1},
value_total: {$sum: '$Val'}
}},
{$sort: {_id: 1}} // we now sort by the country and name
})
所以现在你想要你的每辆车总数:
db.cars.aggregate({
{$group: {
_id: {
Country: '$country',
Name: '$Name',
Type: '$Type'
},
sort_key: { // We add this so we can sort the same as the totals
Country: '$Country',
Name: '$Name'
},
car_count: {$sum: 1},
value_total: {$sum: '$Val'}
}},
{$sort: {sort_key: 1}} // we now sort by the country and name
})
现在,您可以使用JavaScript,例如迭代第一组结果,总计,在嵌套循环中迭代其他聚合的详细结果,将其全部打印出来。
这可能比Map Reduce更快,但另一种选择是每隔一段时间使用Map Reduce更新聚合集合,并从中挑选出来。这意味着结果不会是实时的(可能会延迟5分钟),但它会超级快速。
答案 2 :(得分:0)
聚合对此应该没问题。 最简单的2个独立命令...... 如果你的系列被称为汽车,你可以运行这样的东西:
db.cars.aggregate([{$group:{_id:{"Country":"$Country","Name":"$Name"},"sum":{$sum:1},"price":{$sum:"$Price"}}}])
db.cars.aggregate([{$group:{_id:{"Country":"$Country","Name":"$Name","Car":"$Car"},"sum":{$sum:1},"price":{$sum:"$Price"}}}])