我已经开始学习MongoDB并遇到问题。我有一个集合名称为server_logs。
它包含以下列(SOURCE_SERVER,SOURCE_PORT,DESTINATION_PORT,DESTINATION_SERVER,MBYTES)。
我需要SOURCE_SERVER,MBYTES的总金额转移到每个SOURCE_SERVER。(但是还有一点是,如果在target_server中也存在任何source_server,那么他们的MBYTES也会在每个SOURCE_SERVER中添加)。
例如:我有以下表结构
SOURCE S_PORT DEST D_PORT MBYTES
1)server1 446 server2 555 10MB
2)server3 226 server1 666 2MB
3)server1 446 server3 226 5MB
我需要以下结果:
Server1 17MB
Server3 7MB
我在mysql中创建了一个查询,根据传输到该SOURCE的数据的MBYTES来计算最高SOURCE。它工作正常,我通过此查询获得MYSQL所需的结果。
SELECT SOURCE, DEST, sum( logs.MBYTES )+(
SELECT SUM(log.MBYTES) as sum
from logs as log
where logs.DEST=log.SOURCE
) AS MBYTES
我想在MongoDB中使用此查询。请帮忙..
提前致谢..
答案 0 :(得分:1)
虽然这种“自联接”类型的查询对于如何使用MongoDB看起来似乎并不明显,但可以使用聚合框架来完成,但只需要稍微改变一下你的想法。
使用这种形式的MongoDB中的数据,这仍然非常像原始的SQL源:
{
"source" : "server1",
"s_port" : 446,
"dest" : "server2",
"d_port" : 555,
"transferMB" : 10
},
{
"source" : "server3",
"s_port" : 226,
"dest" : "server1",
"d_port" : 666,
"transferMB" : 2
},
{
"source" : "server1",
"s_port" : 446,
"dest" : "server3",
"d_port" : 226,
"transferMB" : 5
}
使用2.6版本的MongoDB,您的查询将如下所示:
db.logs.aggregate([
// Project a "type" tag in order to transform, then unwind
{ "$project": {
"source": 1,
"dest": 1,
"transferMB": 1,
"type": { "$cond": [ 1,[ "source", "dest" ],0] }
}},
{ "$unwind": "$type" },
// Map the "source" and "dest" servers onto the type, keep the source
{ "$project": {
"type": 1,
"tag": { "$cond": [
{ "$eq": [ "$type", "source" ] },
"$source",
"$dest"
]},
"mbytes": "$transferMB",
"source": 1
}},
// Group for totals, keep an array of the "source" for each
{ "$group": {
"_id": "$tag",
"mbytes": { "$sum": "$mbytes" },
"source": { "$addToSet": "$source" }
}},
// Unwind that array
{ "$unwind": "$source" },
// Is our grouped tag one on the sources? Inner join simulate
{ "$project": {
"mbytes": 1,
"matched": { "$eq": [ "$source", "$_id" ] }
}},
// Filter the results that did not match
{ "$match": { "matched": true }},
// Discard duplicates for each server tag
{ "$group": {
"_id": "$_id",
"mbytes": { "$first": "$mbytes" }
}}
])
对于2.6及更高版本,您可以使用一些额外的运算符来简化此操作,或者至少使用不同的运算符:
db.logs.aggregate([
// Project a "type" tag in order to transform, then unwind
{ "$project": {
"source": 1,
"dest": 1,
"transferMB": 1,
"type": { "$literal": [ "source", "dest" ] }
}},
{ "$unwind": "$type" },
// Map the "source" and "dest" servers onto the type, keep the source
{ "$project": {
"type": 1,
"tag": { "$cond": [
{ "$eq": [ "$type", "source" ] },
"$source",
"$dest"
]},
"mbytes": "$transferMB",
"source": 1
}},
// Group for totals, keep an array of the "source" for each
{ "$group": {
"_id": "$tag",
"mbytes": { "$sum": "$mbytes" },
"source": { "$addToSet": "$source" }
}},
// Co-erce the server tag into an array ( of one element )
{ "$group": {
"_id": "$_id",
"mbytes": { "$first": "$mbytes" },
"source": { "$first": "$source" },
"tags": { "$push": "$_id" }
}},
// User set intersection to find common element count of arrays
{ "$project": {
"mbytes": 1,
"matched": { "$size": {
"$setIntersection": [
"$source",
"$tags"
]
}}
}},
// Filter those that had nothing in common
{ "$match": { "matched": { "$gt": 0 } }},
// Remove the un-required field
{ "$project": { "mbytes": 1 }}
])
两种形式都会产生结果:
{ "_id" : "server1", "mbytes" : 17 }
{ "_id" : "server3", "mbytes" : 7 }
两者的一般原则是,通过保留有效“源”服务器的列表,您可以“过滤”组合结果,以便只有那些列为源的记录将记录其总传输。
因此,您可以使用一些技术来“重新塑造”,“合并”和“过滤”您的文档以获得所需的结果。
在aggregation operators上阅读更多信息,同时值得一看的是文档中的SQL to Aggregation mapping chart,以便您了解转换常见操作的信息。
甚至可以在Stack Overflow上浏览aggregation-framework标签,以找到一些有趣的转换操作。
答案 1 :(得分:0)
您可以使用聚合框架:
db.logs.aggregate([
{$group:{_id:"$SOURCE",MBYTES:{$sum:"$MBYTES"}}}
])
假设您在MBYTES
字段中只有numer值。因此,您将拥有:
{
_id: server1,
MBYTES: 17
},
{
_id: server3,
MBYTES: 7
}
如果你必须计算这个也是服务器出现在DEST字段你应该使用map-reduce方法:
var mapF = function(){
emit(this.SOURCE,this.MBYTES);
emit(this.DEST,this.MBYTES);
}
var reduceF = function(serverId,mbytesValues){
var reduced = {
server: serverId,
mbytes: 0
};
mbytesValues.forEach(function(value) {
reduced.mbytes += value;
});
return reduced;
}
db.logs.mapReduce(mapF,reduceF,{out:"server_stats"});
之后您可以在server_stats集合中找到结果。