我在MongoDB中尝试使用MapReduce程序找到共同的朋友,我在mongoDB中对键进行排序后获得了以下数据
{"user" : " Hari","friend" : "Shiva",
"friendList": ["Hanks"," Tom"," Karma"," Hari"," Dinesh"]}
{"user" : "Hari","friend" : " Shiva",
"friendList" : ["Karma"," Tom"," Ram"," Bindu"," Shiva",
" Kishna"," Bikash"," Bakshi"," Dinesh"]}
现在我想将这些具有相同键的数据组分组到单个组中,在将键值对发送到reducer之前使用Javascript in map函数,如何对数据进行分组?例如,我希望输出像
{"user" : " Hari","friend" : "Shiva",
"friendList": ["Hanks"," Tom"," Karma"," Hari"," Dinesh"],["Karma"," Tom"," Ram"," Bindu"," Shiva"," Kishna"," Bikash"," Bakshi"," Dinesh"]}
答案 0 :(得分:1)
您可以将两个记录的friendlist
数组连接成一个数组,以创建如下对象:
{
"_id": {
"user": " Hari",
"friend": "Shiva"
},
"value": {
"friendList": [
"Hanks",
" Tom",
" Karma",
" Hari",
" Dinesh",
"Karma",
" Tom",
" Ram",
" Bindu",
" Shiva",
" Kishna",
" Bikash",
" Bakshi",
" Dinesh"
]
}
}
请参阅https://jsfiddle.net/b6hxswvk/1/处的代码以创建此单个对象
如果你想让friendlist
成为二维数组,就像这样:
{
"_id": {
"user": " Hari",
"friend": "Shiva"
},
"value": {
"friendList": [
[
"Hanks",
" Tom",
" Karma",
" Hari",
" Dinesh"
],
[
"Karma",
" Tom",
" Ram",
" Bindu",
" Shiva",
" Kishna",
" Bikash",
" Bakshi",
" Dinesh"
]
]
}
}
上的代码
答案 1 :(得分:1)
您可以根据用户和朋友字段aggregation
进行$group
。
db.collection.aggregate([
{$group:{
_id:{
user:'$user',
friend:'$friend'
},
friendList:{$push:'$friendList'}
}},
// project the fields as your wish
{$project:{
user:'$_id.user',
friend:'$_id.friend',
friendList:'$friendList'
}}
])
希望此聚合管道可以返回预期结果
答案 2 :(得分:0)
朋友,如果map reduce将通过对相同键的值进行分组并将reduce作为键,list [values]来执行它,那么为什么要对同一个键的数据值进行分组呢?< / p>
我强烈建议您在reducer中执行分组任务而不是Map。其背后的主要原因是,由于map任务按记录读取并执行收集操作,因此算法采用识别相同键组的负担,如何设计具有分组值的输出可以由我们在减少逻辑
您可以获取减速机的输出以供进一步处理。
<强>输入:强>
{"_id" : {"user" : " Hari","friend" : "Shiva"},
"value" : {"friendList": ["Hanks"," Tom"," Karma"," Hari"," Dinesh"]}}
{"_id" : {"user" : "Hari","friend" : " Shiva"},
"value" : {"friendList" : ["Karma"," Tom"," Ram"," Bindu"," Shiva",
" Kishna"," Bikash"," Bakshi"," Dinesh"]}}
Mapreduce代码:
var mapper = function () {
var key = {"user" : this.user, "friend" : this.friend};
emit(key, {"value":{"friendList":this.friendList}});
};
var reducer = function(key, value){
var combinedfriendList = {"friendList":[]};
for (var i in values) {
var inter = values[i];
for (var j in inter.friendList) {
combinedfriendList.friendList.push(inter.friendList[j]);
}
}
return {"_id": {"user":key.user, "friend": key.friend}, "value":combinedfriendList};
};
预期输出:
{"_id" : {"user" : " Hari","friend" : "Shiva"},
"value" : {"friendList": ["Hanks"," Tom"," Karma"," Hari"," Dinesh","Karma"," Tom"," Ram"," Bindu"," Shiva"," Kishna"," Bikash"," Bakshi"," Dinesh"]}}
希望这是一些帮助。您可以在您的环境中测试它(如果需要,可以更改)并分享您的反馈。