我在mongodb有100万份文件。我想找到并取消设置相同的字段。你能给我一个方法或想法吗?
我的文件是这样的:
{
"regions" : [
{"id" : "1", "name" : "World"},
{"id" : "10370","name" : "South America"},
{"id" : "1426","name" : "Suriname"}
]
}
{
"regions" : [
{"id" : "1", "name" : "World"},
{"id" : "10370","name" : "South America"},
{"id" : "1426","name" : "Suriname"}
]
}
{
"regions" : [
{"id" : "1","name" : "World"},
{"id" : "1734","name" : "USA"},
{"id" : "1136","name" : "Pennsylvania"},
{"id" : "16962","name" : "Greater Philadelphia area"},
]
}
{
"regions" : [
{"id" : "1","name" : "World"},
{"id" : "1734","name" : "USA"},
{"id" : "1136","name" : "Pennsylvania"},
{"id" : "16962","name" : "Greater Philadelphia area"},
]
}
{
"regions" : [
{"id" : "1","name" : "World"},
{"id" : "34964","name" : "Oceania"},
{"id" : "15","name" : "Australia"},
{"id" : "470","name" : "Western Australia"},
{"id" : "36282","name" : "Perth"},
]
}
如何改变:
{
"regions" : [
{"id" : "1", "name" : "World"},
{"id" : "10370","name" : "South America"},
{"id" : "1426","name" : "Suriname"}
]
}
{
"regions" : [
{"id" : "1","name" : "World"},
{"id" : "1734","name" : "USA"},
{"id" : "1136","name" : "Pennsylvania"},
{"id" : "16962","name" : "Greater Philadelphia area"},
]
}
{
"regions" : [
{"id" : "1","name" : "World"},
{"id" : "34964","name" : "Oceania"},
{"id" : "15","name" : "Australia"},
{"id" : "470","name" : "Western Australia"},
{"id" : "36282","name" : "Perth"},
]
}
感谢您的回答和提前的兴趣。
更新 我正在尝试这段代码:
db.collection.aggregate(
{"$group":{"_id": {"id": "$regions.id","name": "$regions.name"},}},
{"$group":{"_id":ObjectId(),"regions": { "$push": {"id": "$_id.id","name": $_id.name"}}}},
{"$unwind": "$regions"},
{"$out": "newcollection"}
)
它给出了这个错误: " ERRMSG" :"插入$ out失败:{connectionId:111,错误:\" E11000重复键错误集合:collection.tmp.agg_out.12索引: id dup key:{ :ObjectId(' 5767f378ff8f5e9302d95bc8')} \",代码:11000,n:0,ok:1.0}",
如何提供唯一密钥?
答案 0 :(得分:0)
使用聚合,如果按数组元素分组,则可以删除重复区域。这样的事情会有所帮助吗?
db.regs.aggregate([{$group:{"_id":{id:"$regions.id",name:"$regions.name"}}}]).pretty()